Mistral Medium 3.5: A 128-Billion-Parameter Model from France Combines Chat, Reasoning, and Coding
May 3, 2026
On May 2, 2026, Mistral unveiled Medium 3.5: a dense 128-billion-parameter model with a 256k context window and configurable reasoning, released under a modified MIT license.
Mistral Medium 3.5: Dense 128-Billion-Parameter Model With a 256k Context Window
French AI company Mistral released its new flagship model, Medium 3.5, on May 2, 2026. Unlike many competitors, Mistral did not pick a Mixture-of-Experts architecture. Instead, it shipped a dense model with 128 billion parameters, all of which are activated for every generated token. The context window is 256,000 tokens, and the model is multimodal, with a vision encoder that handles variable image sizes.
Token Pricing and Modified MIT License on Hugging Face
Mistral charges 1.50 U.S. dollars per million input tokens and 7.50 U.S. dollars per million output tokens. That is more than many open-weight competitors, but cheaper than the flagships of OpenAI and Anthropic. The weights are available on Hugging Face under a modified MIT license.
SWE-Bench Verified at 77.6 Percent: Benchmark Details
On internal benchmarks, Mistral reports 77.6 percent on SWE-Bench Verified, which tests real GitHub bug fixes, and 91.4 percent on ΟΒ³-Telecom, a test of agentic tool use in telecommunications. That places the model in the upper tier, without quite reaching the top scores of GPT-5.5 or Claude.
One Model Replaces Three: Medium 3.1, Magistral, and Devstral 2 Retired
A notable detail is the consolidation. Medium 3.5 replaces three earlier lines: Medium 3.1, Magistral, and Devstral 2. Developers can configure reasoning effort per request. That lowers cost on easy tasks and lifts quality on hard ones.
Why it matters
For European companies, this matters because it offers a high-performance open-weight model that can also run on premises. According to Mistral, Medium 3.5 runs on as few as four GPUs. That meaningfully lowers the bar for self-hosting, especially in regulated industries such as pharma, banking, and insurance, where data must not leave the company. Mistral is also the only serious EU candidate for frontier LLMs.
Practical example
A German health insurer wants to automatically classify incoming claims and draft follow-up questions to policyholders. Running Medium 3.5 on four H100 GPUs in its own data center allows the model to process patient data without that data going to a U.S. cloud. Monthly costs are mainly electricity and hardware, not API fees. At scale, this pays off quickly, and a large part of the GDPR and EU AI Act debate about data export simply disappears.
π‘ In plain English
Imagine a new word-tool from France that writes texts, solves tricky tasks, and fixes code. It is big enough to be smart, small enough to run on a few computers, and companies are allowed to put it up at home themselves.
Key Takeaways
- βMistral Medium 3.5 launched on May 2, 2026 with 128 billion parameters.
- βThe model is dense, multimodal, and has a 256k token context window.
- βPricing is 1.50 USD per million input and 7.50 USD per million output tokens.
- βSWE-Bench Verified score: 77.6 percent.
- βOpen weights under a modified MIT license; runs on four GPUs.
Sources & Context
- Mistral Medium 3.5 Folds Chat, Reasoning, and Code Into One 128B AI Model - WinBuzzer
- Mistral AI Launches Remote Agents in Vibe and Mistral Medium 3.5 with 77.6% SWE-Bench Verified Score - MarkTechPost
- mistralai/Mistral-Medium-3.5-128B - Hugging Face
- Remote agents in Vibe. Powered by Mistral Medium 3.5. - Mistral AI