Mistral AI API Integration: Mistral Large, Small, Codestral
Mistral AI is a European LLM provider. Key advantages: GDPR compliance (data processed in EU), competitive pricing, open-source models for local deployment. Codestral is a specialized code model with Fill-in-the-Middle (FIM) support.
Integration via Official SDK
from mistralai import Mistral
client = Mistral(api_key="MISTRAL_API_KEY")
# Basic call
response = client.chat.complete(
model="mistral-large-latest",
messages=[{"role": "user", "content": "Hello"}],
temperature=0.1,
)
print(response.choices[0].message.content)
# Async
import asyncio
async def async_chat(prompt: str) -> str:
response = await client.chat.complete_async(
model="mistral-small-latest",
messages=[{"role": "user", "content": prompt}],
)
return response.choices[0].message.content
# Streaming
with client.chat.stream(
model="mistral-large-latest",
messages=[{"role": "user", "content": "Long answer"}],
) as stream:
for event in stream:
print(event.data.choices[0].delta.content or "", end="")
Function Calling
tools = [{
"type": "function",
"function": {
"name": "search_db",
"description": "Search company database",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"},
"limit": {"type": "integer", "default": 10},
},
"required": ["query"]
}
}
}]
response = client.chat.complete(
model="mistral-large-latest",
messages=[{"role": "user", "content": "Find information about client Ivanov"}],
tools=tools,
tool_choice="auto",
)
Codestral for Code (FIM)
# Fill-in-the-Middle — code completion between prefix and suffix
fim_response = client.fim.complete(
model="codestral-latest",
prompt="def calculate_discount(price: float,",
suffix=") -> float:\n return discounted_price",
temperature=0,
max_tokens=256,
)
print(fim_response.choices[0].message.content)
# Code generation
code_response = client.chat.complete(
model="codestral-latest",
messages=[{"role": "user", "content": "Write a Python function to parse CSV with error handling"}],
)
Embeddings
response = client.embeddings.create(
model="mistral-embed",
inputs=["First document", "Second document"],
)
embeddings = [item.embedding for item in response.data]
Cost of Mistral (2025)
| Model | Input (1M) | Output (1M) |
|---|---|---|
| Mistral Large | $2 | $6 |
| Mistral Small | $0.20 | $0.60 |
| Codestral | $0.20 | $0.60 |
| Mistral Embed | $0.10 | — |
Local Deployment via Ollama
ollama pull mistral:7b
ollama pull codestral:22b
# Mistral via Ollama — OpenAI-compatible API
from openai import OpenAI
local_client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
response = local_client.chat.completions.create(
model="mistral:7b",
messages=[{"role": "user", "content": "Hello"}],
)
Timeline
- Basic integration: 0.5–1 day
- Codestral for IDE assistant: 2–3 days
- Local deployment + optimization: 1 week







