Cohere API Integration: Command R, Command R+, Embed
Cohere specializes in enterprise NLP: their embeddings (embed-multilingual-v3) lead the MTEB benchmark for multilingual search. Command R+ is optimized for RAG tasks with built-in RAG mode that returns source citations. Useful for enterprise search requiring verifiable answers.
Basic Integration
import cohere
co = cohere.Client("COHERE_API_KEY")
# Chat (Command R+)
response = co.chat(
model="command-r-plus",
message="Explain the principles of transformers",
temperature=0.1,
)
print(response.text)
# Async
import cohere.asyncio as async_cohere
async_co = async_cohere.AsyncClient("COHERE_API_KEY")
response = await async_co.chat(model="command-r-plus", message="Request")
RAG Mode with Built-in Citations
documents = [
{"id": "doc_1", "title": "Security Policy", "text": "...text..."},
{"id": "doc_2", "title": "Access Rules", "text": "...text..."},
]
# Cohere automatically finds relevant documents and cites them
response = co.chat(
model="command-r-plus",
message="How to get access to corporate systems?",
documents=documents,
# Response contains citations with source references
)
print(response.text)
for citation in response.citations:
print(f"Citation: {citation.text}, sources: {citation.document_ids}")
Embeddings (Best in Class for Search)
# Multilingual embeddings — one of best options for RU/EN/UA
response = co.embed(
texts=["Search documents", "Document search", "Пошук документів"],
model="embed-multilingual-v3.0",
input_type="search_query", # "search_query" or "search_document"
)
embeddings = response.embeddings
# For indexing documents
doc_embeddings = co.embed(
texts=["Document text 1", "Document text 2"],
model="embed-multilingual-v3.0",
input_type="search_document",
)
Rerank — Search Results Re-ranking
# Cohere Rerank — powerful tool for improving RAG accuracy
docs = [
"Python is an interpreted programming language",
"Anaconda is a Python distribution for data science",
"Python snakes are widespread in tropical regions",
"Django is a Python web framework",
]
results = co.rerank(
model="rerank-multilingual-v3.0",
query="Python for machine learning",
documents=docs,
top_n=3,
)
for result in results.results:
print(f"Score: {result.relevance_score:.3f} | {docs[result.index]}")
Cost of Cohere (2025)
| Service | Cost |
|---|---|
| Command R+ | $2.50 input / $10 output (1M tokens) |
| Command R | $0.15 input / $0.60 output |
| Embed multilingual v3 | $0.10 / 1M tokens |
| Rerank | $2.00 / 1000 searches |
Timeline
- Basic integration of chat: 0.5–1 day
- RAG with citations: 2–3 days
- Rerank pipeline: 1–2 days







