Perplexity API Integration for AI Search
Perplexity is an LLM with built-in real-time web search. Difference from regular LLMs: the model automatically searches for current information and returns an answer with links to sources. Useful for tasks requiring data freshness (news, prices, technical changes).
Basic Integration
from openai import OpenAI
# Perplexity uses OpenAI-compatible API
client = OpenAI(
api_key="PERPLEXITY_API_KEY",
base_url="https://api.perplexity.ai",
)
# Search with current information
response = client.chat.completions.create(
model="llama-3.1-sonar-large-128k-online", # online = with search
messages=[{
"role": "user",
"content": "What new features appeared in Python 3.13?"
}],
)
print(response.choices[0].message.content)
# Answer contains current information + links to sources
# Citations in response
if hasattr(response, 'citations'):
for citation in response.citations:
print(f"Source: {citation}")
Search Setup
# search_domain_filter — limit search to specific domains
response = client.chat.completions.create(
model="llama-3.1-sonar-large-128k-online",
messages=[{"role": "user", "content": "Latest news about GPT-5"}],
extra_body={
"search_domain_filter": ["openai.com", "techcrunch.com"],
"search_recency_filter": "week", # Only last week
"return_images": False,
"return_related_questions": True,
}
)
Offline Models (without search)
# When search not needed — cheaper and faster
response = client.chat.completions.create(
model="llama-3.1-sonar-large-128k-chat", # chat = without search
messages=[{"role": "user", "content": "Explain Dijkstra algorithm"}],
)
Perplexity Models
| Model | Search | Context | Cost (1M) |
|---|---|---|---|
| llama-3.1-sonar-huge-128k-online | Yes | 127K | $5 + $5/1000 searches |
| llama-3.1-sonar-large-128k-online | Yes | 127K | $1 + $5/1000 searches |
| llama-3.1-sonar-small-128k-online | Yes | 127K | $0.20 + $5/1000 |
| llama-3.1-sonar-large-128k-chat | No | 127K | $1 |
Use Cases
- News and competitor monitoring (current information)
- Checking technical documentation for relevance
- Answering questions about current events
- Research assistant with verifiable sources
Timeline
- Basic integration: 0.5 days
- Citation parsing and sources: 1 day
- Integration into corporate search: 1 week







