Skip to content
Vercel April 2026 security incident

Sonar

perplexity/sonar

Sonar is Perplexity's lightweight search model. It combines language generation with built-in web search to deliver citation-backed answers at low cost within a context window of 127K tokens.

Tool UseVision (Image)
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'perplexity/sonar',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

Sonar runs a web search on every request. Response latency includes network round-trips to retrieve sources. Factor this into timeout budgets for latency-sensitive interfaces. Web search calls bill separately (N/A per thousand when listed for this model).

When to Use Sonar

Best For

  • Customer-facing search:

    Answers must cite current web sources

  • High-volume factual Q&A:

    The lowest cost in the Sonar family, suitable for chatbots and support assistants

  • Automated research pipelines:

    Gather and synthesize web information across many queries without separate search infrastructure

  • Content verification workflows:

    Citation-backed responses provide auditable source trails

  • Lightweight RAG replacement:

    No need to build and maintain a custom retrieval stack

Consider Alternatives When

  • Deep multi-step reasoning:

    Sonar Reasoning Pro applies chain-of-thought before answering

  • Thorough source analysis:

    Sonar Pro searches more sources and produces more detailed synthesis

  • Purely generative work:

    Creative writing or code generation has no need for web grounding

  • Offline environments:

    Air-gapped deployments can't make web search calls

Conclusion

Sonar delivers Perplexity's signature capability: live web search integrated directly into language model inference. It's the most accessible price point in the Sonar family. If you need cited answers at scale without building retrieval infrastructure, it turns any prompt into a web-informed response in a single call.

FAQ

Every API call triggers a live web search. The model formulates search queries from your prompt, retrieves and evaluates web sources, then synthesizes the information into a response with inline citations. You don't need an external search API or RAG pipeline.

Sonar Pro searches more sources, produces longer and more detailed answers, and handles more complex multi-source queries. Sonar is optimized for speed and cost on straightforward factual queries.

Yes. Responses include inline citations referencing the web sources used for each claim. This gives your application verifiable provenance.

127K tokens. This supports multi-turn conversations where search context accumulates across exchanges.

Current pricing is shown on this page. AI Gateway routes across providers, and rates may vary by provider.

Use your AI Gateway API key with the model identifier `perplexity/sonar`. AI Gateway handles provider routing and authentication. You don't need a separate Perplexity API key when using gateway-managed access.

No. Web search is integral to the Sonar architecture and runs on every request. If you need a model without search, use a general-purpose language model from another provider.