Interfaze Beta
Interfaze Beta merges specialized DNN/CNN models with an LLM to handle deterministic developer tasks like OCR, scraping, classification, structured outputs, and web extraction. It supports 1M tokens input and 32K tokens output. On AI Gateway, pay $1.5 per million input tokens and $3.5 per million output tokens.
import { streamText } from 'ai'
const result = streamText({ model: 'interfaze/interfaze-beta', prompt: 'Why is the sky blue?'})Playground
Try out Interfaze Beta by Interfaze. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
About Interfaze Beta
Interfaze built Interfaze Beta around a routing layer. Each request goes to whichever specialized model fits the task. Small CNN and DNN models handle perception work like OCR and object detection. An LLM handles language reasoning. Custom tools cover web search, a code sandbox, and configurable safety guardrails. The endpoint is a single OpenAI-compatible URL.
The context window is 1M tokens and maximum output is 32K tokens. Interfaze reports 70.7% on OCRBench V2 for the native OCR path and 98 to 99% accuracy on structured output generation. Inputs accepted include text, images, audio, files, and video. Reasoning is available for harder queries.
Task coverage includes OCR and document extraction, object detection driven by natural language prompts, web scraping (with handling for sites that block bots), speech-to-text with speaker diarization, translation across many languages, classification, structured output, text-to-SQL, and multimodal question answering.
Because the underlying mix of CNNs, DNNs, and an LLM stays opaque behind one endpoint, integration looks identical to any other chat-completions model. Send a prompt with optional image, audio, or file attachments and get back a response that matches your requested schema. See for product documentation and https://interfaze.ai/ for the model page.
What To Consider When Choosing a Provider
- Configuration: Interfaze Beta is a beta release. It targets workloads where deterministic, parseable output matters more than open-ended conversation. If your pipeline lives or dies by JSON schema adherence, OCR fidelity, or reliable web scraping, Interfaze Beta is built for that shape. For free-form chat, code generation, or general reasoning, a general-purpose frontier model usually fits better.
- Configuration: Call Interfaze Beta through the AI SDK, Chat Completions API, Responses API, Messages API, or other API formats, from TypeScript or Python. Inputs cover text, images, audio, files, and video.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Interfaze Beta
Best For
- OCR Pipelines: Document extraction workloads that need high field-level accuracy on scanned content
- Structured Output: API responses that downstream systems parse against a strict schema
- Web Extraction: Scraping workflows, including sites that block typical bots
- Multimodal Classification: Tasks that mix images, audio, files, or video alongside text
- Text-to-SQL: Natural-language queries translated into runnable SQL
Consider Alternatives When
- General Chat: A general-purpose frontier model fits open-ended conversation better
- Code Generation: A coding-tuned model handles long code synthesis with more depth
- Production Stability: A stable release is a safer pick than a beta for critical paths
- Creative Writing: A model tuned for long-form prose serves narrative work more naturally
Conclusion
Interfaze Beta packages OCR, web extraction, structured output, and multimodal reasoning behind one OpenAI-compatible endpoint. Reach for it when deterministic output matters more than open-ended conversation.
Frequently Asked Questions
What is Interfaze Beta?
Interfaze Beta is a hybrid AI system from Interfaze that routes each request to a specialized DNN or CNN model when one fits, and falls back to an LLM otherwise. It targets developer tasks like OCR, scraping, classification, structured outputs, and web extraction.
What is the context window and output limit?
The context window is 1M tokens and the maximum output is 32K tokens.
Which input modalities does Interfaze Beta support?
Text, images, audio, files, and video. The API stays OpenAI Chat Completions compatible across all of them.
How do I call Interfaze Beta through AI Gateway?
Set the model to
interfaze/interfaze-betain the AI SDK, Chat Completions API, Responses API, Messages API, or other API formats, from TypeScript or Python. AI Gateway handles authentication and routing. See https://interfaze.ai/ for the model page.What is the pricing?
On AI Gateway, Interfaze Beta costs $1.5 per million input tokens and $3.5 per million output tokens. Current rates appear on this page.
How well does Interfaze Beta handle structured output?
Interfaze reports 98 to 99% accuracy on structured output generation, which makes Interfaze Beta a fit for pipelines that parse responses against a schema.
Does Interfaze Beta support zero data retention?
Zero Data Retention is not currently available for this model. Zero Data Retention is offered on a per-provider basis. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.