Skip to content
Dashboard

Whisper

Whisper is a general-purpose speech recognition model, trained on a large dataset of diverse audio. You can also use it as a multitask model to perform multilingual speech recognition as well as speech translation and language identification.

translationTranscription
index.ts
import { experimental_transcribe as transcribe } from 'ai';
import { gateway } from '@ai-sdk/gateway';
import { readFile } from 'node:fs/promises';
const result = await transcribe({
model: gateway.transcriptionModel('openai/whisper-1'),
audio: await readFile('audio.mp3'),
});

More models by OpenAI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.8s
93tps
$5.00/M
$30.00/M
Read:
$0.5/M
Write:
$10.00/K
+ input costs
+4
azure logo
bedrock logo
openai logo
04/24/2026
400K
0.8s
194tps
$0.75/M$4.50/M
Read:$0.07/M
Write:
$10.00/K
+ input costs
+4
azure logo
openai logo
03/17/2026
400K
0.5s
156tps
$0.20/M$1.25/M
Read:$0.02/M
Write:
$10.00/K
+ input costs
+4
azure logo
openai logo
03/17/2026
1.1M
1.4s
86tps
$2.50/M
$15.00/M
Read:
$0.25/M
Write:
$10.00/K
+ input costs
+4
azure logo
openai logo
03/05/2026
400K
3.3s
331tps
$0.25/M$2.00/M
Read:$0.03/M
Write:
$14/K
+ input costs
+3
azure logo
openai logo
08/07/2025
131K
0.1s
1800tps
$0.35/M$0.75/M
Read:$0.25/M
Write:
baseten logo
bedrock logo
cerebras logo
+5
08/05/2025