Whisper
Whisper is a general-purpose speech recognition model, trained on a large dataset of diverse audio. You can also use it as a multitask model to perform multilingual speech recognition as well as speech translation and language identification.
translationTranscription
index.ts
import { experimental_transcribe as transcribe } from 'ai';import { gateway } from '@ai-sdk/gateway';import { readFile } from 'node:fs/promises';
const result = await transcribe({ model: gateway.transcriptionModel('openai/whisper-1'), audio: await readFile('audio.mp3'),});More models by OpenAI
| Model |
|---|