Qwen3 VL 235B A22B Instruct
Qwen3 VL 235B A22B Instruct is Alibaba's multimodal vision-language model supporting interleaved text, images, and video over a native context of 262.1K tokens, with architectural improvements in spatial-temporal modeling and agentic GUI interaction.
import { streamText } from 'ai'
const result = streamText({ model: 'alibaba/qwen3-vl-instruct', prompt: 'Why is the sky blue?'})