Gemma 4 26B A4B IT
Gemma 4 26B A4B IT is Google's open-weight mixture-of-experts model with 26B total parameters and roughly 4B active per forward pass. Built on the Gemini 3 architecture, it supports function-calling, structured JSON output, native vision, and 140+ languages within a context window of 262.1K tokens.
import { streamText } from 'ai'
const result = streamText({ model: 'google/gemma-4-26b-a4b-it', prompt: 'Why is the sky blue?'})Frequently Asked Questions
What does mixture-of-experts mean for Gemma 4 26B A4B IT?
Gemma 4 26B A4B IT has 26B total parameters split across expert sub-networks. A routing mechanism activates roughly 4B parameters per forward pass, selecting the most relevant experts for each input. This reduces compute per token compared to a dense model of equivalent total size.
How does Gemma 4 26B A4B IT compare to the dense Gemma 4 31B?
Gemma 4 26B A4B IT prioritizes latency and throughput by activating fewer parameters per token. The dense Gemma 4 31B activates all 31B parameters, targeting higher output quality at the cost of more compute. Choose Gemma 4 26B A4B IT when speed matters and the dense variant when quality is the priority.
What input modalities does Gemma 4 26B A4B IT support?
Gemma 4 26B A4B IT accepts text and image inputs. It does not generate images or audio. Use it for text generation, visual understanding, and structured output tasks.
What languages does Gemma 4 26B A4B IT support?
Over 140 languages. The instruction-tuning covers multilingual conversational and task-oriented use cases.
How do I use Gemma 4 26B A4B IT on AI Gateway?
Set the model to
google/gemma-4-26b-a4b-itin the AI SDK. AI Gateway handles provider routing, retries, and failover automatically.Does Gemma 4 26B A4B IT support function-calling and structured output?
Yes. It supports function-calling for agentic workflows, structured JSON output, and system instructions natively, sharing these capabilities with the Gemini 3 architecture it is built on.