Gemma 4 31B IT

Gemma 4 31B IT is Google's open-weight dense model with 31B parameters, all active during inference. Built on the Gemini 3 architecture, it targets higher output quality than its MoE sibling, with support for function-calling, structured JSON output, native vision, and 140+ languages. Your use subject to Google's Terms & Privacy Policies.

File InputReasoningTool UseVision (Image)

Use with AI Gateway View docs

TypeScript

Python

import { streamText } from 'ai'

const result = streamText({
  model: 'google/gemma-4-31b-it',
  prompt: 'Why is the sky blue?'
})

Read docs

Overview About Providers Throughput Latency Uptime Status Similar FAQ

About Gemma 4 31B IT

Gemma 4 31B IT is the dense counterpart in Google's Gemma 4 family, released on April 2, 2026 alongside the mixture-of-experts Gemma 4 26B. While both share the Gemini 3 architecture, this model activates all 31B parameters during every forward pass.

The dense design means every parameter contributes to every prediction. This produces higher output quality on complex reasoning, generation, and analysis tasks compared to the MoE variant, where a routing mechanism selects a subset of parameters. The tradeoff is higher compute per token, which translates to increased latency and cost per request.

Gemma 4 31B IT accepts text and image inputs within a context window of 262.1K tokens, supports over 140 languages, and handles function-calling, agentic workflows, structured JSON output, and system instructions. The instruction-tuning (indicated by the it suffix) prepares the model for conversational and task-oriented use out of the box.

Running Gemma 4 31B IT through AI Gateway provides unified billing, observability, automatic retries, and provider failover across a single API surface.

Agent Stack

Core Platform

Tools

Learn

Build

Explore

Gemma 4 31B IT

About Gemma 4 31B IT