Gemma 4 31B IT

Gemma 4 31B IT is Google's open-weight dense model with 31B parameters, all active during inference. Built on the Gemini 3 architecture, it targets higher output quality than its MoE sibling, with support for function-calling, structured JSON output, native vision, and 140+ languages. Your use subject to Google's Terms & Privacy Policies.

File InputReasoningTool UseVision (Image)

Use with AI Gateway View docs

TypeScript

Python

import { streamText } from 'ai'

const result = streamText({
  model: 'google/gemma-4-31b-it',
  prompt: 'Why is the sky blue?'
})

Read docs

Overview About Providers Throughput Latency Uptime Status Similar FAQ

More models by Google

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Capabilities	Providers	ZDR	No Training	Release Date

google/gemini-3.5-flash-lite

0.6s

367tps

$0.30/M

$2.50/M

Read:$0.03/M

Write:—

$14/K+1 more

+ input costs

07/21/2026

google/gemini-3.6-flash

2.5s

175tps

$1.50/M

$7.50/M

Read:$0.15/M

Write:—

$14/K+1 more

+ input costs

07/21/2026

google/gemini-3.5-flash

2.2s

229tps

$1.50/M

$9/M

Read:$0.15/M

Write:—

$14/K+1 more

+ input costs

05/19/2026

google/gemini-3.1-flash-lite

0.5s

236tps

$0.25/M

$1.50/M

Read:$0.03/M

Write:—

$14/K+1 more

+ input costs

05/07/2026

google/gemini-3-flash

0.7s

183tps

$0.50/M+1 more

$3/M+1 more

Read:

$0.05/M+1 more

Write:

—

$14/K+1 more

+ input costs

12/17/2025

google/gemini-2.5-flash-lite

0.3s

222tps

$0.10/M

$0.40/M

Read:$0.01/M

Write:—

$35/K+1 more

+ input costs

06/17/2025

Agent Stack

Core Platform

Tools

Learn

Build

Explore

Gemma 4 31B IT

More models by Google