Skip to content

Llama 4 Maverick 17B 128E Instruct FP8

Llama 4 Maverick 17B 128E Instruct FP8 is Meta's natively multimodal Mixture of Experts (MoE) model with 17B active parameters across 128 experts. Published benchmarks span image and text tasks, and the MoE activates a fraction of the parameters that comparable dense models use.

Tool UseVision (Image)
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'meta/llama-4-maverick',
prompt: 'Why is the sky blue?'
})

More models by Meta

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
131K
0.2s
178tps
$0.17/M$0.66/M
bedrock logo
deepinfra logo
groq logo
04/05/2025
128K
0.2s
141tps
$0.59/M$0.72/M
bedrock logo
groq logo
12/06/2024
128K
0.3s
54tps
$0.15/M$0.15/M
bedrock logo
09/18/2024
128K
0.2s
88tps
$0.10/M$0.10/M
bedrock logo
09/18/2024
131K
0.1s
41tps
$0.10/M$0.10/M
Read:$0.1/M
Write:
bedrock logo
cerebras logo
deepinfra logo
+2
07/23/2024
131K
0.4s
31tps
$0.72/M$0.72/M
bedrock logo
deepinfra logo
07/23/2024