Llama 4 Scout 17B 16E Instruct

Llama 4 Scout 17B 16E Instruct is a natively multimodal Mixture of Experts (MoE) model with a context window of 131.1K tokens, purpose-built for processing entire codebases, multi-document corpora, and extended user activity logs in a single inference call. Your use subject to Meta's Terms & Privacy Policies.

Tool UseVision (Image)

Use with AI Gateway View docs

TypeScript

Python

import { streamText } from 'ai'

const result = streamText({
  model: 'meta/llama-4-scout',
  prompt: 'Why is the sky blue?'
})

Read docs

Overview About Providers Throughput Latency Uptime Status Similar FAQ

More models by Meta

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Capabilities	Providers	ZDR	No Training	Release Date

meta/muse-spark-1.1

3.3s

195tps

$1.25/M

$4.25/M

Read:$0.15/M

Write:—

—

07/09/2026

meta/llama-4-maverick

131K

0.2s

179tps

$0.24/M

$0.97/M

—

04/05/2025

meta/llama-3.3-70b

128K

0.2s

179tps

$0.59/M

$0.72/M

—

12/06/2024

meta/llama-3.1-8b

131K

0.1s

158tps

$0.02/M

$0.05/M

Read:$0.03/M

Write:—

—

07/23/2024

meta/llama-3.1-70b

131K

0.3s

135tps

$0.72/M

—

07/23/2024

Agent Stack

Core Platform

Tools

Learn

Build

Explore

Llama 4 Scout 17B 16E Instruct

More models by Meta