back

Llama 4 Scout 17B

Llama 4 Community

Meta · 109B (17B active) · Mixture of Experts

MoE with 16 experts, 17B active params

207.4K downloads 1.2K likes 2025-04 128K context

Use Cases

chat vision reasoning

Mixture of Experts

Total experts: 16
Active experts: 1
Active params: 17.0B

Quantization Options

Quant Bits VRAM Quality Status
Q2_K 2 35.4 GB low
Q3_K_M 3 49.4 GB moderate
Q4_K_M 4 56.3 GB good
Q5_K_M 5 70.3 GB good
Q6_K 6 84.2 GB excellent
Q8_0 8 112.2 GB excellent
F16 16 223.8 GB lossless

About this model

image.png

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These two models leverage a mixture-of-experts (MoE) architecture and support native multimodality (image input).

Supported languages: Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese.

Input: multilingual text, image

Output: multilingual text, code

Models

Llama 4 Scout

ollama run llama4:scout

109B parameter MoE model with 17B active parameters

Llama 4 Maverick

ollama run llama4:maverick

400B parameter MoE model with 17B active parameters

Intended Use

Intended Use Cases: Llama 4 is intended for commercial and research use in multiple languages. Instruction tuned models are intended for assistant-like chat and visual reasoning tasks, whereas pretrained models can be adapted for natural language generation. For vision, Llama 4 models are also optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The Llama 4 model collection also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation. The Llama 4 Community License allows for these use cases.

Out-of-scope: Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 4 Community License. Use in languages or capabilities beyond those explicitly referenced as supported in this model card.

Note:

  1. Llama 4 has been trained on a broader collection of languages than the 12 supported languages (pre-training includes 200 total languages). Developers may fine-tune Llama 4 models for languages beyond the 12 supported languages provided they comply with the Llama 4 Community License and the Acceptable Use Policy. Developers are responsible for ensuring that their use of Llama 4 in additional languages is done in a safe and responsible manner.

  2. Llama 4 has been tested for image understanding up to 5 input images. If leveraging additional image understanding capabilities beyond this, Developers are responsible for ensuring that their deployments are mitigated for risks and should perform additional testing and tuning tailored to their specific applications.

Benchmarks

Category Benchmark # Shots Metric Llama 3.3 70B Llama 3.1 405B Llama 4 Scout Llama 4 Maverick
Image Reasoning MMMU 0 accuracy No multimodal support 69.4 73.4
MMMU Pro^ 0 accuracy 52.2 59.6
MathVista 0 accuracy 70.7 73.7
Image Understanding ChartQA 0 relaxed_accuracy 88.8 90.0
DocVQA (test) 0 anls 94.4 94.4
Code LiveCodeBench (10/01/2024-02/01/2025) 0 pass@1 33.3 27.7 32.8 43.4
Reasoning & Knowledge MMLU Pro 0 macro_avg/acc 68.9 73.4 74.3 80.5
GPQA Diamond 0 accuracy 50.5 49.0 57.2 69.8
Multilingual MGSM 0 average/em 91.1 91.6 90.6 92.3
Long Context MTOB (half book) eng->kgv/kgv->eng - chrF Context window is 128K 42.2 / 36.6 54.0 / 46.4
MTOB (full book) eng->kgv/kgv->eng - chrF 39.7 / 36.3 50.8 / 46.7
*reported numbers for MMMU Pro is the average of Standard and Vision tasks

Reference