back

LFM2 24B

Name: LFM2 24B
Author: Liquid AI

Liquid AI

Liquid AI · 24B (2.3B active) · Mixture of Experts

Hybrid MoE with convolution+attention layers — 2.3B active

HuggingFace Ollama LM Studio

16.5K downloads 275 likes 2025-11 32K context

Use Cases

chat edge rag

Mixture of Experts

Total experts: 64

Active experts: 4

Active params: 2.3B

Quantization Options

Quant	Bits	VRAM	Quality	Status
Q2_K	2	8.2 GB	low	—
Q3_K_M	3	11.3 GB	moderate	—
Q4_K_M	4	12.8 GB	good	—
Q5_K_M	5	15.9 GB	good	—
Q6_K	6	18.9 GB	excellent	—
Q8_0	8	25.1 GB	excellent	—
F16	16	49.7 GB	lossless	—

About this model

LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.

Best-in-class efficiency: A 24B MoE model with only 2B active parameters per token, fitting in 32 GB of RAM for deployment on consumer laptops and desktops.
Fast edge inference: 112 tok/s decode on AMD CPU, 293 tok/s on H100. Fits in 32B GB of RAM.
Predictable scaling: Quality improves log-linearly from 350M to 24B total parameters, confirming the LFM2 hybrid architecture scales reliably across nearly two orders of magnitude.