DeepSeek · 685B (37B active) · Mixture of Experts
State-of-the-art MoE — 37B active params
Use Cases
Mixture of Experts
| Quant | Bits | VRAM | Quality | Status |
|---|---|---|---|---|
| Q2_K | 2 | 219.8 GB | low | — |
| Q3_K_M | 3 | 307.5 GB | moderate | — |
| Q4_K_M | 4 | 351.4 GB | good | — |
| Q5_K_M | 5 | 439.1 GB | good | — |
| Q6_K | 6 | 526.8 GB | excellent | — |
| Q8_0 | 8 | 702.3 GB | excellent | — |
| F16 | 16 | 1404 GB | lossless | — |
About this model
DeepSeek-V3.2 is a model that harmonizes high computational efficiency with superior reasoning and agent performance. Our approach is built upon three key technical breakthroughs:
DeepSeek Sparse Attention (DSA): an efficient attention mechanism that substantially reduces computational complexity while preserving model performance, specifically optimized for long-context scenarios.
Scalable Reinforcement Learning Framework: By implementing a robust RL protocol and scaling post-training compute, DeepSeek-V3.2 performs comparably to GPT-5.
Large-Scale Agentic Task Synthesis Pipeline: To integrate reasoning into tool-use scenarios, [DeepSeek team] developed a novel synthesis pipeline that systematically generates training data at scale. This facilitates scalable agentic post-training, improving compliance and generalization in complex interactive environments.
Reference
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models