DeepSeek 6 Feb 2025 · 4 min read DeepSeek-VL2: Advancing Vision-Language Models with Mixture-of-Experts Discover DeepSeek-VL2, a state-of-the-art vision-language model leveraging Mixture-of-Experts (MoE) architecture. Explore its innovations in dynamic tiling, Multi-head Latent Attention (MLA), data construction, training methodology, and benchmark evaluations. Read more