Qwen3.5 397B VLM (a17b)

flagship

Qwen (Alibaba) · released 2025-08-01 · text

currently routing · 4.2k rpm

131.1M tokens

Context

— / 1M

Input

— / 1M

Output

— t/s

Speed

open

License

/ ABOUT

Qwen3.5 397B VLM A17B Instruct is Alibaba's most capable multimodal model, using a Mixture-of-Experts architecture with 397 billion total parameters and 17 billion active per token. The VLM (Vision-Language Model) designation indicates it processes images alongside text for visual understanding and reasoning.

The model handles complex multimodal tasks including detailed image analysis, document understanding with charts and tables, visual question answering, and image-grounded reasoning. It supports high-resolution images and can process multiple images in a single context, enabling comparison and cross-referencing.

Qwen3.5 397B VLM represents the frontier of open multimodal AI, designed for applications requiring both visual understanding and text reasoning at the highest quality level.

BENCHMARKS Artificial Analysis Index

Intelligence 45

Providers for Qwen3.5 397B VLM (a17b)

1 routes · sorted by uptime

ClosedRouter routes requests to the providers best able to handle your prompt size and parameters, with automatic fallbacks to maximize uptime.

Provider

Context

Quant

Uptime · 30d

NVIDIA NIM

—

bf16

0.00%