Back to models
qwenmodel
Qwen3-VL-32B-Instruct
Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...
Best for
Image UnderstandingVisual Q&AOCRComplex Reasoning
Context Window
262K tokens ≈ 583 pages of text
Input Cost
$0.50/1M
Output Cost
$1.50/1M
Latency p50
—
Pricing Details
Standard Pricing
Input (per 1M tokens)
$0.50
Output (per 1M tokens)
$1.50
Hallucination Score™ (est.)
Community reliability estimate · not official
—
Not yet rated
About this score: Community-estimated based on user reports and publicly available benchmark data (e.g. TruthfulQA). This is not an official score from the model provider. Scores may be inaccurate — always verify with the official leaderboard before making production decisions.
Price History
Not enough historical data yet. Check back after the next pricing sync.
Provider
qwen
Community Prompts
Proven prompts shared by the community for this model
Loading prompts…
