Qwen/Qwen3-VL-30B-A3B-Instruct
deepinfra
Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception...
- Context window
- 131,072 tokens
- Input cost
- —
- Output cost
- —
- Latency (p50)
- —
