modelstop.top

Compare Models

Run side-by-side checks for pricing, context window, and latency.

llama-3.2-11b-vision-instruct

meta

The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image.

Context window
128,000 tokens
Input cost
$0.05 / 1M
Output cost
$0.68 / 1M
Latency (p50)