Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.

textvisioninstruct

Input$0.3500/1M

Output$0.5600/1M

📏128kcontext

Explore specs and pricingView details →

llama-3.2-11b-vision-instruct

meta

The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image.

Explore specs and pricingView details →

bge-base-en-v1.5

baai

BAAI general embedding (Base) model that transforms any given text into a 768-dimensional vector

textcheaplong-context

Input$0.0670/1M

Output$0.0000/1M

📏154kcontext

Explore specs and pricingView details →

gemma-sea-lion-v4-27b-it

aisingapore

SEA-LION stands for Southeast Asian Languages In One Network, which is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.

Explore specs and pricingView details →

gemma-4-26b-a4b-it

google

Gemma 4 is Google's most intelligent family of open models, built from Gemini 3 research to maximize intelligence-per-parameter.

textcheaplong-context

Input$0.1000/1M

Output$0.3000/1M

📏256kcontext

Explore specs and pricingView details →

gpt-oss-20b

openai

OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases – gpt-oss-20b is for lower latency, and local or specialized use-cases.

Explore specs and pricingView details →

llama-4-scout-17b-16e-instruct

meta

Meta's Llama 4 Scout is a 17 billion parameter model with 16 experts that is natively multimodal. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.

Explore specs and pricingView details →

kimi-k2.5

moonshotai

Kimi K2.5 is a frontier-scale open-source model with a 256k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.

Explore specs and pricingView details →

llama-guard-3-8b

meta

Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.

textcheaplong-context

Input$0.4800/1M

Output$0.0300/1M

📏131kcontext

Explore specs and pricingView details →

glm-4.7-flash

zai-org

GLM-4.7-Flash is a fast and efficient multilingual text generation model with a 131,072 token context window. Optimized for dialogue, instruction-following, and multi-turn tool calling across 100+ languages.

textmultilingualcheap

Input$0.0600/1M

Output$0.4000/1M

📏131kcontext

Explore specs and pricingView details →

granite-4.0-h-micro

ibm-granite

Granite 4.0 instruct models deliver strong performance across benchmarks, achieving industry-leading results in key agentic tasks like instruction following and function calling. These efficiencies make the models well-suited for a wide range of use cases like retrieval-augmented generation (RAG), multi-agent workflows, and edge deployments.

Explore specs and pricingView details →

nemotron-3-120b-a12b

nvidia

NVIDIA Nemotron 3 Super is a hybrid MoE model with leading accuracy for multi-agent applications and specialized agentic AI systems.

Explore specs and pricingView details →

kimi-k2.6

moonshotai

Kimi K2.6 is a frontier-scale open-source 1T parameter model with a 262.1k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.

Explore specs and pricingView details →

gpt-oss-120b

openai

OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases – gpt-oss-120b is for production, general purpose, high reasoning use-cases.

Explore specs and pricingView details →