AI Model Catalogue

Qwen3-30B-A3B-Instruct-2507

Open-source Qwen3-30B-A3B-Instruct-2507 model from qwen — available for download and self-hosting on Hugging Face.

Qwen3-Coder-30B-A3B-Instruct

Open-source Qwen3-Coder-30B-A3B-Instruct model from qwen — available for download and self-hosting on Hugging Face.

Qwen3-30B-A3B

Open-source Qwen3-30B-A3B model from qwen — available for download and self-hosting on Hugging Face.

Qwen3-14B

Open-source Qwen3-14B model from qwen — available for download and self-hosting on Hugging Face.

Qwen3-32B

Open-source Qwen3-32B model from qwen — available for download and self-hosting on Hugging Face.

gpt-oss-120b

Open-source gpt-oss-120b model from openai — available for download and self-hosting on Hugging Face.

gpt-oss-20b

Open-source gpt-oss-20b model from openai — available for download and self-hosting on Hugging Face.

codereasoningmultilingual

Qwen3 235B A22B

alibaba

Qwen3 235B A22B is Alibaba's flagship mixture-of-experts model with 235B total parameters and 22B active per token. Delivers frontier-level performance on coding, reasoning, and multilingual tasks at significantly lower inference cost.

long-contextinstructcheap

AI21 Jamba 1.6 Mini

ai21

AI21 Jamba 1.6 Mini is a lightweight Mamba-Transformer hybrid optimized for cost-effective, high-throughput inference with an impressive 256K context window. An excellent choice for document-heavy workloads on a budget.

Input$0.2000/1M

Output$0.4000/1M

📏256kcontext

long-contextinstructcheap

AI21 Jamba 1.6 Large

ai21

AI21 Jamba 1.6 Large uses a hybrid Mamba-Transformer architecture offering low memory footprint and high throughput compared to equivalent Transformer models. Features 256K context at a fraction of the inference cost.

Input$2.0000/1M

Output$8.0000/1M

📏256kcontext

Microsoft Phi-4 Mini

microsoft

Microsoft Phi-4 Mini is a 3.8B parameter compact model from Microsoft. Delivers impressive reasoning capabilities for edge and mobile deployment scenarios, with strong performance on math and coding tasks relative to its size.

reasoningcodeinstruct

IBM Granite 3.0 2B Instruct

IBM Research

IBM Granite 3.0 2B Instruct is an ultra-compact enterprise model excelling at summarization, extraction, and classification. The smallest model in the Granite family, suitable for edge deployments and constrained environments.

instructopen-sourcecheap

IBM Granite 3.0 8B Instruct

IBM Research

IBM Granite 3.0 8B Instruct is a lightweight enterprise-grade language model trained on a carefully curated enterprise corpus and optimized for RAG, summarization, classification, and code generation in business contexts.

codeinstructopen-source

visionmultimodallong-context

Amazon Nova Pro

amazon

Amazon Nova Pro is a highly capable multimodal model with the best combination of accuracy, speed, and cost across a wide range of tasks. Supports text, image, and video inputs.

Input$0.8000/1M

Output$3.2000/1M

📏300kcontext

Amazon Nova Lite

amazon

Amazon Nova Lite is a very low-cost multimodal model that can process image, video, and text inputs. Fast and accurate for a wide range of tasks requiring visual and language understanding.

visionmultimodalcheap

Input$0.0600/1M

Output$0.2400/1M

📏300kcontext

OpenAI: GPT-4 Turbo (older v1106)

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Training data: up to April 2023.

textvisionlong-context

Input$10.0000/1M

Output$30.0000/1M

📏128kcontext

Auto Router

openrouter

Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output. To see which model was used,...

OpenAI: GPT-4 Turbo Preview

The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Dec 2023. **Note:** heavily rate limited by OpenAI while...

Mistral Large

mistralai

This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....

Anthropic: Claude 3 Haiku

anthropic

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance. See the launch announcement and benchmark results [here](https://www.anthropic.com/news/claude-3-haiku) #multimodal

OpenAI: GPT-4 Turbo

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Training data: up to December 2023.

⭐Top Rated

OpenAI: GPT-4o

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as...

OpenAI: GPT-4o (2024-05-13)

OpenAI: GPT-4o-mini

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...

OpenAI: GPT-4o-mini (2024-07-18)

Mistral: Mistral Nemo

mistralai

A 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA. The model is multilingual, supporting English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese,...

textmultilingualcheap

Input$0.0200/1M

Output$0.0400/1M

📏131kcontext

Meta: Llama 3.1 70B Instruct

meta-llama

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrated strong...

OpenAI: GPT-4o (2024-08-06)

The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with the ability to supply a JSON schema in the respone_format. Read more [here](https://openai.com/index/introducing-structured-outputs-in-the-api/). GPT-4o ("o" for "omni") is...

Nous: Hermes 3 405B Instruct (free)

nousresearch

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...

Nous: Hermes 3 70B Instruct

nousresearch

Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...

Sao10K: Llama 3.1 Euryale 70B v2.2

sao10k

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.1](/models/sao10k/l3-euryale-70b).

Cohere: Command R (08-2024)

cohere

command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improved performance for multilingual retrieval-augmented generation (RAG) and tool use. More broadly, it is better at math, code and reasoning and...

Cohere: Command R+ (08-2024)

cohere

command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) with roughly 50% higher throughput and 25% lower latencies as compared to the previous Command R+ version, while keeping the hardware footprint...

Qwen2.5 72B Instruct

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and...

Meta: Llama 3.2 11B Vision Instruct

meta-llama

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and...