AI Model Catalogue

📏33kcontext

⚡254msp50

qwen/qwen3-32b

groq

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...

textreasoningcheap

📏131kcontext

openai/gpt-oss-safeguard-20b

groq

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lower latency for safety tasks like content classification, LLM filtering, and trust...

mistral-small-2603

Mistral Small 4.

📏262kcontext

⚡367msp50

codestral-2508

Our cutting-edge language model for coding released August 2025.

mistral-embed-2312

Official mistral-embed-2312 Mistral AI model

kimi-k2-thinking

moonshotai

Kimi K2 Thinking is the latest, most capable version of an open-source thinking model.

minimax-m2

minimax-m2 — available to run locally via Ollama on CPU and GPU hardware.

Run locally

📏205kcontext

gemini-3-flash-preview

gemini-3-flash-preview — available to run locally via Ollama on CPU and GPU hardware.

📏1049kcontext

qwen3-coder-next

qwen3-coder-next — available to run locally via Ollama on CPU and GPU hardware.

textcodecheap

📏262kcontext

Llama-3.1-70B-Instruct

meta-llama

Open-source Llama-3.1-70B-Instruct model from meta-llama — available for download and self-hosting on Hugging Face.

kimi-k2-thinking

kimi-k2-thinking — available to run locally via Ollama on CPU and GPU hardware.

textreasoningcheap

📏262kcontext

Llama-3.2-1B-Instruct

meta-llama

Open-source Llama-3.2-1B-Instruct model from meta-llama — available for download and self-hosting on Hugging Face.

textinstructcheap

📏60kcontext

Llama-3.1-8B-Instruct

meta-llama

Open-source Llama-3.1-8B-Instruct model from meta-llama — available for download and self-hosting on Hugging Face.

Qwen3-30B-A3B-Instruct-2507

Open-source Qwen3-30B-A3B-Instruct-2507 model from qwen — available for download and self-hosting on Hugging Face.

Qwen3-Coder-30B-A3B-Instruct

Open-source Qwen3-Coder-30B-A3B-Instruct model from qwen — available for download and self-hosting on Hugging Face.

Qwen3-30B-A3B

Open-source Qwen3-30B-A3B model from qwen — available for download and self-hosting on Hugging Face.

Qwen3-14B

Open-source Qwen3-14B model from qwen — available for download and self-hosting on Hugging Face.

Qwen3-32B

Open-source Qwen3-32B model from qwen — available for download and self-hosting on Hugging Face.

Qwen3-8B

Open-source Qwen3-8B model from qwen — available for download and self-hosting on Hugging Face.

long-contextinstructcheap

AI21 Jamba 1.6 Mini

ai21

AI21 Jamba 1.6 Mini is a lightweight Mamba-Transformer hybrid optimized for cost-effective, high-throughput inference with an impressive 256K context window. An excellent choice for document-heavy workloads on a budget.

Input$0.2000/1M

Output$0.4000/1M

📏256kcontext

long-contextinstructcheap

AI21 Jamba 1.6 Large

ai21

AI21 Jamba 1.6 Large uses a hybrid Mamba-Transformer architecture offering low memory footprint and high throughput compared to equivalent Transformer models. Features 256K context at a fraction of the inference cost.

Input$2.0000/1M

Output$8.0000/1M

📏256kcontext

Microsoft Phi-4 Mini

microsoft

Microsoft Phi-4 Mini is a 3.8B parameter compact model from Microsoft. Delivers impressive reasoning capabilities for edge and mobile deployment scenarios, with strong performance on math and coding tasks relative to its size.

reasoningcodeinstruct

IBM Granite 3.0 2B Instruct

IBM Research

IBM Granite 3.0 2B Instruct is an ultra-compact enterprise model excelling at summarization, extraction, and classification. The smallest model in the Granite family, suitable for edge deployments and constrained environments.

instructopen-sourcecheap

Amazon Nova Lite

amazon

Amazon Nova Lite is a very low-cost multimodal model that can process image, video, and text inputs. Fast and accurate for a wide range of tasks requiring visual and language understanding.

visionmultimodalcheap

Input$0.0600/1M

Output$0.2400/1M

📏300kcontext

Amazon Nova Micro

amazon

Amazon Nova Micro is the fastest and most cost-effective text-only model in the Nova family, optimized for speed and low latency. Ideal for customer service, summarization, and translation at scale.

OpenAI: GPT-3.5 Turbo

openai

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data up to Sep 2021.

MythoMax 13B

gryphe

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge

ReMM SLERP 13B

undi95

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

Mancer: Weaver (alpha)

mancer

An attempt to recreate Claude-style verbosity, but don't expect the same level of coherence or memory. Meant for use in roleplay/narrative situations.

Mistral: Mistral 7B Instruct v0.1

A 7.3B parameter model that outperforms Llama 2 13B on all benchmarks, with optimizations for speed and context length.

Auto Router

openrouter

Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output. To see which model was used,...

Mistral: Mixtral 8x7B Instruct

Mixtral 8x7B Instruct is a pretrained generative Sparse Mixture of Experts, by Mistral AI, for chat and instruction use. Incorporates 8 experts (feed-forward networks) for a total of 47 billion...

OpenAI: GPT-3.5 Turbo (older v0613)

openai

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data up to Sep 2021.

Anthropic: Claude 3 Haiku

anthropic

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance. See the launch announcement and benchmark results [here](https://www.anthropic.com/news/claude-3-haiku) #multimodal

WizardLM-2 8x22B

microsoft

WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to leading proprietary models, and it consistently outperforms all existing state-of-the-art opensource models. It is...