AI Model Catalogue

gpt-5.4

OpenAI's most capable frontier model for complex professional work, coding, and multi-step reasoning.

kimi-k2-thinking

moonshotai

Kimi K2 Thinking is the latest, most capable version of an open-source thinking model.

262,144 ctxFree in

seedream-5-lite

bytedance

Seedream 5.0 lite: image generation with built-in reasoning, example-based editing, and deep domain knowledge

visionimagereasoning

wan-2.7-image-pro

wan-video

Generate and edit high-quality images with Alibaba's Wan 2.7 Pro with 4K output, thinking mode, text-to-image, multi-image editing, and image set generation

visionimagereasoning

kimi-k2-thinking

ollama

kimi-k2-thinking — available to run locally via Ollama on CPU and GPU hardware.

262,144 ctxFree in

HyperCLOVAX-SEED-Think-14B-GPTQ

k-compression

Open-source HyperCLOVAX-SEED-Think-14B-GPTQ model from k-compression — available for download and self-hosting on Hugging Face.

textreasoningfree

Qwen3-4B-Thinking-2507

qwen

Open-source Qwen3-4B-Thinking-2507 model from qwen — available for download and self-hosting on Hugging Face.

textreasoningfree

ctx$0.00/1M in

codereasoningmultilingual

Qwen3 235B A22B

alibaba

Qwen3 235B A22B is Alibaba's flagship mixture-of-experts model with 235B total parameters and 22B active per token. Delivers frontier-level performance on coding, reasoning, and multilingual tasks at significantly lower inference cost.

128,000 ctx$0.00/1M in

Technology Innovation Institute

Falcon 180B

Falcon 180B is one of the largest openly available language models, trained on 3.5 trillion tokens with TII's custom RefinedWeb dataset. Excels at reasoning, summarization, and generation tasks at state-of-the-art quality for open models.

instructopen-sourcereasoning

2,048 ctx$0.00/1M in

Databricks DBRX Instruct

Databricks

DBRX Instruct is an open, general-purpose LLM from Databricks. Built with a fine-grained mixture-of-experts (MoE) architecture, it was the most capable open LLM at launch and excels at code, math, and language tasks.

codereasoninginstruct

32,768 ctx$0.00/1M in

NVIDIA Nemotron-4 340B Instruct

nvidia

NVIDIA Nemotron-4 340B Instruct is a large open language model trained to generate diverse synthetic data for training other LLMs. Strong at following instructions, classification, and generating reward model training data.

instructreasoning

4,096 ctx$0.00/1M in

Microsoft Phi-4 Mini

microsoft

Microsoft Phi-4 Mini is a 3.8B parameter compact model from Microsoft. Delivers impressive reasoning capabilities for edge and mobile deployment scenarios, with strong performance on math and coding tasks relative to its size.

reasoningcodeinstruct

128,000 ctx$0.00/1M in

OpenAI: GPT-4

OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of solving difficult problems with greater accuracy than previous models due to its broader general knowledge and advanced reasoning...

textvisionreasoning

8,191 ctx$30.00/1M in

Mistral Large

mistralai

This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....

128,000 ctx$2.00/1M in

Nous: Hermes 3 405B Instruct (free)

nousresearch

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...

textreasoningagents

131,072 ctx$1.00/1M in

Nous: Hermes 3 70B Instruct

nousresearch

Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...

textreasoningagents

131,072 ctx$0.30/1M in

Cohere: Command R (08-2024)

cohere

command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improved performance for multilingual retrieval-augmented generation (RAG) and tool use. More broadly, it is better at math, code and reasoning and...

128,000 ctx$0.15/1M in

textreasoningmultilingual

Meta: Llama 3.2 3B Instruct (free)

meta-llama

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization. Designed with the latest transformer architecture, it...

131,072 ctx$0.05/1M in

Qwen2.5 Coder 32B Instruct

qwen

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: - Significantly improvements in **code generation**, **code reasoning**...

32,768 ctx$0.66/1M in

Mistral Large 2407

mistralai

This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/)....

131,072 ctx$2.00/1M in

Cohere: Command R7B (12-2024)

cohere

Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, tool use, agents, and similar tasks requiring complex reasoning...

textreasoningagents

128,000 ctx$0.04/1M in

OpenAI: o1

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 model series is trained with large-scale reinforcement learning to reason...

textvisionmultimodal

200,000 ctx$15.00/1M in

Microsoft: Phi 4

microsoft

[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion...

16,384 ctx$0.07/1M in

DeepSeek: R1

deepseek

DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass....

64,000 ctx$0.70/1M in

OpenAI: o3 Mini

OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. This model supports the `reasoning_effort` parameter, which can be set to...

200,000 ctx$1.10/1M in

AionLabs: Aion-1.0-Mini

aion-labs

Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designed for strong performance in reasoning domains such as mathematics, coding, and logic. It is a modified variant...

131,072 ctx$0.70/1M in

AionLabs: Aion-1.0

aion-labs

Aion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding. It is built on DeepSeek-R1, augmented with additional models and techniques such as Tree...

131,072 ctx$4.00/1M in

OpenAI: o3 Mini High

textreasoninglong-context

OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort set to high. o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and...

200,000 ctx$1.10/1M in

Anthropic: Claude 3.7 Sonnet

anthropic

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and...

textvisionmultimodal

200,000 ctx$3.00/1M in

Qwen: QwQ 32B

qwen

QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks,...

131,072 ctx$0.15/1M in

textreasoninglong-context

Perplexity: Sonar Deep Research

perplexity

Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across complex topics. It autonomously searches, reads, and evaluates sources, refining its approach as it gathers...

128,000 ctx$2.00/1M in

Perplexity: Sonar Pro

perplexity

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) For enterprises seeking more advanced capabilities, the Sonar Pro API can handle in-depth, multi-step queries with added extensibility, like...

textvisionmultimodal

200,000 ctx$3.00/1M in