AI Model Catalogue

DeepSeek R1 Distill Qwen 14B

Cogito v2.1 671B

Qwen3 Coder 30B A3b Instruct

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use. Built on the...

Qwen3 Next 80B A3b Thinking

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard multi-step problems; math proofs, code synthesis/debugging, logic, and agentic...

Cogito V1 Preview Qwen 14B

DeepSeek R1 Distill Llama 70B

DeepSeek R1 Distill Qwen 1.5B

deepseek-ai

textcheaplong-context

Qwen3-VL-8B-Instruct

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...

Meta Llama 3.1 70B Instruct Turbo

Cogito V1 Preview Llama 70B

DeepSeek R1 Distill Qwen 7B

Holo3 35B A3b

Llama Guard 4 12B

meta-llama

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...

Qwen3-VL-32B-Instruct

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...

Cogito V1 Preview Qwen 32B

Cogito V1 Preview Llama 70B Turbo

Cogito V1 Preview Llama 8B

GLM 5 Fp4

Qwen2.5 32B

GLM 4.7 FP8

zai-org

textcheaplong-context

Facebook CWM

meta-llama/Llama-Guard-4-12B

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...

Qwen/Qwen3-Max

textreasoningmultilingual

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the January 2025 version. It...

InputFree

📏262kcontext

⚡78msp50

google/gemma-3-4b-it

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

google/gemma-3-12b-it

google/gemma-4-26B-A4B-it

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...

Qwen/Qwen3-Next-80B-A3B-Instruct

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code generation, knowledge QA, and multilingual...

openai/gpt-oss-120b

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized...

Qwen/Qwen3-VL-30B-A3B-Instruct

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception...