modelstop.top
Home/All Models

AI Model Catalogue

Browse 352 models across providers, modalities, and use cases.

🌐 All Models

352 models Β· Page 1 of 10

mistral-small-3.1-24b-instruct

mistralai

Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.

textvisioninstruct
128,000 ctx$0.35/1M in
Explore specs and pricingView details β†’

llama-3-8b-instruct-awq

meta

Quantized (int4) generative text model with 8 billion parameters from Meta.

textinstructcheap
8,192 ctx$0.12/1M in
Explore specs and pricingView details β†’

qwen3-30b-a3b-fp8

qwen

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support.

textreasoningagents
32,768 ctx$0.05/1M in
Explore specs and pricingView details β†’

gemma-sea-lion-v4-27b-it

aisingapore

SEA-LION stands for Southeast Asian Languages In One Network, which is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.

textinstructcheap
128,000 ctx$0.35/1M in
Explore specs and pricingView details β†’

bge-base-en-v1.5

baai

BAAI general embedding (Base) model that transforms any given text into a 768-dimensional vector

textcheaplong-context
153,600 ctx$0.07/1M in
Explore specs and pricingView details β†’

gemma-4-26b-a4b-it

google

Gemma 4 is Google's most intelligent family of open models, built from Gemini 3 research to maximize intelligence-per-parameter.

textcheaplong-context
256,000 ctx$0.10/1M in
Explore specs and pricingView details β†’

gemma-3-12b-it

google

Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Gemma 3 models are multimodal, handling text and image input and generating text output, with a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions.

textvisionreasoning
80,000 ctx$0.35/1M in
Explore specs and pricingView details β†’

gpt-oss-20b

openai

OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases – gpt-oss-20b is for lower latency, and local or specialized use-cases.

textreasoningagents
128,000 ctx$0.20/1M in
Explore specs and pricingView details β†’

bge-reranker-base

baai

Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. You can get a relevance score by inputting query and passage to the reranker. And the score can be mapped to a float value in [0,1] by sigmoid function.

textcheap
ctx$0.00/1M in
Explore specs and pricingView details β†’

llama-4-scout-17b-16e-instruct

meta

Meta's Llama 4 Scout is a 17 billion parameter model with 16 experts that is natively multimodal. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.

textvisioninstruct
131,000 ctx$0.27/1M in
Explore specs and pricingView details β†’

llama-3.2-11b-vision-instruct

meta

The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image.

textvisionreasoning
128,000 ctx$0.05/1M in
Explore specs and pricingView details β†’

llama-3.1-8b-instruct-awq

meta

Quantized (int4) generative text model with 8 billion parameters from Meta.

textinstructcheap
8,192 ctx$0.12/1M in
Explore specs and pricingView details β†’

qwq-32b

qwen

QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.

textreasoningcheap
24,000 ctx$0.66/1M in
Explore specs and pricingView details β†’

bge-large-en-v1.5

baai

BAAI general embedding (Large) model that transforms any given text into a 1024-dimensional vector

textcheap
ctx$0.20/1M in
Explore specs and pricingView details β†’

nemotron-3-120b-a12b

nvidia

NVIDIA Nemotron 3 Super is a hybrid MoE model with leading accuracy for multi-agent applications and specialized agentic AI systems.

textagentscheap
256,000 ctx$0.50/1M in
Explore specs and pricingView details β†’

bge-small-en-v1.5

baai

BAAI general embedding (Small) model that transforms any given text into a 384-dimensional vector

textcheap
ctx$0.02/1M in
Explore specs and pricingView details β†’

plamo-embedding-1b

pfnet

PLaMo-Embedding-1B is a Japanese text embedding model developed by Preferred Networks, Inc. It can convert Japanese text input into numerical vectors and can be used for a wide range of applications, including information retrieval, text classification, and clustering.

textcheap
ctx$0.02/1M in
Explore specs and pricingView details β†’

granite-4.0-h-micro

ibm-granite

Granite 4.0 instruct models deliver strong performance across benchmarks, achieving industry-leading results in key agentic tasks like instruction following and function calling. These efficiencies make the models well-suited for a wide range of use cases like retrieval-augmented generation (RAG), multi-agent workflows, and edge deployments.

textagentsinstruct
131,000 ctx$0.02/1M in
Explore specs and pricingView details β†’

indictrans2-en-indic-1B

ai4bharat

IndicTrans2 is the first open-source transformer-based multilingual NMT model that supports high-quality translations across all the 22 scheduled Indic languages

textmultilingualcheap
ctx$0.34/1M in
Explore specs and pricingView details β†’

kimi-k2.6

moonshotai

Kimi K2.6 is a frontier-scale open-source 1T parameter model with a 262.1k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.

textvisionagents
262,144 ctx$0.95/1M in
Explore specs and pricingView details β†’

mistral-7b-instruct-v0.1

mistral

Instruct fine-tuned version of the Mistral-7b generative text model with 7 billion parameters

textinstructcheap
2,824 ctx$0.11/1M in
Explore specs and pricingView details β†’

llama-2-7b-chat-fp16

meta

Full precision (fp16) generative text model with 7 billion parameters from Meta

textcheap
4,096 ctx$0.56/1M in
Explore specs and pricingView details β†’

llama-3.1-8b-instruct-fp8

meta

Llama 3.1 8B quantized to FP8 precision

textinstructcheap
32,000 ctx$0.15/1M in
Explore specs and pricingView details β†’

gpt-oss-120b

openai

OpenAI’s open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases – gpt-oss-120b is for production, general purpose, high reasoning use-cases.

textreasoningagents
128,000 ctx$0.35/1M in
Explore specs and pricingView details β†’

qwen3-embedding-0.6b

qwen

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks.

textcheap
8,192 ctx$0.01/1M in
Explore specs and pricingView details β†’

m2m100-1.2b

meta

Multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation

textmultilingualcheap
ctx$0.34/1M in
Explore specs and pricingView details β†’

deepseek-r1-distill-qwen-32b

deepseek-ai

DeepSeek-R1-Distill-Qwen-32B is a model distilled from DeepSeek-R1 based on Qwen2.5. It outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.

textcheap
80,000 ctx$0.50/1M in
Explore specs and pricingView details β†’

glm-4.7-flash

zai-org

GLM-4.7-Flash is a fast and efficient multilingual text generation model with a 131,072 token context window. Optimized for dialogue, instruction-following, and multi-turn tool calling across 100+ languages.

textmultilingualcheap
131,072 ctx$0.06/1M in
Explore specs and pricingView details β†’

qwen2.5-coder-32b-instruct

qwen

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:

textcodeinstruct
32,768 ctx$0.66/1M in
Explore specs and pricingView details β†’

llama-3.2-3b-instruct

meta

The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.

textagentsmultilingual
80,000 ctx$0.05/1M in
Explore specs and pricingView details β†’

kimi-k2.5

moonshotai

Kimi K2.5 is a frontier-scale open-source model with a 256k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.

textvisionagents
256,000 ctx$0.60/1M in
Explore specs and pricingView details β†’

bge-m3

baai

Multi-Functionality, Multi-Linguality, and Multi-Granularity embeddings model.

textcheap
60,000 ctx$0.01/1M in
Explore specs and pricingView details β†’

distilbert-sst-2-int8

huggingface

Distilled BERT model that was finetuned on SST-2 for sentiment classification

textcheap
ctx$0.03/1M in
Explore specs and pricingView details β†’

llama-3-8b-instruct

meta

Generation over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.

textreasoninginstruct
7,968 ctx$0.28/1M in
Explore specs and pricingView details β†’

llama-guard-3-8b

meta

Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.

textcheaplong-context
131,072 ctx$0.48/1M in
Explore specs and pricingView details β†’

llama-3.3-70b-instruct-fp8-fast

meta

Llama 3.3 70B quantized to fp8 precision, optimized to be faster.

textinstructcheap
24,000 ctx$0.29/1M in
Explore specs and pricingView details β†’