π All Models
1,750 models Β· Page 5 of 49
llama-3-8b-instruct
Generation over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.
gemma-2b-it-lora
This is a Gemma-2B base model that Cloudflare dedicates for inference with LoRA adapters. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.
kimi-k2.5
Kimi K2.5 is a frontier-scale open-source model with a 256k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.
qwen1.5-0.5b-chat
Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud.
bge-m3
Multi-Functionality, Multi-Linguality, and Multi-Granularity embeddings model.
plamo-embedding-1b
PLaMo-Embedding-1B is a Japanese text embedding model developed by Preferred Networks, Inc. It can convert Japanese text input into numerical vectors and can be used for a wide range of applications, including information retrieval, text classification, and clustering.
qwen1.5-7b-chat-awq
Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization.
distilbert-sst-2-int8
Distilled BERT model that was finetuned on SST-2 for sentiment classification
llama-guard-3-8b
Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM β it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.
llama-3.2-1b-instruct
The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.
kimi-k2.6
Kimi K2.6 is a frontier-scale open-source 1T parameter model with a 262.1k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.
deepseek-r1-distill-qwen-32b
DeepSeek-R1-Distill-Qwen-32B is a model distilled from DeepSeek-R1 based on Qwen2.5. It outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.
smart-turn-v2
An open source, community-driven, native audio turn detection model in 2nd version
qwen3-embedding-0.6b
The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks.
gpt-oss-120b
OpenAIβs open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases β gpt-oss-120b is for production, general purpose, high reasoning use-cases.
tinyllama-1.1b-chat-v1.0
The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. This is the chat model finetuned on top of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T.
Nova Micro
Nova Micro β available via AWS Bedrock (us-east-1).
Mistral Small (24.02)
Mistral Small (24.02) β available via AWS Bedrock (us-east-1).
Mistral 7B Instruct
Mistral 7B Instruct β available via AWS Bedrock (us-east-1).
Claude Sonnet 4.5
Claude Sonnet 4.5 β available via AWS Bedrock (us-east-1).
Claude 3.5 Haiku
Claude 3.5 Haiku β available via AWS Bedrock (us-east-1).
Claude 3 Haiku
Claude 3 Haiku β available via AWS Bedrock (us-east-1).
Command R+
Command R+ β available via AWS Bedrock (us-east-1).
Titan Embeddings G1 - Text
Titan Embeddings G1 - Text β available via AWS Bedrock (us-east-1).
Nova Premier
Nova Premier β available via AWS Bedrock (us-east-1).
Llama 3 70B Instruct
Llama 3 70B Instruct β available via AWS Bedrock (us-east-1).
Llama 4 Maverick 17B Instruct
Llama 4 Maverick 17B Instruct β available via AWS Bedrock (us-east-1).
Llama 3.1 8B Instruct
Llama 3.1 8B Instruct β available via AWS Bedrock (us-east-1).
Command R
Command R β available via AWS Bedrock (us-east-1).
Mixtral 8x7B Instruct
Mixtral 8x7B Instruct β available via AWS Bedrock (us-east-1).
Titan Text Embeddings V2
Titan Text Embeddings V2 β available via AWS Bedrock (us-east-1).
Titan Multimodal Embeddings G1
Titan Multimodal Embeddings G1 β available via AWS Bedrock (us-east-1).
Claude 3.7 Sonnet
Claude 3.7 Sonnet β available via AWS Bedrock (us-east-1).
Gemma 3 4B IT
Gemma 3 4B IT β available via AWS Bedrock (us-east-1).
Llama 3.2 1B Instruct
Llama 3.2 1B Instruct β available via AWS Bedrock (us-east-1).
gpt-oss-20b
gpt-oss-20b β available via AWS Bedrock (us-east-1).
