π All Models
2,392 models Β· Page 5 of 67
mistral-7b-instruct-v0.2-lora
The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2.
whisper
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
flux
Flux is the first conversational speech recognition model built specifically for voice agents.
llama-2-7b-chat-fp16
Full precision (fp16) generative text model with 7 billion parameters from Meta
mistral-7b-instruct-v0.1
Instruct fine-tuned version of the Mistral-7b generative text model with 7 billion parameters
melotts
MeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai.
plamo-embedding-1b
PLaMo-Embedding-1B is a Japanese text embedding model developed by Preferred Networks, Inc. It can convert Japanese text input into numerical vectors and can be used for a wide range of applications, including information retrieval, text classification, and clustering.
flux-1-schnell
FLUX.1 [schnell] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions.
phoenix-1.0
Phoenix 1.0 is a model by Leonardo.Ai that generates images with exceptional prompt adherence and coherent text.
stable-diffusion-v1-5-inpainting
Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask.
qwen1.5-7b-chat-awq
Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization.
llama-3.2-3b-instruct
The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.
nova-3
Transcribe audio using Deepgramβs speech-to-text model
llama-3-8b-instruct
Generation over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.
flux-2-klein-9b
FLUX.2 [klein] 9B is a 9 billion parameter model that can generate images from text descriptions and supports multi-reference editing capabilities.
kimi-k2.5
Kimi K2.5 is a frontier-scale open-source model with a 256k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.
llama-guard-3-8b
Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM β it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.
qwen1.5-0.5b-chat
Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud.
bge-m3
Multi-Functionality, Multi-Linguality, and Multi-Granularity embeddings model.
gpt-oss-120b
OpenAIβs open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases β gpt-oss-120b is for production, general purpose, high reasoning use-cases.
gemma-2b-it-lora
This is a Gemma-2B base model that Cloudflare dedicates for inference with LoRA adapters. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.
tinyllama-1.1b-chat-v1.0
The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. This is the chat model finetuned on top of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T.
deepseek-r1-distill-qwen-32b
DeepSeek-R1-Distill-Qwen-32B is a model distilled from DeepSeek-R1 based on Qwen2.5. It outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.
stable-diffusion-xl-base-1.0
Diffusion-based text-to-image generative model by Stability AI. Generates and modify images based on text prompts.
m2m100-1.2b
Multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation
distilbert-sst-2-int8
Distilled BERT model that was finetuned on SST-2 for sentiment classification
nemotron-3-120b-a12b
NVIDIA Nemotron 3 Super is a hybrid MoE model with leading accuracy for multi-agent applications and specialized agentic AI systems.
qwen2.5-coder-32b-instruct
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:
smart-turn-v2
An open source, community-driven, native audio turn detection model in 2nd version
deepseek-math-7b-instruct
DeepSeekMath-Instruct 7B is a mathematically instructed tuning model derived from DeepSeekMath-Base 7B. DeepSeekMath is initialized with DeepSeek-Coder-v1.5 7B and continues pre-training on math-related tokens sourced from Common Crawl, together with natural language and code data for 500B tokens.
indictrans2-en-indic-1B
IndicTrans2 is the first open-source transformer-based multilingual NMT model that supports high-quality translations across all the 22 scheduled Indic languages
flux-2-klein-4b
FLUX.2 [klein] is an ultra-fast, distilled image model. It unifies image generation and editing in a single model, delivering state-of-the-art quality enabling interactive workflows, real-time previews, and latency-critical applications.
qwen3-embedding-0.6b
The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks.
bge-small-en-v1.5
BAAI general embedding (Small) model that transforms any given text into a 384-dimensional vector
falcon-7b-instruct
Falcon-7B-Instruct is a 7B parameters causal decoder-only model built by TII based on Falcon-7B and finetuned on a mixture of chat/instruct datasets.
llama-3.2-1b-instruct
The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.
