π All Models
2,392 models Β· Page 4 of 67
aura-2-en
Aura-2 is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.
qwen1.5-14b-chat-awq
Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization.
openchat-3.5-0106
OpenChat is an innovative library of open-source language models, fine-tuned with C-RLFT - a strategy inspired by offline reinforcement learning.
gpt-oss-20b
OpenAIβs open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases β gpt-oss-20b is for lower latency, and local or specialized use-cases.
stable-diffusion-v1-5-img2img
Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images. Img2img generate a new image from an input image with Stable Diffusion.
embeddinggemma-300m
EmbeddingGemma is a 300M parameter, state-of-the-art for its size, open embedding model from Google, built from Gemma 3 (with T5Gemma initialization) and the same research and technology used to create Gemini models. EmbeddingGemma produces vector representations of text, making it well-suited for search and retrieval tasks, including classification, clustering, and semantic similarity search. This model was trained with data in 100+ spoken languages.
phi-2
Phi-2 is a Transformer-based model with a next-word prediction objective, trained on 1.4T tokens from multiple passes on a mixture of Synthetic and Web datasets for NLP and coding.
bart-large-cnn
BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. You can use this model for text summarization.
bge-reranker-base
Different from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. You can get a relevance score by inputting query and passage to the reranker. And the score can be mapped to a float value in [0,1] by sigmoid function.
whisper-tiny-en
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalize to many datasets and domains without the need for fine-tuning. This is the English-only version of the Whisper Tiny model which was trained on the task of speech recognition.
whisper-large-v3-turbo
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.
aura-1
Aura is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.
uform-gen2-qwen-500m
UForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering. The model was pre-trained on the internal image captioning dataset and fine-tuned on public instructions datasets: SVIT, LVIS, VQAs datasets.
flux-2-dev
FLUX.2 [dev] is an image model from Black Forest Labs where you can generate highly realistic and detailed images, with multi-reference support.
gemma-7b-it-lora
This is a Gemma-7B base model that Cloudflare dedicates for inference with LoRA adapters. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.
bge-base-en-v1.5
BAAI general embedding (Base) model that transforms any given text into a 768-dimensional vector
gemma-sea-lion-v4-27b-it
SEA-LION stands for Southeast Asian Languages In One Network, which is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
mistral-small-3.1-24b-instruct
Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.
qwen3-30b-a3b-fp8
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support.
gemma-4-26b-a4b-it
Gemma 4 is Google's most intelligent family of open models, built from Gemini 3 research to maximize intelligence-per-parameter.
sqlcoder-7b-2
This model is intended to be used by non-technical users to understand data inside their SQL databases.
lucid-origin
Lucid Origin from Leonardo.AI is their most adaptable and prompt-responsive model to date. Whether you're generating images with sharp graphic design, stunning full-HD renders, or highly specific creative direction, it adheres closely to your prompts, renders text with accuracy, and supports a wide array of visual styles and aesthetics β from stylized concept art to crisp product mockups.
granite-4.0-h-micro
Granite 4.0 instruct models deliver strong performance across benchmarks, achieving industry-leading results in key agentic tasks like instruction following and function calling. These efficiencies make the models well-suited for a wide range of use cases like retrieval-augmented generation (RAG), multi-agent workflows, and edge deployments.
llama-3.3-70b-instruct-fp8-fast
Llama 3.3 70B quantized to fp8 precision, optimized to be faster.
dreamshaper-8-lcm
Stable Diffusion model that has been fine-tuned to be better at photorealism without sacrificing range.
llama-2-7b-chat-hf-lora
This is a Llama2 base model that Cloudflare dedicated for inference with LoRA adapters. Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.
phoenix-1.0
Phoenix 1.0 is a model by Leonardo.Ai that generates images with exceptional prompt adherence and coherent text.
nova-3
Transcribe audio using Deepgramβs speech-to-text model
smart-turn-v2
An open source, community-driven, native audio turn detection model in 2nd version
plamo-embedding-1b
PLaMo-Embedding-1B is a Japanese text embedding model developed by Preferred Networks, Inc. It can convert Japanese text input into numerical vectors and can be used for a wide range of applications, including information retrieval, text classification, and clustering.
stable-diffusion-xl-lightning
SDXL-Lightning is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
stable-diffusion-xl-base-1.0
Diffusion-based text-to-image generative model by Stability AI. Generates and modify images based on text prompts.
deepseek-r1-distill-qwen-32b
DeepSeek-R1-Distill-Qwen-32B is a model distilled from DeepSeek-R1 based on Qwen2.5. It outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.
llama-2-7b-chat-int8
Quantized (int8) generative text model with 7 billion parameters from Meta
llama-3.1-8b-instruct-fp8
Llama 3.1 8B quantized to FP8 precision
flux-1-schnell
FLUX.1 [schnell] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions.
