๐ All Models
1,316 models ยท Page 20 of 37
llada2.1-flash
The smartest diffusion language model up to ~800+ tps
seedream-4.5
Seedream 4.5: Upgraded Bytedance image model with stronger spatial understanding and world knowledge
depth-anything-v3-metric-pano
Monocular metric depth estimation for panoramic images
dotted-waveform-visualizer
Create a dotted waveform video from an audio file
op-replay-clipper-beta
Beta/RFC version of https://replicate.com/nelsonjchen/op-replay-clipper
platmoji-2.0
This is Platmoji 2, trained more to mimic emojis in an extremely similar way. (Realism in emojis go to Platmoji 1)
yoloe-11s
Ultralytics YOLOE-L Real-Time Seeing Anything model with 26.2M parameters. Achieves 52.6 mAP50-95 on COCO dataset. Optimized for real-time inference with 6.2 ms speed on T4 GPU..
lyria-3
Generate 30-second music clips from text prompts or images with Lyria 3, Google's music generation model
op-replay-clipper
GPU accelerated replay renderer / video data clipper for comma.ai connect's openpilot route data. SEE README.
llada2.1-mini
The fastest diffusion language model with up to ~1000+ tps
zozu
flux-2-max
The highest fidelity image model from Black Forest Labs
lofi
Lo-fi hip-hop music generation with ACE-Step 1.5 + LoRA
ffhqdat-4x-upscaler
4x face image upscaler trained on FFHQ dataset using DAT (Dual Aggregation Transformer) architecture. Optimized for portrait and face photos.
music-2.6
Generate full-length songs or instrumentals from a text prompt, with optional auto-generated lyrics
q3-turbo
Fast video generation with text-to-video, image-to-video, and start-end-to-video modes. Up to 16 seconds at 1080p with synchronized audio.
sdxl-cheetah
kling-v2.6-motion-control
Enables precise control of character actions and expressions from a reference image.
veo-3.1-lite
Google's cost-efficient video generation model with native audio, optimized for high-volume applications
veo-3.1
New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support
q3-pro
High-fidelity video generation with text-to-video, image-to-video, and start-end-to-video modes. Up to 16 seconds at 1080p with synchronized audio.
wan-2.7-image-pro
Generate and edit high-quality images with Alibaba's Wan 2.7 Pro with 4K output, thinking mode, text-to-image, multi-image editing, and image set generation
bgogo-feno
BiRefNet + Cutie video segmentation with stacked output video
product-photo-studio
Generate professional e-commerce product photos from a single image. Automatically removes background, creates realistic studio scenes, and adds natural shadows.
ernie-image-turbo
ERNIE-Image is an open text-to-image generation model developed by the ERNIE-Image team at Baidu
medibot
This chatbot is designed to answer medical-related questions and has been fine-tuned on a large dataset. It is built on the Qwen 2.5 3B Instruct base model.
p-image-edit
A sub 1 second 0.01$ multi-image editing model built for production use cases. For image generation, check out p-image here: https://replicate.com/prunaai/p-image
veo-3.1-fast
New and improved version of Veo 3 Fast, with higher-fidelity video, context-aware audio and last frame support
android-dream-v4
A custom Flux LoRA model trained on painterly illustrated poster art inspired by Blade Runner 2049. The style features atmospheric cyberpunk cityscapes with dramatic scale โ tiny silhouetted figures dwarfed by massive holographic projections and towering
video-super-resolution-rife-pro
Super video quality enhancement featuring fast upscaling with TensorRT and frame interpolation with RIFE.
logo-marks-v1
geocalib
GeoCalib (ECCV 2024): Single-image camera calibration. Estimates focal length, FoV, distortion, roll and pitch from one image using a deep net + Levenberg-Marquardt optimizer. Works on both outdoor and indoor scenes.
qwen3-tts
A unified Text-to-Speech demo featuring three powerful modes: Voice, Clone and Design
yolov8s-worldv2
Ultralytics YOLOv8s worldv2 Real-Time Open-Vocabulary Object Detection model with 12.7M parameters. Achieves 37.7 mAP50-95 on COCO dataset. Optimized for real-time inference
imagen-4-ultra
Use this ultra version of Imagen 4 when quality matters more than speed and cost
qwen3guard-gen-4b
A 4B-parameter safety and content moderation model that classifies user prompts and assistant responses as Safe, Unsafe, or Controversial with fine-grained category labels and refusal detection. Supports 119 languages.
