modelstop.top
Home/All Models

AI Model Catalogue

Browse 1,316 models across providers, modalities, and use cases.

๐ŸŒ All Models

1,316 models ยท Page 20 of 37

llada2.1-flash

prunaai

The smartest diffusion language model up to ~800+ tps

textimagefree
ctxFree in
Explore specs and pricingView details โ†’

seedream-4.5

bytedance

Seedream 4.5: Upgraded Bytedance image model with stronger spatial understanding and world knowledge

visionfree
ctxFree in
Explore specs and pricingView details โ†’

depth-anything-v3-metric-pano

vufinder

Monocular metric depth estimation for panoramic images

visionfree
ctxFree in
Explore specs and pricingView details โ†’

dotted-waveform-visualizer

lucataco

Create a dotted waveform video from an audio file

audiofree
ctxFree in
Explore specs and pricingView details โ†’

op-replay-clipper-beta

nelsonjchen

Beta/RFC version of https://replicate.com/nelsonjchen/op-replay-clipper

textfree
ctxFree in
Explore specs and pricingView details โ†’

platmoji-2.0

appmeloncreator

This is Platmoji 2, trained more to mimic emojis in an extremely similar way. (Realism in emojis go to Platmoji 1)

textfree
ctxFree in
Explore specs and pricingView details โ†’

yoloe-11s

ultralytics

Ultralytics YOLOE-L Real-Time Seeing Anything model with 26.2M parameters. Achieves 52.6 mAP50-95 on COCO dataset. Optimized for real-time inference with 6.2 ms speed on T4 GPU..

textfree
ctxFree in
Explore specs and pricingView details โ†’

lyria-3

google

Generate 30-second music clips from text prompts or images with Lyria 3, Google's music generation model

visionimagefree
ctxFree in
Explore specs and pricingView details โ†’

op-replay-clipper

nelsonjchen

GPU accelerated replay renderer / video data clipper for comma.ai connect's openpilot route data. SEE README.

free
ctxFree in
Explore specs and pricingView details โ†’

llada2.1-mini

prunaai

The fastest diffusion language model with up to ~1000+ tps

textimagefree
ctxFree in
Explore specs and pricingView details โ†’

zozu

fultonyard

textfree
ctxFree in
Explore specs and pricingView details โ†’

flux-2-max

black-forest-labs

The highest fidelity image model from Black Forest Labs

visionfree
ctxFree in
Explore specs and pricingView details โ†’

lofi

frow

Lo-fi hip-hop music generation with ACE-Step 1.5 + LoRA

audiofree
ctxFree in
Explore specs and pricingView details โ†’

ffhqdat-4x-upscaler

supersambat

4x face image upscaler trained on FFHQ dataset using DAT (Dual Aggregation Transformer) architecture. Optimized for portrait and face photos.

visionfree
ctxFree in
Explore specs and pricingView details โ†’

music-2.6

minimax

Generate full-length songs or instrumentals from a text prompt, with optional auto-generated lyrics

audiofree
ctxFree in
Explore specs and pricingView details โ†’

q3-turbo

vidu

Fast video generation with text-to-video, image-to-video, and start-end-to-video modes. Up to 16 seconds at 1080p with synchronized audio.

visionimageaudio
ctxFree in
Explore specs and pricingView details โ†’

sdxl-cheetah

prunaai

textfree
ctxFree in
Explore specs and pricingView details โ†’

kling-v2.6-motion-control

kwaivgi

Enables precise control of character actions and expressions from a reference image.

visionfree
ctxFree in
Explore specs and pricingView details โ†’

veo-3.1-lite

google

Google's cost-efficient video generation model with native audio, optimized for high-volume applications

audiofree
ctxFree in
Explore specs and pricingView details โ†’

veo-3.1

google

New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support

visionaudiofree
ctxFree in
Explore specs and pricingView details โ†’

q3-pro

vidu

High-fidelity video generation with text-to-video, image-to-video, and start-end-to-video modes. Up to 16 seconds at 1080p with synchronized audio.

visionimageaudio
ctxFree in
Explore specs and pricingView details โ†’

wan-2.7-image-pro

wan-video

Generate and edit high-quality images with Alibaba's Wan 2.7 Pro with 4K output, thinking mode, text-to-image, multi-image editing, and image set generation

visionimagereasoning
ctxFree in
Explore specs and pricingView details โ†’

bgogo-feno

hcolde

BiRefNet + Cutie video segmentation with stacked output video

free
ctxFree in
Explore specs and pricingView details โ†’

product-photo-studio

i-tokyo

Generate professional e-commerce product photos from a single image. Automatically removes background, creates realistic studio scenes, and adds natural shadows.

visionimagefree
ctxFree in
Explore specs and pricingView details โ†’

ernie-image-turbo

prunaai

ERNIE-Image is an open text-to-image generation model developed by the ERNIE-Image team at Baidu

visionimagefree
ctxFree in
Explore specs and pricingView details โ†’

medibot

frvkygg

This chatbot is designed to answer medical-related questions and has been fine-tuned on a large dataset. It is built on the Qwen 2.5 3B Instruct base model.

textinstructfree
ctxFree in
Explore specs and pricingView details โ†’

p-image-edit

prunaai

A sub 1 second 0.01$ multi-image editing model built for production use cases. For image generation, check out p-image here: https://replicate.com/prunaai/p-image

visionimagefree
ctxFree in
Explore specs and pricingView details โ†’

veo-3.1-fast

google

New and improved version of Veo 3 Fast, with higher-fidelity video, context-aware audio and last frame support

audiofree
ctxFree in
Explore specs and pricingView details โ†’

android-dream-v4

interfaceconjurer

A custom Flux LoRA model trained on painterly illustrated poster art inspired by Blade Runner 2049. The style features atmospheric cyberpunk cityscapes with dramatic scale โ€” tiny silhouetted figures dwarfed by massive holographic projections and towering

visionfree
ctxFree in
Explore specs and pricingView details โ†’

video-super-resolution-rife-pro

bitflow

Super video quality enhancement featuring fast upscaling with TensorRT and frame interpolation with RIFE.

free
ctxFree in
Explore specs and pricingView details โ†’

logo-marks-v1

stefivanovs

textfree
ctxFree in
Explore specs and pricingView details โ†’

geocalib

visionaix

GeoCalib (ECCV 2024): Single-image camera calibration. Estimates focal length, FoV, distortion, roll and pitch from one image using a deep net + Levenberg-Marquardt optimizer. Works on both outdoor and indoor scenes.

visionfree
ctxFree in
Explore specs and pricingView details โ†’

qwen3-tts

qwen

A unified Text-to-Speech demo featuring three powerful modes: Voice, Clone and Design

textfree
ctxFree in
Explore specs and pricingView details โ†’

yolov8s-worldv2

ultralytics

Ultralytics YOLOv8s worldv2 Real-Time Open-Vocabulary Object Detection model with 12.7M parameters. Achieves 37.7 mAP50-95 on COCO dataset. Optimized for real-time inference

textfree
ctxFree in
Explore specs and pricingView details โ†’

imagen-4-ultra

google

Use this ultra version of Imagen 4 when quality matters more than speed and cost

visionfree
ctxFree in
Explore specs and pricingView details โ†’

qwen3guard-gen-4b

ditto--ai

A 4B-parameter safety and content moderation model that classifies user prompts and assistant responses as Safe, Unsafe, or Controversial with fine-grained category labels and refusal detection. Supports 119 languages.

textfree
ctxFree in
Explore specs and pricingView details โ†’