๐จ Image Generation
273 models ยท Page 8 of 8
stems-separator
Image to separate stems from a song, using demucs and spleeter
q3-pro
High-fidelity video generation with text-to-video, image-to-video, and start-end-to-video modes. Up to 16 seconds at 1080p with synchronized audio.
wan-2.7-image
Generate and edit images with Alibaba's Wan 2.7
ffhqdat-4x-upscaler
4x face image upscaler trained on FFHQ dataset using DAT (Dual Aggregation Transformer) architecture. Optimized for portrait and face photos.
flux-2-max
The highest fidelity image model from Black Forest Labs
product-photo-studio
Generate professional e-commerce product photos from a single image. Automatically removes background, creates realistic studio scenes, and adds natural shadows.
android-dream-v4
A custom Flux LoRA model trained on painterly illustrated poster art inspired by Blade Runner 2049. The style features atmospheric cyberpunk cityscapes with dramatic scale โ tiny silhouetted figures dwarfed by massive holographic projections and towering
p-image-upscale
Fastest image upscaler in the world (<1s) supporting outputs up to 128 MP.
geocalib
GeoCalib (ECCV 2024): Single-image camera calibration. Estimates focal length, FoV, distortion, roll and pitch from one image using a deep net + Levenberg-Marquardt optimizer. Works on both outdoor and indoor scenes.
seedvr2
๐ฅ SeedVR2: one-step video & image restoration with 7B and Adjustable Resolution
grok-imagine-r2v
Generate videos guided by reference images using xAI's Grok Imagine Video model
flux-2-pro
High-quality image generation and editing with support for eight reference images
siglip-large-patch16-384
Get embeddings for image using siglip-large-patch16-384
ernie-image
ERNIE-Image is an open text-to-image generation model developed by the ERNIE-Image team at Baidu
lucy-edit-2
Edit and transform videos with text prompts and reference images. Style transfers, object replacement, character transformation, and more.
p-image-edit
A sub 1 second 0.01$ multi-image editing model built for production use cases. For image generation, check out p-image here: https://replicate.com/prunaai/p-image
firered-image-edit-1.1
FireRed-Image-Edit 1.1 is a general-purpose image editing model that delivers high-fidelity and consistent editing across a wide range of scenarios.
veo-3.1
New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support
Stable Diffusion 3.5 Large
Stable Diffusion 3.5 Large is Stability AI's most capable text-to-image model, delivering photorealistic and creative imagery with excellent prompt adherence and detail. Features multimodal diffusion transformer architecture.
Amazon Nova Pro
Amazon Nova Pro is a highly capable multimodal model with the best combination of accuracy, speed, and cost across a wide range of tasks. Supports text, image, and video inputs.
Amazon Nova Lite
Amazon Nova Lite is a very low-cost multimodal model that can process image, video, and text inputs. Fast and accurate for a wide range of tasks requiring visual and language understanding.
