modelstop.top
Home/All Models

AI Model Catalogue

Browse 91 models across providers, modalities, and use cases.

πŸ†“ Free & Open

91 models Β· Page 3 of 3

imagen-3-fast

google

A faster and cheaper Imagen 3 model, for when price or speed are more important than final image quality

visionfree
ctxFree in
Explore specs and pricingView details β†’

gemini-2.5-flash

google

Google’s hybrid β€œthinking” AI model optimized for speed and cost-efficiency

textreasoningfree
ctxFree in
Explore specs and pricingView details β†’

upscaler

google

Upscale images 2x or 4x times

visionfree
Run locally
ctxFree in
Explore specs and pricingView details β†’

nano-banana-pro

google

Google's state of the art image generation and editing model 🍌🍌

visionimagefree
ctxFree in
Explore specs and pricingView details β†’

nano-banana-2

google

Google's fast image generation model with conversational editing, multi-image fusion, and character consistency

visionimagefree
ctxFree in
Explore specs and pricingView details β†’

gemini-3.1-pro

google

Google's most intelligent model, with improved reasoning and a new medium thinking level

textreasoningfree
Run locally
ctxFree in
Explore specs and pricingView details β†’

imagen-4-ultra

google

Use this ultra version of Imagen 4 when quality matters more than speed and cost

visionfree
ctxFree in
Explore specs and pricingView details β†’

gemini-3.1-flash-tts

google

Google's fast, expressive text-to-speech model with 30 voices and 70+ language support

textfree
ctx$5.00/1M in
Explore specs and pricingView details β†’

lyria-3

google

Generate 30-second music clips from text prompts or images with Lyria 3, Google's music generation model

visionimagefree
ctxFree in
Explore specs and pricingView details β†’

veo-3.1

google

New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support

visionaudiofree
ctxFree in
Explore specs and pricingView details β†’

veo-3.1-lite

google

Google's cost-efficient video generation model with native audio, optimized for high-volume applications

audiofree
ctxFree in
Explore specs and pricingView details β†’

veo-3.1-fast

google

New and improved version of Veo 3 Fast, with higher-fidelity video, context-aware audio and last frame support

audiofree
ctxFree in
Explore specs and pricingView details β†’

lyria-3-pro

google

Generate full-length songs up to 3 minutes from text prompts or images with Lyria 3 Pro, Google's most capable music generation model

visionimagefree
ctxFree in
Explore specs and pricingView details β†’

gemma-3-1b-it

google

Open-source gemma-3-1b-it model from google β€” available for download and self-hosting on Hugging Face.

textfree
ctxFree in
Explore specs and pricingView details β†’

t5gemma-s-s-prefixlm

google

Open-source t5gemma-s-s-prefixlm model from google β€” available for download and self-hosting on Hugging Face.

textfree
Run locally
ctx$0.00/1M in
Explore specs and pricingView details β†’

Google: Lyria 3 Clip Preview

google

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate...

textvisionimage
1,048,576 ctx$0.00/1M in
Explore specs and pricingView details β†’

Google: Lyria 3 Pro Preview

google

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz...

textvisionimage
1,048,576 ctx$0.00/1M in
Explore specs and pricingView details β†’

Google: Gemma 4 31B (free)

google

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...

textvisionmultimodal
262,144 ctx$0.14/1M in
Explore specs and pricingView details β†’

Google: Gemma 4 26B A4B (free)

google

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference β€” delivering near-31B quality at...

textvisionmultimodal
262,144 ctx$0.12/1M in
Explore specs and pricingView details β†’