modelstop.top
Home/All Models

AI Model Catalogue

Browse 20 models across providers, modalities, and use cases.

๐ŸŽ™๏ธ Audio & Speech

20 models ยท Page 1 of 1

whisper-large-v3-turbo

openai

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.

audiofree
ctx$0.00/1M in
Explore specs and pricingView details โ†’

whisper-tiny-en

openai

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalize to many datasets and domains without the need for fine-tuning. This is the English-only version of the Whisper Tiny model which was trained on the task of speech recognition.

audiofree
ctx$0.00/1M in
Explore specs and pricingView details โ†’

whisper

openai

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

audiomultilingualfree
ctx$0.00/1M in
Explore specs and pricingView details โ†’

whisper-medium

openai

Open-source whisper-medium model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

whisper-tiny

openai

Open-source whisper-tiny model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

whisper-large-v3

openai

Open-source whisper-large-v3 model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

whisper-base

openai

Open-source whisper-base model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

whisper-large-v3-turbo

openai

Open-source whisper-large-v3-turbo model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

whisper-small

openai

Open-source whisper-small model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

gpt-audio-1.5

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

gpt-audio-mini-2025-12-15

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

gpt-audio-mini-2025-10-06

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

gpt-4o-mini-audio-preview-2024-12-17

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

gpt-4o-audio-preview-2024-12-17

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

gpt-audio-2025-08-28

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

gpt-4o-mini-audio-preview

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

gpt-4o-audio-preview-2025-06-03

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

OpenAI: GPT-4o Audio

openai

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs...

textaudiolong-context
128,000 ctx$2.50/1M in
Explore specs and pricingView details โ†’

OpenAI: GPT Audio Mini

openai

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Input is priced at $0.60 per million...

textaudiocheap
128,000 ctx$0.60/1M in
Explore specs and pricingView details โ†’

OpenAI: GPT Audio

openai

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced...

textaudiolong-context
128,000 ctx$2.50/1M in
Explore specs and pricingView details โ†’