Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalize to many datasets and domains without the need for fine-tuning. This is the English-only version of the Whisper Tiny model which was trained on the task of speech recognition.

audiofree

InputFree

Output$0.0000/1M

⚡81msp50

Explore specs and pricingView details →

whisper-large-v3-turbo

openai

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.

Explore specs and pricingView details →

aura-1

deepgram

Aura is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.

Explore specs and pricingView details →

aura-2-en

deepgram

Aura-2 is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.

Explore specs and pricingView details →

melotts

myshell-ai

MeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai.

audiofree

InputFree

⚡40msp50

Explore specs and pricingView details →

aura-2-es

Explore specs and pricingView details →

whisper

openai

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

audiomultilingualfree

InputFree

Output$0.0000/1M

⚡50msp50

Explore specs and pricingView details →

flux

deepgram

Flux is the first conversational speech recognition model built specifically for voice agents.

Explore specs and pricingView details →

smart-turn-v2

pipecat-ai

An open source, community-driven, native audio turn detection model in 2nd version

textaudiofree

InputFree

⚡271msp50

Explore specs and pricingView details →

nova-3

deepgram

Transcribe audio using Deepgram’s speech-to-text model

Explore specs and pricingView details →

stable-audio-open-1.0

stabilityai

Open-source stable-audio-open-1.0 model from stabilityai — available for download and self-hosting on Hugging Face.

textaudiofree

Run locally

InputFree

Explore specs and pricingView details →

stable-audio-open-small

stabilityai

Open-source stable-audio-open-small model from stabilityai — available for download and self-hosting on Hugging Face.

textaudiofree

Run locally

InputFree

Explore specs and pricingView details →

overlapped-speech-detection

pyannote

Open-source overlapped-speech-detection model from pyannote — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

whisper-medium

openai

Open-source whisper-medium model from openai — available for download and self-hosting on Hugging Face.

audiofree

Run locally

InputFree

Explore specs and pricingView details →

wav2vec2-xls-r-300m-cv7-turkish

mpoyraz

Open-source wav2vec2-xls-r-300m-cv7-turkish model from mpoyraz — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

wav2vec2-large-xlsr-open-brazilian-portuguese-v2

lgris

Open-source wav2vec2-large-xlsr-open-brazilian-portuguese-v2 model from lgris — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

distil-whisper-large-v3-ptbr

freds0

Open-source distil-whisper-large-v3-ptbr model from freds0 — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

wav2vec2-large-xls-r-300m-Urdu

kingabzpro

Open-source wav2vec2-large-xls-r-300m-Urdu model from kingabzpro — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

faster-whisper-base

systran

Open-source faster-whisper-base model from systran — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

wav2vec2-large-xlsr-53-greek

jonatasgrosman

Open-source wav2vec2-large-xlsr-53-greek model from jonatasgrosman — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

reverb-diarization-v1

revai

Open-source reverb-diarization-v1 model from revai — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

parakeet-tdt-0.6b-v3

mlx-community

Open-source parakeet-tdt-0.6b-v3 model from mlx-community — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

parakeet-tdt-0.6b-v3-coreml

fluidinference

Open-source parakeet-tdt-0.6b-v3-coreml model from fluidinference — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

wav2vec2-large-xlsr-53-telugu

anuragshas

Open-source wav2vec2-large-xlsr-53-telugu model from anuragshas — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

Qwen3-ASR-0.6B

qwen

Open-source Qwen3-ASR-0.6B model from qwen — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

faster-whisper-small

systran

Open-source faster-whisper-small model from systran — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

parakeet-tdt-0.6b-v3

nvidia

Open-source parakeet-tdt-0.6b-v3 model from nvidia — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

wav2vec2-base-960h

facebook

Open-source wav2vec2-base-960h model from facebook — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

Voxtral-Mini-4B-Realtime-2602

mistralai

Open-source Voxtral-Mini-4B-Realtime-2602 model from mistralai — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

whisper-tiny

openai

Open-source whisper-tiny model from openai — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

wav2vec2-large-xlsr-53-arabic

jonatasgrosman

Open-source wav2vec2-large-xlsr-53-arabic model from jonatasgrosman — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

speakerkit-pro

argmaxinc

Open-source speakerkit-pro model from argmaxinc — available for download and self-hosting on Hugging Face.

audiofree

InputFree

Explore specs and pricingView details →

← Prev 1 2 3 4 Next →