๐ All Models
125 models ยท Page 1 of 4
aura-2-en
Aura-2 is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.
whisper-large-v3-turbo
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.
whisper-tiny-en
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalize to many datasets and domains without the need for fine-tuning. This is the English-only version of the Whisper Tiny model which was trained on the task of speech recognition.
aura-1
Aura is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.
nova-3
Transcribe audio using Deepgramโs speech-to-text model
aura-2-es
Aura-2 is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.
whisper
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
melotts
MeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai.
flux
Flux is the first conversational speech recognition model built specifically for voice agents.
smart-turn-v2
An open source, community-driven, native audio turn detection model in 2nd version
stable-audio-open-small
Open-source stable-audio-open-small model from stabilityai โ available for download and self-hosting on Hugging Face.
stable-audio-open-1.0
Open-source stable-audio-open-1.0 model from stabilityai โ available for download and self-hosting on Hugging Face.
wav2vec2-large-xlsr-open-brazilian-portuguese-v2
Open-source wav2vec2-large-xlsr-open-brazilian-portuguese-v2 model from lgris โ available for download and self-hosting on Hugging Face.
Voxtral-Mini-4B-Realtime-2602
Open-source Voxtral-Mini-4B-Realtime-2602 model from mistralai โ available for download and self-hosting on Hugging Face.
filipino-wav2vec2-l-xls-r-300m-official
Open-source filipino-wav2vec2-l-xls-r-300m-official model from khalsuu โ available for download and self-hosting on Hugging Face.
faster-whisper-tiny
Open-source faster-whisper-tiny model from systran โ available for download and self-hosting on Hugging Face.
parakeet-tdt-0.6b-v3
Open-source parakeet-tdt-0.6b-v3 model from mlx-community โ available for download and self-hosting on Hugging Face.
whisper-tiny
Open-source whisper-tiny model from openai โ available for download and self-hosting on Hugging Face.
nb-wav2vec2-1b-nynorsk
Open-source nb-wav2vec2-1b-nynorsk model from nbailab โ available for download and self-hosting on Hugging Face.
wav2vec2-large-xlsr-53-telugu
Open-source wav2vec2-large-xlsr-53-telugu model from anuragshas โ available for download and self-hosting on Hugging Face.
wav2vec2-large-xls-r-300m-Urdu
Open-source wav2vec2-large-xls-r-300m-Urdu model from kingabzpro โ available for download and self-hosting on Hugging Face.
parakeet-ctc-1.1b
Open-source parakeet-ctc-1.1b model from nvidia โ available for download and self-hosting on Hugging Face.
Qwen3-ASR-0.6B
Open-source Qwen3-ASR-0.6B model from qwen โ available for download and self-hosting on Hugging Face.
wav2vec2-xls-r-300m-cv7-turkish
Open-source wav2vec2-xls-r-300m-cv7-turkish model from mpoyraz โ available for download and self-hosting on Hugging Face.
faster-whisper-base
Open-source faster-whisper-base model from systran โ available for download and self-hosting on Hugging Face.
parakeet-tdt-0.6b-v3
Open-source parakeet-tdt-0.6b-v3 model from nvidia โ available for download and self-hosting on Hugging Face.
parakeet-tdt-0.6b-v3-coreml
Open-source parakeet-tdt-0.6b-v3-coreml model from fluidinference โ available for download and self-hosting on Hugging Face.
wav2vec2-large-xlsr-53-greek
Open-source wav2vec2-large-xlsr-53-greek model from jonatasgrosman โ available for download and self-hosting on Hugging Face.
whisper-medium
Open-source whisper-medium model from openai โ available for download and self-hosting on Hugging Face.
reverb-diarization-v1
Open-source reverb-diarization-v1 model from revai โ available for download and self-hosting on Hugging Face.
faster-whisper-small
Open-source faster-whisper-small model from systran โ available for download and self-hosting on Hugging Face.
wav2vec2-base-960h
Open-source wav2vec2-base-960h model from facebook โ available for download and self-hosting on Hugging Face.
overlapped-speech-detection
Open-source overlapped-speech-detection model from pyannote โ available for download and self-hosting on Hugging Face.
distil-whisper-large-v3-ptbr
Open-source distil-whisper-large-v3-ptbr model from freds0 โ available for download and self-hosting on Hugging Face.
distil-large-v3
Open-source distil-large-v3 model from distil-whisper โ available for download and self-hosting on Hugging Face.
wav2vec2-large-xlsr-53-arabic
Open-source wav2vec2-large-xlsr-53-arabic model from jonatasgrosman โ available for download and self-hosting on Hugging Face.
