modelstop.top
Home/All Models

AI Model Catalogue

Browse 125 models across providers, modalities, and use cases.

๐ŸŽ™๏ธ Audio & Speech

125 models ยท Page 2 of 4

wav2vec2-large-xlsr-53-greek

jonatasgrosman

Open-source wav2vec2-large-xlsr-53-greek model from jonatasgrosman โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

reverb-diarization-v1

revai

Open-source reverb-diarization-v1 model from revai โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

parakeet-tdt-0.6b-v3

mlx-community

Open-source parakeet-tdt-0.6b-v3 model from mlx-community โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

parakeet-tdt-0.6b-v3-coreml

fluidinference

Open-source parakeet-tdt-0.6b-v3-coreml model from fluidinference โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

wav2vec2-large-xlsr-53-telugu

anuragshas

Open-source wav2vec2-large-xlsr-53-telugu model from anuragshas โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

Qwen3-ASR-0.6B

qwen

Open-source Qwen3-ASR-0.6B model from qwen โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

faster-whisper-small

systran

Open-source faster-whisper-small model from systran โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

parakeet-tdt-0.6b-v3

nvidia

Open-source parakeet-tdt-0.6b-v3 model from nvidia โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

wav2vec2-large-xlsr-53-arabic

jonatasgrosman

Open-source wav2vec2-large-xlsr-53-arabic model from jonatasgrosman โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

whisper-tiny

openai

Open-source whisper-tiny model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

whisper-base

openai

Open-source whisper-base model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

whisper-large-v3-turbo

openai

Open-source whisper-large-v3-turbo model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

wav2vec2-large-xlsr-53-portuguese

jonatasgrosman

Open-source wav2vec2-large-xlsr-53-portuguese model from jonatasgrosman โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

wav2vec2-large-xlsr-53-russian

jonatasgrosman

Open-source wav2vec2-large-xlsr-53-russian model from jonatasgrosman โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

whisperkit-coreml

argmaxinc

Open-source whisperkit-coreml model from argmaxinc โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

speaker-diarization-community-1

pyannote

Open-source speaker-diarization-community-1 model from pyannote โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

wav2vec2-large-xlsr-korean

kresnik

Open-source wav2vec2-large-xlsr-korean model from kresnik โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

mms-1b-all

facebook

Open-source mms-1b-all model from facebook โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

mms-300m-1130-forced-aligner

mahmoudashraf

Open-source mms-300m-1130-forced-aligner model from mahmoudashraf โ€” available for download and self-hosting on Hugging Face.

audiofree
ctxFree in
Explore specs and pricingView details โ†’

wav2vec2-large-xlsr-53-japanese

jonatasgrosman

Open-source wav2vec2-large-xlsr-53-japanese model from jonatasgrosman โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

whisper-small

openai

Open-source whisper-small model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

speaker-diarization-3.1

pyannote

Open-source speaker-diarization-3.1 model from pyannote โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

whisper-large-v3

openai

Open-source whisper-large-v3 model from openai โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

wav2vec2-large-xlsr-53-chinese-zh-cn

jonatasgrosman

Open-source wav2vec2-large-xlsr-53-chinese-zh-cn model from jonatasgrosman โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

Qwen3-ASR-1.7B

qwen

Open-source Qwen3-ASR-1.7B model from qwen โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

wav2vec2-large-xlsr-53-polish

jonatasgrosman

Open-source wav2vec2-large-xlsr-53-polish model from jonatasgrosman โ€” available for download and self-hosting on Hugging Face.

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

Google Veo 3.0 + Audio

google

textaudiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

Google Veo 3.0 Fast + Audio

google

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

voxtral-mini-2507

mistralai

A mini audio understanding model released in July 2025

textaudiofree
32,768 ctxFree in
Explore specs and pricingView details โ†’

gpt-audio-mini-2025-12-15

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

voxtral-small-2507

mistralai

A small audio understanding model released in July 2025

textaudiofree
32,768 ctxFree in
Explore specs and pricingView details โ†’

voxtral-mini-2507

mistralai

A mini audio understanding model released in July 2025

textaudiofree
32,768 ctxFree in
Explore specs and pricingView details โ†’

gpt-audio-1.5

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’

voxtral-small-2507

mistralai

A small audio understanding model released in July 2025

textaudiofree
32,768 ctxFree in
Explore specs and pricingView details โ†’

kling-v3-video

kwaivgi

Kling Video 3.0: Generate cinematic videos up to 15 seconds with multi-shot control, native audio, and improved consistency

audiofree
Run locally
ctxFree in
Explore specs and pricingView details โ†’

gpt-4o-mini-audio-preview-2024-12-17

openai

textaudiofree
ctxFree in
Explore specs and pricingView details โ†’