Open SourceMedium Impact·Thursday, March 26, 2026

Cohere Launches Open-Source 2B Speech Recognition Model for Enterprise

Cohere released Transcribe, a 2B-parameter open-source ASR model beating major rivals on Hugging Face's leaderboard with a 5.42 WER.

What happened

Cohere launched Transcribe, its first voice model — a 2B-parameter open-source automatic speech recognition model available via API for free and on its Model Vault platform. It supports 14 languages and processes 525 minutes of audio per minute, outperforming ElevenLabs Scribe v2, IBM Granite 4.0 1B, and Zoom Scribe v1 on the Hugging Face Open ASR leaderboard with an average WER of 5.42. Human evaluators gave it a 61% win rate across accuracy, coherence, and usability — though it underperforms on Portuguese, German, and Spanish. Cohere plans to integrate Transcribe into its enterprise agent platform, North.

Why it matters to you

personalized

Transcribe is a 2B-parameter model that runs on consumer GPUs and processes audio at 525x real-time — a rare combination of speed, cost, and deployability. Available via Cohere's API for free today, it undercuts paid ASR providers on both price and benchmark performance. The one caveat: WER on Portuguese, German, and Spanish lags behind competitors, so validate against your target language corpus before replacing your current provider.

What to do about it

Pull Transcribe via Cohere's API this week and benchmark it against your current ASR provider on a 10-minute real-world audio sample — if WER is within 1 point and latency holds, you've eliminated your transcription cost line entirely.

Try this now

Cohere API5 min

1
Go to dashboard.cohere.com, sign in, and grab your API key from the API Keys tab

Community

5 comments