What is Whisper? — AI Glossary

James Swierczewski

May 16, 2026

What it is: Whisper is OpenAI‘s speech-to-text (ASR) model. It transcribes audio into text across 99+ languages with high accuracy and is open-source, meaning developers can run it themselves.
Who it is for: Anyone who works with audio transcription, voice notes, or accessibility. Used by podcast tools, voice assistants, and meeting recorders.
Best if: You need to convert spoken audio to text — whether for podcasting, accessibility, or analysis.
Skip if: You only use voice assistants like Siri or Alexa — they have their own built-in speech recognition. Want one practical AI workflow every morning? Subscribe to our free daily newsletter.

Table of Contents

What is Whisper?

Whisper is a speech recognition (audio-to-text) model released open-source by OpenAI in September 2022. It was trained on 680,000 hours of multilingual audio and can transcribe speech in 99+ languages, translate non-English audio into English, and detect language automatically.

Unlike most OpenAI products, Whisper has open weights — anyone can download the model and run it locally. This made it instantly popular across the AI ecosystem; it’s now the default speech recognition engine behind dozens of products, from podcast transcription tools to voice assistants to accessibility apps.

Why does Whisper matter?

Before Whisper, accurate multilingual speech recognition was expensive and locked behind cloud APIs (Google Cloud Speech, AWS Transcribe). Whisper made the state of the art free, downloadable, and runnable on a laptop. That changed the economics of voice-enabled software.

Today Whisper powers transcription in many of the products you use: Otter.ai, Riverside, Descript, the dictation features in many apps, and OpenAI’s own ChatGPT Voice. The model has been continually improved (Whisper Large v3 is the current best) and remains the leading open-source option for speech-to-text.

How do you use Whisper?

Three main ways:

Inside other products — if you use ChatGPT Voice, Otter.ai, Descript, Riverside, or any modern transcription tool, you’re likely using Whisper without knowing it.
OpenAI Whisper API — developers can transcribe audio via API call (~$0.006/minute in 2026).
Self-hosted — download the open-weight model from Hugging Face and run locally using whisper.cpp, faster-whisper, or the Python whisper library.

Related terms

Learn more on Beginners in AI

AI Glossary

Sources and further reading

Last reviewed: May 2026. AI terminology evolves quickly — verify specifics on the official source pages above.

Get Smarter About AI Every Morning

Free daily newsletter — one term, one tool, one tip. Plain English.

Free forever. Unsubscribe anytime.

What Are Gemini Gems? A Guide

Best AI Prompts for HR

What Is Google Gemini? A Guide