Voice Transcription
Coeus can transcribe audio: either from your microphone in real time or from an audio file you import.
Recording your voice
Click the mic button in the chat bar. Coeus starts recording from your microphone. Talk. Click the button again to stop.
The transcript shows up as your message. You can edit it before sending, or just send it as-is.
If you're dictating a note rather than asking a question, you can tell the AI: "Save this as a note." Or just start your message with the note's title.
Importing an audio file
Drop an audio file onto the Coeus window or use the import button (paperclip) to select one. Coeus transcribes it and creates a note with the transcript.
Supported formats: MP3, WAV, M4A, MP4, MOV, OGG, WebM, FLAC, AAC.
Good for recorded meetings, voice memos, or podcast episodes you want to take notes on.
YouTube transcription
Paste a YouTube URL into the chat bar. Coeus detects it and shows a banner asking what you want to do. Click Transcribe and Coeus downloads the audio and transcribes it.
The transcript is saved as a note you can search and ask questions about.
Transcription modes
There are two ways Coeus can transcribe audio. You pick one in Settings → Integrations → Speech & Transcription.
Local Whisper (default)
Coeus runs Whisper on your machine. Nothing gets sent to any server.
There are four model sizes:
| Model | Size | Speed | Accuracy |
|---|---|---|---|
| tiny.en | 75 MB | Fastest | Lower |
| base.en | 142 MB | Fast | Good |
| small.en | 466 MB | Medium | Better |
| medium.en | 1.5 GB | Slower | Best |
Start with base.en. It's accurate enough for most speech and downloads quickly.
To download a model: Settings → Integrations → Speech & Transcription → Download model.
OpenAI Transcription API
Sends your audio to OpenAI's gpt-4o-mini-transcribe model. More accurate than local Whisper for some accents and noisy audio. Costs a small amount per minute.
To use it, enter your OpenAI API key in the Speech settings. Audio files are sent to OpenAI and the transcript is returned. The audio is not stored.
Transcription quality tips
- Speak clearly and at a normal pace
- A quiet environment helps a lot with local Whisper
- The
small.enormedium.enmodels handle accents and background noise better thantiny.en - If you're transcribing long recordings, expect it to take a little while with local models