A Claude Code Skill for Audio Transcription Link to heading

Disclaimer: Yes, this post was created with the help of AI — why not?

Finding a good speech-to-text service that handles Catalan well is genuinely hard. Most commercial offerings treat it as an afterthought, and the results show. After one too many mangled transcriptions, I decided to solve this myself by writing a Claude Code skill that uses OpenAI’s Whisper model locally.

The Skill Link to heading

The skill hooks into Claude Code’s skill system. When triggered, it runs a short Python snippet inside a pre-configured virtual environment:

source .venv/bin/activate && \
python3 << 'EOF'
import whisper
model = whisper.load_model("base")
result = model.transcribe("<path-to-file>")
print(result["text"])
EOF

The skill activates automatically when I mention an audio file or drag one into the chat. Claude Code identifies the file path, runs the script, and returns the full transcript.

Telegram Integration Link to heading

The workflow that makes this actually useful day-to-day is a Telegram bot. I send voice messages directly to a private chat channel, the bot picks them up, passes the audio file to Claude Code with this skill loaded, and sends the transcript back. No apps to open, no files to move around — I just talk and get text.

Wrap-up Link to heading

A single-file skill and a local Whisper model turned a frustrating gap — no decent Catalan speech-to-text — into a smooth, private, zero-cost workflow. If you work in a minority language and have hit the same wall, this setup is worth an afternoon of your time.