Skip to main content

Quick Thoughts

Listenr CLI streaming real-time ASR transcriptions to Lemonade — Whisper-Tiny loaded alongside gpt-oss-20b-mxfp4-GGUF.

Ongoing Series

Listenr & Fine-Tuning: Building a Personal ASR Pipeline

A running log of building Listenr, a privacy-first tool for capturing real-world conversational audio and using it to fine-tune Whisper speech recognition models locally on AMD hardware.

2 parts · First published

The goal is simple: as an individual, build enough high-quality audio clips and transcriptions to meaningfully improve a speech recognition model — privately, cheaply, and with a scalable process.

That means solving three problems in sequence:

  1. Capture — a frictionless way to record real-world conversational audio without sending anything to the cloud.
  2. Label — automated transcription plus LLM-assisted post-processing to get clean, accurate ground-truth text.
  3. Train — a reproducible fine-tuning pipeline that runs on consumer AMD hardware and produces a model that actually understands how I talk.

This series aims to document the whole journey.

All Parts

  1. Listenr CLI streaming real-time ASR transcriptions using Whisper-Tiny ASR model alongside gpt-oss-20b-mxfp4-GGUF.
    Part 1

    How I locally fine-tuned Whisper using my own voice data and some effort

    Off-the-shelf Whisper models are impressive but struggle with personal vocabulary, accents, and jargon not present in their training data. This post covers why standard open datasets fall short for personal fine-tuning and how I built Listenr to continuously capture and transcribe my own conversational audio as a training set.

  2. Part 2 Coming soon

    Fine-Tuning Whisper on AMD Hardware: ROCm, Docker, and Getting It Running

    Fine-tuning Whisper on a consumer AMD GPU requires navigating ROCm's Docker setup, HIP device flags, and finding the right PyTorch image tag before a single training step can run. This post documents the specific configuration that finally got it working and the pain points encountered along the way.