I Built a Sleep Talking Recorder in One Night

My girlfriend told me I talk in my sleep. I didn't believe her. So I did what any reasonable developer would do at midnight: I built an app to catch myself in the act.

The Problem

Sleep talking is one of those things everyone's heard about but almost nobody actually records. There are a few apps out there — Sleep Talk Recorder on iOS, for instance — but they're either paywalled, privacy-nightmare black boxes, or just bad UX. What I wanted was simple:

Hit record before bed
Wake up to a digest of everything weird that happened
Actually hear (and read) what I said

The hard part isn't the recording. It's the filtering. You don't want an 8-hour audio file. You want the 30 seconds that matter.

The Core Insight

The Web Audio API gives you real-time access to microphone input, including volume levels. The trick is voice activity detection (VAD) — a simple threshold-based system that starts capturing audio only when the ambient noise crosses a certain dB level, and stops after a period of silence.

audioAnalyzer.ts

const analyser = audioContext.createAnalyser();
analyser.fftSize = 2048;
const dataArray = new Uint8Array(analyser.frequencyBinCount);

function checkAudioLevel() {
  analyser.getByteFrequencyData(dataArray);
  const avg = dataArray.reduce((a, b) => a + b) / dataArray.length;
  const db = 20 * Math.log10(avg / 255);
  
  if (db > THRESHOLD_DB) {
    startCapture(); // -40dB works well for quiet rooms
  }
}

Each captured segment gets a timestamp, duration, and audio blob stored in the browser via the MediaRecorder API. When you wake up, you tap “View Last Night” and see a timeline of events.

Technical Decisions

No backend. Everything runs in the browser. Audio blobs live in localStorage (for short clips) and IndexedDB (for longer ones). This means zero privacy concerns — your sleep audio never leaves your device. That was non-negotiable.

AI transcription via OpenAI Whisper. When you review your night, each audio segment gets sent to Whisper for transcription. I debated doing this on-device with Whisper.js, but the accuracy delta is still too large for something as chaotic as sleep speech (low volume, mumbled, unpredictable). The API call happens client-side with the user's own key — keeps the product free to run forever.

Classification heuristics. Before spending an API call on transcription, I run a simple frequency-domain classifier to guess what type of sound it is: speech (concentrated energy in 300–3000 Hz), snoring (rhythmic low-frequency bursts), or movement (broadband transient). It's not perfect but it labels 80% of clips correctly and saves unnecessary API calls.

What Surprised Me

The biggest surprise was how emotional sleep recordings feel. When I listened back to my first test session, it wasn't just funny — it was genuinely weird to hear yourself be so unguarded. There's something vulnerable about it. I ended up spending a lot more time on the UX around “reviewing” a session than I expected, because it needed to feel safe and playful rather than clinical.

The second surprise: threshold tuning. -40 dB sounds quiet, but in a real bedroom with an AC unit, partner breathing, and street noise, you're constantly triggering. I ended up building a calibration step that samples 30 seconds of ambient noise on first launch and sets the threshold dynamically. Annoying edge case, but essential.

What's Next

On-device Whisper via WASM when model quality catches up
Sleep quality scoring based on movement frequency and speech intensity
Shared highlights — send your partner a voice message of them being absurd at 3am
Streaks and trends — your speech frequency over 30 nights, labeled by life events

Curious what you're saying at night?

sleep-talker-recorder.vercel.app →

Hit record before bed tonight. I'll be curious what you find.