How do I properly transcribe an interview?

Question

Accepted Answer

Clean recording + AI first pass + 30-60 minutes of editing per interview hour produces publication-ready transcripts in a fraction of the time.

A good interview transcript happens in three phases – prep, recording, and editing. AI now replaces the typing pass; it doesn't replace the care.

**Preparation**
- One mic per speaker position: each person gets their own (lavalier or headset). This prevents reverb and speaker mix-ups. If you only have one mic, place it equidistant between speakers.
- Choose a room with carpet, curtains, soft furniture – not a bare conference room.
- Recording device: a dedicated recorder (Zoom H1n, H5) is more reliable than a laptop. Run a backup recording on your phone.
- Set levels before you start – strong peaks without clipping.

**During the interview**
- Open with a clean intro that names everyone ("Today I'm speaking with Maria Müller, …") – helps diarization later.
- One person finishes before the next starts – avoid affirmative mm-hmms and yeah-overlaps. They wreck diarization.
- On Zoom/Teams: capture local audio, not just the cloud version (which is compressed).

**AI first pass**
- Upload to a provider like DeepScript, pick the Premium model for research interviews (diarization, higher accuracy), set the correct language, and add a custom vocabulary with proper nouns and jargon if relevant.
- Processing: 2-5 minutes per hour of audio.

**Editing**
AI transcripts are a first draft. Budget 30-60 minutes of editing per audio hour. Steps:
1. Verify speaker labels: AI sometimes swaps speakers on short interjections. Fix them and use real names instead of "Speaker 1" / "Speaker 2".
2. Cross-check proper nouns and jargon.
3. Choose a format:
   - **Verbatim**: every "um," every stutter, every pause noted. For conversation analysis, phonetics, court records.
   - **Clean / smooth**: filler words removed, grammar smoothed. For publication, book quotes, journalism.
   - **Smart verbatim**: middle ground – content complete but lightly cleaned. Default for qualitative research.
4. Insert timestamps every 1-2 minutes or at topic shifts.
5. Append a final paragraph with date, interviewer/interviewee, duration, location.

**Export**
For research: DOCX with speaker labels and timestamps; for publication: plain text with structured paragraphs. DeepScript exports TXT, SRT, VTT, JSON directly – DOCX via Pandoc or Word.

How do I properly transcribe an interview?

Related questions

Still have a question?