DeepScript
Product

The most accurate transcription for German-speaking dialects

We specialise in one thing: audio from the DACH region. Swiss German, Bavarian, Viennese, Low German — where generic models guess, we listen carefully.

3 free transcriptions · no credit card · data stays in Germany

app.deepscript.com
kickoff-meeting.mp3
Premium
Sprecher 100:04

Schön, dass es mit dem Termin geklappt hat.

Sprecher 200:09

Sehr gerne. Sollen wir direkt starten?

Sprecher 100:13

Ja ich nehme das Gespräch auf, ist das ok?

12:48

Most transcription services use a generic multilingual model that handles all 99 languages roughly the same. For clean Standard German that's fine. But the moment a Bernese client mumbles, a Viennese woman speaks rapid-fire or a Bavarian meeting tips into dialect, accuracy collapses. DeepScript took the other route: our Premium model is fine-tuned on over 50,000 hours of DACH audio — real business meetings, interviews and podcasts. That dataset comes out of Aliru GmbH's heritage of running Sally (sally.io), an AI meeting assistant, before DeepScript. The result: ~3–5% Word Error Rate on clean German, versus 7–9% for generic Whisper-large.

Proof

Why we can claim this

50,000+ hours of DACH audio in training

Real meetings, interviews and phone calls from Germany, Austria and Switzerland — no synthetic data.

~3–5% WER on clean German (Premium)

Measured against a curated Standard-German test set. Generic Whisper-large scores 7–9% on the same set.

Heritage: Sally (sally.io)

Aliru GmbH has been building meeting AI for DACH companies for years. That experience went straight into the DeepScript model.

ISO 27001 / 9001 / 14001 certified

Information security, quality management and environmental management — three certifications, one operation.

Servers in Nuremberg & Falkenstein

Our own Hetzner hardware in German data centres. No US-cloud subprocessor anywhere in the transcription path.

In practice

What this looks like in practice

We specialise in one thing: audio from the DACH region. Swiss German, Bavarian, Viennese, Low German — where generic models guess, we listen carefully.

  • Dialect handling for Swiss German, Bavarian, Viennese, Low German and Saxon — where generic models drift into English, we stay in context.
  • Speaker diarization included in both tiers. Standard reliably handles 2–6 speakers, Premium also separates voices that sound alike.
  • Custom vocabulary for proper nouns, jargon and company-specific acronyms — measurably boosts recognition of names like "Schwarz-Schilling" or "T1-weighted MRI".
  • 99 languages available (auto-detect or manual selection), DACH languages are prioritised for accuracy and latency.
  • Word-level timestamps with per-word confidence scores — visible in the editor, exportable as JSON for downstream pipelines.
app.deepscript.com
Fasse die Kundengespräche der letzten Woche zusammen.
Ich habe 7 Transkripte über MCP abgerufen. Die drei wichtigsten Themen: Preis-Feedback, Feature-Wunsch Export-API und zwei Verlängerungen.via deepscript-mcp · 7 Quellen

How to use it

Up and running in a few steps

  1. 1

    1. Pick a model

    Standard (€0.18/h) for clean recordings and everyday conversations. Premium (€0.27/h) for dialect, noise, multiple speakers, or whenever the transcript needs to hold up to scrutiny.

  2. 2

    2. Add a vocabulary (optional)

    Drop proper nouns, product names and jargon into a list. The same vocabulary is then applied to every transcription in the project.

  3. 3

    3. Upload or record live

    Drop in an audio or video file, or use live transcription via the microphone. Premium runs on the priority queue.

  4. 4

    4. Review and export

    In the editor: rename speakers, click any word for the audio position, use confidence colouring. Export as TXT, SRT, VTT or JSON.

FAQ

Frequently asked questions

How is DeepScript better than generic Whisper?+

Generic Whisper-large is trained on a broad mix of 99 languages — average across the board, excellent at none. We take the same architecture and continue training on over 50,000 hours of DACH-specific audio. On clean German that gives us ~3–5% WER, where generic Whisper-large measures 7–9% on the same set. The gap widens significantly on dialects.

Why focus on DACH?+

Before DeepScript, Aliru GmbH spent years running Sally (sally.io), an AI meeting assistant with a mostly German-speaking customer base. That produced a training corpus and operational experience generic providers simply don't have. We're building the product we ourselves needed.

Do I get the same quality for English?+

English is excellently supported via the Whisper base — WER typically lands at 4–6% on clean recordings. Our fine-tuning doesn't measurably improve it (English already dominated the original Whisper training). We outperform on DACH; on English we're roughly on par with the big providers.

What does speaker diarization mean in practice?+

Each word gets a speaker label in addition to its text ("Speaker 1", "Speaker 2" …). In the editor you can rename labels into real names. In SRT/VTT exports they appear as a prefix before each subtitle; in JSON as a field on every word. You can override any label manually.

How is Premium different from Standard?+

Three things: (1) the DACH fine-tuning is only active in the Premium model — Standard uses a leaner variant; (2) Premium runs on a priority queue with lower wait times; (3) speaker diarization is more finely tuned for similar-sounding voices. Standard is €0.18/h, Premium €0.27/h.

See it for yourself

Upload a file and see the result in minutes. Three transcriptions free, no credit card.

Best Transcription for DACH Dialects — DeepScript | DeepScript