Product

The most accurate transcription for German-speaking dialects

We specialise in one thing: audio from the DACH region. Swiss German, Bavarian, Viennese, Low German — where generic models guess, we listen carefully.

Try free See pricing

3 free transcriptions · no credit card · data stays in Germany

app.deepscript.com

kickoff-meeting.mp3

Premium

Sprecher 100:04

Schön, dass es mit dem Termin geklappt hat.

Sprecher 200:09

Sehr gerne. Sollen wir direkt starten?

Sprecher 100:13

Ja — ich nehme das Gespräch auf, ist das ok?

12:48

Most transcription services use a generic multilingual model that handles all 99 languages roughly the same. For clean Standard German that's fine. But the moment a Bernese client mumbles, a Viennese woman speaks rapid-fire or a Bavarian meeting tips into dialect, accuracy collapses. DeepScript took the other route: our Premium model is fine-tuned on over 50,000 hours of DACH audio — real business meetings, interviews and podcasts. That dataset comes out of Aliru GmbH's heritage of running Sally (sally.io), an AI meeting assistant, before DeepScript. The result: ~3–5% Word Error Rate on clean German, versus 7–9% for generic Whisper-large.

Proof

Why we can claim this

50,000+ hours of DACH audio in training

Real meetings, interviews and phone calls from Germany, Austria and Switzerland — no synthetic data.

~3–5% WER on clean German (Premium)

Measured against a curated Standard-German test set. Generic Whisper-large scores 7–9% on the same set.

Heritage: Sally (sally.io)

Aliru GmbH has been building meeting AI for DACH companies for years. That experience went straight into the DeepScript model.

ISO 27001 / 9001 / 14001 certified

Information security, quality management and environmental management — three certifications, one operation.

Servers in Nuremberg & Falkenstein

Our own Hetzner hardware in German data centres. No US-cloud subprocessor anywhere in the transcription path.

In practice

What this looks like in practice

We specialise in one thing: audio from the DACH region. Swiss German, Bavarian, Viennese, Low German — where generic models guess, we listen carefully.

Dialect handling for Swiss German, Bavarian, Viennese, Low German and Saxon — where generic models drift into English, we stay in context.
Speaker diarization included in both tiers. Standard reliably handles 2–6 speakers, Premium also separates voices that sound alike.
Custom vocabulary for proper nouns, jargon and company-specific acronyms — measurably boosts recognition of names like "Schwarz-Schilling" or "T1-weighted MRI".
99 languages available (auto-detect or manual selection), DACH languages are prioritised for accuracy and latency.
Word-level timestamps with per-word confidence scores — visible in the editor, exportable as JSON for downstream pipelines.

app.deepscript.com

Fasse die Kundengespräche der letzten Woche zusammen.

Ich habe 7 Transkripte über MCP abgerufen. Die drei wichtigsten Themen: Preis-Feedback, Feature-Wunsch Export-API und zwei Verlängerungen.via deepscript-mcp · 7 Quellen

How to use it

Up and running in a few steps

1
1. Pick a model
Standard (€0.18/h) for clean recordings and everyday conversations. Premium (€0.27/h) for dialect, noise, multiple speakers, or whenever the transcript needs to hold up to scrutiny.
2
2. Add a vocabulary (optional)
Drop proper nouns, product names and jargon into a list. The same vocabulary is then applied to every transcription in the project.
3
3. Upload or record live
Drop in an audio or video file, or use live transcription via the microphone. Premium runs on the priority queue.
4
4. Review and export
In the editor: rename speakers, click any word for the audio position, use confidence colouring. Export as TXT, SRT, VTT or JSON.

FAQ

Frequently asked questions

How is DeepScript better than generic Whisper?+

Generic Whisper-large is trained on a broad mix of 99 languages — average across the board, excellent at none. We take the same architecture and continue training on over 50,000 hours of DACH-specific audio. On clean German that gives us ~3–5% WER, where generic Whisper-large measures 7–9% on the same set. The gap widens significantly on dialects.

Why focus on DACH?+

Before DeepScript, Aliru GmbH spent years running Sally (sally.io), an AI meeting assistant with a mostly German-speaking customer base. That produced a training corpus and operational experience generic providers simply don't have. We're building the product we ourselves needed.

Do I get the same quality for English?+

English is excellently supported via the Whisper base — WER typically lands at 4–6% on clean recordings. Our fine-tuning doesn't measurably improve it (English already dominated the original Whisper training). We outperform on DACH; on English we're roughly on par with the big providers.

What does speaker diarization mean in practice?+

Each word gets a speaker label in addition to its text ("Speaker 1", "Speaker 2" …). In the editor you can rename labels into real names. In SRT/VTT exports they appear as a prefix before each subtitle; in JSON as a field on every word. You can override any label manually.

How is Premium different from Standard?+

Three things: (1) the DACH fine-tuning is only active in the Premium model — Standard uses a leaner variant; (2) Premium runs on a priority queue with lower wait times; (3) speaker diarization is more finely tuned for similar-sounding voices. Standard is €0.18/h, Premium €0.27/h.

See it for yourself

Upload a file and see the result in minutes. Three transcriptions free, no credit card.

Try free All features