Integration
REST API for transcription — three calls, done
Upload, poll, fetch. JSON, word-level timestamps, 99 languages, own servers in Germany.
The DeepScript API follows the classic async job pattern: POST an audio or video file to `/v1/transcriptions`, receive a job ID, then either poll `/v1/transcriptions/{id}` every few seconds or get notified via webhook. Authentication uses an `X-API-KEY` header; keys start with `ds_live_` and are generated in the dashboard. The response contains full text, word-level timestamps with confidence scores, detected language, speaker labels and computed cost. Export formats (TXT, SRT, VTT, JSON) are available via `/v1/transcriptions/{id}/export?format=srt`. The full OpenAPI 3.1 spec lives at `/openapi.json`, an interactive Scalar UI at `/docs`.
View OpenAPI 3.1 specWhat you can build
- Upload audio and video files up to 500 MB (mp3, wav, flac, ogg, m4a, aac, mp4, mkv, webm, mov).
- Word-level timestamps with per-word confidence — ideal for subtitle generation and editor integrations.
- Speaker diarization in both tiers, DACH dialect optimisation in the Premium model.
- Custom vocabulary per request — company names, medical terms and proper nouns get recognised correctly.
- On-demand export formats: TXT, SRT, VTT, JSON. No client-side re-encoding required.
- Webhook callbacks on `transcription.completed` — polling is optional, no long-polling required.
Code samples
# Upload an audio file and start a Premium transcription job
curl -X POST https://api.deepscript.com/v1/transcriptions \
-H "X-API-KEY: ds_live_xxx" \
-F "file=@meeting.mp3" \
-F "model=premium" \
-F "language=de"
# Response:
# {
# "id": "8b1f2e4a-9c3d-4f7e-a1b2-1234567890ab",
# "status": "queued",
# "progress": 0,
# "model": "premium",
# "createdAt": "2026-06-09T10:14:22Z"
# }
# Poll until done
curl https://api.deepscript.com/v1/transcriptions/8b1f2e4a-9c3d-4f7e-a1b2-1234567890ab \
-H "X-API-KEY: ds_live_xxx"
# Download as SRT
curl -o meeting.srt \
"https://api.deepscript.com/v1/transcriptions/8b1f2e4a-9c3d-4f7e-a1b2-1234567890ab/export?format=srt" \
-H "X-API-KEY: ds_live_xxx"import { readFile } from "node:fs/promises";
const API_KEY = process.env.DEEPSCRIPT_API_KEY; // "ds_live_xxx"
const BASE = "https://api.deepscript.com/v1";
async function transcribe(filePath) {
const buffer = await readFile(filePath);
const blob = new Blob([buffer], { type: "audio/mpeg" });
const form = new FormData();
form.append("file", blob, "meeting.mp3");
form.append("model", "premium");
form.append("language", "de");
const created = await fetch(`${BASE}/transcriptions`, {
method: "POST",
headers: { "X-API-KEY": API_KEY },
body: form,
}).then((r) => r.json());
// Poll every 3 seconds until done
while (true) {
await new Promise((r) => setTimeout(r, 3000));
const job = await fetch(`${BASE}/transcriptions/${created.id}`, {
headers: { "X-API-KEY": API_KEY },
}).then((r) => r.json());
if (job.status === "completed") return job.result;
if (job.status === "failed") throw new Error(job.errorMessage);
}
}
const result = await transcribe("./meeting.mp3");
console.log(result.text);import os
import time
import requests
API_KEY = os.environ["DEEPSCRIPT_API_KEY"] # "ds_live_xxx"
BASE = "https://api.deepscript.com/v1"
HEADERS = {"X-API-KEY": API_KEY}
def transcribe(path: str) -> dict:
with open(path, "rb") as f:
created = requests.post(
f"{BASE}/transcriptions",
headers=HEADERS,
files={"file": f},
data={"model": "premium", "language": "de"},
timeout=120,
).json()
job_id = created["id"]
while True:
time.sleep(3)
job = requests.get(
f"{BASE}/transcriptions/{job_id}", headers=HEADERS, timeout=30
).json()
if job["status"] == "completed":
return job["result"]
if job["status"] == "failed":
raise RuntimeError(job["errorMessage"])
result = transcribe("meeting.mp3")
print(result["text"])Setup in a few steps
- 1
Generate an API key
Generate a key under Settings → Security in the dashboard. The key is shown only once and starts with `ds_live_`. Store it in your app as the environment variable `DEEPSCRIPT_API_KEY`.
- 2
Send the upload request
Multipart upload to POST `/v1/transcriptions` with fields `file`, `model` (standard/premium) and optionally `language` (ISO 639-1) plus `vocabularyId`. You receive a job ID immediately (HTTP 202).
- 3
Poll or wait for webhook
Either GET `/v1/transcriptions/{id}` every 2-5 seconds, or register a webhook on `transcription.completed`. Rule of thumb: 1 minute of audio = 5-15 seconds processing in Standard, slightly longer in Premium.
- 4
Fetch or export the result
Once `status: 'completed'`, the `result` field contains full text, words with timestamps and speaker labels. For SRT/VTT/TXT/JSON export use GET `/v1/transcriptions/{id}/export?format=srt`.
Frequently asked questions
What are the rate limits?
100 requests per minute per API key for authenticated calls, 30/min unauthenticated. Responses include `X-RateLimit-Limit`, `X-RateLimit-Remaining` and `X-RateLimit-Reset` headers. On exceedance you get HTTP 429 with a Retry-After header.
Do you support idempotency keys?
Yes — send `Idempotency-Key: <uuid>` as a header on POST `/v1/transcriptions`. Identical keys within 24 hours return the same response without starting a second job. Recommended for retries on network errors.
What polling interval should I use?
We recommend 2-5 seconds. For longer audio (>30 min) every 10 seconds is fine. If you'd rather avoid polling, use webhooks (`/v1/webhooks`) or the Server-Sent Events stream at `/v1/transcriptions/{id}/events`.
What happens on a failed job?
Status flips to `failed` and `errorMessage` contains an RFC 7807 Problem Details string. Common causes: file too short (<1s), no detectable audio, unsupported format. You are not charged for failed jobs.
Is there an official SDK?
For now we ship the OpenAPI 3.1 spec at `/openapi.json` — use `openapi-generator-cli` or `openapi-typescript` to generate a typed client in any language. Official TypeScript and Python SDKs are in the pipeline.
Ready to ship this to production?
Create an account, generate an API key, ship. Three transcriptions free to try. Full OpenAPI 3.1 docs at api.deepscript.com/docs.