Skip to main content
v0.4.0AGPL-3.0Self-Hosted

OpenTranscribe

AI-powered transcription with speaker identification, summarization, and search. Runs on your infrastructure. No per-minute fees.

OpenTranscribe interface showing transcription with speaker labels
Install with one command
curl -fsSL https://raw.githubusercontent.com/davidamacey/OpenTranscribe/master/setup-opentranscribe.sh | bash
70x
Realtime Speed
100+
Languages
8
LLM Providers
$0
Per-Minute Cost

What you get

A complete transcription platform, not just a Whisper wrapper.

Transcription

WhisperX with word-level timestamps across 100+ languages. Native cross-attention alignment, 70x realtime on GPU, configurable models.

Speaker Identification

PyAnnote v4 diarization with cross-video voice fingerprinting. GPU-accelerated clustering, gender classification, and automatic profile matching.

AI Analysis

BLUF summaries, action item extraction, auto-labeling. Works with OpenAI, Claude, vLLM, Ollama, or OpenRouter. Bring your own model.

Hybrid Search

Full-text and semantic search across all transcripts. BM25 + neural vectors merged via Reciprocal Rank Fusion on OpenSearch.

Enterprise Auth

LDAP/AD, Keycloak OIDC, PKI/X.509, MFA. FedRAMP-aligned controls, AES-256-GCM encryption, audit logging, FIPS 140-3 ready.

Self-Hosted

Docker Compose deployment. Multi-GPU scaling, 3-stage Celery pipeline, health monitoring. Runs air-gapped. Your data, your servers.

Use cases

Meetings

Speaker-attributed transcripts with action items

Media Production

Subtitles, indexing, and podcast transcription

Legal & Compliance

Depositions, audit trails, FIPS compliance

Research

Interview analysis and multilingual support

Government

Air-gapped, PKI auth, classification banners

Call Centers

High-volume analysis and trend detection

Open stack, zero lock-in

Every component is open source. Deploy on bare metal, in your cloud, or fully air-gapped. Same Docker Compose config scales from a laptop to a multi-GPU rack.

  • 3-stage Celery chain separates CPU and GPU work
  • Multi-GPU worker scaling with configurable concurrency
  • Automatic Alembic migrations on startup
  • Non-root containers, health checks, Flower monitoring
  • Offline deployment with pre-cached models
WhisperX + PyAnnote v4
Transcription and speaker diarization
OpenSearch 3.4
Full-text and neural vector search
Celery + Redis
7-queue task pipeline (GPU/CPU split)
PostgreSQL
Relational storage with Alembic migrations
MinIO
S3-compatible object storage, AES-256
Svelte + FastAPI
Frontend and async API backend

How it compares

Feature comparison with commercial platforms and open source alternatives

FeatureOpen-
Transcribe
CommercialOpen Source
Otter.aiRevDescriptSonixTrintVibeScriberrLinTOWhishperMac-
Whisper
Speaker diarization--Pro
Word-level timestamps----
100+ languages--20+49+50+
AI summarization--------Pro
Custom LLM providers----------------
Full-text + semantic search--------------text
Cross-video speaker matching--------------------
Self-hosted / air-gapped------------
Enterprise auth (LDAP/PKI/OIDC)--------------------
Multi-user / roles----------------
Multi-GPU scaling--------------------
Cloud ASR providers--------------------
URL import (YouTube etc.)----------------Pro
Docker Compose deploy--------------
Desktop app------------------
Subtitle editor------------------
SOC 2 / ISO 27001--------------

Data based on publicly available documentation as of March 2026. Features may vary by plan or version.

Own your transcription pipeline

Deploy in under 5 minutes. No account needed, no data leaves your servers.