Skip to main content

Architecture

OpenTranscribe is built with a modern, scalable architecture.

System Components

Frontend (Svelte)

Progressive Web App
TypeScript
Responsive design
Real-time WebSocket updates

Backend (FastAPI)

Async Python
RESTful API
OpenAPI documentation
WebSocket support

Workers (Celery)

GPU Queue: Transcription, diarization
Download Queue: YouTube downloads
CPU Queue: Waveform generation
NLP Queue: LLM features
Utility Queue: Maintenance

Data Layer

PostgreSQL: Relational data
MinIO: Object storage (S3-compatible)
OpenSearch: Full-text and vector search
Redis: Task queue and caching

Data Flow

User uploads file
File stored in MinIO
Celery task queued
GPU worker processes:
- Transcription (WhisperX)
- Diarization (PyAnnote)
Results stored in PostgreSQL
Indexed in OpenSearch
WebSocket notification to UI

Deployment Models

Development: docker-compose with hot reload
Production: docker-compose with optimizations
Offline: Airgapped deployment
Cloud: AWS/GCP/Azure with GPU instances

Next Steps

System Components
Data Flow
Deployment Models
Next Steps