Skip to main content

Environment Variables

Comprehensive reference for all OpenTranscribe environment variables.

Quick Reference

Edit .env file in installation directory. See .env.example for full template.

GPU Configuration

TORCH_DEVICE=auto  # or: cuda, mps, cpu
USE_GPU=auto # or: true, false
GPU_DEVICE_ID=0 # Which GPU (0, 1, 2, etc.)
COMPUTE_TYPE=auto # or: float16, float32, int8
BATCH_SIZE=auto # or: 8, 16, 32

Model & Caching

Configure AI models, caching behavior, and model discovery.

# Whisper Transcription Models
WHISPER_MODEL=large-v3-turbo # or: large-v3, large-v2, medium, small, base, tiny

# PyAnnote Speaker Diarization
PYANNOTE_VERSION=auto # or: v3, v4 (auto-detect installed version)
EMBEDDING_MODE=auto # or: v3, v4 (embedding model version)
MIN_SPEAKERS=1
MAX_SPEAKERS=20

# Model Caching & Storage
MODEL_CACHE_DIR=./models
HUGGINGFACE_CACHE=${MODEL_CACHE_DIR}/huggingface
TORCH_CACHE=${MODEL_CACHE_DIR}/torch
HUGGINGFACE_TOKEN=hf_your_token_here

# Warm Cache (Pre-load Models on Startup)
WARM_CACHE_ENABLED=false

Transcription Performance Options

# Whisper beam_size: lower = faster but slightly less accurate (default: 5)
# Set to 1 for greedy decoding (~25-40% faster, ~1-2% lower WER for English)
WHISPER_BEAM_SIZE=5

# Whisper compute_type: quantization for faster inference
# Default: auto-detected (float16 on CUDA). Options: float16, int8_float16, int8, float32
# int8_float16 gives ~15-25% speedup with negligible quality loss
WHISPER_COMPUTE_TYPE=float16

Model Recommendations

Use CaseModelNotes
English (primary)large-v3-turbo6x faster, excellent English accuracy
Multilinguallarge-v3Best accuracy for 100+ languages
Translation to Englishlarge-v3Turbo cannot translate
Speed-criticallarge-v3-turboRecommended for most use cases
Maximum accuracylarge-v3Slower but best overall

Whisper Model VRAM Requirements

ModelBatch Size 1Batch Size 8Batch Size 16
tiny~1GB~2GB~3GB
base~1GB~2GB~3GB
small~2GB~4GB~6GB
medium~5GB~10GB~15GB
large-v3-turbo~6GB~10GB~15GB
large-v3~10GB~20GB~30GB
large-v2~10GB~20GB~30GB

PyAnnote v4 Configuration

Configure speaker diarization and voice fingerprinting for speaker identification and tracking.

# Speaker Diarization Version
PYANNOTE_VERSION=auto # or: v3, v4 (auto-detect installed version)

# Speaker Detection Ranges
MIN_SPEAKERS=1 # Minimum speakers to detect
MAX_SPEAKERS=20 # Maximum speakers to detect (no hard limit, can increase for large events)

# Embedding & Fingerprinting
EMBEDDING_MODE=auto # or: v3, v4 (which embedding model to use)

# Model Caching & Warmup
WARM_CACHE_ENABLED=false # Pre-load speaker models on startup for faster first transcription
MODEL_CACHE_DIR=./models

Speaker Detection Use Cases

Event TypeSpeakersRecommended MAX_SPEAKERSNotes
Small meetings2-520 (default)Works well with default
Medium meetings5-1520 (default)Works well with default
Large conferences15-3030-40Increase MAX_SPEAKERS
Very large events30-50+50-100No hard limit

Warm Cache Benefits

Enabling WARM_CACHE_ENABLED=true pre-loads PyAnnote models on startup:

  • First transcription: 15-20 seconds faster (models already loaded)
  • Subsequent transcriptions: No performance change
  • Trade-off: ~500MB additional memory usage at startup
  • Recommended for: High-throughput systems with continuous transcription

Configure neural search capabilities for semantic search across transcriptions.

# Enable/Disable Neural Search
NEURAL_SEARCH_ENABLED=true

# ML Commons Plugin (Powers Neural Search)
OPENSEARCH_ML_COMMONS_ENABLED=true

# OpenSearch Connection
OPENSEARCH_URL=http://opensearch:9200
OPENSEARCH_USERNAME=admin
OPENSEARCH_PASSWORD=your_secure_password

# Vector Search Configuration
NEURAL_SEARCH_MODEL_ID=your_model_id # ML Commons model ID for embeddings
NEURAL_SEARCH_BATCH_SIZE=32

Neural Search VRAM Requirements

Embedding ModelVRAM (GPU)VRAM (CPU)Batch Size
sentence-transformers/all-MiniLM-L6-v2~200MB~500MB32-64
sentence-transformers/all-mpnet-base-v2~400MB~1GB16-32
bge-base-en-v1.5~400MB~1GB16-32
bge-large-en-v1.5~1GB~2GB8-16
BAAI/bge-large-zh-v1.5~1GB~2GB8-16

Search Performance Tuning

# Collapse optimization: max concurrent group searches (default: 20, 0 = sequential)
SEARCH_COLLAPSE_MAX_CONCURRENT=20

# Bulk batch size: chunks per OpenSearch bulk request (default: 100)
SEARCH_BULK_BATCH_SIZE=100

# Neural ingest batch size: documents per embedding call (default: 5)
SEARCH_NEURAL_BATCH_SIZE=5

# Reindex refresh interval: flush Lucene segments every N files (default: 100)
SEARCH_REINDEX_REFRESH_INTERVAL=100

# Hybrid search over-fetch cap: max candidates per sub-query before RRF merge (default: 200)
# Increase for large indexes where top-200 misses relevant results
SEARCH_MAX_OVERFETCH=200

# RRF rank constant: lower = more aggressive top-result boosting (default: 30)
SEARCH_RRF_RANK_CONSTANT=30

ML Commons Plugin

The OpenSearch ML Commons plugin enables vector embeddings and semantic search:

  • Status: Automatically detected on OpenSearch startup
  • Configuration: Database-driven via Admin UI
  • Fallback: Full-text search if neural search disabled

Cloud ASR Providers

Configure cloud-based speech recognition as an alternative to local GPU processing.

# ASR Provider Selection
ASR_PROVIDER=local # local, deepgram, assemblyai, openai, google, azure, aws, speechmatics, gladia

# Deepgram
DEEPGRAM_API_KEY=
DEEPGRAM_MODEL=nova-3

# AssemblyAI
ASSEMBLYAI_API_KEY=
ASSEMBLYAI_MODEL=universal

# OpenAI Whisper / GPT-4o Transcribe (uses OPENAI_API_KEY)
OPENAI_ASR_MODEL=gpt-4o-transcribe

# Google Cloud Speech
GOOGLE_CLOUD_CREDENTIALS= # Path to service account JSON
GOOGLE_ASR_MODEL=chirp-3

# Azure Speech
AZURE_SPEECH_KEY=
AZURE_SPEECH_REGION=eastus
AZURE_ASR_MODEL=whisper

# Amazon Transcribe
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_REGION=us-east-1
AWS_ASR_MODEL=standard
AWS_TRANSCRIBE_BUCKET= # S3 bucket for intermediate output

# Speechmatics
SPEECHMATICS_API_KEY=
SPEECHMATICS_MODEL=standard

# Gladia
GLADIA_API_KEY=
GLADIA_MODEL=standard

# Cloud ASR Options
CLOUD_ASR_EXTRACT_EMBEDDINGS=true # Extract speaker embeddings locally for cross-file matching
CLOUD_ASR_WORKER_CONCURRENCY=4 # Concurrency for cloud-asr worker

Deployment Mode

DEPLOYMENT_MODE=full  # full (local GPU + optional cloud) or lite (cloud-only, no GPU, ~2GB image)
BACKEND_LITE_IMAGE=davidamacey/opentranscribe-backend-lite:latest

LLM Integration

LLM_PROVIDER=  # vllm, openai, anthropic, ollama, openrouter
VLLM_BASE_URL=http://localhost:8012/v1
VLLM_MODEL_NAME=mistralai/Mistral-7B-Instruct-v0.2
VLLM_API_KEY=
OPENAI_API_KEY=
OPENAI_MODEL_NAME=gpt-4o-mini
OPENAI_BASE_URL=https://api.openai.com/v1
ANTHROPIC_API_KEY=
ANTHROPIC_MODEL_NAME=claude-3-haiku-20240307
ANTHROPIC_BASE_URL=https://api.anthropic.com
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL_NAME=llama2:7b-chat
OPENROUTER_API_KEY=
OPENROUTER_MODEL_NAME=anthropic/claude-3-haiku
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1

GPU Concurrent Processing

# GPU Concurrent Model Sharing (multiple Celery threads share one model copy)
GPU_CONCURRENT_REQUESTS=1 # auto calculates from VRAM: (total - 9GB) / 2GB, max 4
GPU_WORKER_POOL=threads # Default: threads. Use "prefork" only for legacy single-threaded setups

# Model preloading gate: set to true only on GPU workers to prevent CPU workers
# from initializing CUDA contexts and causing memory leaks (default: false)
PRELOAD_GPU_MODELS=false # Set true in GPU worker compose service only

# GPU Worker Max Tasks (restart after N tasks for memory safety)
GPU_MAX_TASKS=100000 # Default: effectively never restart
GPU_DEFAULT_BATCH_SIZE=12 # Batch size for default GPU worker (auto-detected if unset)

# VRAM Profiling (temporary diagnostic tool)
ENABLE_VRAM_PROFILING=false # Captures per-step GPU memory usage and timing data

Multi-GPU Scaling

GPU_SCALE_ENABLED=false
GPU_SCALE_DEVICE_ID=2
GPU_SCALE_WORKERS=4
GPU_SCALE_DEFAULT_WORKER=1 # Scale default worker (0 to disable)
GPU_SCALE_MAX_TASKS=500 # Restart scaled worker after N tasks (memory safety)

Worker Concurrency Tuning

# Download worker: parallel video/URL downloads
DOWNLOAD_CONCURRENCY=3 # Default: 3
DOWNLOAD_MAX_TASKS=10 # Restart after N tasks

# NLP worker: LLM summarization, speaker ID
NLP_CONCURRENCY=4 # Default: 4
NLP_MAX_TASKS=50 # Restart after N tasks

# Cloud ASR worker
CLOUD_ASR_WORKER_CONCURRENCY=4

Flower Monitoring Dashboard

FLOWER_USER=admin
FLOWER_PASSWORD=auto_generated_on_install
FLOWER_URL_PREFIX=flower # URL prefix (must match nginx proxy_pass path)

Flower provides industry-standard Celery task monitoring with persistent task history, queue visibility, and worker status. Access at http://localhost:5175/flower (or via NGINX at /flower/).

Storage Encryption

# MinIO Server-Side Encryption at Rest (AES-256-GCM)
# Generate with: echo "opentranscribe-key:$(openssl rand -base64 32)"
MINIO_KMS_SECRET_KEY=auto_generated_on_install
MINIO_KMS_AUTO_ENCRYPTION=on # Set to 'on' to enable

# API Key Encryption (for LLM keys stored in database)
ENCRYPTION_KEY=auto_generated_on_install # NEVER change after first use

Database

POSTGRES_HOST=postgres
POSTGRES_PORT=5176
POSTGRES_USER=postgres
POSTGRES_PASSWORD=auto_generated_on_install
POSTGRES_DB=opentranscribe
POSTGRES_SSLMODE=prefer # disable/allow/prefer/require/verify-ca/verify-full

Database initialization is handled entirely by Alembic migrations on backend startup. No external SQL init file is needed.

Ports

FRONTEND_PORT=5173
BACKEND_PORT=5174
FLOWER_PORT=5175
POSTGRES_PORT=5176
REDIS_PORT=5177
MINIO_PORT=5178
MINIO_CONSOLE_PORT=5179
OPENSEARCH_PORT=5180
OPENSEARCH_ADMIN_PORT=5181
DOCS_PORT=5183

HTTPS/SSL Configuration

Enable HTTPS with NGINX reverse proxy for secure access and browser microphone recording from network devices.

# Set hostname to enable HTTPS (triggers NGINX reverse proxy)
NGINX_SERVER_NAME=opentranscribe.local

# Optional: Custom ports (defaults shown)
NGINX_HTTP_PORT=80
NGINX_HTTPS_PORT=443

# Optional: Custom certificate paths (defaults shown)
NGINX_CERT_FILE=./nginx/ssl/server.crt
NGINX_CERT_KEY=./nginx/ssl/server.key

Quick setup: Run ./opentranscribe.sh setup-ssl to configure interactively.

See NGINX Setup Guide for full documentation.

Content Security Policy

OpenTranscribe's production NGINX configuration includes a Content Security Policy header to mitigate cross-site scripting (XSS) and other injection attacks (#124). The CSP restricts script sources, style sources, connection targets, and frame ancestors. Key directives include:

  • default-src 'self' -- baseline restriction to same-origin resources
  • script-src 'self' 'unsafe-inline' -- inline scripts required by Svelte hydration (nonce-based CSP is a planned improvement)
  • connect-src 'self' ws: wss: -- allows WebSocket connections for real-time updates
  • frame-ancestors 'self' -- prevents clickjacking
  • object-src 'none', base-uri 'self', form-action 'self' -- defense-in-depth directives

CSP is enforced in production via frontend/nginx.conf. Development mode (Vite dev server) does not apply CSP headers.

File Retention

OpenTranscribe supports admin-configurable automatic file retention (#134). Admins can set a retention period (delete files older than N days) to support GDPR compliance and storage management. File deletion is audit-logged and controlled exclusively by super admins via Settings → Admin → File Retention.

URL Download Quality Settings

URL downloads (YouTube, TikTok, and 1800+ platforms via yt-dlp) support configurable quality settings (#122):

# These are user-level settings stored in the database, configurable via Settings UI.
# Per-download overrides are also available in the URL upload tab.
# Default: "best" for both video and audio (current behavior).

Quality options include video resolution selection (best, 4K, 1440p, 1080p, 720p, 480p, 360p), audio-only mode for podcasts, and audio bitrate selection. The yt-dlp format string builder uses a fallback chain: if the requested quality is unavailable, it automatically downloads the next best option. This is designed for bandwidth-conscious users and storage optimization.

Authentication Configuration

OpenTranscribe uses a database-driven authentication system with support for multiple simultaneous auth methods (hybrid authentication). See Authentication Overview for detailed configuration.

Configuration Sources

Authentication is configured via Super Admin UI (Settings → Authentication) and stored in the database with AES-256-GCM encryption:

Configuration LevelSourcePriorityNotes
PrimaryDatabase (auth_config table)✅ PrimarySet via Super Admin UI
Legacy/OverrideEnvironment variablesSecondaryENV vars override DB for legacy support

Multi-Method Authentication

Multiple authentication methods can be enabled simultaneously. Users can authenticate via any enabled method:

# Authentication Type (Indicator - set via Super Admin UI)
AUTH_TYPE=local,ldap,keycloak # Enabled methods (informational, read-only)

LDAP/Active Directory Configuration

# LDAP/Active Directory (configured via Super Admin UI)
# These ENV variables are for legacy/development use only
LDAP_SERVER=ldaps://your-ad-server.domain.com
LDAP_PORT=636
LDAP_USE_SSL=true
LDAP_BIND_DN=CN=service-account,CN=Users,DC=domain,DC=com
LDAP_BIND_PASSWORD=your-service-account-password
LDAP_SEARCH_BASE=DC=domain,DC=com
LDAP_USERNAME_ATTR=sAMAccountName

Keycloak/OIDC Configuration

# Keycloak/OIDC (configured via Super Admin UI)
# These ENV variables are for legacy/development use only
KEYCLOAK_SERVER_URL=https://keycloak.yourdomain.com
KEYCLOAK_REALM=opentranscribe
KEYCLOAK_CLIENT_ID=opentranscribe-app
KEYCLOAK_CLIENT_SECRET=your-client-secret
KEYCLOAK_ADMIN_ROLE=admin

PKI/X.509 Certificate Configuration

# PKI/X.509 Certificates (configured via Super Admin UI)
# These ENV variables are for legacy/development use only
PKI_CA_CERT_PATH=/path/to/ca.crt
PKI_ADMIN_DNS=CN=Admin User,O=Company,C=US

Security Features

# Password Policy (FedRAMP IA-5)
PASSWORD_POLICY_ENABLED=true
PASSWORD_MIN_LENGTH=12
PASSWORD_REQUIRE_UPPERCASE=true
PASSWORD_REQUIRE_LOWERCASE=true
PASSWORD_REQUIRE_DIGIT=true
PASSWORD_REQUIRE_SPECIAL=true
PASSWORD_HISTORY_COUNT=24
PASSWORD_MAX_AGE_DAYS=60

# Account Lockout (NIST AC-7)
ACCOUNT_LOCKOUT_ENABLED=true
ACCOUNT_LOCKOUT_THRESHOLD=5
ACCOUNT_LOCKOUT_DURATION_MINUTES=15
ACCOUNT_LOCKOUT_PROGRESSIVE=true
ACCOUNT_LOCKOUT_MAX_DURATION_MINUTES=1440

# Multi-Factor Authentication
MFA_ENABLED=true
MFA_ISSUER_NAME=OpenTranscribe
MFA_BACKUP_CODE_COUNT=10

# Rate Limiting
RATE_LIMIT_ENABLED=true
RATE_LIMIT_AUTH_PER_MINUTE=10
RATE_LIMIT_API_PER_MINUTE=100

# Audit Logging (FedRAMP AU-2/AU-3)
AUDIT_LOG_ENABLED=true
AUDIT_LOG_FORMAT=json # or: cef
AUDIT_LOG_TO_OPENSEARCH=false

# Login Banner
LOGIN_BANNER_ENABLED=false
LOGIN_BANNER_TITLE=Security Notice
LOGIN_BANNER_TEXT=This is a restricted system...

Next Steps