Prompt Engineering

OpenTranscribe uses Large Language Models for summarization, speaker identification, topic extraction, and auto-labeling. The quality of these AI outputs depends heavily on the prompts that guide the model. This guide covers how prompts work in OpenTranscribe and how to customize them for your use case.

How OpenTranscribe Uses Prompts

OpenTranscribe separates AI instructions into two layers:

System prompts define the model's role, expertise, and output standards. These set the overall behavior and are configured once per feature.
User prompts contain the actual transcript content, speaker data, and task-specific instructions. These are assembled automatically for each request.

Where Prompts Are Used

Feature	What the Prompt Does
Summarization	Generates BLUF summaries with action items, decisions, and speaker analysis
Speaker Identification	Analyzes conversation context to suggest speaker names
Topic Extraction	Identifies major topics and themes from transcript content
Auto-Labeling	Suggests tags and categories for uploaded media

The system prompt and default summarization prompt are stored in the database and can be edited through Settings > AI > Prompts. Per-collection prompts override the default for files within that collection.

BLUF Summary Format

OpenTranscribe defaults to BLUF (Bottom Line Up Front) format, a military and executive communication standard that leads with the conclusion.

Default Output Structure

Every summary includes:

BLUF (2-3 sentences) -- the key takeaway, decision, or recommendation
Brief Summary -- supporting context and details
Speaker Analysis -- who spoke, for how long, and their key contributions
Action Items -- tasks with owners, deadlines, and priorities
Key Decisions -- what was decided and why
Major Topics -- themes discussed with key points

What Makes a Good BLUF

A strong BLUF is specific, outcome-oriented, and self-contained:

GOOD: "Engineering team approved Q4 roadmap with two feature deferrals
to address $50K budget overrun. Marketing budget preserved at current
levels. Sarah Chen updating roadmap by Friday."

GOOD: "Product launch delayed 2 weeks due to critical security
vulnerability discovered in testing. Security team implementing fix
with high priority. New launch date: March 15."

Avoid vague or passive summaries:

BAD: "This meeting discussed various topics including budget..."
BAD: "The team had a productive discussion..."
BAD: "We talked about Q4 and everyone shared thoughts..."

Customizing the Format

To change the default summary format, edit the summarization prompt in Settings > AI > Prompts. You can modify the output schema, add or remove sections, or change the tone. The system validates that the response is valid JSON matching the expected structure.

Writing Custom Prompts

Use XML Tags for Structure

XML tags prevent the model from mixing up prompt sections and improve accuracy. OpenTranscribe uses this pattern internally:

<transcript>
{actual_transcript_content}
</transcript>

<speaker_data>
{speaker_statistics_json}
</speaker_data>

<task_instructions>
Analyze the transcript above and generate a BLUF summary.
</task_instructions>

<output_format>
{
  "bluf": "...",
  "brief_summary": "...",
  "action_items": [...]
}
</output_format>

Common tags: <instructions>, <context>, <examples>, <transcript>, <output_format>, <constraints>.

Include Few-Shot Examples

Providing 2-5 concrete examples is more effective than lengthy instructions. Wrap examples in <examples> tags:

<examples>
<example>
<input_transcript>
John Smith [00:00]: We need to finalize the Q4 budget.
Sarah Chen [00:15]: Engineering is over budget by $50K.
John Smith [00:30]: Can we reallocate from marketing?
Mike Johnson [00:45]: Marketing budget is tight. Defer two features instead.
Sarah Chen [01:00]: Agreed. I'll update the roadmap by Friday.
</input_transcript>

<output>
{
  "bluf": "Q4 budget requires $50K reduction; team agreed to defer
           two feature releases rather than cut marketing",
  "action_items": [
    {
      "item": "Update Q4 roadmap to reflect deferred features",
      "owner": "Sarah Chen",
      "due_date": "Friday",
      "priority": "high"
    }
  ]
}
</output>
</example>
</examples>

Include diverse examples that cover different meeting types, edge cases, and output structures.

Use Chain-of-Thought for Complex Analysis

For tasks requiring deep reasoning, instruct the model to think step-by-step before answering:

<instructions>
Analyze this transcript and generate a structured summary.

ANALYSIS PROCESS:
1. First, in <thinking> tags:
   - Identify main topics and themes
   - Note speaker roles and contributions
   - Extract key decisions with context
   - Identify action items with assignees

2. Then, in <answer> tags:
   - Provide the structured JSON summary
</instructions>

Chain-of-thought improves accuracy significantly but increases token usage and latency. It is recommended for summarization (high-value output) but not for simple extraction tasks.

Organization Context

Organization context injects domain-specific knowledge into all AI prompts, helping the model correctly interpret jargon, acronyms, and references specific to your organization.

Configure in Settings > AI > Organization Context.

Writing Effective Context

Good organization context is concise and focused on information the model would not otherwise know:

GOOD:
"Acme Corp is a fintech company. 'The Board' refers to the Board of
Directors. Q3 = July-September fiscal quarter. Project Phoenix is our
mobile app rewrite. Engineering teams: Platform (backend), Frontier
(frontend), Shield (security). CTO = Jane Smith, VP Eng = Bob Lee."

BAD:
"We are a company that makes software and has meetings sometimes."

Focus on:

Team and project names that appear in meetings
Acronyms and abbreviations unique to your organization
Key personnel and their roles
Domain terminology that might be ambiguous

The context is injected as a preamble before the transcript in all LLM calls. It applies to both summarization and speaker identification.

Per-Collection Prompts

Collections can have their own default summarization prompt, overriding the global default for all files within that collection. This is useful when different types of content need different analysis approaches.

Use Cases

Collection	Prompt Focus
Legal Depositions	Witness statements, objections, exhibits referenced, legal terminology
Team Standups	Blockers, progress updates, commitments, brief format
Customer Interviews	Pain points, feature requests, sentiment, competitive mentions
Medical Dictation	Clinical findings, diagnoses, treatment plans, medication changes
Board Meetings	Motions, votes, resolutions, fiduciary topics

Setting a Collection Prompt

Navigate to the collection
Open collection settings
Enter a custom summarization prompt under AI Prompt
Files added to this collection will use the collection prompt instead of the global default

If a file belongs to multiple collections, the most specific prompt takes precedence.

Speaker Identification Prompts

Speaker identification uses multi-signal analysis to suggest names for unlabeled speakers. The model looks for:

Signal	Confidence	Example
Direct name mention	0.95	"Thanks John for that insight"
Role reference	0.85	"As the CEO, I think..."
Expertise demonstration	0.70	Deep knowledge in a specific domain
Speech patterns	0.70	Technical jargon, vocabulary complexity
Conversational dynamics	0.60	Who asks vs. answers questions, deference patterns

Improving Speaker ID Accuracy

Organization context significantly improves speaker identification. If the model knows the meeting participants, their roles, and their areas of responsibility, it can match speakers with much higher confidence.

Example context for speaker ID:

Meeting participants typically include:
- Dr. Sarah Chen (Chief Medical Officer) - discusses clinical trials
- James Park (Head of Regulatory) - references FDA submissions
- Maria Lopez (VP of Marketing) - discusses launch timelines

Speaker identification suggestions are never auto-applied. They appear as recommendations in the speaker panel for manual verification.

Temperature and Model Settings

Temperature

Temperature controls randomness in outputs. Lower values produce more deterministic, consistent results.

Task	Recommended Temperature	Rationale
Summarization	0.1	Maximum consistency and factual accuracy
Speaker Identification	0.2	Consistent analytical reasoning
Data Extraction	0.1	Deterministic, structured outputs
Creative Writing	0.7-1.0	Allow variety and expressiveness

OpenTranscribe defaults to 0.1 for summarization and 0.2 for speaker identification. Adjust only if you need more varied output (and accept reduced consistency).

caution

Alter either temperature or top-p, not both simultaneously. Changing both creates unpredictable behavior.

Max Tokens

Max tokens controls the maximum length of the model's response. For most transcripts, the default is sufficient. Increase it for very long transcripts (2+ hours) that may produce lengthy summaries.

Provider-Specific Tips

vLLM (Self-Hosted)

Best for privacy-sensitive deployments where data cannot leave your network
Supports response prefilling (forcing JSON output format)
Configure the endpoint as http://<host>:<port>/v1
See the Security Hardening guide if containers cannot reach your vLLM server

OpenAI

GPT-4o recommended for best quality-to-cost ratio
Supports function/tool calling for guaranteed JSON schema compliance
API key required; set in Settings > AI > LLM Provider

Anthropic (Claude)

Claude 3.5 Sonnet or Claude Opus 4 recommended
Excellent at following XML-structured prompts
Strong performance on long transcripts (200K+ token context)
Supports response prefilling for reliable JSON output

Ollama (Local)

Good for development and testing
Model quality varies significantly -- use 7B+ parameter models for summarization
Runs on CPU if no GPU available (slower but functional)
Configure endpoint as http://<host>:11434

OpenRouter

Access to multiple model providers through one API
Useful for comparing model performance
Pay-per-token pricing across providers

Example Prompts

Meeting Summary (Default)

You are a senior executive assistant with 10+ years experience
analyzing business meetings and creating actionable summaries.

YOUR EXPERTISE:
- Identifying critical business decisions and their implications
- Extracting actionable items with clear ownership
- Recognizing speaker roles and contribution patterns
- Distilling complex discussions into executive-level BLUF format

OUTPUT STANDARD:
- Follow BLUF (Bottom Line Up Front) format
- Ensure all JSON fields are properly formatted
- Provide specific evidence for all claims
- Flag low-confidence items explicitly

BLUF FORMAT REQUIREMENTS:
- First sentence: What happened / what was decided
- Second sentence: Why it matters / what's the impact
- Optional third sentence: Next critical action
- Total length: 2-3 sentences maximum
- Must be understandable without reading rest of summary

Legal Deposition

You are a legal analyst specializing in deposition transcript review.

Focus on:
- Witness statements of fact vs. opinion
- Objections raised and their basis
- Exhibits referenced with identification numbers
- Inconsistencies or contradictions in testimony
- Key admissions or denials

Format action items as follow-up questions or investigation tasks.
Use legal terminology appropriately. Flag any testimony that
contradicts previously established facts.

Medical Dictation

You are a medical transcription analyst with clinical terminology
expertise.

Focus on:
- Chief complaint and history of present illness
- Physical examination findings
- Assessment and diagnosis (using standard medical terminology)
- Treatment plan including medications with dosages
- Follow-up instructions and referrals

IMPORTANT: Flag any medication interactions or contraindications
mentioned. Use ICD-10 codes where diagnosis is clearly stated.
Do not infer diagnoses not explicitly stated by the provider.

Podcast Episode

You are a content analyst for podcast production.

Focus on:
- Key topics discussed with timestamps
- Notable quotes from guests
- Audience takeaways and actionable advice
- References to external resources (books, tools, websites)
- Potential show notes and episode description

Format the BLUF as a one-paragraph episode summary suitable for
podcast directories. Keep action items focused on content the
production team needs to follow up on (fact-checking, links, etc.).

Troubleshooting

Summaries Are Too Verbose

Lower the temperature to 0.1
Add explicit length constraints: "BLUF must be 2-3 sentences maximum"
Add a negative instruction: "Do NOT include preamble, introduction, or meta-commentary"
Provide a concrete example showing the desired length

Summaries Miss Important Details

Check if the transcript is being chunked (long transcripts are processed in sections)
Add "Quote relevant evidence before summarizing" to the prompt
Increase max tokens if the response is being truncated
Use chain-of-thought to force the model to extract key information first

JSON Parsing Errors

OpenTranscribe uses response prefilling to force JSON output. If you still see parsing errors:

Check backend logs: docker logs opentranscribe-backend | grep "Failed to parse"
Verify the prompt does not contain instructions that conflict with JSON output
Ensure few-shot examples use valid JSON

Speaker Identification Confidence Is Low

Add organization context with participant names and roles
Ensure the transcript has enough content per speaker (very short utterances provide little signal)
Check that the transcript includes natural name mentions or role references
Lower confidence thresholds are expected for speakers who are only briefly mentioned

Hallucinated Information in Summaries

Lower the temperature to 0.1 (minimum randomness)
Add explicit constraints: "Only include information explicitly stated in the transcript. Do not infer or assume."
Add edge case handling: "If information is unclear, say 'unclear from transcript' rather than guessing"
Use chain-of-thought to force evidence extraction before conclusions

LLM Not Generating Summaries

If summaries never appear after triggering them:

Verify LLM provider is configured in Settings > AI > LLM Provider
Test the connection using the Test Connection button
Check celery worker logs: ./opentr.sh logs celery-nlp-worker
If using a self-hosted model (vLLM/Ollama), verify network connectivity from Docker containers -- see the LLM Integration docs for firewall configuration

How OpenTranscribe Uses Prompts​

Where Prompts Are Used​

BLUF Summary Format​

Default Output Structure​

What Makes a Good BLUF​

Customizing the Format​

Writing Custom Prompts​

Use XML Tags for Structure​

Include Few-Shot Examples​

Use Chain-of-Thought for Complex Analysis​

Organization Context​

Writing Effective Context​

Per-Collection Prompts​

Use Cases​

Setting a Collection Prompt​

Speaker Identification Prompts​

Improving Speaker ID Accuracy​

Temperature and Model Settings​

Temperature​

Max Tokens​

Provider-Specific Tips​

vLLM (Self-Hosted)​

OpenAI​

Anthropic (Claude)​

Ollama (Local)​

OpenRouter​

Example Prompts​

Meeting Summary (Default)​

Legal Deposition​

Medical Dictation​

Podcast Episode​

Troubleshooting​

Summaries Are Too Verbose​

Summaries Miss Important Details​

JSON Parsing Errors​

Speaker Identification Confidence Is Low​

Hallucinated Information in Summaries​

LLM Not Generating Summaries​