Chat, RAG & Flows
Streamdiver turns every media asset into a knowledge base. The Chat API enables natural-language questions against videos, audio files, and documents -- powered by RAG that grounds answers in your actual content. The Flows API extends this with parameterized, agentic automation workflows.
Retrieval-Augmented Generation (RAG) combines search with AI text generation. Instead of relying solely on a language model's training data, RAG first retrieves relevant passages from your media -- transcripts, document pages, speaker segments -- and then generates an answer grounded in that specific content. This means answers include source references and stay faithful to what was actually said or written.
Ask a Question
The Chat endpoint POST /chats accepts a prompt and optional filters. Without filters it searches your entire library; with filters you can scope the query to specific assets, channels, speakers, or asset types.
Ask About a Specific Asset
To ask a question about a single video, audio file, or document, pass its ID in the mediaIds filter:
- curl
- Python
- TypeScript
curl --request POST \
--url https://api.streamdiver.com/v2/chats \
--header "Authorization: Bearer {token}" \
--header "Content-Type: application/json" \
--data '{
"prompt": "What are the key takeaways from this presentation?",
"mediaIds": ["{mediaId}"]
}'
response = requests.post(
"https://api.streamdiver.com/v2/chats",
headers={"Authorization": f"Bearer {token}"},
json={
"prompt": "What are the key takeaways from this presentation?",
"mediaIds": [media_id],
},
).json()
print(response["data"]["answer"])
for ctx in response["data"]["context"]:
for part in ctx.get("parts", []):
print(f" Source: {part['text']} (at {part['metadata']['startTime']}s)")
const response = await fetch("https://api.streamdiver.com/v2/chats", {
method: "POST",
headers: {
Authorization: `Bearer ${token}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
prompt: "What are the key takeaways from this presentation?",
mediaIds: [mediaId],
}),
}).then((r) => r.json());
console.log(response.data.answer);
response.data.context.forEach((ctx) =>
ctx.parts?.forEach((part) =>
console.log(` Source: ${part.text} (at ${part.metadata?.startTime}s)`)
)
);
The response includes the generated answer and the source context that was used:
{
"data": {
"answer": "The presentation highlights three key takeaways: ...",
"context": [
{
"parts": [
{
"text": "In summary, we achieved a 15% cost reduction...",
"metadata": { "startTime": 124.5 },
"ref": "chunk-abc123"
}
],
"presentation": { "thumbnail": { "url": "..." } }
}
]
}
}
For documents, metadata.page indicates the page number instead of startTime.
Ask Across Your Entire Library
Omit the filters to query all media assets at once:
- curl
- Python
- TypeScript
curl --request POST \
--url https://api.streamdiver.com/v2/chats \
--header "Authorization: Bearer {token}" \
--header "Content-Type: application/json" \
--data '{
"prompt": "What has been said about sustainability across all recordings?"
}'
response = requests.post(
"https://api.streamdiver.com/v2/chats",
headers={"Authorization": f"Bearer {token}"},
json={
"prompt": "What has been said about sustainability across all recordings?"
},
).json()
print(response["data"]["answer"])
for ctx in response["data"]["context"]:
print(f" Asset: {ctx['presentation']['thumbnail']}")
const response = await fetch("https://api.streamdiver.com/v2/chats", {
method: "POST",
headers: {
Authorization: `Bearer ${token}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
prompt: "What has been said about sustainability across all recordings?",
}),
}).then((r) => r.json());
Filters
Narrow the scope with filters -- combine as needed:
{
"prompt": "What decisions were made?",
"channelIds": ["{channelId}"],
"mediaIds": ["{mediaId1}", "{mediaId2}"],
"speakers": ["Dr. Smith"],
"tags": ["board-meeting"],
"types": ["video", "audio"]
}
| Filter | Description |
|---|---|
channelIds | Restrict to specific channels |
mediaIds | Restrict to specific assets (use a single ID to target one asset) |
speakers | Only consider content from named speakers |
tags | Filter by asset tags |
types | Filter by asset type: video, audio, document |
Inference Modes
Control the speed/quality tradeoff with the mode parameter:
| Mode | Description |
|---|---|
auto | Automatically selects the best mode (default) |
turbo | Faster responses, may reduce precision due to limited context |
smart | Deeper analysis, considers more context for complex questions |
curl --request POST \
--url https://api.streamdiver.com/v2/chats \
--header "Authorization: Bearer {token}" \
--header "Content-Type: application/json" \
--data '{
"prompt": "Summarize the budget discussion",
"mediaIds": ["{mediaId}"],
"mode": "smart"
}'
Use GET /tenants/current/speakers to list all recognized speakers across your library, including their IDs and labels. You can then pass speaker names in the speakers filter to ask what specific people said.
Conversation History
Maintain multi-turn conversations by passing previous messages in messageHistory:
{
"prompt": "Can you elaborate on the third point?",
"messageHistory": [
{ "role": "user", "content": "What are the key takeaways?" },
{ "role": "assistant", "content": "The presentation highlights three key takeaways: ..." }
]
}
Streaming Responses (SSE)
For real-time UIs, use the streaming endpoint that returns Server-Sent Events:
curl --request POST \
--url https://api.streamdiver.com/v2/chats/stream \
--header "Authorization: Bearer {token}" \
--header "Content-Type: application/json" \
--header "Accept: text/event-stream" \
--data '{ "prompt": "Summarize the latest board meeting" }'
Each SSE event contains a token chunk. The final event signals completion:
data: {"Token": "The board", "isComplete": false}
data: {"Token": " discussed", "isComplete": false}
data: {"Token": "", "isComplete": true}
Flows: Agentic Automation
Flows are parameterized AI workflows that go beyond simple Q&A. They enable automated content analysis (insights) and content generation (creation) -- from summarization to structured data extraction.
Discover Available Flows
List all flows available for your tenant, optionally filtered by category:
# List all insight flows
curl --request GET \
--url "https://api.streamdiver.com/v2/flows?category=insights&lang=en" \
--header "Authorization: Bearer {token}"
The response includes each flow's name, description, supported media types, and estimated duration:
{
"data": [
{
"id": "flow-uuid",
"name": "Meeting Summary",
"description": "Generates a structured summary of a meeting recording.",
"category": "insights",
"mediaTypes": ["video", "audio"],
"estimatedDuration": "medium"
}
]
}
| Category | Purpose |
|---|---|
insights | Analyze existing content (summaries, key topics, compliance checks) |
creation | Generate new content (social media posts, newsletters, reports) |
Inspect a Flow's Schema
Each flow has a dynamic input schema. Inspect it before execution to see what parameters are required:
curl --request GET \
--url "https://api.streamdiver.com/v2/flows/{flowId}/schema?lang=en" \
--header "Authorization: Bearer {token}"
Execute a Flow
Run a flow against a media asset. The response is streamed via SSE:
- curl
- Python
- TypeScript
curl --request POST \
--url https://api.streamdiver.com/v2/flows/{flowId}/runs \
--header "Authorization: Bearer {token}" \
--header "Content-Type: application/json" \
--header "Accept: text/event-stream" \
--data '{
"mediaId": "{mediaId}",
"llmProvider": "streamdiver/nitrox",
"params": {
"language": "en",
"format": "bullet_points"
}
}'
import requests
response = requests.post(
f"https://api.streamdiver.com/v2/flows/{flow_id}/runs",
headers={
"Authorization": f"Bearer {token}",
"Accept": "text/event-stream",
},
json={
"mediaId": media_id,
"llmProvider": "streamdiver/nitrox",
"params": {"language": "en", "format": "bullet_points"},
},
stream=True,
)
for line in response.iter_lines():
if line:
print(line.decode())
const response = await fetch(
`https://api.streamdiver.com/v2/flows/${flowId}/runs`,
{
method: "POST",
headers: {
Authorization: `Bearer ${token}`,
"Content-Type": "application/json",
Accept: "text/event-stream",
},
body: JSON.stringify({
mediaId,
llmProvider: "streamdiver/nitrox",
params: { language: "en", format: "bullet_points" },
}),
}
);
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
console.log(decoder.decode(value));
}
The stream emits events of different types: message (content tokens), search (retrieval progress), progress (status updates), and error.
Further Resources
- Interactive API Reference -- Chat -- all chat endpoints and parameters
- Interactive API Reference -- Flows -- flow discovery and execution
- Transcripts & Subtitles -- manage the transcripts that power Chat
- Metadata -- explore entities, keywords, and speakers used as context
- Search -- keyword and semantic search across your library
Related Use Cases
- KI-Medienanalyse -- metadata extraction, semantic search, and automated AI workflows
- Medienbeobachtung -- RAG-based Q&A for media archives
- Unternehmenskommunikation -- AI summaries and content repurposing via Chat and Flows