Chat, RAG & Flows

Streamdiver turns every media asset into a knowledge base. The Chat API enables natural-language questions against videos, audio files, and documents -- powered by RAG that grounds answers in your actual content. The Flows API extends this with parameterized, agentic automation workflows.

What is RAG?

Retrieval-Augmented Generation (RAG) combines search with AI text generation. Instead of relying solely on a language model's training data, RAG first retrieves relevant passages from your media -- transcripts, document pages, speaker segments -- and then generates an answer grounded in that specific content. This means answers include source references and stay faithful to what was actually said or written.

Ask a Question

The Chat endpoint POST /chats accepts a prompt and optional filters. Without filters it searches your entire library; with filters you can scope the query to specific assets, channels, speakers, or asset types.

Ask About a Specific Asset

To ask a question about a single video, audio file, or document, pass its ID in the mediaIds filter:

curl
Python
TypeScript

curl --request POST \
  --url https://api.streamdiver.com/v2/chats \
  --header "Authorization: Bearer {token}" \
  --header "Content-Type: application/json" \
  --data '{
    "prompt": "What are the key takeaways from this presentation?",
    "mediaIds": ["{mediaId}"]
  }'

response = requests.post(
    "https://api.streamdiver.com/v2/chats",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "prompt": "What are the key takeaways from this presentation?",
        "mediaIds": [media_id],
    },
).json()

print(response["data"]["answer"])

for ctx in response["data"]["context"]:
    for part in ctx.get("parts", []):
        print(f"  Source: {part['text']} (at {part['metadata']['startTime']}s)")

const response = await fetch("https://api.streamdiver.com/v2/chats", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${token}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    prompt: "What are the key takeaways from this presentation?",
    mediaIds: [mediaId],
  }),
}).then((r) => r.json());

console.log(response.data.answer);
response.data.context.forEach((ctx) =>
  ctx.parts?.forEach((part) =>
    console.log(`  Source: ${part.text} (at ${part.metadata?.startTime}s)`)
  )
);

The response includes the generated answer and the source context that was used:

{
  "data": {
    "answer": "The presentation highlights three key takeaways: ...",
    "context": [
      {
        "parts": [
          {
            "text": "In summary, we achieved a 15% cost reduction...",
            "metadata": { "startTime": 124.5 },
            "ref": "chunk-abc123"
          }
        ],
        "presentation": { "thumbnail": { "url": "..." } }
      }
    ]
  }
}

For documents, metadata.page indicates the page number instead of startTime.

Ask Across Your Entire Library

Omit the filters to query all media assets at once:

curl
Python
TypeScript

curl --request POST \
  --url https://api.streamdiver.com/v2/chats \
  --header "Authorization: Bearer {token}" \
  --header "Content-Type: application/json" \
  --data '{
    "prompt": "What has been said about sustainability across all recordings?"
  }'

response = requests.post(
    "https://api.streamdiver.com/v2/chats",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "prompt": "What has been said about sustainability across all recordings?"
    },
).json()

print(response["data"]["answer"])
for ctx in response["data"]["context"]:
    print(f"  Asset: {ctx['presentation']['thumbnail']}")

const response = await fetch("https://api.streamdiver.com/v2/chats", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${token}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    prompt: "What has been said about sustainability across all recordings?",
  }),
}).then((r) => r.json());

Filters

Narrow the scope with filters -- combine as needed:

{
  "prompt": "What decisions were made?",
  "channelIds": ["{channelId}"],
  "mediaIds": ["{mediaId1}", "{mediaId2}"],
  "speakers": ["Dr. Smith"],
  "tags": ["board-meeting"],
  "types": ["video", "audio"]
}

Filter	Description
`channelIds`	Restrict to specific channels
`mediaIds`	Restrict to specific assets (use a single ID to target one asset)
`speakers`	Only consider content from named speakers
`tags`	Filter by asset tags
`types`	Filter by asset type: `video`, `audio`, `document`

Inference Modes

Control the speed/quality tradeoff with the mode parameter:

Mode	Description
`auto`	Automatically selects the best mode (default)
`turbo`	Faster responses, may reduce precision due to limited context
`smart`	Deeper analysis, considers more context for complex questions

curl --request POST \
  --url https://api.streamdiver.com/v2/chats \
  --header "Authorization: Bearer {token}" \
  --header "Content-Type: application/json" \
  --data '{
    "prompt": "Summarize the budget discussion",
    "mediaIds": ["{mediaId}"],
    "mode": "smart"
  }'

tipp

Use GET /tenants/current/speakers to list all recognized speakers across your library, including their IDs and labels. You can then pass speaker names in the speakers filter to ask what specific people said.

Conversation History

Maintain multi-turn conversations by passing previous messages in messageHistory:

{
  "prompt": "Can you elaborate on the third point?",
  "messageHistory": [
    { "role": "user", "content": "What are the key takeaways?" },
    { "role": "assistant", "content": "The presentation highlights three key takeaways: ..." }
  ]
}

Streaming Responses (SSE)

For real-time UIs, use the streaming endpoint that returns Server-Sent Events:

curl --request POST \
  --url https://api.streamdiver.com/v2/chats/stream \
  --header "Authorization: Bearer {token}" \
  --header "Content-Type: application/json" \
  --header "Accept: text/event-stream" \
  --data '{ "prompt": "Summarize the latest board meeting" }'

Each SSE event contains a token chunk. The final event signals completion:

data: {"Token": "The board", "isComplete": false}
data: {"Token": " discussed", "isComplete": false}
data: {"Token": "", "isComplete": true}

Flows: Agentic Automation

Flows are parameterized AI workflows that go beyond simple Q&A. They enable automated content analysis (insights) and content generation (creation) -- from summarization to structured data extraction.

Discover Available Flows

List all flows available for your tenant, optionally filtered by category:

# List all insight flows
curl --request GET \
  --url "https://api.streamdiver.com/v2/flows?category=insights&lang=en" \
  --header "Authorization: Bearer {token}"

The response includes each flow's name, description, supported media types, and estimated duration:

{
  "data": [
    {
      "id": "flow-uuid",
      "name": "Meeting Summary",
      "description": "Generates a structured summary of a meeting recording.",
      "category": "insights",
      "mediaTypes": ["video", "audio"],
      "estimatedDuration": "medium"
    }
  ]
}

Category	Purpose
`insights`	Analyze existing content (summaries, key topics, compliance checks)
`creation`	Generate new content (social media posts, newsletters, reports)

Inspect a Flow's Schema

Each flow has a dynamic input schema. Inspect it before execution to see what parameters are required:

curl --request GET \
  --url "https://api.streamdiver.com/v2/flows/{flowId}/schema?lang=en" \
  --header "Authorization: Bearer {token}"

Execute a Flow

Run a flow against a media asset. The response is streamed via SSE:

curl
Python
TypeScript

curl --request POST \
  --url https://api.streamdiver.com/v2/flows/{flowId}/runs \
  --header "Authorization: Bearer {token}" \
  --header "Content-Type: application/json" \
  --header "Accept: text/event-stream" \
  --data '{
    "mediaId": "{mediaId}",
    "llmProvider": "streamdiver/nitrox",
    "params": {
      "language": "en",
      "format": "bullet_points"
    }
  }'

import requests

response = requests.post(
    f"https://api.streamdiver.com/v2/flows/{flow_id}/runs",
    headers={
        "Authorization": f"Bearer {token}",
        "Accept": "text/event-stream",
    },
    json={
        "mediaId": media_id,
        "llmProvider": "streamdiver/nitrox",
        "params": {"language": "en", "format": "bullet_points"},
    },
    stream=True,
)

for line in response.iter_lines():
    if line:
        print(line.decode())

const response = await fetch(
  `https://api.streamdiver.com/v2/flows/${flowId}/runs`,
  {
    method: "POST",
    headers: {
      Authorization: `Bearer ${token}`,
      "Content-Type": "application/json",
      Accept: "text/event-stream",
    },
    body: JSON.stringify({
      mediaId,
      llmProvider: "streamdiver/nitrox",
      params: { language: "en", format: "bullet_points" },
    }),
  }
);

const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  console.log(decoder.decode(value));
}

The stream emits events of different types: message (content tokens), search (retrieval progress), progress (status updates), and error.

Further Resources

Interactive API Reference -- Chat -- all chat endpoints and parameters
Interactive API Reference -- Flows -- flow discovery and execution
Transcripts & Subtitles -- manage the transcripts that power Chat
Metadata -- explore entities, keywords, and speakers used as context
Search -- keyword and semantic search across your library

KI-Medienanalyse -- metadata extraction, semantic search, and automated AI workflows
Medienbeobachtung -- RAG-based Q&A for media archives
Unternehmenskommunikation -- AI summaries and content repurposing via Chat and Flows

Ask a Question​

Ask About a Specific Asset​

Ask Across Your Entire Library​

Filters​

Inference Modes​

Conversation History​

Streaming Responses (SSE)​

Flows: Agentic Automation​

Discover Available Flows​

Inspect a Flow's Schema​

Execute a Flow​

Further Resources​

Related Use Cases​