Metadata

Every uploaded media asset is automatically analyzed by Streamdiver's AI pipeline. Beyond transcripts, the platform extracts entities, keywords, speakers, on-screen text (OCR), and document content. This tutorial shows how to retrieve, expand, and re-extract this metadata.

Retrieve Metadata

Get the metadata overview for an asset:

curl
Python
TypeScript

curl --request GET \
  --url https://api.streamdiver.com/v2/media/{assetId}/metadata \
  --header "Authorization: Bearer {token}"

metadata = requests.get(
    f"https://api.streamdiver.com/v2/media/{asset_id}/metadata",
    headers={"Authorization": f"Bearer {token}"},
).json()

print("Status:", metadata["data"]["status"])
print("Summary:", metadata["data"].get("summary"))

const metadata = await fetch(
  `https://api.streamdiver.com/v2/media/${assetId}/metadata`,
  { headers: { Authorization: `Bearer ${token}` } }
).then((r) => r.json());

console.log("Status:", metadata.data.status);
console.log("Summary:", metadata.data.summary);

The response includes a status field (running, completed, etc.) and a summary of the asset. Detailed metadata for each category is returned as a count -- use the expand parameter to include the full data.

Expand Specific Metadata

Use the expand query parameter to include detailed data for specific metadata categories in the response:

curl
Python
TypeScript

curl --request GET \
  --url "https://api.streamdiver.com/v2/media/{assetId}/metadata?expand=entities,keywords,speakers" \
  --header "Authorization: Bearer {token}"

metadata = requests.get(
    f"https://api.streamdiver.com/v2/media/{asset_id}/metadata",
    params={"expand": "entities,keywords,speakers"},
    headers={"Authorization": f"Bearer {token}"},
).json()

# Access expanded entities
for entity in metadata["data"]["entities"]["items"]:
    print(entity["name"], entity["count"])

# Access expanded keywords
for keyword in metadata["data"]["keywords"]["items"]:
    print(keyword["name"])

const metadata = await fetch(
  `https://api.streamdiver.com/v2/media/${assetId}/metadata?` +
    new URLSearchParams({ expand: "entities,keywords,speakers" }),
  { headers: { Authorization: `Bearer ${token}` } }
).then((r) => r.json());

metadata.data.entities.items.forEach((e) =>
  console.log(e.name, e.count)
);

Available Expand Fields

Field	Asset types	Description
`entities`	Video, audio	Named entities: people, places, organizations
`keywords`	Video, audio	Automatically derived keywords and topics
`speakers`	Video, audio	Recognized speakers with labels and timestamps
`videoTexts`	Video	On-screen text extracted via OCR
`imageTexts`	Image	Text detected in images via OCR
`documentContent`	Document	Full extracted text content of documents
`file`	All	File-level metadata (EXIF, IPTC, XMP)

info

Transcripts are not included in expand results. Use the dedicated Transcripts API to list and download transcripts.

Entities

Entities are named references to people, locations, and organizations extracted from transcripts and document content:

curl --request GET \
  --url "https://api.streamdiver.com/v2/media/{assetId}/metadata?expand=entities" \
  --header "Authorization: Bearer {token}"

Each entity includes its name, category, and the number of occurrences.

Keywords

Keywords represent the most relevant topics and terms found in the asset:

curl --request GET \
  --url "https://api.streamdiver.com/v2/media/{assetId}/metadata?expand=keywords" \
  --header "Authorization: Bearer {token}"

Speakers

For video and audio assets, Streamdiver automatically identifies and segments speakers. Expand speakers to see who spoke and when:

curl --request GET \
  --url "https://api.streamdiver.com/v2/media/{assetId}/metadata?expand=speakers" \
  --header "Authorization: Bearer {token}"

Speakers are initially anonymous and can be labeled via the Publishing-Suite or the API:

# List all recognized speakers across your tenant
curl --request GET \
  --url "https://api.streamdiver.com/v2/tenants/current/speakers?includeAnonymous=false" \
  --header "Authorization: Bearer {token}"

tipp

Once a speaker is labeled, they are automatically recognized across all future uploads.

OCR Text (Video & Image)

Text appearing on screen in videos or embedded in images is extracted via OCR:

# Video on-screen text
curl --request GET \
  --url "https://api.streamdiver.com/v2/media/{assetId}/metadata?expand=videoTexts" \
  --header "Authorization: Bearer {token}"

# Image text
curl --request GET \
  --url "https://api.streamdiver.com/v2/media/{assetId}/metadata?expand=imageTexts" \
  --header "Authorization: Bearer {token}"

OCR text is also indexed for search, so slides, whiteboards, and titles in your videos become searchable.

Document Content

For uploaded documents (PDF, DOCX, etc.), the full text content is extracted and indexed:

curl --request GET \
  --url "https://api.streamdiver.com/v2/media/{assetId}/metadata?expand=documentContent" \
  --header "Authorization: Bearer {token}"

File Metadata

Retrieve technical file metadata (EXIF, IPTC, XMP) for any asset:

curl --request GET \
  --url "https://api.streamdiver.com/v2/media/{assetId}/metadata?expand=file" \
  --header "Authorization: Bearer {token}"

Re-Extract Metadata

Trigger a new metadata extraction for an asset, optionally with specific presets:

curl --request PUT \
  --url https://api.streamdiver.com/v2/media/{assetId}/metadata \
  --header "Authorization: Bearer {token}" \
  --header "Content-Type: application/json" \
  --data '{
    "presets": [
      { "presetId": "{presetId}", "configuration": "{\"language\": \"de-v2\"}" }
    ]
  }'

Available presets

Use GET /tenants/current/metadata/presets to list the metadata extraction presets available for your tenant, along with their configuration schemas and dependencies.

SEO Metadata

For public video assets, Streamdiver generates JSON-LD structured data compliant with the Google Video SEO standard:

curl --request GET \
  --url https://api.streamdiver.com/v2/media/{assetId}/seo \
  --header "Authorization: Bearer {token}"

Embed the returned JSON-LD in your page's <head> to improve search engine visibility for embedded videos.

Further Resources

Interactive API Reference -- Metadata -- all metadata endpoints
Transcripts & Subtitles -- manage transcripts separately
Search -- search across all metadata layers
Chat, RAG & Flows -- ask questions using metadata as context

KI-Medienanalyse -- full AI pipeline: entities, keywords, speakers, OCR, and automated workflows
Medienbeobachtung -- entity extraction and analytics for media monitoring platforms

Retrieve Metadata​

Expand Specific Metadata​

Available Expand Fields​

Entities​

Keywords​

Speakers​

OCR Text (Video & Image)​

Document Content​

File Metadata​

Re-Extract Metadata​

SEO Metadata​

Further Resources​

Related Use Cases​