Media Metadata

Each uploaded Media asset is analyzed automatically and its Metadata is extracted. This starts with basic Metadata like file size, creation date etc. and goes far beyond. It includes descriptive Metadata like transcripts; speakers diarization and recognition; keywords and entities; and a scene recognition, which is obtained applying various machine learning techniques.

Transcription in multiple languages is supported, you can retrieve a list of available languages for your tenant and request multiple language transcripts for an asset via extract metadata. In particular, we also support the automatic cross-language transcription from any source language to English.

The speakers in videos and audio assetss are automatically assigned representations, that can be labeled. Then they are named automatically in the transcripts and Subtitles if they are recognized again in subsquently uploaded Media assets.

The Metadata is indexed and constitutes the basis for the deep-search functionality accross all Media types. The Metadata can be retrieved as structured-data for SEO enhancement of embedded Media assets.

📄️ Retrieve metadata

This endpoint returns all derived `Metadata` for a given `Media` asset.

📄️ Extract metadata

Extracts metadata (with optional presets) from the `Media` asset.

📄️ Delete metadata

Deletes a metadata for a given asset.

📄️ List transcripts

The transcripts endpoint returns a list of transcripts associated with the given `Media` asset.

📄️ Retrieve a transcript

Retrieves a transcript of a `Media` asset by its identifier. An ID is needed since there can be multiple `Transcripts` for a single `Media` asset.

📄️ Update a transcript

Updates a particular `Transcript` of a `Media` asset. Partial updates are supported based on a subset of the `Transcript` paragraphs.

📄️ Delete a transcript

Deletes a `Transcript` of a `Media` asset.

📄️ Update the transcript time alignment

Realign the word timestamps of a `Transcript` and update the `Transcript` with the word timestamps.

📄️ Activate a transcript

Adds the `isActive` attribute to the specified `Transcript` and removes it from all other `Transcripts` of the given `Media` asset.

📄️ Update a speaker in a transcript

Update a speaker in a `Transcript` by assigning a different speaker to the `paragraph`. This informs the system that the recognition is incorrect and that the representation of the speaker should be updated.

📄️ Assign a transcript as subtitle

Derives subtitles from the `Transcript` and assignes them. The ISO-639 `languageModel` code of the `Transcript` will be automatically mapped to the language name.

📄️ Download a transcript

Download a transcript in the specified format. Available formats are `JSON`, `TXT`, `SRT`, `WebVTT` and `DOCX` (*Microsoft Word*). A [template can be created](#tag/Design-Settings/operation/createDocxExportDesignSettings) and applied to the `DOCX` format export.