Speech Recognition - Transcript Schema – Nuclei, Inc.

Overview

This guide provides an overview of the schema used for Nuclei Speech Recognition. This schema can be used by integration partners that desire to receive JSON formatted transcripts from Nuclei to enable custom post-processing, including:

Word-level timestamps
Word-level accuracy confidence scores
Speaker segmented transcripts

Example use cases include:

Select a word from transcript and jump to the correct audio segment for replay.
Hover over a word from transcript and see the accuracy / confidence score for that word.
Display speaker segmented transcript summaries.

Schema

Each JSON transcript contains the following additional metadata:

id
metadata
- identify_language
- identify_multiple_languages
- language_code
- language_options
- media_format
- settings
  - channel_identification
  - max_speaker_labels
  - show_speaker_labels
results
- summary
- items
  - start_time
  - end_time
  - alternatives
    - confidence
    - content
  - type

Examples

The following example demonstrates the transcript schema for a basic recording:

{
    "id": "c70e570f-c970-49af-a90c-4bfeb76b6f95",
    "metadata": {
        "identify_language": true,
        "identify_multiple_languages": false,
        "identified_language_score": 0.9849987030029297,
        "language_code": "en-US",
        "language_options": ["de-DE", "en-US", "fr-FR"],
        "media_format": "mp3",
        "settings": {
            "channel_identification": false,
            "max_speaker_labels": 2,
            "show_speaker_labels": false
        }
    },
    "results": {
        "summary": "Hello, world!",
        "items": [
            {
                "start_time": "0.64",
                "end_time": "1.24",
                "alternatives": [
                    {
                        "confidence": "1.0",
                        "content": "Hello"
                    }
                ],
                "type": "pronunciation"
            },
            {
                "alternatives": [
                    {
                        "confidence": "0.0",
                        "content": ","
                    }
                ],
                "type": "punctuation"
            },
            {
                "start_time": "1.39",
                "end_time": "1.71",
                "alternatives": [
                    {
                        "confidence": "1.0",
                        "content": "world"
                    }
                ],
                "type": "pronunciation"
            },
            {
                "alternatives": [
                    {
                        "confidence": "0.0",
                        "content": "!"
                    }
                ],
                "type": "punctuation"
            }
        ]
    }
}

More Information

For more information, please contact Nuclei's sales team at hello@nuclei.ai

Overview

Schema

Examples

More Information

Related articles