1     Definition 2     Functional Requirements 3     Syntax
4     Semantics 5    Conformance Testing 6     Performance Assessment

1      Definition

A Data Type representing characteristic elements extracted from the input speech, specifically Pitch, Intensity, Tempo, Personal Status, and NNSpeechFeatures in a period of time.

2      Functional Requirements

Speech Descriptors may include Neural Network Descriptors.

3      Syntax

https://schemas.mpai.community/MMC/V2.4/data/SpeechDescriptors.json

4      Semantics

Label Description
Header Speech Descriptors Header
– Standard – SpeechDescriptors The characters “MMC-SPD-V”
– Version Major version – 1 or 2 characters
– Dot-separator The character “.”
– Subversion Minor version – 1 or 2 characters
MInstanceID ID of the Metaverse Instance.
SpeechDescriptorsID ID of Speech Descriptors.
SpeechDescriptorsData Data associated with Input Text.
NNSpeechFeatures The output vector of a neural-network using Speech as input.
Duration The Time in which the Speech Descriptors are computed.
Pitch Real number measuring the fundamental frequency of Speech in Hz (Hertz).
Intensity Real number measuring the Energy of Speech in dBs (decibel).
Tempo Real number measuring the rate at which specified linguistic units (Phonemes, Syllables, or Words) are produced.
Personal Status The Speech Personal Status carried by the input speech.

5     Conformance Testing

A Data instance Conforms with MPAI-MMC V2.3 Speech Descriptors (MMC-SPD) if:

  1. The Data validates against the Speech Descriptors’ JSON Schema.
  2. All Data in the  Speech Descriptors’ JSON Schema
    1. Have the specified type
    2. Validate against their JSON Schemas
    3. Conform with their Data Qualifiers if present.

6     Performance Assessment