1 Definition
A Data Type that can be used by a Text-To-Speech AI Module to generate a Speech Object from a Text Object.
2 Functional Requirements
The generated Speech Object can be perceived as:
- Having been generated by a specific human.
- Having a specific language or dialect intonation.
- Not having any specific connotation.
- Etc.
The Speech Model can be implemented as a Neural Network trained to generate utterances with specific features. A specific Neural Network has its own set of parameters that define it as having a particular Format identified by an ID.
3 Syntax
https://schemas.mpai.community/MMC/V2.2/data/SpeechModel.json
4 Semantics
Label | Size | Description |
Header | N1 Bytes | |
· Standard-Visual Object | 9 Bytes | The characters “MMC-SML-V” |
· Version | N2 Bytes | Major version – 1 or 2 characters |
· Dot-separator | 1 Byte | The character “.” |
· Subversion | N3 Bytes | Minor version – 1 or 2 characters |
MInstanceID | N4 Bytes | Identifier of M-Instance. |
SpeechModelID | N5 Bytes | Identifier of the Visual Object. |
SpeechModelData | N6 Bytes | Standard set of Model Data |
– SpeechModelQualifier | N7 Bytes | Model Format ID |
– Speech ModelPayload | N8 Bytes | Set of Data Length and URI |
– SpeechModelDataLength | N9 Bytes | Model Data Length in Bytes |
– SpeechModeDataURI | N10 Bytes | Mode Data URI |
DescrMetadata | N10 Bytes | Descriptive Metadata |
5 Data Formats
IDs required:
- ModelFormat
- ModelAttribute
- ModeAttributelFormat
6 To Respondents
Respondents are requested to:
- Comment on the characterisation of Speech Model.
- Propose Speech Model Formats.