1 Definition
A Data Type that can be used by a Text-To-Speech AI Module to generate a Speech Object from a Text Object.
2 Functional Requirements
The generated Speech Object can be perceived as:
- Having been generated by a specific human.
- Having a specific language or dialect intonation.
- Not having any specific connotation.
- Etc.
The Speech Model can be implemented as a Neural Network trained to generate utterances with specific features. A specific Neural Network has its own set of parameters that define it as having a particular Format identified by an ID.
3 Syntax
https://schemas.mpai.community/MMC/V2.2/data/SpeechModel.json
4 Semantics
| Label | Size | Description |
| Header | N1 Bytes | |
| · Standard-Visual Object | 9 Bytes | The characters “MMC-SML-V” |
| · Version | N2 Bytes | Major version – 1 or 2 characters |
| · Dot-separator | 1 Byte | The character “.” |
| · Subversion | N3 Bytes | Minor version – 1 or 2 characters |
| MInstanceID | N4 Bytes | Identifier of M-Instance. |
| SpeechModelID | N5 Bytes | Identifier of the Visual Object. |
| SpeechModelData | N6 Bytes | Standard set of Model Data |
| – SpeechModelQualifier | N7 Bytes | Model Format ID |
| – Speech ModelPayload | N8 Bytes | Set of Data Length and URI |
| – SpeechModelDataLength | N9 Bytes | Model Data Length in Bytes |
| – SpeechModeDataURI | N10 Bytes | Mode Data URI |
| DescrMetadata | N10 Bytes | Descriptive Metadata |
5 Data Formats
IDs required:
- ModelFormat
- ModelAttribute
- ModeAttributelFormat
6 To Respondents
Respondents are requested to:
- Comment on the characterisation of Speech Model.
- Propose Speech Model Formats.