| 1 Function | 2 Reference Model | 3 Input/Output Data | 
| 4 SubAIMs | 5 JSON Metadata | 6 Profiles | 
| 7 Reference Software | 8 Conformance Texting | 9 Performance Assessment | 
1 Functions
Text-To-Speech (MMC-TTS):
| Receives | Text Object | |
| Personal Status | to be contained in the Synthesised Speech Object. | |
| Speech Model | used by AIM depending on Profile. | |
| Feeds | Text Object and Personal Status | to Speech Model. | 
| Produces | Synthesised Speech Object | output of AIM. | 
2 Reference Model
Figure 1 specifies the Reference Model of the Text-To-Speech (MMC-TTS) AIM.

Figure 1 – The Text-To-Speech AIM Reference Model
3 Input/Output Data
Table 1 specifies the Input and Output Data of the Automatic Speech Recognition AIM.
Table 1 – I/O Data of the Automatic Speech Recognition AIM
| Input | Description | 
| Text Object | Input Text. | 
| Personal Status | Input Personal Status of the Speech Modality. | 
| Speech Model | NN Model used to produce Speech from Text and Personal Status. | 
| Output | Description | 
| Speech Object | Output of the Text-To-Speech AIM, | 
4 SubAIMs
No SubAIMs.
5 JSON Metadata
https://schemas.mpai.community/MMC/V2.3/AIMs/TextToSpeech.json
6 Profiles
The Text-To-Speech Profiles are specified.
7 Reference Software
7.1 Disclaimers
- The purpose of this MMC-TTS Reference Software is to provide a working Implementation of MMC-TTS, not to provide a ready-to-use product.
- MPAI disclaims the suitability of the Software for any other purposes and does not guarantee that it is secure.
- Use of this Reference Software may require acceptance of licences from the respective repositories. Users shall verify that they have the right to use any third-party software required by this Reference Software.
7.2 Guide to the MMC-TTS code
Use of this AI Module is for developers who are familiar with Python and downloading models from HuggingFace,
A wrapper for the speech5 NN Module
- Manages input files and parameters: Text Object
- Executes the BLIP Module to perform the Speech Recognition on each individual pair of Text and Visual Object.
- Outputs Speech Object as answer.
The MMC-TTS Reference Software is found at the MPAI-NNW gitlab site. It contains:
- The python code implementing the AIM
- Required libraries are: pytorch, transformers (HuggingFace), datasets (HuggingFace), and soundfile.
7.3 Acknowledgements
This version of the MMC-TTS Reference Software has been developed by the MPAI Neural Network Watermarking Development Committee (NNW-DC).
8 Conformance Testing
Table 2 provides the Conformance Testing Method for MMC-TTS AIM.
If a schema contains references to other schemas, conformance of data for the primary schema implies that any data referencing a secondary schema shall also validate against the relevant schema, if present and conform with the Qualifier, if present.
Table 2 – Conformance Testing Method for MMC-TTS AIM
| Input | Text Object | Shall validate against Text Object schema. Text Data shall conform with Text Qualifier. | 
| Personal Status | Shall validate against Personal Status schema. | |
| Speech Model | Shall validate against Machine Learning Model schema. Machine Learning Model Data shall conform with Machine Learning Model Qualifier. | |
| Output | Synthesised Speech Object | Shall validate against Speech Object schema. Speech Data shall conform with Speech Qualifier. | 
Table 3 provides an example of MMC-TTS AIM conformance testing.
Table 3 – An example MMC-TTS AIM conformance testing
| Input Data | Data Type | Input Conformance Testing Data | 
| Machine Text | Unicode | All input Text files to be drawn from Text files. | 
| Machine Emotion | JSON | All input JSON Emotion files to be drawn from Emotion JSON Files | 
| Output Data | Data Type | Output Conformance Testing Criteria | 
| Machine Speech | .wav | All Speech files produced shall conform with Speech. | 
