1     Function 2     Reference Model 3     Input/Output Data
4     SubAIMs 5     JSON Metadata 6     Profiles
7     Reference Software 8     Conformance Texting 9     Performance Assessment

1     Functions

The Text and Speech Translation Composite (MMC-TST) AIM :

Receives Selection info between:
– The AIM output should be Text or Speech.
– The output Speech should retain the input Speech Features.
Language Preferences in the form of requested input and output language.
Personal Status.
Text.
Speech.
Performs (A subset of) the following:
Conversion of input Speech into Text using Personal Status.
Translation of Text to the target language.
Extraction of Features from Speech.
Conversion of Text into Speech adding the Input Speech’s Features.
Produces Translated Text.
. Translated Speech

2     Reference Model

Figure 1 depicts the Reference Model of the Text-and-Speech Translation Composite AIM.

Figure 1 – Text-and-Speech Translation (MMC-TST) AIM Reference Model

3    Input/Output Data

Table 1 specifies the Input and Output Data of the Text-to-Text Translation AIM.

Table 1 – I/O Data of the Text-and-Speech Translation AIM

Input Semantics
Selector Signals:
1.     Whether the input is Text or Speech
2.    Whether the input Speech features are preserved in the output Speech.
3.     The Input and output languages.
Speech Object Speech produced in input language by a human desiring translation into output language
TextObject Alternative textual source information to be translated into and pron­ounced in output language depending on the value of Input Selection.
Output Description
Translated  SpeechObject Speech in input language translated into output language preserving the Input Speech features in the Output Speech, depending on Selec­tor.
Translated TextObject Text of Input Speech or Input Text translated into output language, depending on Selector.

4     SubAIMs

Text and Speech Translation is a Composite AIM whose Reference Model is depicted in Figure 2.

Figure 1 – Text-and-Speech Translation Composite (MMC-TST) AIM

Acronym   AIM Name JSON
MMC-TST Text-and-Speech Translation X
MMC-ASR Automatic Speech Recognition X
MMC-TTT Text-to-Text Translation X
MMC-ISD Entity Speech Description X
MMC-DTS Descriptors Text-to-Speech X

5     JSON Metadata

https://schemas.mpai.community/MMC/V2.2/AIMs/TextAndSpeechTranslation.json

6     Profiles

The Profiles of Text and Speech Translation are specified.

7. Reference Software

8. Conformance Testing

Important note. This Conformance Testing Specification does not provide methods and datasets to Test the Conformance of the individual Speech Feature Extraction and Text-To-Speech Basic AIMs, only of their Descriptors Speech Translation Composite AIMs.

Input Data Data Type Input Conformance Testing Data
Input Selector Selector All Input Selectors to conform with Selector.
Requested Language Selector All Language Selectors to be drawn from Language Codes.
Input Text Unicode All input Text files shall be drawn from Text files.
Input Speech .wav All input Text files shall be drawn from Speech files.
Output Data Data Type Conformance Test
Machine Text Unicode All Text files produced shall conform with Text files.
Machine Speech .wav All Speech files produced shall conform with Speech files.

9. Performance Assessment