1 Function | 2 Reference Model | 3 Input/Output Data |
4 SubAIMs | 5 JSON Metadata | 6 Profiles |
7 Reference Software | 8 Conformance Texting | 9 Performance Assessment |
1 Functions
The Text and Speech Translation Composite (MMC-TST) AIM :
Receives | Selection info between: |
– The AIM output should be Text or Speech. | |
– The output Speech should retain the input Speech Features. | |
Language Preferences in the form of requested input and output language. | |
Personal Status. | |
Text. | |
Speech. | |
Performs | (A subset of) the following: |
Conversion of input Speech into Text using Personal Status. | |
Translation of Text to the target language. | |
Extraction of Features from Speech. | |
Conversion of Text into Speech adding the Input Speech’s Features. | |
Produces | Translated Text. |
. | Translated Speech |
2 Reference Model
Figure 1 depicts the Reference Model of the Text-and-Speech Translation Composite (MMC-TST) AIM.
Figure 1 – Text-and-Speech Translation (MMC-TST) AIM Reference Model
3 Input/Output Data
Table 1 specifies the Input and Output Data of the Text-to-Text Translation (MMC-TST) AIM.
Table 1 – I/O Data of the Text-and-Speech Translation (MMC-TST) AIM
Input | Semantics |
Selector | Signals: 1. Whether the input is Text or Speech 2. Whether the input Speech features are preserved in the output Speech. 3. The Input and output languages. |
Speech Object | Speech produced in input language by a human desiring translation into output language |
TextObject | Alternative textual source information to be translated into and pronounced in output language depending on the value of Input Selection. |
Output | Description |
Translated SpeechObject | Speech in input language translated into output language preserving the Input Speech features in the Output Speech, depending on Selector. |
Translated TextObject | Text of Input Speech or Input Text translated into output language, depending on Selector. |
4 SubAIMs
Text and Speech Translation is a Composite AIM whose Reference Model is depicted in Figure 2.
Figure 2 – Text-and-Speech Translation Composite (MMC-TST) AIM
Table 2 – AIMs of Text-and-Speech Translation Composite (MMC-TST) AIM
AIW | AIMs | AIM Names | JSON |
MMC-TST | Text-and-Speech Translation | X | |
MMC-ASR | Automatic Speech Recognition | X | |
MMC-TTT | Text-to-Text Translation | X | |
MMC-ISD | Entity Speech Description | X | |
MMC-DTS | Descriptors Text-to-Speech | X |
5 JSON Metadata
https://schemas.mpai.community/MMC/V2.2/AIMs/TextAndSpeechTranslation.json
6 Profiles
The Profiles of Text and Speech Translation are specified.
7. Reference Software
8. Conformance Testing
Important note. This Conformance Testing Specification does not provide methods and datasets to Test the Conformance of the individual Speech Feature Extraction and Text-To-Speech Basic AIMs, only of their Descriptors Speech Translation Composite AIMs.
Table 3 provides an example of MMC-TSTAIM conformance testing.
Table 3 – An example MMC-TST AIM conformance testing
Input Data | Data Type | Input Conformance Testing Data |
Input Selector | Selector | All Input Selectors to conform with Selector. |
Requested Language | Selector | All Language Selectors to be drawn from Language Codes. |
Input Text | Unicode | All input Text files shall be drawn from Text files. |
Input Speech | .wav | All input Text files shall be drawn from Speech files. |
Output Data | Data Type | Conformance Test |
Machine Text | Unicode | All Text files produced shall conform with Text files. |
Machine Speech | .wav | All Speech files produced shall conform with Speech files. |