1     Version

V2.1

2     Functions

The Text and Speech Translation Composite AIM (MMC-TST):

  1. Receives:
    • Input Selection determining whether the input is provided as text or speech. If the desired output is speech, Selection also specifies whether The speech features should be preserved in the translated speech.
    • Language Preference determining the input Language and the target Language.
    • Input Text.
    • Input Speech.
  2. Produces Translated Text or Translated Speech in the target Language.

3    Reference Architecture

Figure 9 depicts the Reference Architecture of the Text-and-Speech Translation Composite AIM.

Figure 9 – Text-and-Speech Translation (HMC-TST) Composite AIM

4      I/O Data

Table specifies the Input and Output Data of the Text-to-Text Translation AIM.

Table – I/O Data of the Text-and-Speech Translation AIM

Input Semantics
Input Selector Determines whether:
1.     The input will be in Text or Speech
2.     The Input Speech features are preserved in the Output Speech.
Language Preferences User-specified input Language (A) and output Language (B).
Input Speech Speech produced in Language A by a human desiring translation into language B.
Input Text Alternative textual source information to be translated into and pron­ounced in language B depending on the value of Input Selection.
Output Description
Translated Speech Input Speech in language A translated into language B preserving the Input Speech features in the Output Speech, depending on the value of Input Selec­tion.
Translated Text Text of Input Speech or Input Text translated into language B, depending on the value of Input Selection.

5     SubAIMs

MMC-ASR Automatic Speech Recognition X
MMC-TTT Text-to-Text Translation X
MMC-ISD Input Speech Description X
MMC-TTS Text-to-Speech X

6     JSON Metadata

https://schemas.mpai.community/MMC/V2.1/AIMs/TextAndSpeechTranslation.json