1     Functions

The Text and Speech Translation Composite AIM (MMC-TST):

  1. Selector to inform whether:
  2. 1. The AIM output should be Text or Speech.
  3. 1. The output Speech should retain the input Speech Features.
  4. Language Preferences in the form of requested input and output language.
  5. Personal Status.
  6. Text.
  7. Speech.
Performs (A subset of) the following:

  1. Converts input Speech into Text using Personal Status.
  2. Translates the Text to the target language.
  3. Extracts the Features from Speech.
  4. Converts Text into Speech adding the Speech Features of the input Speech.
  1. Translated Text.
  2. Translated Speech.

2    Reference Architecture

Figure 1 depicts the Reference Architecture of the Text-and-Speech Translation Composite AIM.

Figure 1 – Text-and-Speech Translation (HMC-TST) AIM

3      I/O Data

Table 1 specifies the Input and Output Data of the Text-to-Text Translation AIM.

Table 1 – I/O Data of the Text-and-Speech Translation AIM

Input Semantics
Input Selector Signals whether:
1.     The input will be in Text or Speech
2.     The Input Speech features are preserved in the Output Speech.
3.     Input Language and Language (B).
Input Speech Speech produced in Language A by a human desiring translation into language B.
Input Text Alternative textual source information to be translated into and pron­ounced in language B depending on the value of Input Selection.
Output Description
Translated  Speech Input Speech in language A translated into language B preserving the Input Speech features in the Output Speech, depending on the value of Input Selec­tion.
Translated Text Text of Input Speech or Input Text translated into language B, depending on the value of Input Selection.

4     SubAIMs

Text and Speech Translation is a Composite AIM whose Reference Model is depicted in Figure 2.

Acronym AIM Name JSON
MMC-TST Text-and-Speech Translation X
MMC-ASR Automatic Speech Recognition X
MMC-TTT Text-to-Text Translation X
MMC-ISD Entity Speech Description X
MMC-TTS Text-to-Speech X

5     JSON Metadata


6     Profiles

The Profiles of Text and Speech Translation are specified.