1 Version
V2.1
2 Functions
Bidirectional Speech Translation (MMC-UST) enables two humans each speaking a different language to have a conversation where each human speaks and hears their own language:
- Receives
- Input Selector – indicates whether
- Input is Text of Speech
- Speech Features of Input Speech should be preserved in Translated Speech.
- Requested Language – Language of Speech and Target Speech.
- Input Text1 – Text to be translated
- Input Speech1 – Speech to be translated
- Input Text2 – Text to be translated
- Input Speech2 – Speech to be translate
- Input Selector – indicates whether
- Produces Translated Text1 or Speech1 an Translated Text2 or Speech2.
3 Reference Model
Figure 1 depicts the AIMs and the data exchanged between AIMs.
Figure 1 – Reference Model of Bidirectional Speech Translation (BST)
4 I/O Data
The input and output data of the Bidirectional Speech Translation Use Case are given by Table 1:
Table 1 – I/O Data of Bidirectional Speech Translation
Input | Descriptions |
Input Selector | Determines whether the input will be Text or Speech. |
Language Preferences | User-specified input language and output languages |
Input Speech1 | Speech by human1 desiring spoken translation in the specified language. |
Input Text1 | Alternative Input Text to be translated to the specified language. |
Input Speech2 | Speech by human2 desiring spoken translation in the specified language. |
Input Text2 | Alternative Input Text to be translated to the specified language. |
Output | Descriptions |
Output Speech1 | Translated Speech of Speaker 1. |
Output Text1 | Text of the translated Speech of Speaker 1. |
Output Speech2 | Translated Speech of Speaker 2. |
Output Text2 | Text of the translated Speech of Speaker 2. |
5 JSON Metadata
https://schemas.mpai.community/MMC/V2.1/AIWs/BidirectionalSpeechTranslation.json
AIMs | Name | JSON | |
– | MMC-ASR | Audio Scene Description | X |
– | MMC-TTT | Text-to-Speech | X |
– | MMC-ISD | Input Speech Description | X |
– | MMC-TTS | Text-to-Speech | X |