Bidirectional Speech Translation (MMC-BST)

1 Scope of Bidirectional Speech Translation

2 Reference Architecture of Bidirectional Speech Translation

3 I/O Data of Bidirectional Speech Translation

4 Functions of AI Modules of Bidirectional Speech Translation

5 I/O Data of AI Modules of Bidirectional Speech Translation

6 JSON Metadata of Bidirectional Speech Translation

1 Scope of Bidirectional Speech Translation

The goal of the Bidirectional Speech Translation (MMC-BST) Use Case is to support a conversation between two people, each speaking a different language. The machine translates each input speech segment into the selected language as speech or text. If the desired output is speech, users can specify whether their speech features (voice colour, emotional charge, etc.) should be preserved in the translated speech.

The flow of control (from Input Speech to Translated Text to Output Speech) is identical to that of the Unidirectional case. The difference is that, rather than one such flow, two flows are provided in two different channels – the first from language A to language B, and the second from language B to language A.

Depending on the value of Input Selector:

Input Text in Language A is translated into Translated Text in Language B and pronounced as Speech in Language B.
The Speech features (voice colour, emotional charge, etc.) in Language A are preserved in Language B.

The same applies for the Language-B-to-Language-A channel.

2 Reference Architecture of Bidirectional Speech Translation

Figure 1 depicts the AIMs and the data exchanged between AIMs.

Figure 1 – Reference Model of Bidirectional Speech Translation (BST)

3 I/O Data of Bidirectional Speech Translation

The input and output data of the Bidirectional Speech Translation Use Case are given by Table 1:

Table 1 – I/O Data of Bidirectional Speech Translation

Input	Descriptions
Input Selector	Determines whether the input will be Text or Speech.
Language Preferences	User-specified input language and output languages
Input Speech1	Speech by human1 desiring spoken translation in the specified language.
Input Text1	Alternative Input Text to be translated to the specified language.
Input Speech2	Speech by human2 desiring spoken translation in the specified language.
Input Text2	Alternative Input Text to be translated to the specified language.
Output	Descriptions
Output Speech1	Translated Speech of Speaker 1.
Output Text1	Text of the translated Speech of Speaker 1.
Output Speech2	Translated Speech of Speaker 2.
Output Text2	Text of the translated Speech of Speaker 2.

4 Functions of AI Modules of Bidirectional Speech Translation

Table 2 gives the functions of Bidirectional Speech Translation AIMs.

Table 2 – Functions of Bidirectional Speech Translation AI Modules

AIM	Functions
Automatic Speech Recognition	Recognises Speech
Text-to-Text Translation	Translates Recognised Text
Input Speech Description	Extracts Speech Features
Text-to-Speech (Features)	Synthesises Translated Text adding Speech Features

5 I/O Data of AI Modules of Bidirectional Speech Translation

Table 3 gives the I/O Data of the AI Modules.

Table 3 – AI Modules of Bidirectional Speech Translation

AIM	Receives	Produces
Automatic Speech Recognition	1. Input Speech1 Segment 2. Input Speech2 Segment	1. Recognised Text1 2. Recognised Text2.
Text-to-Text Translation	1. Input Text1 or Recognised Text1 2. Input Text2 or Recognised Text2 3. based on the value of Input Selector	1. Translated Text1 2. Translated Text2.
Input Speech Description	1. Input Speech1 2. Input Speech2	1. Speech Descriptors1 2. Speech Descriptors2.
Text-to-Speech (Features)	1. Translated Text1 and 2. Translated Text2 and Speech Descriptors 3. Speech Descriptors 1 and 2 based on Input Selector	1. Translated Speech1 2. Translated Speech1

6 Specification of Bidirectional Speech Translation AIW, AIMs, and JSON Metadata

Table 4 – AIMs and JSON Metadata

AIW and AIMs			Name	JSON
MMC-BST			Bidirectional Speech Translation	X
	–	MMC-ASR	Audio Scene Description	X
	–	MMC-TTT	Text-to-Speech	X
	–	MMC-ISD	Input Speech Description	X
	–	MMC-TTS	Text-to-Speech	X

Cookie	Duration	Description
cookielawinfo-checkbox-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Technical".
CookieLawInfoConsent	1 year	The cookie is set by the GDPR Cookie Consent plug-in and is used to store whether the user has consented to the use of cookies or not. It does not store any personal data.
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
_pk_id.6.08a8	13 months	Used to store a few details about the user such as the unique visitor ID
_pk_ses.6.08a8	30 minutes	Short lived cookies used to temporarily store data for the visit

Bidirectional Speech Translation (MMC-BST)

1 Scope of Bidirectional Speech Translation

2 Reference Architecture of Bidirectional Speech Translation

3 I/O Data of Bidirectional Speech Translation

4 Functions of AI Modules of Bidirectional Speech Translation

5 I/O Data of AI Modules of Bidirectional Speech Translation

6 Specification of Bidirectional Speech Translation AIW, AIMs, and JSON Metadata

Notice