1 Functions
Text and Image Query (MMC-TIQ):
Receives | Textual part of query. |
Image part of query. | |
Produces | Text in response to Text and Image provides as input. |
2 Reference Model
Figure 1 depicts the Reference Model of the Text and Image Query (MMC-TIQ) AIM.
Figure 1 – The Text and Image Query (MMC-TIQ) AIM Reference Model
3 Input/Output Data
Table 1 specifies the Input and Output Data of the Text and Image Query (MMC-TIQ) AIM.
Table 1 – I/O Data of the Text and Image Query (MMC-TIQ) AIM
Input | Description |
Text Object | Text asking question about the Image. |
Visual Object | Image about which a question is asked. |
Output | Description |
Text Object | Response produced by Text and Image Query. |
4 SubAIMs
Text and Image Query (MMC-TIQ) can be implemented as a Composite AIM whose Reference Model is depicted in Figure 2.
Figure 2 – Text and Image Query (MMC-TIQ) Composite AIM Reference Model
The AIMs and there JSON Metadata are specified in Table 2
Table 2 – AIMs and JSON Metadata of Text and Image Query (MMC-TIQ) Composite AIM
Acronym | AIM Name | JSON | |
MMC-TIQ | Text-and-Image Query | X | |
OSD-VOI | Visual Object Identification | X | |
MMC-NLU | Natural Language Understanding | X | |
MMC-QAM | Question Analysis Module | X | |
MMC-AQM | Answer to Question Module | X |
5 JSON Metadata
https://schemas.mpai.community/MMC/V2.2/AIMs/TextAndImageQuery.json
6 Profiles
No Profiles.
7. Reference Software
7.1 Disclaimers
- The purpose of this MMC-TIQ Reference Software is to provide a working Implementation of MMC-TIQ, not to provide a ready-to-use product.
- MPAI disclaims the suitability of the Software for any other purposes and does not guarantee that it is secure.
- Use of this Reference Software may require acceptance of licences from the respective repositories. Users shall verify that they have the right to use any third-party software required by this Reference Software.
7.2 Guide to the TIQ code
Note that the Reference software implements the Basic MMC-TIQ AIM.
Use of this AI Module is for developers who are familiar with Python and downloading models from HuggingFace,
A wrapper for the BLIP NN Module:
- Manages input files and parameters: Text Object, Visual Object
- Executes the BLIP Module to perform the question answering on each individual pair of Text and Visual Object.
- Outputs Text Object as answer.
The OSD-TIQ Reference Software is found at the NNW gitlab site. It contains:
- The python code implementing the AIM.
- Required libraries are: pytorch and transformers (HuggingFace), PIL
7.3 Acknowledgements
This version of the MMC-TIQ Reference Software has been developed by the MPAI Neural Network Watermarking Development Committee (NNW-DC).