1 Function | 2 Reference Model | 3 Input/Output Data |
4 SubAIMs | 5 JSON Metadata | 6 Profiles |
7 Reference Software | 8 Conformance Texting | 9 Performance Assessment |
1 Functions
Text and Image Query (MMC-TIQ):
Receives | Text Object | Textual part of query. |
Image Visual Object | Image part of query. | |
Produces | Text Object | In response to Text and Image provides as input. |
2 Reference Model
Figure 1 depicts the Reference Model of the Text and Image Query (MMC-TIQ) AIM.
Figure 1 – The Text and Image Query (MMC-TIQ) AIM Reference Model
3 Input/Output Data
Table 1 specifies the Input and Output Data of the Text and Image Query (MMC-TIQ) AIM.
Table 1 – I/O Data of the Text and Image Query (MMC-TIQ) AIM
Input | Description |
Text Object | Text asking question about the Image. |
Visual Object | Image about which a question is asked. |
Output | Description |
Text Object | Response produced by Text and Image Query. |
4 SubAIMs
Text and Image Query (MMC-TIQ) can be implemented as a Composite AIM whose Reference Model is depicted in Figure 2.
Figure 2 – Text and Image Query (MMC-TIQ) Composite AIM Reference Model
The AIMs and there JSON Metadata are specified in Table 2
Table 2 – AIMs and JSON Metadata of Text and Image Query (MMC-TIQ) Composite AIM
Acronym | AIM Name | JSON | |
MMC-TIQ | Text-and-Image Query | X | |
OSD-VOI | Visual Object Identification | X | |
MMC-NLU | Natural Language Understanding | X | |
MMC-QAM | Question Analysis Module | X | |
MMC-AQM | Answer to Question Module | X |
5 JSON Metadata
https://schemas.mpai.community/MMC/V2.3/AIMs/TextAndImageQuery.json
6 Profiles
No Profiles.
7. Reference Software
7.1 Disclaimers
- The purpose of this MMC-TIQ Reference Software is to provide a working Implementation of MMC-TIQ, not to provide a ready-to-use product.
- MPAI disclaims the suitability of the Software for any other purposes and does not guarantee that it is secure.
- Use of this Reference Software may require acceptance of licences from the respective repositories. Users shall verify that they have the right to use any third-party software required by this Reference Software.
7.2 Guide to the TIQ code
Note that the Reference software implements the Basic MMC-TIQ AIM.
Use of this AI Module is for developers who are familiar with Python and downloading models from HuggingFace,
A wrapper for the BLIP NN Module:
- Manages input files and parameters: Text Object, Visual Object
- Executes the BLIP Module to perform the question answering on each individual pair of Text and Visual Object.
- Outputs Text Object as answer.
The OSD-TIQ Reference Software is found at the NNW gitlab site. It contains:
- The python code implementing the AIM.
- Required libraries are: pytorch and transformers (HuggingFace), PIL
7.3 Acknowledgements
This version of the MMC-TIQ Reference Software has been developed by the MPAI Neural Network Watermarking Development Committee (NNW-DC).
8. Conformance Testing
The Conformance Testing Method for the MMC-TIQ Basic AIM is provided here. The Conformance Testing Methods for the individual Basic AIMs of the MMC-TIQ Composite AIM are provided by the individual Basic AIMs.
Table 3 provides the Conformance Testing Method for MMC-TIQ AIM.
If a schema contains references to other schemas, conformance of data for the primary schema implies that any data referencing a secondary schema shall also validate against the relevant schema, if present and conform with the Qualifier, if present.
Table 3 – Conformance Testing Method for MMC-TIQ AIM
Input | Text Object | Shall validate against Text Object schema. Text Data shall conform with Text Qualifier |
Image Visual Object | Shall validate against Visual Object schema. Visual Data shall conform with Visual Qualifier |
|
Output | Text Object | Shall validate against Text Object schema. Text Data shall conform with Text Qualifier |