| 1 Definition | 2 Functional Requirements | 3 Syntax | 4 Semantics |
1 Definition
A Genomics Omics Qualifier specifies metadata describing a Genomics Omics Data instance and providing information about:
- the Sub‑Type of the genomics/omics data (e.g., whole‑genome sequencing variants, RNA‑seq expression matrix, methylation profiles, proteomics abundance tables),
- the Format used to encode the data (e.g., VCF variant representation, FASTQ/BAM reads, TSV/CSV matrices, HDF5‑based assay containers), and
The combination of Behavioural Signal Data and Behavioural Signal Qualifier is called Behavioural Signal Object, specified by AIH-HSP V1.0.
2 Functional Requirements
- Genomics Omics Sub‑Type Identification: The Genomics Omics Qualifier shall specify the Sub‑Type of the Genomics Omics Data.
- Genomics Omics Format Identification: The Genomics Omics Qualifier shall specify the Format used to encode the Genomics Omics Data.
- Separation from Data: The Genomics Omics Qualifier shall not contain intrinsic genomics/omics samples; it shall only describe the Genomics Omics Data.
- Interpretability of Data: The Genomics Omics Qualifier shall provide sufficient information to interpret the Genomics Omics Data.
Users needing additional entries in the Physiological Signal Qualifier or support of new Qualifiers should make a documented request to the MPAI Secretariat. Requests will be considered by the appropriate MPAI committee.
3 Syntax
https://schemas.mpai.community/TFA/V1.5/data/GenomicsOmicsQualifier.json
4 Semantics
- Sub‑Types
- DNA‑Based Data
Whole‑genome variants
Whole‑exome variants
Targeted panel variants
Structural variants
Copy‑number profiles - RNA‑Based Data
Gene‑expression matrix (bulk RNA‑seq)
Transcript‑level quantification
Splicing / junction counts - Epigenomic Data
Methylation array profiles
Bisulfite sequencing profiles - Proteomics Data
Protein abundance profiles
Peptide quantification tables - Metabolomics Data
Metabolite abundance profiles
- DNA‑Based Data
- Formats
- DNA‑Based Formats
VCF
BCF
FASTQ (raw reads)
BAM / CRAM (aligned reads)
TSV/CSV (variant tables) - RNA‑Based Formats
FASTQ (raw reads)
BAM / CRAM
TSV/CSV (gene‑expression matrices)
HDF5 (e.g., assay containers) - Epigenomic Formats
IDAT (array raw files)
TSV/CSV (processed methylation profiles)
HDF5 - Proteomics Formats
TSV/CSV (protein/peptide quantification)
HDF5 - Metabolomics Formats
TSV/CSV
HDF5
- DNA‑Based Formats