| 1 Definition | 2 Functional Requirements | 3 Syntax | 4 Semantics |
1 Definition
The Genomic Processing Type defines the allowable operations, methods, and processing descriptors used for Genomic (DNA/RNA) data.
It reuses Common Definitions for: Header, Algorithm, Algorithms, FeatureClass, Features.
It reuses Common Definitions for: Header, Algorithm, Algorithms, FeatureClass, Features.
2 Functional Requirements
The Genomic Processing Type shall:
- Fix Domain = Genomics.
- Validate Operation against genomics‑specific enumerations.
- Validate Method against genomics computational techniques.
- Allow Algorithm to be a string identifier or an AlgorithmObject.
- Allow Algorithms to be an array of Algorithm items.
- Require Features to be a non‑empty array of unique strings.
3 Syntax
https://schemas.mpai.community/AIH1/V1.0/data/GenomicProcessingType.json
4 Semantics
| Label | Description |
|---|---|
| Header | Genomic Processing Type Header, Standard “AIH‑GNT‑Vx.y”. |
| Domain | Constant value "Genomics". Processing Type applies exclusively to genomic data. |
| Operation | Specifies the genomics‑specific processing step. Enumerated list includes: QualityControl, Alignment, VariantCalling, VariantFiltering, Annotation, GeneExpressionQuantification, DifferentialExpression, CopyNumberAnalysis, HaplotypeReconstruction, EpigenomicProcessing. |
| QualityControl | Operation performing read‑level QC (quality checks, trimming, adapter removal). |
| Alignment | Operation mapping sequencing reads to a reference genome. |
| VariantCalling | Operation identifying SNPs, indels, or structural variants. |
| VariantFiltering | Operation filtering variants using quality thresholds or rules. |
| Annotation | Operation adding functional/clinical annotations to variants or genes. |
| GeneExpressionQuantification | Operation quantifying gene/isoform expression from RNA‑seq. |
| DifferentialExpression | Operation comparing expression levels across groups/conditions. |
| CopyNumberAnalysis | Operation detecting genomic amplifications or deletions (CNV). |
| HaplotypeReconstruction | Operation phasing variants into haplotypes. |
| EpigenomicProcessing | Operation analysing methylation or chromatin accessibility signals. |
| Method | Processing technique used to implement the operation. Must be one of: FastQC, Cutadapt, Trimmomatic, BWA, Bowtie2, STAR, HISAT2, GATK, FreeBayes, DeepVariant, ANNOVAR, VEP, Salmon, Kallisto, DESeq2, EdgeR, CNVkit. |
| Algorithm | String identifier or AlgorithmObject from CommonDefinitions. Represents the algorithm used. |
| AlgorithmObject.Name | Required algorithm name (e.g., “GATK‑HC”, “DeepVariant‑Model”). |
| AlgorithmObject.Version | Optional version identifier. |
| AlgorithmObject.Params | Free‑form object containing algorithm configuration parameters. |
| Algorithms | Array of Algorithm entries, each either a string ID or an AlgorithmObject. |
| FeatureClass | Category describing type of genomic features (e.g., variant features, expression features, CNV features). |
| Features | Non‑empty array of unique genomic feature names (e.g., SNP_Count, TPM_FoldChange, CNV_Events). |
| Trace | Provenance information and Time. |
| DescrMetadata | Descriptive Metadata |