1 Definition 2 Functional Requirements 3 Syntax 4 Semantics

1 Definition

Types of Data Processing that an AIH Data Processing AIM can be performed on AIH Data..

2 Functional Requirements

The Functional Requirements are organised into Common Definitions and sections specific to ECG, EEG, Medical Image, Genomics domains.

2.1 Common Definitions – Functional Requirements

  • The Common Definitions shall define the permissible top-level Domain values (ECG, EEG, Medical Image, Genomics).
  • The Common Definitions shall enable routing of validation and processing logic based on Domain.
  • The Common Definitions shall allow Feature Class to specify the category of features (e.g., Morphological, Temporal, Frequency, Statistical, HRV, Spectral bands, ERP components).
  • The Common Definitions shall validate Features as a non-empty array of strings with unique items.
  • The Common Definitions shall support Algorithm as either a string identifier or an object with Name, Version, and Params.
  • The Common Definitions shall support Algorithms as an array whose items are either a string identifier or an Algorithm object.
  • The Common Definitions shall capture provenance using Trace.

2.2 ECG Processing Type – Functional Requirements

  • The ECG Processing Type shall support operations: Pre-processing, Beat Detection, Wave Delimitation, Feature Extraction, Arrhythmia Classification, Heart Rate Variability (HRV) Analysis.
  • The ECG Processing Type shall validate Operation against ECG-specific enumerations (e.g., RPeak Detection, QRS Detection, QT Measurement, ST Deviation, HRV Time Domain, HRV Frequency Domain, Beat Classification, Baseline Wander Removal, Denoising, Arrhythmia Detection).
  • The ECG Processing Type shall validate Method against ECG techniques (e.g., Pan Tompkins, Wavelet Transform, Savitzky Golay, Adaptive Thresholding, Template Matching, Lomb Scargle, Welch PSD, Median Filter, CNN, RNN, Transformer).
  • The ECG Processing Type shall allow Targets to include waveform components P, QRS, T, ST.

2.3 EEG Processing Type – Functional Requirements

  • The EEG Processing Type shall support operations: Pre-processing, Epoching, Feature Extraction, Source Localisation, Classification.
  • The EEG Processing Type shall validate Operation against EEG-specific enumerations (e.g., Band Power, Spectral Analysis, Event Related Potential, Artifact Removal, Independent Component Analysis (ICA) Decomposition, Source Localisation, Sleep Stage Classification, Epoching, Connectivity Analysis, Time Frequency Analysis).
  • The EEG Processing Type shall validate Method against EEG techniques (e.g., FFT, Welch PSD, Morlet Wavelet, ICA, CSP, sLORETA, Beamforming, Notch Filter, Bandpass Filter, DCNN, LSTM).

2.4 MedicalImaging Processing Type – Functional Requirements

  • The Medical Imaging Processing Type shall support operations: Pre-processing, Segmentation, Registration, Detection Classification, Reconstruction, Quantification.
  • The Medical Imaging Processing Type shall validate Operation against imaging-specific enumerations (e.g., Segmentation, Registration, Denoising, Enhancement, Lesion Detection, Classification, Quantification, Reconstruction, Motion Correction, Feature Extraction).
  • The Medical Imaging Processing Type shall validate Method against imaging techniques (e.g., Otsu Thresholding, UNet, ResNet, FLIRT, ANTs, SIFT, SURF, Non Local Means, Bilateral Filter, Histogram Equalisation, CNN, Transformer, LevelSet, GraphCut).

2.5 Genomic Processing Type – Functional Requirements

  • The Genomic Processing Type shall support operations: Alignment, Quality Control, Trimming, Variant Calling, Variant Filtration, Annotation, Expression Quantification, Differential Expression, Peak Calling, Assembly, Phasing, Normalization.
  • The Genomic Processing Type shall validate Method against genomics techniques (e.g., BWA, Bowtie2, Minimap2, STAR, HISAT2, GATK. Haplotype Caller, Free Bayes, bcftools, Samtools, FastQC, Cutadapt, Trimmomatic, DESeq2, edgeR, Salmon, Kallisto, MACS2, SPAdes, Trinity).

3 Syntax

https://schemas.mpai.community/AIH1/V1.0/data/AIHDataProcessingType.json

4 Semantics

Label Description
Header Health Data Processing Types Header
– Standard – HealthDataProcessingTypes The characters AIH-HDP-V
– Version Major version – 1 or 2 characters
– Dot-separator The character .
– Subversion Minor version – 1 or 2 characters
Common Definitions
Domains Top-level processing areas covered by this taxonomy.
Algorithm Identifier (string) or object (Name, Version, Params). Use ID when minimal; object when detailed.
FeatureClass Class of features (e.g., Morphological, Temporal, Frequency, Statistical, HRV, Spectral bands, ERP components).
Features Examples per domain: ECG → QRS_Duration, RR_Interval, SDNN, RMSSD; EEG → PSD bands, ERP amplitudes; Imaging → Radiomics features.
Trace Capture source/time provenance (MPAI Trace). Trace.SourceAIM → AIF/V3.0/data/AIMInstance.json; Trace.Time → OSD/V1.5/data/Time.json.
DescrMetadata Descriptive metadata for human-readable context (title, description, notes).
Domains – ECG Electrocardiogram signal processing domain.
–– Preprocessing Noise/artifact mitigation before analysis.
–– BeatDetection Detection of heart beats and R-peaks.
–– WaveDelimitation Identification of P/QRS/T boundaries and ST segments.
–– FeatureExtraction Derivation of morphological/temporal/frequency features.
–– ArrhythmiaClassification Automated classification of rhythm abnormalities.
–– HRVAnalysis Time/frequency/nonlinear analysis of heart-rate variability.
Methods – ECG Techniques for ECG domain operations.
–– Preprocessing BaselineWanderRemoval; PowerlineNotch; BandpassFiltering; MotionArtifactSuppression; WaveletDenoising.
–– BeatDetection QRSDetection; RPeakDetection.
–– WaveDelimitation PWaveSegmentation; QRSTDelimitation; TWaveEndDetection.
–– FeatureExtraction – FeatureClass Morphological; Temporal; Frequency; Statistical; HRV.
–– FeatureExtraction – Features Examples: QRS_Duration, RR_Interval, PSD bands, SDNN, RMSSD.
–– ArrhythmiaClassification RuleBased; ClassicalML; DeepLearning.
–– HRVAnalysis TimeDomain; FrequencyDomain; Nonlinear.
Targets – P Atrial depolarisation wave.
Targets – QRS Ventricular depolarisation complex.
Targets – T Ventricular repolarisation wave.
Targets – ST Segment between S end and T start.
Domains – EEG Electroencephalography signal processing domain.
–– Preprocessing Filtering, re-referencing, artifact removal.
–– Epoching Segmentation of continuous EEG around events.
–– FeatureExtraction Spectral/ERP/time–frequency/connectivity features.
–– SourceLocalization Estimation of neural generators from scalp signals.
–– Classification Model-based categorisation of signals or states.
Methods – EEG Techniques for EEG domain operations.
–– Preprocessing Filtering_Bandpass; Filtering_Notch; ReReferencing; Resampling; ArtifactRejection_ICA; ArtifactRejection_Automated.
–– Epoching EventLockedEpochs; BaselineCorrection.
–– FeatureExtraction Spectral; ERP; TimeFrequency; Connectivity.
–– FeatureExtraction – FeatureClass Spectral bands; ERP components; wavelet/STFT; coherence/PLI/PLV.
–– SourceLocalization DipoleFitting; DistributedInverseSolution.
–– Classification ClassicalML; DeepLearning.
Domains – Genomics Genomics/transcriptomics processing domain (e.g., WGS/WES/RNA-seq/ChIP-seq).
–– Alignment Mapping reads to a reference genome/transcriptome.
–– QualityControl Read/coverage quality assessment and reporting.
–– Trimming Adapter/low-quality base removal before alignment.
–– VariantCalling Detection of SNPs/indels/structural variants.
–– VariantFiltration Applying quality/annotation-based filters to variants.
–– Annotation Adding biological/functional context to variants.
–– ExpressionQuantification Quantifying gene/transcript expression from RNA-seq.
–– DifferentialExpression Comparing expression between conditions/groups.
–– PeakCalling Identifying enriched regions (e.g., ChIP-seq peaks).
–– Assembly De novo or reference-guided assembly of sequences.
–– Phasing Inferring haplotypes from genotype/reads.
–– Normalization Scaling/normalizing counts/signals for comparability.
Methods – Genomics Techniques for genomics operations.
–– Alignment BWA; Bowtie2; Minimap2; STAR; HISAT2.
–– QualityControl FastQC.
–– Trimming Cutadapt; Trimmomatic.
–– VariantCalling GATK.HaplotypeCaller; FreeBayes; bcftools; Samtools.
–– Annotation Examples include ANNOVAR or SnpEff (if used).
–– ExpressionQuantification Salmon; Kallisto.
–– DifferentialExpression DESeq2; edgeR.
–– PeakCalling MACS2.
–– Assembly SPAdes; Trinity.
–– Normalization TPM/RPKM/FPKM or domain-specific approaches (if applicable).
–– FeatureExtraction – FeatureClass Coverage; Variants; Expression; Peak metrics.
–– FeatureExtraction – Features Examples: read_depth, variant_count, TPM, peak_score.
Domains – MedicalImaging Medical image processing domain.
–– Preprocessing Denoising, contrast/bias correction, normalization.
–– Segmentation Partitioning images into anatomical/lesion regions.
–– Registration Spatial alignment within/across modalities or time.
–– DetectionClassification Finding and labeling abnormalities or tissues.
–– Reconstruction Improving or rebuilding images from raw/undersampled data.
–– Quantification Measurement and radiomics feature computation.
Methods – MedicalImaging Techniques for Medical Imaging domain operations.
–– Preprocessing Denoising; ContrastEnhancement; BiasFieldCorrection; Normalization.
–– Segmentation Thresholding; RegionGrowing; ActiveContour; GraphCut; UNet; AttentionUNet; TransformerBased.
–– Registration Rigid; Affine; Deformable; IntensityBased; LandmarkBased.
–– DetectionClassification FeatureBased; ClassicalML; DeepLearning.
–– Reconstruction IterativeReconstruction; SuperResolution; Denoising.
–– Quantification Radiomics; VolumeMeasurement; ShapeAnalysis.