Preprocessing
The Preprocessing tab is the first step in the MCNV2 workflow. It performs CNV annotation and inheritance status calculation.
Overview
This tab allows you to:
Upload input files (CNV calls, pedigree, problematic regions)
Set inheritance parameters (overlap threshold, genome build)
Annotate CNVs with genes, LOEUF scores, and problematic regions
Calculate inheritance status (transmitted vs non-transmitted)
View annotated results
Input files
CNV file (mandatory)
Tab-delimited file with columns: CHR, START, STOP, TYPE (DEL/DUP), SAMPLE_ID
See Input formats for detailed specifications.
Pedigree file (mandatory)
Three-column file: SAMPLE_ID, FATHER_ID, MOTHER_ID
Only complete trios are analyzed.
Problematic regions file (optional, BED format)
BED file with problematic genomic regions:
Segmental duplications
Centromeres
Telomeres
HLA region
Note
A default file is provided. You can replace it with your own BED file if needed.
Parameters for inheritance calculation
Inheritance threshold (child CNV proportion)
Default: 0.5 (50%)
This parameter defines the minimum reciprocal overlap required between a child CNV and a parental CNV to consider the child CNV as inherited.
Value range: 0.01 to 1.0 (1% to 100%)
Interpretation: If ≥X% of a child CNV overlaps with a parental CNV (father or mother), the CNV is classified as inherited (True)
Example: With threshold=0.5, a child CNV is inherited if at least 50% of it overlaps with a CNV in at least one parent
Genome build
Default: GRCh38/hg38
Planned: GRCh37/hg19 (not yet implemented)
Workflow
Step 1: Upload files
Upload your CNV file, pedigree file, and optionally a custom problematic regions file.
Step 2: Set parameters
Inheritance threshold: Adjust the reciprocal overlap percentage (default 50%)
Genome build: Select GRCh38/hg38
Step 3: Submit
Click Submit to start annotation and inheritance calculation.
Annotation process
MCNV2 annotates each CNV with:
Gene annotation
Each CNV is intersected with gene coordinates (Gencode v45):
GeneName — HGNC gene symbol
GeneID — Ensembl gene ID
Transcript — Ensembl transcript ID
If a CNV overlaps multiple genes, one row is created per gene (CNV-gene pairs).
LOEUF scores (gnomAD v4)
LOEUF (Loss-of-function Observed/Expected Upper bound Fraction) quantifies gene constraint:
Low LOEUF (≤0.6): Highly constrained genes (intolerant to loss-of-function)
High LOEUF (>0.6): Less constrained genes
LOEUF is used for:
Stratifying Mendelian Precision by gene constraint
Optional filtering (exclude constrained genes to focus on technical precision)
Problematic regions overlap
Percentage of CNV overlapping with problematic regions:
Segmental duplications
Centromeres
Telomeres
HLA region
Note
This percentage is used in the filtering step to exclude CNVs with high overlap (e.g., >50%).
Annotated CNV table
After clicking Submit, the first table displays annotated CNVs:
Columns:
Original CNV file columns (CHR, START, STOP, TYPE, SAMPLE_ID, quality scores, etc.)
GeneName — Overlapping gene name
GeneID — Ensembl gene ID
Transcript — Ensembl transcript ID
LOEUF — Constraint score
problematic_region_overlap — Percentage overlap with problematic regions
Table navigation:
Show 10, 50, 100, or all entries
Scroll horizontally to view all columns
File path:
The file path where the annotated table is saved is displayed below the table.
Inheritance status calculation
Click Proceed to inheritance status to calculate transmission for each CNV.
MCNV2 uses two complementary approaches to determine inheritance:
Transmitted_CNV (coordinate-based) — Based on genomic coordinate overlap
Transmitted_gene (gene-based) — Based on shared affected genes
See also
See Inheritance status for a comprehensive explanation of the two inheritance matching approaches.
Column values in the inheritance table:
Transmitted_CNV: True / False
Transmitted_gene: True / False / intergenic
Inheritance status table
The second table displays inheritance results:
Columns:
All columns from the annotated CNV table
Transmitted_CNV — True/False (coordinate-based inheritance)
Transmitted_gene — True/False/intergenic (gene-based inheritance)
Interpretation:
True (both columns): CNV is inherited from at least one parent
False (Transmitted_CNV): Candidate de novo CNV (no parental overlap)
False (Transmitted_gene): Gene not affected in parents
“intergenic” (Transmitted_gene): CNV does not overlap any gene
File path:
The file path where the inheritance table is saved is displayed below the table.
Next step: Mendelian Precision analysis
Once inheritance status is calculated, click Go to Mendelian Precision analysis to:
Compute Mendelian Precision across size ranges and quality thresholds
Apply filters (quality scores, problematic regions, LOEUF, caller concordance)
Generate publication-ready plots
See Mendelian Precision for details on the MP analysis workflow.
Tips
Understanding the two inheritance approaches
Both Transmitted_CNV (coordinate-based) and Transmitted_gene (gene-based) have different strengths:
Transmitted_CNV requires an overlap threshold but works for all CNVs
Transmitted_gene is more robust to breakpoint and CNV fragmentation but only works for genic CNVs
Recommendation: Use both approaches to get a comprehensive view of CNV inheritance.
See Inheritance status for detailed comparison, advantages, limitations, and use cases.
Filtering intergenic CNVs
Intergenic CNVs can only be evaluated using coordinate-based matching (Transmitted_CNV).
Table too large to display
Use pagination (show 10/50/100 entries).
Troubleshooting
Error: “No complete trios found”
Check that:
Pedigree file has correct format (SAMPLE_ID, FATHER_ID, MOTHER_ID)
All three IDs (child, father, mother) are present in the pedigree file
Sample IDs are consistent between CNV and pedigree files
Warning: “Some samples in pedigree not found in CNV file”
This is normal. If a sample has no detected CNVs, it won’t appear in the CNV file. The trio is still valid.
Table too large to display
Use pagination (show 10/50/100 entries) or download the full table for offline analysis.