Mendelian Precision
Mendelian Precision (MP) is the core quality metric in MCNV2, quantifying the proportion of CNV calls that follow Mendelian inheritance patterns in parent–offspring trios.
Note
Mendelian Precision is defined as:
where:
N = Total number of CNVs detected in offspring
I = Number of inherited CNVs (found in at least one parent)
E = Number of non-inherited CNVs (not found in parents)
Rationale:
Given the low expected rate of genuine de novo CNVs (typically 1.92% per individual), non-inherited CNVs are predominantly false positives. MP therefore provides a biologically grounded estimate of CNV call precision without requiring an external reference callset.
Interpretation
High MP (≥80%):
Most CNVs are inherited → High call quality
Few false positives
Callset suitable for downstream analyses
Moderate MP (50-80%):
Mixed quality
May benefit from additional filtering
Investigate size-specific or quality-score-specific patterns
Low MP (<50%):
High false positive rate
Requires aggressive filtering or caller parameter tuning
Consider alternative CNV calling methods
Transmission types
MCNV2 calculates MP using two complementary approaches:
CNV-level matching (coordinate-based) — Based on genomic coordinate overlap
Gene-level matching (gene-based) — Based on shared affected genes
See also
See Inheritance status for detailed algorithms, advantages, and limitations of each approach.
MP calculation:
MP can be computed separately using each approach:
Why use both:
CNV-level works for all CNVs (including intergenic)
Gene-level is robust to breakpoint variability (genic CNVs only)
The two approaches provide complementary quality assessments
Recommendation:
Compute MP using both approaches. Differences between MP_CNV and MP_gene reflect the interplay of breakpoint precision and CNV fragmentation in your specific dataset.
MP stratification
To identify quality issues and optimization strategies, MP should be computed across multiple dimensions:
By CNV size
Size ranges:
1-30 kb
30-50 kb
50-100 kb
100-200 kb
200-500 kb
500 kb-1 Mb
>1 Mb
Typical pattern:
Small CNVs (<30 kb): Low MP (high false positive rate)
Medium CNVs (50-200 kb): Moderate to high MP
Large CNVs (>500 kb): High MP (more reliable)
Interpretation:
If MP increases dramatically with size, consider applying a minimum size filter.
By CNV type
DEL vs DUP:
MP is calculated separately for deletions and duplications, as they often have different precision profiles.
By quality score
Quality metrics:
MP can be stratified by various quality scores:
Score — Caller-specific quality score
SNP — Number of supporting SNPs (array data)
% Overlap — Reciprocal overlap percentage between CNVs detected by multiple algorithms
Typical pattern:
MP increases with quality score, often plateauing at a certain threshold. This plateau identifies the optimal filtering threshold.
Example:
Score ≥10: MP = 65%
Score ≥30: MP 780%
Score ≥50: MP= 90%
Score ≥200: M = 9 2% (plateau)
Optimal threshold: 50 (additional filtering beyond this point provides no MP improvement)
Filtering strategies
To maximize MP while retaining biologically relevant CNVs, MCNV2 supports multiple filtering approaches:
CNV-level filters
Size filters:
Minimum size (bp)
Maximum size (bp)
Target specific size ranges
Quality score filters:
Minimum quality score
Minimum number of supporting probes/reads
Minimum caller concordance
Important
Highly recommended: Apply problematic region filters to all datasets.
CNVs in these regions are enriched for false positives due to:
Read mismapping (segmental duplications)
Poor mappability (centromeres, telomeres)
Genuine polymorphism (HLA region)
Recommended threshold: Exclude CNVs with >50% overlap.
Gene-level filters
Exclusion lists:
Upload a list of genes to exclude (e.g., immunoglobulin genes, olfactory receptors)
Useful for removing genes prone to technical artifacts
Gene constraint filters (LOEUF):
Exclude CNVs affecting highly constrained genes (LOEUF < threshold)
Rationale: CNVs in constrained genes may be genuine de novo events, reducing technical MP
Important
LOEUF filter for MP calculation only
Excluding CNVs in constrained genes (low LOEUF) helps distinguish:
Technical MP — Precision excluding likely de novo events
Overall MP — Precision including all non-inherited CNVs
CNVs affecting constrained genes are enriched for genuine de novo events, which reduce MP but are biologically real. Excluding them from MP calculation provides a cleaner assessment of technical false positive rate.
Critical: These CNVs must be retained in your final dataset for downstream analyses (disease association, burden tests) as they may represent pathogenic variants.
See Filtering strategies for detailed filtering strategies and optimization approaches.
Optimal threshold identification
Strategy:
Plot MP versus quality score threshold (line plot)
Identify where MP plateaus
Use the lowest threshold at which MP reaches plateau
Example:
For deletions 50-100kb:
- Score ≥30 → MP = 75% (n = 500)
- Score ≥50 → MP = 85% (n = 300)
- Score ≥70 → MP = 92% (n = 150) ← Plateau starts
- Score ≥70 → MP = 92% (n = 140) ← No further MP gain
Optimal threshold: 150 (plateau reached, retains 150 CNVs)
Technical vs biological non-inheritance
Challenge:
Non-inherited CNVs include both:
Technical false positives (reduce MP, should be filtered)
Genuine *de novo* CNVs (reduce MP, should be retained)
Solution:
Use gene constraint (LOEUF) to distinguish these two categories:
CNVs affecting constrained genes (low LOEUF) → Enriched for de novo events
CNVs affecting unconstrained genes → Enriched for false positives
Workflow:
Calculate MP (all CNVs) → Includes both technical and biological non-inheritance
Calculate MP (excluding LOEUF < 0.6) → Focuses on technical precision only
Compare the two values:
Large difference (>10%): High de novo rate in constrained genes
Small difference (<5%): Most non-inherited CNVs are false positives
Important
LOEUF exclusion is for assessment only
Example:
1000 CNVs total:
- MP (all CNVs) = 85% → 150 non-inherited
- MP (LOEUF ≥ 0.6) = 92% → 80 non-inherited (970 CNVs after excluding 30)
Interpretation:
- ~30 CNVs (3%) likely genuine *de novo* in constrained genes
- ~120 CNVs (12%) technical false positives
- Keep all 1000 CNVs for downstream analysis
- Apply quality filters to remove the 120 false positives
Global vs stratified MP
Global MP:
Single value for all CNVs (or all DEL, all DUP)
Simple summary metric
May mask quality issues in specific size ranges or quality bins
Stratified MP:
MP computed for each size range and/or quality threshold
Reveals patterns (e.g., low MP for small CNVs)
Enables targeted filtering strategies
Best practice: Always examine stratified MP before applying filters.
Use cases for Mendelian Precision
1. CNV caller evaluation
Compare MP across different callers or parameter settings to identify the best configuration.
2. Quality control
Assess CNV call quality in a new dataset.
3. Filter optimization
Systematically evaluate the impact of different filters on MP to maximize precision while retaining CNVs.
4. Method development
Use MP as an optimization criterion when developing new CNV calling methods.
5. Publication
Report MP alongside standard metrics (sensitivity, specificity) to provide a biologically grounded quality assessment.
Limitations
Assumptions:
Parents are biological parents (not step-parents or adoption)
Trio relationships are correctly specified in pedigree file
De novo CNV rate is low (~1.92%)
Caveats:
MP does not distinguish between technical false positives and genuine de novo events
MP is not a measure of sensitivity (cannot detect false negatives)
See also
Preprocessing — How inheritance status is calculated
Inheritance status — CNV-level vs gene-level matching in detail
Filtering strategies — Filtering strategies to optimize MP
Outputs — Visualizations and tables