BSMS205 · Genetics

Haplo-
insufficiency

Chapter 11 · Part II · Variation
A question to start with

When is half
not enough?

Two copies · why we usually have a buffer

  • Diploid: one allele from Mom, one from Dad
  • For most genes, one copy covers normal demand
  • The other copy = biological backup
  • But not always — some genes are tight on dose
The cake analogy
A recipe needs two cups of sugar.
With one, you get something cake-like.
It just doesn't taste right.

Haploinsufficient genes follow the two-cup recipe.

From Chapter 10 → here

Last time

  • Dominant alleles · one bad copy is enough
  • Many mechanisms

Today

  • One mechanism in detail:
  • Haploinsufficiency — dose-sensitive genes

Roadmap for today

  1. Definitions · haploinsufficient vs haplosufficient
  2. Why selection rare-ifies bad PTVs
  3. Measuring intolerance · pLI
  4. The next-gen score · LOEUF
  5. Case study · SCN2A
  6. Why this matters in the clinic
  7. Summary & what comes next
§ 1

Defining
the Terms

Haploinsufficient · one copy is not enough

  • One allele knocked out (often a PTV)
  • Remaining copy cannot keep up
  • Result: disease or abnormal trait
  • These genes are dose-sensitive
Cut the protein in half → the cell can't compensate.

Haplosufficient · one copy is fine

  • One working copy → enough protein
  • The cell tolerates the missing allele
  • Most genes behave this way
  • This is the default state for diploids

LoF-tolerant · even both copies can go

  • Some genes are even more forgiving
  • You can lose one or both copies without harm
  • Often have backup paralogs elsewhere
  • Or simply not critical for fitness

The dose-sensitivity spectrum

ClassOne copy lostTwo copies lostExamples
HaploinsufficientDiseaseOften lethalTranscription factors, channels
HaplosufficientNo phenotypeRecessive diseaseMost disease genes
LoF-tolerantNo phenotypeOften no phenotypeOlfactory receptors
A surprising number
~100
PTVs · in every healthy person
  • Healthy adults each carry about 100 PTVs
  • Most sit in LoF-tolerant genes
  • A PTV by itself is not a death sentence
  • What matters is which gene got hit
§ 2

Selection
Removes the Worst

Natural selection · in one sentence

Variants that reduce fitness
get filtered out across generations.
  • Fitness = ability to survive and reproduce
  • Helpful variants spread; harmful ones vanish
  • Slow-motion quality control on the genome

Purifying selection · the cleanup crew

  • Type of natural selection that removes harmful variants
  • Most active on essential genes
  • PTVs in haploinsufficient genes → strongly disfavored
  • PTVs in LoF-tolerant genes → no pressure to remove

The signal we exploit

If a gene has far fewer PTVs
than expected by chance,
selection has been removing them.
  • Few PTVs in healthy people = essential gene
  • Many PTVs = tolerant gene
  • This is the basis of every constraint metric
§ 3

Measuring with
pLI

The ExAC study · 2016

  • 60,706 adult exomes · the largest at the time
  • Adults without severe developmental disorders
  • Healthy population = baseline
  • Lek et al. 2016, Nature

What pLI means

pLI = probability that a gene is loss-of-function intolerant
  • Score from 0 to 1
  • pLI ≥ 0.9 → highly intolerant · likely haploinsufficient
  • pLI ≈ 0 → tolerates PTVs · LoF-tolerant
  • A binary-style readout: intolerant or not

How pLI is calculated

compare observed PTVs ↔ expected PTVs
  • Expected: from gene length × background mutation rate
  • Observed: actual PTVs found in 60,706 exomes
  • Big shortfall → strong selection against PTVs
  • Big shortfall → high pLI

3,230 highly intolerant genes

3,230
genes with pLI ≥ 0.9
  • ~16% of all human genes
  • Enriched in ribosome assembly
  • Enriched in chromatin regulation
  • Enriched in cell cycle control

pLI flags known disease genes

pLI distributions across functional gene categories
Figure 1. Known haploinsufficient (dominant) disease genes pile up at high pLI; recessive disease genes spread more broadly. pLI captures real biology. Source: Lek et al. 2016, Nature. CC-BY 4.0.

Why use healthy adults?

  • Haploinsufficient genes → developmental disorders
  • Healthy adults already survived and may have reproduced
  • Selection has already worked on this sample
  • The signal you see = what survived selection

PTVs are mostly singletons

Allele frequency distribution of PTVs in ExAC
Figure 2. Most PTVs in ExAC appear in only one person — the signature of purifying selection keeping harmful variants rare. Source: Lek et al. 2016, Nature. CC-BY 4.0.
§ 4

The Refined Score:
LOEUF

The gnomAD study · 2020

  • 141,456 individuals · more than 2× ExAC
  • Both exomes and whole genomes
  • Adults without severe developmental disorders
  • Karczewski et al. 2020, Nature

What LOEUF means

LOEUF = Loss-of-function Observed/Expected Upper-bound Fraction
  • Continuous score (no hard threshold)
  • Lower = more intolerant
  • LOEUF < 0.35 ≈ likely haploinsufficient
  • Adjusted for statistical uncertainty

pLI vs LOEUF · what changed

pLI

  • Yes / no test
  • Cutoff at 0.9
  • Misses moderate intolerance

LOEUF

  • Sliding scale
  • Lower = more intolerant
  • Captures shades of gray

The LOFTEE upgrade

  • LOFTEE = LoF Transcript Effect Estimator
  • Filters out fake PTVs · sequencing errors, alt transcripts
  • Output: 443,769 high-confidence PTVs
  • Cleaner input → more reliable score

LOEUF lines up with biology

LOEUF distributions across gene categories
Figure 3. Genes with low LOEUF are essential in mouse knockouts and enriched for human disease. External validation that LOEUF captures real biological importance. Source: Karczewski et al. 2020, Nature. CC-BY 4.0.
§ 5

Case Study:
SCN2A

SCN2A · the dosage profile

165.8
expected PTVs
18
observed PTVs

o/e = 0.11 · pLI = 1.0 · LOEUF very low

SCN2A on gnomAD · the actual page

gnomAD constraint table for SCN2A
Figure 4. The gnomAD constraint table for SCN2A. pLoF row: expected 165.8, observed 18, o/e = 0.11, pLI = 1.0. Missense Z-score = 8.73 also shows depletion. Source: gnomAD Browser, ENSG00000136531.

The clinical reality of SCN2A

  • ~1,500–2,000 known affected individuals worldwide
  • Severe epilepsy and neurodevelopmental disorders
  • De novo PTVs (not inherited) → fitness essentially zero
  • That is why the constraint signal is so strong
§ 6

Why It Matters
in the Clinic

Variant prioritization in diagnosis

  • Patient has a rare PTV in Gene X
  • pLI = 1.0 / LOEUF = 0.1 → very suspicious
  • pLI = 0 / LOEUF = 1.5 → probably not the cause
  • Constraint scores = front-line filter for clinicians

Drug target prioritization

  • Drug target = often a gene we want to turn off
  • Lower LOEUF → likely more side effects if inhibited
  • Higher LOEUF → maybe safer to inhibit
  • Constraint helps rank candidates early

The PCSK9 twist · LoF can be good

Some humans are born with broken PCSK9.
They have low cholesterol and less heart disease.
  • Drug companies designed PCSK9 inhibitors
  • Essentially: "PTVs in a pill"
  • Constraint data → therapeutic opportunities
§ 7

Summary

What to take away

  • Haploinsufficiency = one copy is not enough · dose-sensitive
  • Healthy people carry ~100 PTVs · mostly in tolerant genes
  • Purifying selection removes harmful PTVs over generations
  • pLI ≥ 0.9 or LOEUF < 0.35 → likely haploinsufficient
  • SCN2A: 165.8 expected, 18 observed, pLI 1.0
  • Constraint scores → diagnosis · drug targets · therapy
Next lecture

Now what if
one bad copy is tolerated
and you need two?

Chapter 12 · Recessive Alleles