CVJ

The Dark Matter of the Genome

Recoding the non-expressed genome into a new generation of molecules

For nearly fifty years, drug discovery focused almost entirely on the ~2% of the genome that codes for proteins. The remaining ~98% — non-coding, non-expressing, and “retired” sequences — were dismissed as junk. Our work is built on a different premise, now supported by more than fifteen years of evidence: this dark matter of the genome is a vast, untapped reservoir of functional molecules that can be computationally decoded, synthetically expressed, and engineered into first-in-class therapeutics, enzymes, and vaccines.

A discovery, eighteen years in the making

The foundation was laid in 2009, when our group provided the first experimental proof that naturally silent intergenic DNA from Escherichia coli can be synthetically expressed into functional proteins. Six non-expressing intergenic sequences were cloned and induced; all six produced protein, one (Eka1) showed clear biological activity, and computational modelling predicted stable globular folds. What began as a single experiment has since become a reproducible, sustained discovery platform.

The untapped reservoir

We organise the dark genome into three functional classes, each a distinct source of new biomolecules:

Conventional biology treats all three as silent. We have shown that their redesigned versions encode functional peptides, proteins, and pathways.

The platform: The Dark Genome

The platform is not a concept; it has delivered functional molecules against major disease classes:

Anti-malarial

peptides from yeast intergenic sequences blocked more than 60% of Plasmodium falciparum parasites from invading red blood cells.

Anti-Alzheimer's

peptides from yeast intergenic sequences blocked more than 60% of Plasmodium falciparum parasites from invading red blood cells.

Antimicrobials

peptides from E. coli intergenic sequences showed strong activity against both Gram-positive and Gram-negative bacteria.

Vaccines

tREP-derived epitopes against Mamastrovirus and Norovirus showed favourable, stable binding to immune targets, opening a computational route to peptide vaccines.

Anti-leishmanial (tREPs)

tREP-18, a peptide encoded by transfer RNA, showed potent activity at nanomolar concentrations (IC₅₀ ≈ 22 nM) while remaining safe to human cells — the first evidence that tRNA can be repurposed into a therapeutic molecule, defining an entirely new class we call tRNA-encoded peptides.

Why it matters

For academia

This is Functional Genomics 2.0 — a shift from studying individual genes to treating the entire genome as a design canvas. It reframes how we read evolution, expands the known proteome, and opens new questions about why nature transcribes only a fraction of its own code.

For industry

Traditional pipelines are stalling on derivatives of known drugs, rising resistance, and ballooning R&D costs. The dark genome supplies entirely new molecular starting points — unconstrained by homology or historical annotation — for diseases where current therapies fall short, from drug-resistant infections to neurodegeneration. Because a single genome can be mined for thousands of candidates with AI and quantum tools, the approach also democratises discovery: a fast, data-rich, adaptive pipeline that does not depend on large compound libraries.

Our vision

To build a deep genome foundry from the dark matter of the genome to deliver novel medicines, enzymes, and pathways, originating from the unread instructions already within our own DNA.

References