AIコンサル

EVO: The AI Model Decoding the Blueprint of Life — Genomics, Drug Discovery, and What's Next

2026-01-21濱本

Patrick Hsu and Ark Institute have developed EVO — a DNA-level foundation model trained on the entire evolutionary record. This article explores how EVO is reshaping genomics, drug discovery, and the interpretation of disease-causing genetic variants, and what the next decade of AI-biology integration looks like.

EVO: The AI Model Decoding the Blueprint of Life — Genomics, Drug Discovery, and What's Next
シェア

This is Hamamoto from TIMEWELL.

AI Decodes the Blueprint of Life

The rapid advancement of AI is reshaping not just industry but science itself. In biology, machine learning is becoming the key to unlocking the fundamental mysteries of life — accelerating the development of new treatments, diagnostic tools, and our basic understanding of how living systems work.

Patrick Hsu, a pioneer of CRISPR genome editing technology and co-founder of Ark Institute, is working at exactly this frontier. His team has developed EVO — a revolutionary biological foundation model that learns from DNA itself. EVO is demonstrating remarkable capabilities in interpreting and generating genomic sequences in ways that were not possible before.

This article examines how AI is transforming genomics, drug discovery, and life science research — and what EVO specifically makes possible.

Topics:

  1. AI and biology converge: computational biology and the challenge of genomic interpretation
  2. EVO: the DNA foundation model and its capabilities
  3. Drug discovery to basic science: the future of AI-accelerated research and Ark Institute's mission
  4. Summary

Looking for AI training and consulting?

Learn about WARP training programs and consulting services in our materials.

Part 1: The Genomic Interpretation Problem

Computational biology has long wrestled with a central challenge: we can read the genome, but we struggle to interpret what we read.

Services like 23andMe give individuals access to their genomic sequence. Medical institutions sequence patients as part of diagnostic workups. The result is an enormous volume of genomic data — and a frustrating reality: most genetic variants found are classified as "Variants of Unknown Significance" (VUS). Scientists have a self-deprecating phrase for this: "We have no idea what's happening here."

Some variants cause serious hereditary diseases — muscular dystrophy, cystic fibrosis, hereditary breast and ovarian cancer. But most variants' effects are unknown. Interpreting VUS is precisely where AI foundation models of the EVO type can contribute.

AI's role in VUS interpretation: EVO can predict whether a newly introduced mutation in a gene is likely to be pathogenic — a capability that significantly exceeds existing methods. For a gene like BRCA1 (associated with hereditary breast and ovarian cancer risk), EVO can evaluate not just known benign and pathogenic variants but also previously uncharacterized VUS.

This has direct clinical implications: a patient who receives a VUS result from genetic testing currently faces genuine uncertainty about whether preventive interventions (such as prophylactic surgery) are warranted. An AI model that can classify those variants with high confidence changes that clinical reality.

Beyond drug discovery: Hsu is emphatic that drug discovery, while important, represents a relatively small portion of what "AI for Bio" can achieve. His scientific focus is on applying AI to biology's existing "unified theory" — evolution. Where physicists are still searching for a unified theory, biology already has one: evolution acts on life at every scale, from ecosystems to molecules. Incorporating evolutionary principles into AI models creates the foundation for genuinely deep understanding of biological systems.

EVO's name derives directly from this: the model is named for evolution itself.

Part 2: EVO — The DNA Foundation Model

What EVO Does

EVO is an autoregressive model — meaning it is trained to predict the next element in a sequence. Just as a large language model learns to predict the next word and through that task acquires grammar, semantics, and world knowledge, EVO learns to predict the next base pair in a DNA sequence, the next amino acid in a protein sequence, or the next gene in a genomic arrangement.

Through this seemingly simple task, the model learns the "molecular logic" of biology — the high-level patterns that encode how life functions.

EVO uses a hybrid architecture combining convolutional neural networks and other components, designed to handle the long contexts required for genomic sequences.

What EVO Can Do

VUS functional prediction: EVO predicts pathogenicity for genetic variants with state-of-the-art accuracy, using databases like ClinVar as ground truth for evaluation. Hsu's team invested significant effort in building rigorous evaluation frameworks — because, as in AI generally, model quality is only as good as the benchmark used to assess it.

CRISPR system design: EVO can be applied to designing new CRISPR genome editing components — new CRISPR-associated proteins and guide RNAs that edit specific genes more efficiently and accurately.

Zero-shot capabilities: EVO demonstrates "zero-shot" performance on tasks it was not explicitly trained for — a hallmark of genuine generalization rather than task-specific memorization.

Open ecosystem: EVO is open-source. Researchers worldwide can access, use, and build on it — creating an "app store" of biological applications on top of a shared foundation model.

Training Data

EVO's training data comes from public biological databases (such as the Sequence Read Archive) — the accumulated scientific output of over 25 years of genomic sequencing. The data covers bacteria, viruses, humans, primates, fish, insects — a "Noah's Ark" of biological diversity.

This means EVO is trained on the results of evolution's grand experiment: the genomic diversity between individual humans, between species, between entire domains of life. Every evolutionary selection pressure that shaped those genomes is embedded in EVO's training data.

EVO vs. AlphaFold

AlphaFold's protein structure prediction represents an important complementary approach. The sequence-to-structure-to-function paradigm (central dogma) is scientifically elegant, and structural information is genuinely valuable.

EVO's approach is different: it attempts to predict function directly from DNA sequence, without necessarily going through the intermediate step of structure. This is advantageous when structure is unknown, or when evaluating non-coding regulatory DNA regions that don't map to protein structure but still affect gene expression.

Challenges in Using Biological Language Models

Hsu offers a vivid analogy for the current state of the field: using EVO is like reading a text that is 99% Russian and 1% English — or like speaking DNA's language with an extremely strong accent. Understanding and interpreting model outputs requires annotation specialists and interpretability techniques.

The "prompt engineering" for biological language models is still in its earliest stages. Developing the equivalent of sophisticated natural language prompting techniques for genomic models is an important unsolved problem — and one that requires collaboration across biology, computational science, and AI research.

Part 3: The Future of AI-Accelerated Biology

AI Agents for Scientific Research

If 2026 is "the year of AI agents" in business, Hsu expects the same to be true in science. Not just AI that interprets molecules, but AI that supports the meta-level scientific workflow: hypothesis generation, literature review, experimental design, data analysis, result interpretation, and writing.

Ark Institute has already demonstrated this in practice. Their recently released "Virtual Cell Atlas" — one of the world's largest single-cell datasets — was built in significant part through AI agents that automatically crawled public databases and structured unorganized metadata. Work that would have taken skilled bioinformaticians years was completed by a small team using AI assistance at scale.

This experience has convinced Hsu that AI agents can dramatically increase both the scale and efficiency of scientific research. Fully automated end-to-end scientific discovery loops remain a future goal; AI-assisted research cycles are happening now.

Ark Institute's Mission

Ark Institute bridges the major research universities of the Bay Area (Stanford, UC Berkeley, UCSF) with biotechnology and the technology sector (including CTO Dave Burke, formerly a Google engineering leader).

The goal is not merely to publish in top journals — though rigorous publication is essential for community impact. The goal is to produce research that is "touchable": technology platforms that many people can use, and approaches that actually change how diseases are diagnosed and treated.

This requires a different organizational model than traditional academia: close collaboration between academic principal investigators and industry-trained technical staff (software engineers, data scientists, operations specialists), with "bilingual" individuals who can bridge the disciplinary cultures.

Predictions: What AI-Biology Integration Delivers

By 2025–2026: Complete antibody drugs (IgG) designed computationally from a target protein specification. De novo enzyme design reaching practical maturity. (These remain largely protein-centric.)

By 2030: "Virtual cell" models precise enough to impress cell biologists — enabling better target selection and treatment efficacy prediction in drug development.

By 2050 (or earlier): Scientific superintelligence: fully integrated wet lab / AI systems with self-improvement cycles that dramatically compress the timeline from hypothesis to validated therapy.

Hsu notes that some challenges will persist: predicting toxicity, long-term effects, and bridging the gap between mouse models and humans. More human-derived experimental data — ethically obtained through approaches like ex vivo organ perfusion — will be needed to close this gap.

Personalized Medicine

Longer-term, Hsu envisions "AI physicians" that integrate an individual's genomic data, wearable physiological data (blood glucose, heart rate, sleep patterns), and laboratory results to deliver personalized health recommendations — diet, exercise, supplementation — calibrated to individual genetics and lifestyle.

The fundamental equation G × E = P (genotype × environment = phenotype) applied at the individual level, in real time, is the direction of travel. Current tools collect these data points in isolation; AI integration that treats them as a unified system is the next step.

Summary

EVO and the broader AI-for-biology field represent a transformation of scientific methodology itself — not just a new tool but a new way of doing science.

Key points:

  • EVO learns from DNA directly, capturing evolutionary information accumulated over billions of years
  • VUS interpretation can change clinical decisions for hereditary disease patients
  • CRISPR design applications may accelerate genome editing tool development
  • AI agents are already being used to build large-scale biological data resources that were previously impractical
  • The field is still early — "biological prompt engineering" remains an open problem
  • Hsu predicts precise virtual cell models by 2030, transformative drug discovery acceleration this decade

The convergence of AI and biology is just beginning. Its impact on our understanding of health, disease, and life itself will exceed what most people currently anticipate.

Reference: https://www.youtube.com/watch?v=v-_58dabswU

TIMEWELL AI Consulting

TIMEWELL supports business transformation in the age of AI agents.

Book a free consultation →

Considering AI adoption for your organization?

Our DX and data strategy experts will design the optimal AI adoption plan for your business. First consultation is free.

Share this article if you found it useful

シェア

Newsletter

Get the latest AI and DX insights delivered weekly

Your email will only be used for newsletter delivery.

無料診断ツール

あなたのAIリテラシー、診断してみませんか?

5分で分かるAIリテラシー診断。活用レベルからセキュリティ意識まで、7つの観点で評価します。

Learn More About AIコンサル

Discover the features and case studies for AIコンサル.