A Measured Approach to AI+Biosecurity: The "If-Then" Framework from the 2025 NASEM report, The Age of AI in the Life Sciences: Benefits and Biosecurity Considerations.
In theory, I'm totally on board with the monitor-and-respond approach vs. "no biology for everyone because we are afraid that maybe some evil person will do evil things." But I don't think your footnote 1 is relevant to the claim you make about AI overall. It may be that general purpose LLMs like Claude Opus 4.5 can't design a novel pathogen, but biological design tools like AlphaFold and Evo can design novel proteins predicted to have the same biological targets as known toxins,[1] and Evo has designed an entire working phage genome from scratch.[2]
Also, it seems like the dataset usage test approach is still targeting capabilities, when we really want to be targeting usage. Hopefully the intent is more towards some sort of verification of whose key is accessing the dataset, and then being able to trace that access in case more red flags show up from the same source, like a systematic "know your customer" approach.
The "if-then" framework basically makes it so we don't overreact and shut down legit research just because something could theoretically be dangerous later. Using dataset availability as an early warning system is pretty clever since you can't train models without good data. Feels way more practical than trying to guess every future risk scenario and locking everything down preemptively.
The dataset availability angle is smart because it sidesteps the whole 'regulate everything now just in case' trap. But I'm curious how fast monitoring systems can actually scale when new datasets drop, especially if they're scattered across different institutions. If something goes from niche to widely available in weeks, does this framework have enough runway to catch capabilities before they're deployed broadly?
In theory, I'm totally on board with the monitor-and-respond approach vs. "no biology for everyone because we are afraid that maybe some evil person will do evil things." But I don't think your footnote 1 is relevant to the claim you make about AI overall. It may be that general purpose LLMs like Claude Opus 4.5 can't design a novel pathogen, but biological design tools like AlphaFold and Evo can design novel proteins predicted to have the same biological targets as known toxins,[1] and Evo has designed an entire working phage genome from scratch.[2]
Also, it seems like the dataset usage test approach is still targeting capabilities, when we really want to be targeting usage. Hopefully the intent is more towards some sort of verification of whose key is accessing the dataset, and then being able to trace that access in case more red flags show up from the same source, like a systematic "know your customer" approach.
[1] https://www.science.org/content/article/made-order-bioweapon-ai-designed-toxins-slip-through-safety-checks-used-companies
[2] https://www.biorxiv.org/content/10.1101/2025.09.12.675911v1
The "if-then" framework basically makes it so we don't overreact and shut down legit research just because something could theoretically be dangerous later. Using dataset availability as an early warning system is pretty clever since you can't train models without good data. Feels way more practical than trying to guess every future risk scenario and locking everything down preemptively.
Hey, great read as always. What specific if scenarios?
The dataset availability angle is smart because it sidesteps the whole 'regulate everything now just in case' trap. But I'm curious how fast monitoring systems can actually scale when new datasets drop, especially if they're scattered across different institutions. If something goes from niche to widely available in weeks, does this framework have enough runway to catch capabilities before they're deployed broadly?