BoltzGen 논문 리뷰

들어가며

Binder design platform으로서의 BoltzGen

All-atom diffusion에서 sequence와 structure를 같이 만들기

Figure 1: 26 targets라는 숫자를 분해해서 읽기

9 novel targets 결과

Peptide와 cyclic peptide 결과

NPM1과 GyrA: binding assay 밖의 evidence

Small-molecule binder: proof-of-concept에 가까운 영역

Benchmark target과 memorization caveat

Structural validation이 상대적으로 얇다는 점

평가: broad all-atom binder platform으로서의 BoltzGen

참고

BoltzGen 논문 리뷰

들어가며

AlphaProteo를 읽고 나면 자연스럽게 남는 질문이 있습니다. “이 정도 성능을 공개된 도구로도 얻을 수 있을까?” AlphaProteo는 7/8 target에서 binder를 얻고 여러 target에서 sub-nanomolar affinity를 보여준 강한 성능 benchmark였지만, method와 model은 공개되지 않았습니다. 그래서 field에는 약간의 공백이 생깁니다. Closed system은 어디까지 왔는지 보여주지만, 연구자들이 직접 pipeline을 뜯어보고 고치고 재사용하기는 어렵습니다.

BoltzGen은 이 공백의 반대편에 놓기 좋은 논문입니다. 제목부터 큽니다. “BoltzGen: Toward Universal Binder Design.” 저자들은 protein binder, nanobody, linear peptide, disulfide-cyclic peptide, small-molecule binder까지 하나의 all-atom generative framework와 design specification language로 다루겠다고 말합니다. 더 중요한 차이는 공개성입니다. 논문은 model weights, data, inference code, training code, designs를 MIT license로 공개한다고 주장합니다.

이 글에서는 BoltzGen을 “open AlphaProteo”라고 단순화하지 않겠습니다. 두 논문은 같은 binder design field에 있지만, 비교 축이 다릅니다. AlphaProteo는 closed system의 높은 post-filter wet-lab performance를 보여주고, BoltzGen은 open all-atom platform이 여러 modality를 어느 정도까지 다룰 수 있는지를 보여줍니다. AlphaProteo가 closed high-performance target-conditioned binder system의 validation benchmark라면, BoltzGen은 open all-atom binder design platform에 가깝습니다. 성능 claim도 한 줄로 요약하기 어렵습니다. 26개 target, 8개 wet-lab campaign이라는 breadth는 인상적이지만, campaign마다 assay, affinity range, functional readout, specificity control이 다릅니다. 따라서 이 논문은 “universal binder design이 완성됐다”가 아니라, “open platform이 여러 modality에서 어디까지 실험 evidence를 붙였는가”를 보는 글로 읽는 편이 안전합니다.

Binder design platform으로서의 BoltzGen

기존 binder design 흐름은 대체로 modality별로 나뉘어 있었습니다. RFdiffusion은 backbone/interface generation의 공개 milestone이고, BindCraft는 practical miniprotein binder pipeline으로 이어집니다. RFdiffusion-Antibody나 DiffAb 계열은 antibody/VHH framework constraint를 따로 다룹니다. Small-molecule binder나 cyclic peptide까지 한 모델에서 통합적으로 다루기는 더 어렵습니다.

BoltzGen의 문제의식은 여기서 출발합니다. 실제 discovery campaign에서는 “이 target에 붙는 protein을 만들어줘” 정도로 끝나지 않습니다. Target의 어느 surface에 붙을지, 어느 region은 피할지, peptide를 cyclic하게 만들지, disulfide bond를 넣을지, nanobody framework는 고정할지, CDR loop 길이는 어떻게 할지 같은 제약이 붙습니다. 저자들은 이런 조건을 design specification language로 표현하고, 하나의 all-atom diffusion model이 구조 예측과 design을 함께 하도록 만듭니다.

이 지점에서 BoltzGen은 단일 모델 논문이라기보다 platform 논문에 가깝습니다. Core generation model, specification language, refolding/evaluation, ranking/filtering, diversity-aware selection, wet-lab handoff가 함께 묶입니다. 그래서 성능 숫자를 읽을 때도 “모델이 raw sample을 냈다”가 아니라 “큰 후보군을 생성하고 여러 filter를 거쳐 실험 후보를 골랐다”는 흐름으로 읽는 편이 안전합니다.

All-atom diffusion에서 sequence와 structure를 같이 만들기

BoltzGen의 기술적 여기서는 designed residue를 다루는 방식입니다. 일반적인 protein design에서는 sequence는 discrete token이고 structure는 continuous coordinate입니다. 이 둘을 joint generation으로 묶으면 모델링이 복잡해집니다. BoltzGen은 residue identity를 직접 discrete amino acid token으로 생성하지 않고, 각 designed residue를 14-atom fixed representation으로 표현합니다. Backbone N, C-alpha, C, O 위에 superpose되는 virtual atom pattern을 통해 residue type을 encoding하고, 나머지는 side chain atom으로 해석하는 방식입니다.

이 아이디어의 장점은 sequence와 structure를 같은 continuous all-atom diffusion objective 안에서 다룰 수 있다는 점입니다. 모델은 target과 binder의 all-atom complex를 denoise하면서, binder의 backbone, side chain, residue identity에 해당하는 geometry를 함께 생성합니다. Architecture는 Boltz-2 / AlphaFold3 계열의 trunk와 diffusion module 구성을 따릅니다. Non-designed entity는 protein, RNA, DNA, small molecule, covalent modification까지 atom/token representation으로 들어가고, designed entity는 diffusion 과정에서 만들어집니다.

이 설명은 BoltzGen을 이해하는 데 중요하지만, 독자 입장에서는 한 가지를 같이 붙잡는 게 좋습니다. All-atom generation이라고 해서 바로 binding truth가 생기는 것은 아닙니다. 논문도 generation 이후 Boltz/Boltz-2 refolding agreement, interface confidence, hydrogen bond/salt bridge, delta SASA, affinity estimate, rank aggregation, diversity selection 같은 필터를 사용합니다. All-atom representation은 후보를 더 풍부하게 만들 수 있는 도구이고, wet-lab success는 그 뒤의 filtering과 validation까지 붙었을 때 해석할 수 있습니다.

Figure 1: 26 targets라는 숫자를 분해해서 읽기

BoltzGen의 Figure 1은 야심찹니다. 저자들은 26개 target, 8개 campaign에서 wet-lab validation을 수행했다고 제시합니다. 여기에 novel protein/nanobody target panel, bioactive peptide binder, disordered NPM1 binder, RagC/RagA:RagC peptide, nanobody yeast-display examples, small-molecule binder, GyrA antimicrobial peptide, benchmark targets가 포함됩니다.

이 breadth는 BoltzGen의 가장 큰 강점입니다. 한 modality에서만 좋은 결과를 낸 것이 아니라, protein binder, nanobody, peptide, cyclic peptide, small-molecule binder까지 같은 framework로 건드렸습니다. Open platform으로서 이 breadth는 의미가 큽니다. 연구자들이 다양한 binder design problem을 하나의 specification language로 적어볼 수 있다는 것은 practical value가 있습니다.

하지만 “26 targets validated”라는 문장을 그대로 한 종류의 evidence처럼 읽으면 안 됩니다. 어떤 campaign은 SPR/BLI로 nM affinity를 측정했고, 어떤 campaign은 live-cell localization proxy를 봤고, 어떤 campaign은 fluorescence-based weak micromolar binding을 봤으며, GyrA는 in-cell growth inhibition과 alanine mutant specificity를 봤습니다. 모두 wet-lab evidence이지만, 같은 깊이의 validation은 아닙니다. 이 논문을 읽을 때는 target 수보다 assay endpoint와 follow-up depth를 먼저 나누는 편이 안전합니다.

9 novel targets 결과

BoltzGen에서 가장 깔끔하게 강한 claim은 9개 novel target campaign입니다. 저자들은 PDB 전체에서 bound-context sequence identity가 30%를 넘는 protein이 없는 monomer target을 골랐다고 설명합니다. 각 target에 대해 nanobody 60,000개와 protein binder 60,000개를 생성하고, filtering과 selection을 거쳐 modality별 최대 15개 후보를 SPR/BLI로 테스트했습니다.

Protein binder에서는 9개 target 중 6개에서 nM binder를 얻었습니다. AMBP, HNMT, IDI2, MZB1, PHYH, PMVK 같은 target에서 10 nM대부터 수백 nM대까지 binder가 보고됩니다. MZB1에서는 여러 nM binders가 나오고, PMVK에서도 10–12 nM binder가 보입니다.

Nanobody에서도 9개 target 중 6개에서 nM binder를 얻었습니다. PHYH 7.8 nM, PMVK 6.1/9.1/13 nM, RFK 8.8/18 nM 같은 사례가 들어갑니다. Nanobody design은 일반 miniprotein binder와 다릅니다. Framework는 고정되고 CDR regions를 설계해야 하며, antibody/VHH 특유의 structural and developability constraint가 붙습니다. BoltzGen이 protein binder와 nanobody 양쪽에서 6/9 target-level success를 보고했다는 점은 강한 evidence입니다.

여기서도 denominator 분리가 먼저입니다. 6/9는 target-level success입니다. 후보 수 기준의 hit rate와도 다릅니다. 60,000개 raw generation 중 6/9가 binder라는 뜻이 아니고, 대량 생성 후 filtering과 selection을 거쳐 최대 15개를 실험했을 때 9개 target 중 6개 target에서 적어도 하나의 nM binder를 얻었다는 뜻입니다. 이 차이를 놓치면 BoltzGen의 성능을 과대평가하게 됩니다. 반대로 이 차이를 인정해도 결과의 실용적 의미는 남습니다. 실험 후보를 15개 이하로 줄인 뒤 절반 이상의 novel target에서 nM binder를 얻었다면, open platform으로서는 무시하기 어려운 성과입니다.

Peptide와 cyclic peptide 결과

BoltzGen은 protein/nanobody에서 멈추지 않고 peptide binder로 넘어갑니다. Bioactive peptide target으로 protegrin, melittin, indolicidin을 골라 target당 6개 design을 실험했습니다. Melittin에서는 일부 design이 antimicrobial activity와 hemolysis를 neutralize했고, mel2/mel3는 intrinsic tryptophan fluorescence 기준 micromolar affinity를 보였습니다. Indolicidin에서는 6개 모두 detectable binding을 보였고, indo4는 sub-micromolar 또는 SPR nanomolar affinity confirmation과 robust neutralization을 보였습니다. Protegrin에서도 micromolar binding과 antimicrobial neutralization 사례가 나옵니다.

이 결과는 흥미롭지만, protein/nanobody nM binder campaign과 같은 언어로 읽기는 어렵습니다. Peptide target에서는 binding assay, aggregation/SEC behavior, antimicrobial neutralization이 서로 완전히 겹치지 않습니다. 어떤 후보는 binding은 약해 보이지만 functional neutralization을 보이고, 어떤 후보는 biophysical behavior 해석이 까다롭습니다. 이 부분은 BoltzGen의 modality breadth를 보여주는 proof-of-concept로는 좋지만, “범용적으로 high-affinity peptide binder를 만든다”는 claim까지 바로 밀어주지는 않습니다.

RagC와 RagA:RagC peptide campaign도 비슷합니다. RagC linear peptide는 10,000개 생성 후 29개를 SPR로 테스트해 7개 binder를 찾았고, 최고 affinity는 3.5 µM입니다. RagA:RagC disulfide-cyclic peptide는 50,000개 생성 후 24개를 테스트해 14개 specific binding signal을 보였고, resolved affinity는 80–1100 µM 범위입니다. 여기서는 “specific binders found”는 맞지만, affinity 수준은 nM binder campaign과 다릅니다. BoltzGen의 장점은 여러 constraint와 modality를 한 플랫폼에서 시도했다는 데 있고, 모든 modality에서 같은 수준의 potency를 보였다고 읽기는 어렵습니다.

NPM1과 GyrA: binding assay 밖의 evidence

NPM1-c mutant campaign은 또 다른 성격입니다. 저자들은 disordered region을 target으로 peptide design 20,000개를 만들고, 5개를 live U2OS cell에서 GFP fusion localization으로 테스트했습니다. 그중 NPM1-binder-4 하나가 nucleoli localization과 endogenous NPM1 colocalization을 보였습니다.

이 결과는 cell 안에서 target engagement 비슷한 신호를 보여준다는 점에서 흥미롭습니다. 하지만 SPR/BLI affinity나 구조적 pose validation과는 다른 evidence입니다. Nucleoli localization과 colocalization은 designed peptide가 NPM1 관련 cellular compartment에 들어간다는 신호를 주지만, 정확히 intended disordered region에 어떤 pose로 결합하는지는 직접 보여주지 않습니다. 논문도 이 부분의 main support를 generated structure와 prediction에 기대고 있습니다.

GyrA campaign은 기능 readout이 강한 쪽입니다. GyrA C-gate closure interface를 target으로 10–50 residue peptide를 설계하고, 1,808개를 E. coli growth inhibition assay로 병렬 테스트했습니다. 352개, 즉 19.5%가 growth를 4배 이상 inhibit했습니다. 하지만 여기서 멈추면 너무 넓은 functional hit rate가 됩니다. Designed interface와 관련된 activity인지를 보려고 target-proximal residues를 alanine으로 바꾼 mutant와 비교했고, delta inhibition ≥2 기준에서는 54개, 전체의 3.0%가 interface-specific inhibitor로 해석됩니다.

이 campaign은 BoltzGen의 강점과 한계를 동시에 보여줍니다. In-cell functional screen에서 수십 개 active peptide를 찾은 것은 인상적입니다. 동시에 이것은 purified binding affinity나 high-resolution pose validation이 아니라 growth inhibition + mutant specificity 기반의 evidence입니다. Functional readout은 실제 생물학적 효과에 가깝지만, mechanism을 더 좁혀 말할 때는 assay 구조를 같이 봐야 합니다.

Small-molecule binder: proof-of-concept에 가까운 영역

BoltzGen은 small-molecule binder도 다룹니다. Rucaparib와 undisclosed rhodamine derivative에 대한 binder를 설계했고, fluorescence shift 또는 polarization 기준으로 micromolar affinity를 보고합니다. Rucaparib binder는 대략 43–151 µM, rhodamine derivative binder는 31–252 µM 범위입니다.

이 결과는 “all-atom generative platform이 protein-protein binder를 넘어 small molecule까지 시도할 수 있다”는 proof-of-concept로는 의미가 있습니다. 그러나 low-nanomolar small-molecule binder나 drug-like binding protein design claim으로 읽기는 어렵습니다. 특히 RFAA/RFdiffusionAA의 ligand-conditioned design 사례와 비교할 때는 assay endpoint, affinity range, structural validation 여부를 분리해서 읽는 편이 안전합니다.

Benchmark target과 memorization caveat

논문은 PD-L1, TNF-alpha, PDGFR, IL-7R-alpha, InsulinR 같은 benchmark targets도 테스트합니다. Protein binder와 nanobody 모두 5개 target 중 4개에서 nM binder를 얻었다고 보고하고, PDGFR에서는 매우 강한 affinity 사례도 나옵니다.

하지만 저자들은 이 target들이 known binder가 public dataset이나 training data에 포함된 경우가 많아 generalization evidence로는 제한적이라고 직접 말합니다. 그래서 이 결과는 성능 showcase로는 볼 수 있지만, novel discovery scenario를 대표하는 evidence로는 9 novel target panel이 더 볼 지점입니다.

또 하나의 중요한 caveat는 ubiquitin memorization입니다. 저자들은 length 73–76 protein binder design에서 특정 target에 대해 BoltzGen이 거의 ubiquitin만 sample하는 diversity collapse 문제가 있다고 적습니다. Open model이라면 이런 실패 패턴도 볼 지점입니다. 실제 사용자는 generated structure를 inspect하고, target과 length, specification을 조정하며, filtering과 selection을 반복하는 practical workflow에 가깝습니다. 논문이 말하는 universal binder design은 plug-and-play magic button이라기보다, 여러 modality를 기술할 수 있는 open design platform에 가깝습니다.

Structural validation이 상대적으로 얇다는 점

BoltzGen은 wet-lab breadth가 넓지만, AlphaProteo나 RFdiffusion 일부 사례처럼 high-resolution designed-pose validation이 전면에 나오는 논문은 아닙니다. Novel target protein/nanobody campaign은 SPR/BLI affinity가 강하고, GyrA는 functional screen이 강하지만, cryo-EM이나 X-ray로 designed pose가 재현된 사례는 이 정리 범위에서는 두드러지지 않습니다.

이것은 논문의 가치를 낮추는 지적이라기보다 evidence layer를 정확히 놓는 문제입니다. Binding, function, specificity, pose validation은 서로 다른 층위의 evidence입니다. Binding affinity가 있으면 binder evidence는 강해집니다. Functional readout이 있으면 biological effect evidence가 붙습니다. 하지만 designed interface가 실제 구조에서 그대로 구현되었는지는 별도 질문입니다. BoltzGen을 platform으로 사용할 때도 이 구분은 볼 지점입니다. Prediction/refolding agreement가 좋아도 pose validation은 실험 구조 없이는 proxy로 남습니다.

평가: broad all-atom binder platform으로서의 BoltzGen

BoltzGen은 “universal binder design”이라는 말을 제목에 걸었습니다. 솔직히 이 표현은 조금 크다고 느껴집니다. 모든 target과 modality에서 plug-and-play로 binder를 만든다는 뜻으로 받아들이면 아직 멀었습니다. Campaign마다 evidence depth가 다르고, affinity range도 넓으며, target-level success와 raw generation success 사이에는 큰 filtering layer가 있습니다. Small-molecule binder는 proof-of-concept에 가깝고, NPM1은 localization proxy이며, peptide/cyclic peptide는 potency와 assay 해석이 target마다 달라집니다.

그럼에도 BoltzGen이 중요한 이유는 분명합니다. Open all-atom model, design specification language, multi-modality wet-lab campaign을 한 논문 안에 묶었다는 점입니다. Closed systems가 높은 성능 기준선을 보여주는 동안, BoltzGen은 공개 platform이 어느 정도 breadth와 validation을 확보할 수 있는지를 보여줍니다. 특히 9 novel target에서 protein binder와 nanobody 모두 6/9 target-level nM success를 보고한 부분은 publication-facing claim으로 충분히 강합니다.

내가 보기엔 BoltzGen은 “universal binder design의 완성”이라기보다 “open binder design platform 경쟁의 시작점”에 가깝습니다. RFdiffusion이 backbone/interface generation을 열었고, AlphaProteo가 closed performance ceiling을 보여줬다면, BoltzGen은 all-atom generation과 specification language를 공개 platform으로 묶어 field가 실험해볼 수 있는 surface를 넓혔습니다. 앞으로 중요한 질문은 하나입니다. 이 open platform이 실제 연구자들의 hands-on pipeline에서 얼마나 안정적으로 반복되고, 어떤 target class에서 실패하며, filtering과 validation이 어디서 병목이 되는가. BoltzGen은 그 질문을 던질 만큼 충분히 큰 발걸음입니다. 과장 없이 말하면, 아직 universal solution은 아니지만 open binder design platform으로서는 중요한 기준점입니다.

참고

•

Stark et al., “BoltzGen: Toward Universal Binder Design”, bioRxiv, 2025. https://www.biorxiv.org/content/10.1101/2025.11.20.689494v1

•

BoltzGen GitHub: https://github.com/HannesStark/boltzgen