RFdiffusion3 논문 리뷰

RFdiffusion2가 enzyme active site의 catalytic atom geometry를 직접 condition으로 넣는 방향을 보여줬다면, RFdiffusion3는 한 걸음 더 넓은 질문을 던집니다. Protein backbone뿐 아니라 sidechain, ligand, DNA, motif atom을 같은 장면 안에서 생성할 수 있을까요? 그리고 그렇게 하면 biomolecular interaction design은 얼마나 더 가까워질까요?

RFdiffusion3 논문, “De novo Design of All-atom Biomolecular Interactions with RFdiffusion3”는 bioRxiv preprint입니다. 아직 peer review를 거친 논문은 아니고, dedicated runnable RFdiffusion3 code/weights release도 이 글을 쓰는 시점에서 확인되는 공개 자료에서는 명확히 확인되지 않았습니다. 그래서 성능 claim을 조심스럽게 읽어야 합니다.

그럼에도 이 논문은 볼 지점입니다. RFdiffusion lineage가 backbone generator에서 all-atom biomolecular interaction generator로 이동하려는 방향이 선명하게 드러나기 때문입니다. 다만 “all-atom이니까 해결됐다”로 받아들이기는 어렵습니다. 오히려 이 논문은 all-atom representation이 들어와도 sequence design, AF3-style filtering, assay validation이 여전히 결정적이라는 점을 보여줍니다.

Backbone-only generation의 한계

RFdiffusion은 residue-frame backbone generation에 강했습니다. Target interface나 motif를 조건으로 넣어 새로운 backbone을 만들고, ProteinMPNN으로 sequence를 붙인 뒤, AF2로 다시 접히는지 확인하는 pipeline은 많은 design task에서 강력했습니다. 하지만 기능이 sidechain atom-level contact에 민감한 문제에서는 backbone-only representation이 병목이 됩니다.

DNA major groove recognition, small-molecule pocket, enzyme active site를 생각해보면 이 한계가 분명합니다. 어떤 residue의 sidechain donor/acceptor가 어디를 향하는지, ligand atom이 얼마나 묻히는지, catalytic triad가 어떤 geometry를 취하는지가 기능을 좌우합니다. 이런 문제에서는 “대충 그럴듯한 backbone”만으로는 부족합니다.

RFdiffusion2는 이 문제를 enzyme theozyme 쪽에서 풀었습니다. Catalytic functional-group atom과 ligand/transition-state atom을 condition으로 넣고 scaffold를 만들었습니다. RFdiffusion3는 더 일반화된 방향을 택합니다. Protein residue 자체를 atom-level representation으로 diffusing하고, ligand, DNA, small molecule, motif atom 같은 non-protein context를 같은 diffusion scene 안에서 다루려 합니다.

14-atom residue slot

RFdiffusion3의 핵심 representation은 14-atom residue slot입니다. Amino acid마다 sidechain atom 수가 다르기 때문에, 논문은 모든 residue를 4개 backbone atom과 10개 sidechain slot으로 padding합니다. 실제 sidechain atom이 부족한 위치는 Cβ 또는 glycine의 경우 Cα 위치의 virtual atom으로 채웁니다. 이렇게 하면 sequence가 아직 정해지지 않은 상태에서도 backbone과 sidechain-like atom coordinates를 함께 생성할 수 있습니다.

Architecture는 AtomWorks framework와 AlphaFold3 diffusion module에 가까운 transformer-based U-Net 구조로 설명됩니다. Atom-level feature와 residue/token-level feature를 오가며 sparse attention과 region-based readout을 사용하고, EDM-style diffusion framework를 채택합니다. Supplement 기준으로 16 NVIDIA H200 GPU에서 약 7일 학습했다고 되어 있습니다.

Conditioning도 넓습니다. Fixed motif나 binding partner atom coordinates, hydrogen-bond donor/acceptor labels, ligand atom별 RASA, center-of-mass conditioning, symmetry, DNA sequence/shape context, partial/fixed ligand setting 등이 포함됩니다. Classifier-free guidance도 conditioning adherence를 높이는 데 사용됩니다.

하지만 여기서 중요한 caveat가 있습니다. RFdiffusion3가 atom coordinates를 생성하더라도, 논문에서 최종 sequence realization은 여전히 ProteinMPNN 또는 LigandMPNN에 의존합니다. RFD3 output의 inferred sequence-like atom identity만으로 끝내는 것이 아니라, downstream sequence design과 AF3/RF3-style filtering을 거쳐야 reported benchmark 성능이 나옵니다.

[Figure 1] 생성 단위의 변화

Figure 1은 RFdiffusion3의 위치를 잡는 데 유용합니다. RFdiffusion이 residue frame을 생성했다면, RFdiffusion3는 residue를 14-atom representation으로 놓고 atom coordinate 자체를 diffusing합니다. 이 그림은 “all-atom”이라는 단어보다, protein sidechain과 ligand/DNA/motif context를 같은 scene 안에서 다룬다는 점을 보여주는 그림으로 읽는 편이 좋습니다.

Supplement를 보면 RFdiffusion3는 그냥 AF3를 조금 바꾼 모델이 아닙니다. EDM diffusion framework와 AtomWorks-style atom/token representation을 조합하고, conventional AF3 sequence-local attention을 그대로 쓰지 않으며, sparse atom/token processing으로 더 lean한 architecture를 만들려는 방향입니다.

[Figure 2] Conditioning examples: DNA, ligand, motif

Figure 2에서는 RFdiffusion3가 조건으로 받을 수 있는 biomolecular context가 정리됩니다. Hydrogen-bond donor/acceptor labels, ligand RASA, center-of-mass specification, symmetry condition, DNA shape/context 같은 조건이 등장합니다. 이 조건들은 단순한 옵션이 아니라, all-atom interaction design에서 “어떤 contact를 만들고 싶은가”를 표현하는 언어에 가깝습니다.

DNA task도 일반 protein generation과 같은 데이터 혼합 위에서만 나온 것이 아닙니다. Supplement에는 DNA distillation, DNA interfaces, free DNA set이 따로 등장합니다. DNA distillation set은 high-confidence predicted protein-DNA complexes에서 오고, DNA interfaces는 PDB protein-DNA interface subset이며, free DNA는 DeepDNAshape와 X3DNA-rebuild로 만든 all possible DNA hexamers 기반입니다. 따라서 DNA binder 결과는 DNA interface와 DNA shape/context를 별도로 다룬 training/input setup 위에서 나온 proof-of-principle로 읽는 편이 안전합니다.

[Figure 3] In silico benchmarks: task가 많지만 evidence depth는 다르다

Unconditional generation

논문은 먼저 unconditional protein generation을 평가합니다. Length 100–200 design에 대해 8개 ProteinMPNN sequences 중 하나 이상이 AF3 prediction 기준 design backbone과 1.5 Å RMSD 이내로 refold되는 비율을 98%로 보고합니다. 96 generations에서 TM-score 0.5 cutoff 기준 41 clusters를 만들었다는 diversity 결과도 제시합니다.

이 결과에서 읽을 수 있는 것은 RFdiffusion3가 foldable-looking backbone/atomistic structures를 만들 수 있다는 점입니다. 다만 AF3 recapitulation과 cluster diversity는 wet-lab foldability와는 다른 층위의 metric입니다. 실제 protein이 만들어지고 접히는지는 별도의 evidence로 보아야 합니다.

Protein-protein binder benchmark

RFdiffusion3는 protein-protein binder design에서도 평가됩니다. RFD3는 residue-level hotspot이 아니라 atom-level hotspot을 줄 수 있고, 5개 therapeutically relevant targets에서 RFD1과 비교해 AF3-based in silico pass 기준이 더 좋다고 제시합니다.

이 benchmark는 binding assay가 아닙니다. AF3/RMSD/interface-confidence 기반의 in silico comparison입니다. 따라서 이 결과를 “RFD3가 protein-protein binder를 wet-lab에서 검증했다”로 읽으면 안 됩니다. paper 안에서 protein-protein binder wet-lab validation은 보고되지 않습니다.

Small-molecule binder benchmark

Small-molecule binder benchmark에서는 FAD, OQO, IAI, SAM 네 ligand를 사용합니다. Fixed ligand setting과 diffused ligand setting을 나누고, 각 target/setting에서 400 designs, LigandMPNN 8 sequences, AF3 refolding으로 backbone RMSD와 ligand RMSD를 평가합니다.

RASA conditioning과 classifier-free guidance로 ligand burial이나 hydrogen-bond interaction을 조절하는 부분은 방법론적으로 흥미롭습니다. 하지만 이 paper에서 small-molecule binder wet-lab validation은 없습니다. RFAA/RFdiffusionAA의 digoxigenin, heme, bilin wet-lab binder evidence와 혼동하면 안 됩니다.

[Figure 4] Wet-lab evidence: DNA binder와 cysteine hydrolase

DNA binder: 1/5 proof-of-principle hit

DNA binder generation은 RFdiffusion3 논문에서 가장 흥미로운 부분 중 하나입니다. 세 개 held-out DNA sequences에 대해 각 sequence당 100 structures를 생성하고, backbone당 LigandMPNN 4 sequences를 붙인 뒤 AF3로 평가합니다. Design/prediction DNA phosphate atoms를 align하고 protein Cα RMSD를 계산하는 DNA-aligned RMSD를 사용합니다.

Supplement의 실험 pipeline은 proof-of-principle 성격이 강합니다. Random target DNA sequence는 CGAGAACATAGTCG이고, fine-tuned RFD3로 length 50–80 designs를 만들고, LigandMPNN 8 sequences/backbone, AF3 DNA-aligned protein RMSD < 3 Å, iPTM > 0.8 같은 필터를 거친 뒤 5 designs를 주문합니다. Yeast display에서는 anti-c-Myc FITC로 display expression을 보고, biotinylated dsDNA target oligo와 streptavidin–PE로 binding을 봅니다.

그중 DBRFD3가 cognate DNA target에 binding한다고 보고됩니다. No-avidity titration에서 apparent EC50 = 5.89 ± 2.15 µM이고, 세 replicate 기반입니다. 이 값은 real binding signal이지만, sequence-specific DNA recognition을 넓게 입증한 것은 아닙니다. Broad specificity panel, high-resolution structural pose validation, cellular function은 이 paper에서 확립되지 않았습니다.

Cysteine hydrolase: 35/190 multi-turnover hits

RFdiffusion3의 주요 functional validation은 cysteine hydrolase입니다. 4-methylumbelliferyl phenyl acetate, 즉 4MU-PA hydrolysis를 대상으로 하고, Ulp-1-derived active-site motif를 사용합니다. Cys-His-Asp catalytic triad, Gln, cysteine-flanking backbone atoms, tetrahedral intermediate substrate geometry를 조건으로 넣습니다.

Supplement를 보면 RFD3가 단독으로 active enzyme을 낸 것이 아니라, updated downstream sequence-design pipeline과 함께 작동했다는 점이 분명합니다. TI1을 non-canonical thioacylated cysteine으로 encode하고, LigandMPNN–AF3 iterative refinement를 사용합니다. Round 1에서는 backbone마다 LigandMPNN 10 sequences와 AF3 ensemble을 돌리고, pLDDT > 80 및 ensemble-average pTM > 0.7, stereochemistry filter를 적용합니다. 이후 AF3 TI1 models를 다시 LigandMPNN conditioning structure로 recycle하고, distance constraints로 catalytic geometry를 유지합니다.

최종적으로 AF3가 holo/ES, TI1, acyl-enzyme 등 여러 state를 확인하고, catalytic distances와 model confidence를 기준으로 consolidate한 뒤 96-design set을 두 번 만들었고, DNA synthesis QC 실패 2개를 제외한 190개가 screen됩니다. 여기서 73/190 detectable activity, 35/190 multi-turnover hits가 나옵니다. Best design C6는 kcat/KM ≈ 3.6 × 10³ M⁻¹s⁻¹를 보였다고 보고됩니다.

이 결과는 강합니다. RFdiffusion3의 atom-level active-site/context generation이 실제 catalytic activity까지 이어진 사례이기 때문입니다. 다만 구조 검증은 여전히 AF3 prediction 중심입니다. X-ray나 cryo-EM으로 designed active site pose를 확인한 것은 아닙니다. 따라서 activity evidence는 real이지만, mechanism과 구조를 얼마나 정확히 구현했는지는 추가 검증이 필요한 지점입니다.

RFdiffusion2와 RFdiffusion3 비교

이 두 논문은 이름 때문에 단순 version update처럼 보일 수 있지만, 실제 초점은 다릅니다. RFdiffusion2에서 중심에 있는 문제는 enzyme theozyme을 residue index와 rotamer enumeration 없이 직접 scaffold하는 것입니다. 즉 functional-group atom을 condition으로 넣고, 그 atom들이 어떤 residue와 위치로 구현될지 모델이 함께 고릅니다.

RFdiffusion3는 범위를 더 넓게 잡습니다. Protein residue 자체를 14-atom slot으로 두고, protein sidechain, ligand, DNA, motif atom을 같은 atom-level diffusion scene 안에 넣습니다. RFdiffusion2가 enzyme active site에 강하게 맞춘 milestone이라면, RFdiffusion3는 all-atom biomolecular interaction generation으로 RF lineage를 넓히려는 시도입니다.

따라서 RFdiffusion3를 RFdiffusion2의 단순 상위호환으로 읽으면 안 됩니다. RFD2는 enzyme active-site scaffolding에서 아주 선명한 문제를 풉니다. RFD3는 범용성을 넓히지만, task별 evidence의 깊이는 다릅니다. Protein-protein binder와 small-molecule binder는 in silico evidence가 중심이고, wet-lab은 DNA binder와 cysteine hydrolase 쪽에 제한적으로 있습니다.

한계점

첫째, RFdiffusion3는 bioRxiv preprint입니다. publication-facing 글에서는 peer-reviewed result처럼 단정적으로 쓰지 않는 편이 안전합니다.

둘째, dedicated RFdiffusion3 runnable code/weights release status는 이 글을 쓰는 시점에서 확인되는 공개 자료만으로는 명확하지 않습니다. Paper text는 AtomWorks GitHub를 언급하지만, RFdiffusion3 자체를 독자가 그대로 실행해 재현할 수 있는지는 논문이 직접 보여주는 evidence 범위 밖에 있습니다.

셋째, 많은 benchmark는 in silico structural/proxy metric입니다. AF3 refolding, ligand RMSD, interface confidence는 useful하지만 binding/function validation은 아닙니다.

넷째, wet-lab evidence는 task-specific입니다. DNA binder는 1/5 hit와 apparent EC50 5.89 ± 2.15 µM이고, cysteine hydrolase는 35/190 multi-turnover hits와 best kcat/KM ~3.6 × 10³ M⁻¹s⁻¹입니다. 반면 protein-protein binder와 small-molecule binder wet-lab validation은 이 paper에서 보고되지 않습니다.

평가

RFdiffusion3는 protein design에서 “all-atom generation”이라는 말을 어디까지 받아들일지 생각하게 만드는 논문입니다. All-atom이라고 해서 곧바로 모든 binder와 function design이 해결되는 것은 아닙니다. 하지만 sidechain과 non-protein context를 같은 생성 문제 안에 넣으려는 방향은 분명히 볼 지점입니다.

특히 DNA binder와 enzyme active site처럼 atom-level contact가 기능을 직접 좌우하는 문제에서는 RFdiffusion3의 representation이 설득력 있습니다. Backbone만 잘 만드는 모델로는 이런 문제를 끝까지 밀기 어렵습니다. 다만 이 논문에서도 최종 성공은 여전히 LigandMPNN/ProteinMPNN sequence design, AF3-style filtering, 그리고 wet-lab assay에 의존합니다.

제가 보기에는 RFdiffusion3의 가장 좋은 해석은 이렇습니다. RFdiffusion3는 protein design을 all-atom biomolecular interaction generation 쪽으로 넓히는 중요한 시도입니다. 하지만 현재 evidence만으로는 “all-atom이면 binder/function design이 해결된다”고 말하기 어렵습니다. 더 정확한 결론은 “representation bottleneck은 줄어들고 있지만, sequence realization과 experimental validation bottleneck은 여전히 남아 있다”입니다.

참고

•

Butcher et al. 2025, bioRxiv, “De novo Design of All-atom Biomolecular Interactions with RFdiffusion3”, DOI: 10.1101/2025.09.18.676967

•

Wiki 참고: RFdiffusion3, RFdiffusion Lineage, All-Atom Generation, Enzyme Design, Context-aware Design, Candidate Filtering, Wet-lab Validation, Assay Cascade