Chroma 논문 리뷰

들어가며

Protein design을 Bayesian inference로 보기

Backbone, sequence, sidechain을 나눠 생성한다

Correlated polymer diffusion

Conditioner framework: Chroma의 진짜 중심

Semantic conditioning과 natural-language prompt

In silico evaluation: novelty, diversity, refolding

Wet-lab validation: Chroma가 강한 지점

무엇을 검증했고, 무엇을 검증하지 않았나

RFdiffusion, FrameDiff와의 위치

평가: programmable protein generation의 기준점

참고

Chroma 논문 리뷰

들어가며

Protein generation 논문은 보통 두 방향 중 하나로 나뉩니다. 하나는 “얼마나 잘 접히는 protein backbone을 만들었는가”이고, 다른 하나는 “어떤 target이나 motif에 맞는 functional design을 만들었는가”입니다. Chroma는 이 둘 중 하나에만 깔끔하게 들어가지 않습니다. 이 논문은 특정 binder task에서 최고 성능을 내는 모델이라기보다, protein design을 조건부로 조작할 수 있는 generative prior로 만들려는 시스템 논문입니다.

2023년 Nature에 실린 “Illuminating protein space with a programmable generative model”에서 Generate Biomedicines는 Chroma를 programmable protein generation system으로 제시했습니다. ChromaBackbone은 protein backbone과 complex를 diffusion으로 생성하고, ChromaDesign은 backbone에 맞는 sequence와 sidechain torsion을 붙입니다. 여기에 symmetry, substructure, motif, shape, semantic class, natural-language prompt 같은 conditioner를 generation time에 조합합니다.

그래서 Chroma를 읽을 때는 “RFdiffusion보다 binder를 잘 만들었나?”라고 묻는 것보다, “protein generative model을 얼마나 programmable하게 만들었나?”라고 묻는 편이 자연스럽습니다. Chroma의 강점은 broad conditioner framework와 wet-lab foldability validation입니다. 반대로 target-specific binder affinity나 enzyme activity를 보여주는 논문은 아닙니다.

Protein design을 Bayesian inference로 보기

Chroma의 문제 설정은 단순합니다. 가능한 protein space에서, 사용자가 원하는 조건을 만족하는 protein을 어떻게 직접 sample할 것인가. 기존 computational protein design은 project마다 custom search와 filtering을 많이 썼고, 조건이 바뀔 때마다 별도 protocol이 필요했습니다. Chroma는 이 문제를 generative prior와 conditioner의 조합으로 바꿉니다.

논문은 protein design을 external constraint 아래의 Bayesian inference로 둡니다. Diffusion model이 protein prior를 제공하고, 사용자가 원하는 constraint는 conditioner energy 또는 likelihood term처럼 sampling 과정에 더해집니다. 즉 model 자체를 매번 새로 학습하지 않고, generation time에 조건을 걸어 posterior-like sampling을 수행한다는 아이디어입니다.

이 관점이 Chroma의 핵심입니다. Chroma는 “text를 넣으면 protein이 나오는 모델”이라기보다, protein prior 위에 geometry/semantic condition을 프로그래밍하는 framework에 가깝습니다. Natural-language prompt는 그중 하나의 demo일 뿐이고, 오히려 symmetry, motif, substructure, shape conditioning이 논문의 더 안정적인 중심입니다.

Backbone, sequence, sidechain을 나눠 생성한다

Chroma는 full all-atom protein을 한 번에 black-box로 생성하지 않습니다. Joint distribution을 세 부분으로 나눕니다. 첫째, backbone heavy atom coordinate likelihood입니다. 여기에는 N, Cα, C, O backbone atoms가 들어갑니다. 둘째, backbone-conditioned sequence likelihood입니다. 셋째, backbone과 sequence가 주어졌을 때 sidechain torsion angle likelihood입니다.

이 구조에서 ChromaBackbone은 backbone coordinate를 만들고, ChromaDesign은 sequence와 sidechain conformation을 생성합니다. ChromaDesign은 sampled backbone에 대해 amino-acid sequence와 sidechain torsion을 붙이는 내부 design network입니다. Supplement에서는 ChromaDesign이 ProteinMPNN과 natural protein sequence recovery에서 comparable하다고 보고합니다.

이 점은 중요합니다. 많은 backbone generation 논문은 backbone을 만든 뒤 ProteinMPNN 같은 외부 sequence design model을 붙이고, ESMFold나 AF2로 self-consistency를 봅니다. Chroma는 sequence/sidechain design을 system 안에 포함합니다. 그래서 Chroma는 backbone-only diffusion이라기보다 backbone generation, sequence design, sidechain modeling을 함께 묶은 programmable protein generation system으로 보는 편이 맞습니다.

Correlated polymer diffusion

ChromaBackbone의 diffusion도 일반 image diffusion과 다릅니다. 각 coordinate에 independent Gaussian noise를 넣는 대신, protein chain의 polymer statistics를 반영한 correlated diffusion을 사용합니다. Forward noising process의 covariance는 local chain constraints와 radius-of-gyration scaling law를 반영하도록 설계됩니다.

이 framing은 FrameDiff와 대비됩니다. FrameDiff는 residue rigid frame과 SE(3) manifold 위의 diffusion을 전면에 놓습니다. Chroma는 backbone heavy atom coordinate에 polymer-structured covariance를 넣어, protein chain이 collapsed polymer ensemble로 noising되는 그림을 취합니다. 둘 다 protein backbone diffusion이지만, “무엇을 diffusion state로 두고 어떤 prior를 넣는가”가 다릅니다.

Chroma는 scaling도 강하게 강조합니다. Random long-range graph connection을 쓰는 graph neural network와 equivariant geometry solver를 결합해 큰 protein complex를 다룰 수 있다고 주장합니다. 논문은 60 subunit, 60,000 residue, 240,000 atom 규모의 icosahedral complex demo를 보여줍니다. 이 demo는 algorithmic scalability를 보여주지만, 해당 assembly formation이 실험적으로 검증되었다는 뜻은 아닙니다.

Conditioner framework: Chroma의 진짜 중심

Chroma의 가장 중요한 부분은 conditioner입니다. 논문은 diffusion model을 time-dependent protein prior로 두고, user-specified constraint를 energy term이나 state transformation으로 조합합니다. 여기에는 substructure constraint, distance restraint, motif restraint, symmetry, arbitrary shape, neural-network classifier, natural-language prompt가 포함됩니다.

Symmetry conditioning은 cyclic, dihedral, tetrahedral, octahedral, icosahedral symmetry까지 다룹니다. Substructure conditioning은 DHFR half regeneration, VHH CDR loop rebuilding, motif outfilling을 보여줍니다. Motif examples에는 αββ packing motif, chymotrypsin catalytic triad active site, EF-hand Ca-binding motif가 포함됩니다. Shape conditioning은 Latin alphabet과 Arabic numerals 같은 arbitrary point-cloud shape를 stress test로 사용합니다.

이런 예시는 Chroma가 단순히 “예쁜 protein-like object”를 만드는 모델이 아니라, 생성 과정에 constraint를 넣는 programming interface를 지향한다는 점을 보여줍니다. 다만 여기서도 대부분의 evidence는 refolding/self-consistency와 structural plausibility입니다. 조건에 맞는 모양을 만들었다는 것과, 그 protein이 실제 biological function을 수행한다는 것은 다른 층위입니다.

Semantic conditioning과 natural-language prompt

Chroma 논문에서 가장 눈에 띄는 demo는 natural-language prompt입니다. “crystal structure of Fab”, “SH2 domain”, “kinase domain”, “de novo designed Rossmann fold protein” 같은 caption을 조건으로 넣고, 그에 맞는 sample을 생성합니다. Semantic conditioning은 CATH class/topology classifier를 사용해 원하는 class probability를 높이는 방향으로 diffusion sampling을 bias합니다.

이 부분은 흥미롭지만, 가장 조심해서 읽어야 합니다. 저자들도 Rossmann fold처럼 training data에 많이 나타나는 class에서는 alignment가 비교적 잘 보이지만, Ig fold나 β-barrel 같은 경우 canonical topology를 완전히 만족하지 못하는 예가 있다고 설명합니다. Natural-language prompt 역시 reliable text-to-protein interface라기보다 early demonstration에 가깝습니다.

즉 Chroma의 natural-language conditioning은 “문장으로 원하는 단백질을 설계한다”는 완성된 기능이 아닙니다. 더 정확히는, protein prior 위에 learned semantic classifier를 얹어 sampling을 bias할 수 있음을 보여준 초기 사례입니다. 이 차이를 분명히 해야 Chroma의 장점과 한계를 동시에 볼 수 있습니다.

In silico evaluation: novelty, diversity, refolding

Chroma는 unconditional generation에서 100,000 single-chain proteins와 20,000 complexes를 sample해 분석했습니다. Secondary-structure usage, contact order, radius of gyration, long-range contact frequency, inter-residue contact density 같은 low-order statistics는 PDB와 전반적으로 비슷하다고 보고합니다. 다만 low-temperature samples에서는 α-helix overrepresentation이 관찰됩니다.

Novelty는 PDB/CATH structural homology로 평가합니다. 논문은 PDB nearest-neighbor TM score가 낮은 sample이 많고, longer structures에서 novelty가 증가한다고 보고합니다. 다만 긴 protein은 database coverage가 낮기 때문에 novelty가 과대평가될 수 있습니다. 그래서 논문은 CATH domain coverage 기반 length-normalized novelty metric도 함께 사용합니다.

Refolding은 ChromaDesign으로 sequence를 붙인 뒤 AlphaFold, ESMFold, OmegaFold로 generated backbone을 recapitulate하는지 보는 방식입니다. Main text는 single-chain unconditional samples에서 length, helical content, novelty에 걸쳐 widespread refolding이 관찰되었다고 보고합니다. 하지만 refolding consistency는 structure prediction model이 novel fold에 generalize한다는 가정에 기대는 proxy입니다. Foldability와 structural plausibility를 지지할 수는 있지만, 실험적 folding이나 function을 대신하지는 않습니다.

Wet-lab validation: Chroma가 강한 지점

Chroma의 가장 강한 evidence는 wet-lab validation입니다. 논문은 Chroma v0 기반 simple design protocol을 사용했습니다. Low-temperature Chroma로 backbone을 sample하고, ChromaDesign으로 sequence를 붙인 뒤, sequence/structure likelihood 중심으로 subset을 고릅니다. 흥미롭게도 structure-prediction refolding filter나 energy calculation은 일부러 사용하지 않았습니다.

총 310개 protein이 experimental characterization으로 갔습니다. 이 중 unconditional designs는 268개, semantic conditioning designs는 42개입니다. 첫 unconditional set은 172개 protein, 길이 100–450 aa였고, split-GFP pooled solubility assay에서 172개 모두 negative control보다 높은 enrichment score를 받았습니다. Top-scoring 20개 중 19개는 western blot으로 soluble expression이 확인되었고, lowest-scoring 20개는 0/20이었습니다.

두 번째 unconditional set은 96개 protein, 길이 100–950 aa였습니다. 여기서도 split-GFP 결과는 비슷했고, top-scoring 10개 중 9개가 western blot으로 soluble expression을 보였습니다. Top 10% split-GFP solubility screen에서 7개 protein을 정제해 CD와 DSC를 측정했고, 대부분 stable folded secondary structure를 보였습니다.

가장 설득력 있는 부분은 X-ray structure입니다. UNC_079는 1.1 Å resolution 구조를 얻었고, design model과 backbone RMSD 1.1 Å였습니다. UNC_239는 2.4/2.36 Å resolution 구조를 얻었고, design model과 backbone RMSD 1.0 Å였습니다. 둘 다 PDB 대비 novel structure로 보고됩니다. 이 결과는 Chroma가 만든 일부 de novo protein이 실제로 soluble하게 발현되고, 접히고, 설계 구조와 원자 수준에서 잘 맞을 수 있음을 보여줍니다.

무엇을 검증했고, 무엇을 검증하지 않았나

Chroma wet-lab data는 de novo generated proteins의 expression, solubility, folding, stability, structural accuracy를 꽤 강하게 지지합니다. 특히 refolding filter 없이도 crystal structure 2개가 design model과 약 1 Å RMSD로 맞은 점은 중요합니다. 이것은 Chroma가 단순한 visual generator가 아니라, 실제 protein foldability에 닿아 있는 generative prior를 배웠다는 근거입니다.

하지만 이 validation은 target-specific binder design이나 enzyme function validation이 아닙니다. Split-GFP, western blot, CD, DSC, X-ray crystallography는 protein이 발현되고 접히며 안정적인 구조를 가질 수 있음을 보여줍니다. Binding affinity, specificity, cellular function, catalytic activity를 직접 보여주지는 않습니다.

그래서 Chroma를 binder design milestone처럼 소개하면 과합니다. RFdiffusion은 target-conditioned binder design에서 BLI와 cryo-EM validation을 보여준 논문이고, BindCraft는 miniprotein binder pipeline으로 여러 target에서 binding/function evidence를 제시했습니다. Chroma는 그보다 상위의 programmable protein generation prior에 가깝습니다. Foldability validation은 강하지만, binder/function validation과는 다른 성격입니다.

RFdiffusion, FrameDiff와의 위치

Chroma와 RFdiffusion은 둘 다 diffusion-based protein design으로 묶일 수 있지만, 강조점이 다릅니다. RFdiffusion은 motif scaffolding, symmetric oligomer, target-conditioned binder design까지 이어지는 practical design milestone입니다. 특히 binder design에서는 ProteinMPNN sequence design과 AF2 filtering을 거쳐 wet-lab BLI/cryo-EM validation까지 보여줍니다.

Chroma는 더 넓은 programmable generation을 겨냥합니다. Conditioner framework가 일반적이고, sequence/sidechain design을 system 안에 포함하며, wet-lab은 soluble/folded de novo protein과 crystal structure validation에 집중합니다. RFdiffusion이 “조건부 backbone/interface generation이 실제 binder로 이어질 수 있다”의 anchor라면, Chroma는 “protein generative prior에 다양한 constraints를 compositionally 걸 수 있다”의 anchor입니다.

FrameDiff와의 차이도 분명합니다. FrameDiff는 SE(3) manifold 위에서 protein backbone diffusion을 원칙적으로 정의한 method paper입니다. Monomer backbone generation과 ProteinMPNN+ESMFold self-consistency를 보였지만 wet-lab validation은 없습니다. Chroma는 polymer-structured coordinate diffusion, scalable GNN, conditioner framework, ChromaDesign, wet-lab validation을 묶은 system paper입니다. FrameDiff가 “backbone diffusion을 어떻게 정의할 것인가”에 가깝다면, Chroma는 “protein space prior를 어떻게 programmable design engine으로 쓸 것인가”에 가깝습니다.

평가: programmable protein generation의 기준점

Chroma의 장점은 명확합니다. Protein generation을 단순 sampling problem에서 programmable conditional sampling problem으로 확장했습니다. Backbone prior, sequence/sidechain design, conditioner framework, wet-lab foldability validation을 하나의 system으로 묶었습니다. 특히 refolding filter 없이도 soluble expression과 crystal structure validation까지 간 점은 Chroma를 단순 in silico model과 구분합니다.

한계도 같이 보는 편이 안전합니다. Natural-language conditioning은 아직 early demo입니다. Refolding metric은 구조 예측 모델의 generalization에 기대는 proxy입니다. Large symmetric assembly demo는 algorithmically impressive하지만 broad assembly formation validation은 아닙니다. Wet-lab validation은 folded protein evidence이지 binder affinity나 enzyme function evidence가 아닙니다. Low-temperature sampling은 quality를 높이는 대신 diversity를 낮출 수 있어, 다른 backbone generator와의 단순 leaderboard 비교도 조심스럽습니다.

그래도 Chroma는 protein design 모델의 중요한 기준점입니다. RFdiffusion이 binder/interface design의 practical impact를 보여줬다면, Chroma는 protein space를 조건부로 탐색하는 programmable generative prior의 가능성을 보여줬습니다. 이후의 all-atom generation, motif scaffolding, semantic conditioning 논문을 읽을 때 Chroma는 계속 비교 기준으로 돌아오게 되는 논문입니다.

참고

•

Ingraham, J. B. et al. “Illuminating protein space with a programmable generative model”, Nature, 2023.

•

DOI: https://doi.org/10.1038/s41586-023-06728-8

•

주요 비교 축: RFdiffusion, FrameDiff, Proteus, ProteinMPNN, BindCraft.