Latent-X 논문 리뷰

들어가며

Binder design에서 Latent-X가 겨냥한 병목

Closed all-atom platform으로 읽기

Filtering이 성능의 일부다

Macrocycle result: 높은 hit rate와 낮은 affinity의 조합

Mini-binder result: 적은 후보 수에서 나온 강한 binding signal

Specificity evidence의 범위

In silico benchmark는 wet-lab hit rate가 아니다

Structural diversity와 speed claim

Figure 흐름으로 읽기

AlphaProteo, RFdiffusion, Latent-X2와의 위치

이 논문을 읽을 때의 guardrail

평가: 성능 상한선과 투명성의 간극

참고

Latent-X 논문 리뷰

들어가며

AI binder design 논문을 읽을 때 자주 생기는 착시는 “몇 개를 만들어서 몇 개가 붙었는가”라는 숫자만 보는 데서 시작됩니다. 실제로는 raw generation, in silico filtering, synthesis/expression feasibility, binding assay, specificity, structural validation이 서로 다른 층위입니다. Latent-X는 이 층위 구분을 특히 조심해서 읽어야 하는 논문입니다.

Latent-X는 Latent Labs가 2025년에 공개한 arXiv preprint입니다. 제목은 “An Atom-level Frontier Model for De Novo Protein Binder Design”이고, target structure와 hotspot epitope를 조건으로 mini-binder와 macrocyclic peptide를 생성합니다. 논문이 내세우는 숫자는 강합니다. Macrocycle에서는 tested binder 기준 91–100% hit rate를, mini-binder에서는 target별 10–64% HT-BLI hit rate를 보고합니다. 일부 mini-binder는 pM 수준 affinity까지 제시됩니다.

하지만 이 논문은 open method paper라기보다 company-led closed platform performance report에 가깝습니다. Model weights, runnable code, full architecture, training recipe는 공개되어 있지 않습니다. 따라서 이 글에서는 Latent-X를 “내가 바로 써볼 수 있는 pipeline”이 아니라, closed all-atom binder design platform이 어디까지 실험 성능을 보여줬는지에 대한 benchmark-like report로 읽겠습니다.

Binder design에서 Latent-X가 겨냥한 병목

전통적인 computational binder design은 여러 단계로 나뉩니다. Backbone을 만들고, sequence를 설계하고, structure prediction model로 다시 접히는지 확인하고, interface metric으로 filter합니다. 그다음 적은 수의 후보를 실험으로 보냅니다. 이 방식은 강력하지만, target epitope에서 어떤 sidechain interaction이 필요한지와 binder sequence/structure consistency가 뒤늦게 맞춰지는 경우가 많습니다.

Latent-X가 던지는 질문은 조금 다릅니다. Target epitope를 지정하면, model이 binder sequence와 all-atom complex structure를 함께 생성해서 실험 후보 수를 30–100개 수준으로 줄여도 binding hit를 낼 수 있는가? 논문의 framing은 “large library를 훑는 대신, precision AI design으로 바로 실험 가능한 후보를 만들 수 있다”는 쪽에 있습니다.

여기서 중요한 단어는 “post-filter”입니다. Latent-X가 wet-lab으로 보낸 후보들은 raw generated samples가 아닙니다. Chai-1, Boltz-2, novelty filter, synthesis filter, diversity selection을 통과한 후보입니다. 그러므로 논문이 보여주는 hit rate는 raw generation hit rate가 아니라, platform generation plus filtering 이후의 wet-lab handoff efficiency입니다.

Closed all-atom platform으로 읽기

Latent-X는 target structure와 hotspot residues를 prompt로 받아 binder-target complex를 생성합니다. 논문은 이를 joint sequence–structure generation이라고 설명합니다. Binder sequence, binder structure, target-side interaction detail을 함께 생성한다는 점에서 backbone-only generator와 다릅니다.

Input은 target protein mmCIF, hotspot residues, binder length, target crop입니다. Context window는 target과 binder를 합쳐 512 residues이며, target backbone atoms only가 model input으로 들어간다고 설명됩니다. 이 지점은 미묘합니다. 논문은 Latent-X가 target sidechain rotamer를 co-generate/adapt할 수 있다고 말하지만, user input이 target sidechain detail을 그대로 제공하는 구조는 아닙니다. 즉 epitope conditioning은 target backbone과 hotspot 지정 위에서 작동합니다.

Training data는 PDB와 AFDB v4 mixture이고, PDB는 2023년 11월 23일 이후 entry를 제외했습니다. 논문에서는 Latent-X v1과 v1.1을 구분합니다. v1은 experimental validation에 사용된 macrocycle/mini-binder generation model이고, v1.1은 in silico expected hit-rate study와 platform serving에 쓰이는 slightly improved version입니다.

이 차이는 독자가 놓치기 쉽습니다. Wet-lab 결과와 in silico benchmark가 항상 같은 model variant에서 나온 것은 아닙니다. 성능 숫자를 비교할 때 v1 experimental validation과 v1.1 platform/in silico analysis를 구분해 두는 편이 안전합니다.

Filtering이 성능의 일부다

Latent-X의 reported success는 filtering 없이는 해석할 수 없습니다. Mini-binder filter는 Chai-1과 Boltz-2 기반 structure prediction consistency를 사용합니다. Chai-1 기준으로는 `min_ipae < 1`, `ptm_binder > 0.9`, `complex_rmsd < 2` 같은 threshold가 들어갑니다. Macrocycle은 validated cyclic peptide complex에서 `ptm_binder`가 낮게 나오는 문제가 있어, mini-binder filter를 그대로 쓰지 않고 `complex_rmsd`와 `min_ipae` 중심의 relaxed filter를 사용합니다.

후보 selection에는 novelty와 manufacturability filter도 들어갑니다. Macrocycle에서는 cyclic permutation을 고려한 short peptide database search로 >50% identity hit를 제거하고, vendor synthesis guideline에 따라 hydrophilic residue 비율, 반복 residue, 긴 hydrophobic stretch, cysteine-containing designs 등을 걸러냅니다. Mini-binder에서는 UniRef50 MMseqs2 search로 >20% identity hit를 제거하고, Foldseek clustering으로 structural diversity를 확보하며, cysteine-containing designs는 expression/purification complication 때문에 제외합니다.

따라서 Latent-X의 성능은 model 하나의 latent space만으로 설명되지 않습니다. Generator, Chai-1/Boltz-2 filter, novelty filter, synthesis/expression feasibility filter, diversity selection이 함께 만든 result입니다. 이 점은 AlphaProteo, BindCraft, RFdiffusion 계열 논문을 읽을 때와 같은 guardrail입니다.

Macrocycle result: 높은 hit rate와 낮은 affinity의 조합

Latent-X는 MDM2, MCL-1, PD-L1 세 macrocycle target에서 모두 binding hit를 얻었다고 보고합니다. 각 target에서 length 12–18, length당 100개씩 총 700 designs를 만들고, filter 후 top 30 designs를 synthesis로 보냈습니다. Synthesis/cyclization success는 MCL-1 77%, MDM2 57%, PD-L1 87% with purity >90%로 제시됩니다.

실험 hit rate는 tested binder 기준 MDM2 90.9%, MCL-1 100.0%, PD-L1 94.1%입니다. 이 숫자는 macrocyclic peptide design에서는 상당히 강한 result입니다. 하지만 affinity는 대체로 µM range입니다. 대표적으로 `LL_CYC_MDM2_18`은 KD 5.35 µM, `LL_CYC_MCL-1_2`는 KD 18.4 µM, `LL_CYC_PD-L1_24`는 KD 71.7 µM입니다.

이 결과는 “macrocycle binder를 높은 확률로 만들 수 있다”는 claim에는 힘을 줍니다. 동시에 “바로 potent therapeutic macrocycle을 만든다”는 claim까지 가면 너무 멉니다. Hit rate와 affinity는 다른 층위입니다. Latent-X의 macrocycle result는 binding discovery efficiency를 보여주지만, potency optimization과 drug-like property는 별도 문제로 남습니다.

Specificity는 all-against-all SPR로 봅니다. 각 macrocycle은 intended target에 가장 강하게 결합했고, off-target affinity는 보통 몇 orders of magnitude 약했다고 설명됩니다. 다만 논문도 canonical amino acid macrocycle에서는 low-level off-target binding이 positive control에서도 관찰된다고 적습니다. 따라서 specificity evidence는 의미 있지만, proteome-wide specificity나 in vivo selectivity로 확장되지는 않습니다.

Mini-binder result: 적은 후보 수에서 나온 강한 binding signal

Mini-binder에서는 BHRF1, IL-7Rα, PD-L1, SARS-CoV-2 RBD, TrkA를 대상으로 합니다. 각 target에서 length 80–120 범위로 20,000 designs를 생성하고, in silico filter, novelty filter, Foldseek diversity selection 후 100 designs per target을 HT-BLI로 test했습니다. 병렬로 88 designs per target은 mDisplay로도 평가했습니다.

Target별 HT-BLI hit rate는 BHRF1 64.0%, TrkA 10.0%, PD-L1 49.0%, IL-7Rα 26.0%, SC2RBD 52.0%입니다. Best affinity examples도 강합니다. BHRF1 binder는 KD 22.5 nM, TrkA binder는 0.04 nM, PD-L1 binder는 0.27 nM, IL-7Rα와 SC2RBD top binders는 <0.01 nM로 보고됩니다.

이 부분이 Latent-X 논문의 핵심 evidence입니다. 100개 후보 수준의 wet-lab handoff에서 target별 10–64% hit rate가 나오고, 일부 binder가 pM range affinity를 보인다는 것은 그냥 지나치기 어렵습니다. 특히 SC2RBD와 PD-L1에서 AlphaProteo published binders를 authors’ assay 조건에서 비교하려는 시도도 있습니다.

다만 head-to-head comparison은 조심해서 읽는 편이 안전합니다. Table 1에서는 published affinity와 replicated affinity가 다르게 나오는 사례가 있고, assay format, reagent, construct, expression/purification condition이 달라질 수 있습니다. Latent-X가 보여주는 것은 “동일 실험실 조건에서 이 platform이 강한 후보를 낸다”는 evidence에 가깝고, 모든 published method를 완전히 통제된 방식으로 이겼다는 결론과는 거리가 있습니다.

Specificity evidence의 범위

Mini-binder specificity는 mDisplay all-against-all assay로 평가됩니다. IL-7Rα, PD-L1, SC2RBD의 top four binders에 대해 intended target 외 binding이 detect되지 않았다고 보고합니다. mDisplay와 HT-BLI 사이의 Pearson correlation은 0.68–0.79로 제시됩니다.

이 result는 유용합니다. 단순히 “붙었다”에서 끝나지 않고, selected top binders가 여러 target 사이에서 intended target signal을 보이는지 확인했기 때문입니다. 하지만 이것도 selected panel specificity입니다. Proteome-wide off-target, human serum binding, cell-surface context, immunogenicity, developability까지 말해주지는 않습니다.

Macrocycle specificity는 SPR all-against-all로 확인합니다. Intended target이 가장 강하지만 low-level off-target binding이 일부 존재합니다. Mini-binder specificity story가 더 깔끔하고, macrocycle은 affinity와 specificity 모두 후속 optimization을 전제로 읽는 편이 안전합니다.

In silico benchmark는 wet-lab hit rate가 아니다

Latent-X는 200개 held-out PDB target에서 in silico pass rate도 비교합니다. 각 target당 3개 epitope를 자동 선택하고 epitope당 100 designs를 생성합니다. Chai-1 based filter 기준 macrocycle은 Latent-X 8.26% vs RFpeptides 1.72%, mini-binder는 Latent-X 5.11% vs RFdiffusion 3.02%입니다. Boltz-2 based filter에서도 qualitative trend는 유지됩니다.

이 benchmark는 Latent-X의 platform-level 주장에 힘을 보탭니다. 하지만 이것은 structure prediction filter 통과율입니다. Wet-lab binding hit rate가 아닙니다. Held-out target construction, hotspot selection algorithm, filter threshold choice, prediction model bias에 영향을 받습니다.

따라서 Figure 6은 “Latent-X가 RFpeptides/RFdiffusion보다 실험에서 몇 배 더 잘 붙는다”가 아니라, “Latent-X-generated designs가 authors’ Chai-1/Boltz-2 filter를 더 자주 통과한다”로 읽는 편이 안전합니다. In silico pass rate는 실험 후보를 고르는 proxy이지, binding assay가 아닙니다.

Structural diversity와 speed claim

논문은 Latent-X가 RFdiffusion보다 beta-sheet-containing diverse mini-binder folds를 더 많이 만든다고 주장합니다. Figure S7에서는 RFdiffusion이 helical binder에 치우치는 반면, Latent-X는 sheet fraction 분포가 더 넓게 나타납니다. 이 부분은 all-atom joint generation의 장점으로 해석할 수 있습니다.

Runtime benchmark도 제시됩니다. PD-L1 80-aa binder single sample 기준 A100에서는 Latent-X 3.8초 vs RFdiffusion 35.0초, H100에서는 1.9초 vs 21.1초로 약 10× 빠르다고 보고합니다. 다만 runtime은 model implementation, hardware, batching, filtering 포함 범위에 따라 달라집니다. 이 숫자는 platform serving 관점에서는 중요하지만, wet-lab success의 직접 evidence는 아닙니다.

Figure 흐름으로 읽기

Figure 1은 Latent-X의 대표 validated designs와 전체 hit-rate/affinity story를 보여줍니다. 여기서 독자가 볼 지점은 macrocycle과 mini-binder를 같은 숫자로 비교하지 않는 것입니다. Macrocycle은 hit rate가 높지만 affinity가 µM range인 경우가 많고, mini-binder는 hit rate가 target별로 더 넓게 흔들리지만 best affinity가 훨씬 강합니다.

Figure 2는 end-to-end workflow입니다. Prompt, generate, in silico score, lab validate가 한 그림 안에 들어갑니다. 이 figure는 Latent-X가 model-only paper가 아니라 platform workflow report라는 점을 보여줍니다.

Figure 4와 Table 1은 binding curves와 best binder comparison입니다. 여기서는 replicated controls와 published values 차이를 같이 봐야 합니다. Figure 5는 specificity assay입니다. Mini-binder와 macrocycle의 specificity evidence가 같은 깊이는 아니라는 점이 중요합니다.

Figure 6은 in silico hit-rate benchmark입니다. 이 figure는 useful proxy이지만 wet-lab validation은 아닙니다. Supplement Figure S4–S6와 Table S1은 Chai-1/Boltz-2 filter tuning과 threshold를 보여주므로, Latent-X의 hit rate를 이해하려면 사실상 main figure만큼 중요합니다.

AlphaProteo, RFdiffusion, Latent-X2와의 위치

Latent-X는 AlphaProteo와 비교하기 쉽습니다. 둘 다 closed high-performance binder design system이고, target-conditioned generation과 wet-lab validation을 전면에 둡니다. 하지만 Latent-X는 mini-binder와 macrocycle을 함께 다루고, all-atom complex generation과 platform serving을 강조합니다. AlphaProteo는 selected protein target에서 strong binder design evidence를 보인 DeepMind system으로, method transparency와 experimental scope가 다릅니다.

RFdiffusion/BindCraft와도 다르게 읽어야 합니다. RFdiffusion은 open ecosystem에서 실제 pipeline으로 널리 쓰이는 backbone generation anchor입니다. BindCraft는 AF2-style hallucination/optimization으로 binder design hit를 낸 practical pipeline입니다. Latent-X는 이 둘처럼 local reproducible method로 가져오기 어렵지만, post-filter wet-lab hit rate는 강하게 제시합니다. 즉 method accessibility와 performance evidence가 서로 반대 방향으로 놓여 있습니다.

Latent-X2와의 관계도 중요합니다. Latent-X는 mini-binder와 macrocycle에서 closed all-atom platform의 성능을 보여주는 논문입니다. Latent-X2는 이후 antibody/VHH/scFv 쪽으로 validation frontier를 옮기고, developability와 ex vivo immunogenicity proxy를 더합니다. Latent-X가 “closed all-atom model이 binder hit rate를 어디까지 끌어올릴 수 있는가”라면, Latent-X2는 “이 접근이 drug-like antibody modality로 얼마나 가까이 갈 수 있는가”에 더 가깝습니다.

이 논문을 읽을 때의 guardrail

첫째, tested 후보 수가 작다는 것은 raw generation이 효율적이라는 뜻이 아닙니다. Latent-X는 target별 30 또는 100 candidates를 실험했지만, 그 앞에는 generation과 filtering이 있습니다. Wet-lab denominator와 raw generation denominator를 분리해서 읽어야 합니다.

둘째, macrocycle hit rate와 mini-binder affinity를 하나의 성능 숫자로 합치면 안 됩니다. Macrocycle은 hit rate가 높지만 affinity가 대체로 µM입니다. Mini-binder는 affinity가 강하지만 target별 hit rate가 10–64%로 흔들립니다.

셋째, specificity evidence는 selected assay panel 안에서 읽어야 합니다. mDisplay/SPR all-against-all은 좋은 추가 evidence지만, broad off-target, immunogenicity, developability, in vivo property는 별도 영역입니다.

넷째, closed model이라는 점은 해석의 일부입니다. Latent-X는 성능 보고로는 강하지만, 재현 가능한 method anchor는 아닙니다. 논문을 읽고 바로 내 pipeline에 넣을 수 있는 정보는 제한적입니다.

평가: 성능 상한선과 투명성의 간극

Latent-X는 AI binder design field에서 무시하기 어려운 performance milestone입니다. 특히 mini-binder에서 100개 후보 수준으로 pM–nM binders를 얻고, macrocycle에서 90% 이상 tested hit rate를 보였다는 점은 강합니다. Post-filter wet-lab handoff efficiency라는 관점에서는 AlphaProteo 이후 closed binder design platform들이 보여주는 상한선이 올라가고 있다는 신호로 읽을 수 있습니다.

동시에 이 논문은 투명성의 한계도 분명합니다. Model architecture, weights, code, full training details가 공개되지 않기 때문에, method를 검증하거나 재현하거나 변형하기 어렵습니다. 또한 structural experimental validation, broad developability, immunogenicity, functional cellular assay, in vivo evidence는 이 paper의 중심 evidence가 아닙니다.

그래서 Latent-X를 읽는 균형점은 이렇습니다. 실험 결과는 강하게 인정하되, 그 숫자를 raw model capability나 field-wide reproducible capability로 일반화하지 않는 것. Closed platform이 보여준 성능 상한선과 open science/reproducibility 사이의 간극을 함께 보는 것이 이 논문을 읽는 가장 안전한 방법입니다.

참고

- Paper: “An Atom-level Frontier Model for De Novo Protein Binder Design” / “Latent-X: An Atom-level Frontier Model for De Novo Protein Binder Design” - Authors: Latent Labs Team - arXiv: https://arxiv.org/abs/2507.19375 - DOI: https://doi.org/10.48550/arXiv.2507.19375 - Platform: https://platform.latentlabs.com - Raw source: `raw/papers/Latent-X/latent-x.pdf` - Supplement: model inputs, filtering thresholds, target/hotspot table, binder sequences, experimental protocols