Technology
THE ARCHITECTURE OF DISCOVERY.
Five layers of differentiation that competitors cannot replicate overnight. Full methodology →
Layer 01
Generative Chemistry — MolForge-Gen Engine
Proprietary 25.4M-parameter Transformer (GPT-2 architecture) trained on 1.6M drug-like molecules, then fine-tuned on 5 targets with property-conditioned generation. Specify target (TYK2/CDK4/CDK6/TNIK/EGFR) + desired QED/hERG/pIC50 range → generates novel molecules matching those constraints. FTO-aware: all outputs Tanimoto < 0.21 vs rentosertib. TNIK library: 33,755 validated compounds.
Layer 02
Uncertainty Quantification
Every prediction comes with a 90% confidence interval via Conformal Prediction. No more point estimates. You know exactly how much to trust each number before spending a dollar on synthesis.
Layer 03
Failure Boundary Mapping
ROBOGATE adaptive sampling + Multi-Objective Pareto optimization. We don't just find good molecules — we map the parameter space where candidates fail. hERG toxicity, mutagenicity, bioavailability — every failure mode is charted.
Layer 04
Certified Compound Tiers
Tier 1: AI Screened (ADMET + QSAR). Tier 2: Structure Verified (Boltz-2 ligand_ptm ≥ 0.85). Tier 3: Experimentally Validated (CRO assay in progress). Pharmaceutical R&D teams know exactly what level of evidence they're working with.
Layer 05
Continuous Feedback Loop
Every CRO assay result — hit or miss — feeds back into the models. Negative data is a first-class asset. Conformal calibration and Pareto boundary maps improve with each experiment. This is the moat that takes time to build.
Validated Benchmark
0.903
AUC-ROC
PASS0.838
BEDROC
PASS0.503
Pearson R
PASSExtDesc
Descriptor
Split: Bemis-Murcko scaffold + Tanimoto<0.4 cap between train and test. Leak-proof external benchmarks (LP-PDBBind, PLINDER-ECOD, PLINDER-TIME) scheduled 2026-Q2 and 2026-Q3 — split-integrity report →
Pipeline
From Target to Dossier in Hours
Target Selection
Auto
Seed Collection
~30min
Generative Design
~1h
Structure Prediction
~2h
ADMET Filtering
~15min
QSAR Scoring
~10min
Failure Boundary
~2min
Compound Dossier
~30min
Target Selection
Auto
Seed Collection
~30min
Generative Design
~1h
Structure Prediction
~2h
ADMET Filtering
~15min
QSAR Scoring
~10min
Failure Boundary
~2min
Compound Dossier
~30min
Validated Results
TYK2 — A Phase III Clinical Target
108,220+
Total Compounds
71,651
ADMET Passed (5 targets)
0.5+
External Benchmark R
2,403
Pareto Top (5 targets)
Publications & Methods
| Title | Type | Status |
|---|---|---|
| MolForge: Failure Boundary Mapping with Conformal Prediction for Drug Discovery | Preprint | In preparation |
| ROBOGATE: Adaptive Sampling for Multi-Objective Pareto Optimization in ADMET Space | Preprint | In preparation |
| Uncertainty-Aware Compound Tiering: A 3-Gate Framework for AI Drug Discovery | Conference | Submitted |
| TNIK Inhibitors for Obesity: A Computational Repurposing Study | Research Article | In preparation |
| MolForge-Gen: A Property-Conditioned Transformer for FTO-Aware Multi-Target Drug Design | Technical Report | Data collected |
| TNIK Binding Affinity Prediction: A GNN Approach with Conformal Calibration | Preprint | In preparation |