A short, honest account of what RataFold tries to do, what it does not, and the choices behind the way we publish it.
AlphaFold and the structures it predicts are the foundation we build on; without that work, this project would not exist. AlphaFold gives the most-likely shape of a protein. We try to be useful in a small, complementary way — surfacing a curated set of alternative shapes that same backbone could plausibly adopt, with energies, structural distances, and an honest confidence check on each.
The hope is that these alternatives are useful starting points for downstream work — molecular dynamics validation, docking exploration, thinking about binding states or misfolded forms. We are explicit that they are candidates, not verified structures.
Released under CC-BY 4.0 — free to use, free to redistribute, free to integrate into any downstream pipeline. No login, no API key.
The search engine behind these variants is kept private while the outputs, schema, viewer, and audit framework are open. The contribution we hope to make is the combination of four ordinary ideas put together for this purpose:
10(constant × residue-count) — roughly
10883 for insulin and
106184 for the Amyloid-β precursor.
None of these ideas is new on its own. We are simply trying to put them together carefully and publish what falls out.
Trying to walk through a 1050-conformation space by brute force is not feasible on current hardware. Some rough numbers, for context:
| Hardware | Time to evaluate 1050 conformations |
|---|---|
| Modern laptop core | ~3.5 × 1040 years |
| Single H100-class GPU | ~1 × 1037 years |
| One exascale supercomputer | ~1 × 1031 years |
| All top-10 supercomputers combined | ~3 × 1029 years |
| Universe age ≈ 1.4 × 1010 years. | |
Anything that wants to reach beyond about 1030 in conformational space has to do so by being clever about which points to evaluate — not by being faster per evaluation. Our work sits in that "be clever" tradition rather than competing on raw compute. The 30-second-per-protein figure is a consequence of the sampling discipline, not of any privileged hardware.
Every tool below does something we cannot — we are listing them with respect.
| Tool | What it does well |
|---|---|
| AlphaFold | State-of-the-art single-structure prediction. The native structures we display come from AlphaFold; we do not produce predictions ourselves. |
| Molecular Dynamics (GROMACS, AMBER, OpenMM) | Physics-based trajectories with real timesteps. The trusted way to validate any candidate structure, including ours. |
| Rosetta | Structure design and ab initio sampling. Decades of refinement; the reference point for high-quality protein modeling. |
| RataFold | A curated set of energy-ranked atomic-detail alternatives to each AlphaFold prediction, with an AF-confidence honesty check, in roughly 30 seconds per protein. |
The search engine is kept private; the data, schema, viewer, and audit framework are open. This separation is pragmatic, not philosophical — it lets us improve the engine without breaking downstream consumers, and lets you build on the data without depending on us being around.
If you are doing academic work that needs methodology details to publish, please reach out. We will share enough to make your study reproducible against our outputs.
Thanks for reading. We are building this carefully and welcome corrections.