Responsibility

A short, honest account of what RataFold tries to do, what it does not, and the choices behind the way we publish it.

What we're trying to do

AlphaFold and the structures it predicts are the foundation we build on; without that work, this project would not exist. AlphaFold gives the most-likely shape of a protein. We try to be useful in a small, complementary way — surfacing a curated set of alternative shapes that same backbone could plausibly adopt, with energies, structural distances, and an honest confidence check on each.

The hope is that these alternatives are useful starting points for downstream work — molecular dynamics validation, docking exploration, thinking about binding states or misfolded forms. We are explicit that they are candidates, not verified structures.

What we publish, per protein

Released under CC-BY 4.0 — free to use, free to redistribute, free to integrate into any downstream pipeline. No login, no API key.

The approach, briefly

The search engine behind these variants is kept private while the outputs, schema, viewer, and audit framework are open. The contribution we hope to make is the combination of four ordinary ideas put together for this purpose:

  1. A structured state space. Each protein is mapped into a discrete, finite, reproducible space whose size scales as 10(constant × residue-count) — roughly 10883 for insulin and 106184 for the Amyloid-β precursor.
  2. Sampling, not enumeration. Spaces of this size cannot be enumerated by any current computer. We sample them, biasing toward energetically reasonable neighborhoods and keeping a rolling pool of the most interesting candidates.
  3. Atomic-grade reconstruction. Every accepted sample is rebuilt as a full-atom structure (backbone + sidechains) using the per-residue geometry measured in the AlphaFold model, then scored with a multi-term force field.
  4. AlphaFold-aware honesty check. Each variant carries a verdict comparing it against the regions AlphaFold was most confident about (plDDT ≥ 70). Variants that preserve those regions earn an AF ✓; ones that scramble them are flagged so users can prioritize accordingly.

None of these ideas is new on its own. We are simply trying to put them together carefully and publish what falls out.

Why this is computationally hard

Trying to walk through a 1050-conformation space by brute force is not feasible on current hardware. Some rough numbers, for context:

HardwareTime to evaluate 1050 conformations
Modern laptop core~3.5 × 1040 years
Single H100-class GPU~1 × 1037 years
One exascale supercomputer~1 × 1031 years
All top-10 supercomputers combined~3 × 1029 years
Universe age ≈ 1.4 × 1010 years.

Anything that wants to reach beyond about 1030 in conformational space has to do so by being clever about which points to evaluate — not by being faster per evaluation. Our work sits in that "be clever" tradition rather than competing on raw compute. The 30-second-per-protein figure is a consequence of the sampling discipline, not of any privileged hardware.

Where we fit among existing tools

Every tool below does something we cannot — we are listing them with respect.

ToolWhat it does well
AlphaFoldState-of-the-art single-structure prediction. The native structures we display come from AlphaFold; we do not produce predictions ourselves.
Molecular Dynamics (GROMACS, AMBER, OpenMM)Physics-based trajectories with real timesteps. The trusted way to validate any candidate structure, including ours.
RosettaStructure design and ab initio sampling. Decades of refinement; the reference point for high-quality protein modeling.
RataFoldA curated set of energy-ranked atomic-detail alternatives to each AlphaFold prediction, with an AF-confidence honesty check, in roughly 30 seconds per protein.

What we don't claim

How we try to be a good citizen

Why the search engine stays private

The search engine is kept private; the data, schema, viewer, and audit framework are open. This separation is pragmatic, not philosophical — it lets us improve the engine without breaking downstream consumers, and lets you build on the data without depending on us being around.

If you are doing academic work that needs methodology details to publish, please reach out. We will share enough to make your study reproducible against our outputs.

Thanks for reading. We are building this carefully and welcome corrections.