A collaborative constrained graph diffusion model for the generation of realistic synthetic molecules
Ruiz-Botella, M, Sales-Pardo, M, GuimerĂ , R.
Nat. Mach. Intell.
,
(2026).
Developing new molecular compounds is crucial to address pressing challenges, from health to environmental sustainability. However, exploring the molecular space to discover new molecules is difficult owing to the vastness of the space. Here we introduce CoCoGraph, a collaborative and constrained graph diffusion model capable of generating molecules that are guaranteed to be chemically valid. Thanks to the constraints built into the model and to the collaborative mechanism, CoCoGraph outperforms state-of-the-art approaches on standard benchmarks while being more efficient. Analysis of 36 chemical properties also demonstrates that CoCoGraph generates molecules with distributions more closely matching real molecules than current models. To illustrate the potential of the model, we created a database of 8.2 million synthetically generated molecules, show how this database and CoCoGraph could be used for molecular discovery and conduct a Turing-like test with organic chemistry experts to further assess the plausibility of the generated molecules, and the potential biases and limitations of CoCoGraph.