494d The Core and the Most Useful Molecules of Organic Chemistry

Kyle J. M. Bishop1, Rafal Klajn2, and Bartosz A. Grzybowski2. (1) Dept of Chemical and Biological Engineering, Northwestern University, 2145 Sheridan Rd/TechE136, Evanston, IL 60208, (2) Chemical Engineering, Northwestern University, 2145 Sheridan Rd/TechE136, Evanston, IL 60208

Organic syntheses reported in the literature between 1700 and 2004 are analyzed at the simplified level of a connected network. Using mathematical tools from network theory and statistical physics, I demonstrate that there exist a small set of strongly connected, chemically diverse substances (the “core”) from which the majority of other known organic compounds (the “periphery”) can be made in three or less synthetic steps. The core thus represents the backbone of all organic syntheses and contains synthetically important molecules. Furthermore, knowledge of the core combined with Monte Carlo (MC) search algorithms allows identification of most useful chemicals from which maximal number of other substances can be made. Aside from purely scientific interest, these most-useful compound collections should be of practical value to specialty chemicals companies, which could use them to optimize product selection and cater to the most diverse group of chemical customers. It is also discussed how the analysis of compounds not connected to the core or periphery (i.e., the islands) can help identify challenging synthetic targets.