In a pioneering study, University of Bonn researchers have developed an AI similar to ChatGPT, but for molecules. This “chemical language model” predicts compounds with dual-target activities, opening new vistas in pharmaceutical research.
In a groundbreaking advancement, researchers from the University of Bonn have trained an artificial intelligence model to predict potential active ingredients with specific properties, akin to a chemical ChatGPT for molecules. This innovative study, published in the journal Cell Reports Physical Science, holds promise for revolutionizing the field of pharmaceutical research by identifying compounds with dual-target activities.
The AI model, referred to as a chemical language model, is designed to generate the structural formulas of chemical compounds that can bind to two different target proteins simultaneously. Such compounds are highly coveted in pharmaceutical research due to their polypharmacology, promising enhanced efficacy by influencing multiple intracellular processes and signaling pathways at once.
“In pharmaceutical research, these types of active compounds are highly desirable due to their polypharmacology,” Jürgen Bajorath, a professor and chair of Life Science Informatics at the University of Bonn, said in a news release.
“Because compounds with desirable multi-target activity influence several intracellular processes and signaling pathways at the same time, they are often particularly effective – such as in the fight against cancer,” he added.
Traditionally, such dual-target effects might be achieved through the co-administration of multiple drugs. However, this comes with inherent risks like unwanted drug interactions and variable metabolic rates in the body, complicating their administration.
The AI model developed by the University of Bonn offers a solution by predicting compounds that can precisely achieve these effects on their own.
Creating single-target molecules in drug discovery is a formidable challenge. Designing compounds with predefined dual-target activities is even more complex.
The newly developed chemical language model could transform this aspect of drug discovery. Operating similarly to how ChatGPT learns from vast amounts of written text, this chemical language model has been trained with thousands of chemical representations known as SMILES strings, which encode the structures and compositions of organic molecules.
“We have now trained our chemical language model with pairs of strings,” Sanjana Srinivasan, a member of Bajorath’s research group, said in the news release. “One of the strings described a molecule that we know only acts against one target protein. The other represented a compound that, in addition to this protein, also influences a second target protein.”
The AI model was trained with over 70,000 pairs of these strings, enabling it to understand the subtle distinctions between single-target and dual-target compounds. Following this training, the model could suggest molecules based on a known compound that would act not only against its target protein but also against a second, distinct protein.
The versatility of the AI model was further enhanced through a fine-tuning phase, where the model was trained with several dozen specialized pairs. This phase prepared the AI to predict compounds affecting entirely different classes of enzymes or receptors, akin to asking ChatGPT to switch from writing a sonnet to a limerick.
After fine-tuning, the AI was able to suggest molecules already known to act against desired protein combinations.
“This shows that the process works,” Bajorath said.
However, the real power of this approach lies in the AI’s ability to suggest novel chemical structures.
“To a certain extent, it triggers ‘out of the box’ ideas and comes up with original solutions that can lead to new design hypotheses and approaches,” he added.
The study signifies a significant leap forward in the integration of AI into life sciences, potentially setting new standards in drug discovery and development.