A protein target is selected, and an inseparable DNA-Encoded Library (DEL) is created, containing approximately 50 billion compounds marked with DNA tags and chemical building blocks. A binding assessment is then performed using methods such as AS/MS and SPR to identify compounds that bind to the target protein. The top 100,000 DEL binders are selected, and tag sequencing is conducted to filter out inactive binders. From these binders, active compounds are identified using the Receptor.ai fit-for-target workflow. To reduce the cost and time of synthesis, the platform is also used to search for the best analogues among over 30 billion commercially available compounds. Around 100 active compounds are identified for further study.
Advantages of our DEL pipeline:
The search is conducted in a commercial space of approximately 20 billion compounds using the Receptor.ai fit-for-target workflow to design initial hits. This results in a query of over 1,000 hit candidates. A set of more than 100 billion DNA-encoded libraries is then screened to select the most appropriate target-specific DEL library, based on a diverse and active fragment similarity query. A binding assessment is performed, and sequence reads are conducted on a reduced DEL library of approximately 1 billion compounds to decrease noise. Active compounds are then selected from the binders, and the Receptor.ai platform is used to search for analogues in commercial spaces to reduce the cost and time of synthesis. Approximately 100 active compounds are selected for additional investigation.
Advantages of our DEL pipeline:
The DTI prediction process begins with a search for the best molecular fragments for each protein subpocket using a fragment library. The fragments are represented as graphs and processed through graph neural network blocks, followed by dense neural network layers. This process identifies the best fragments for each protein subpocket.
Next, the compatibility of these fragments is checked using reaction-specific dense neural network layers. Some reactions may fail, while others succeed, helping to identify the most compatible fragments.
Finally, the compatible fragments are combined into a complete molecule. The fragments are encoded using a SMILES tokenizer, followed by attention-based neural machine translation. The translated sequences are then decoded into a whole molecule using a detokenizer. The best ligand is ultimately constructed from the selected fragments within the protein binding pocket.