IV Jornada del GRIB
Transcripción
IV Jornada del GRIB
Computer-assisted Drug Discovery Ferran Sanz Unitat de Recerca en Informàtica Biomèdica (GRIB) IMIM, Universitat Pompeu Fabra Barcelona www.imim.es/grib What does it take to discover a new drug? 9 12 years average drug development timeline. 9 Total cost over 800 million $. www.imim.es/grib IT tools along the Drug Life Cycle Target identification Computational genomics Target validation Virtual screening Systems biology FEATURES Identification & obtention of compounds Structure-based drug design Preclinical Clinical assessment trials Postmarketing monitoring ADMET prediction Biosimulation BENEFITS Stardardisation and interoperability Knowledge management and integration www.imim.es/grib www.imim.es/grib Raw genomic data www.imim.es/grib Gene identification and annotation www.imim.es/grib Gene identification and annotation By: signal identification sequence comparison www.imim.es/grib Comparative genomics www.imim.es/grib Protein sequences and protein structures QIKDLLVSSSTDLDTTLVLVNAIYFK GMWKTAFNAEDTREMPFHVTKQESKP VQMMCMNNSFNVATLPAEKMKILELP FASGDLSMLVLLPDEVSDLERIEKTI NFEKLTEWTNPNTMEKRRVKVYLPQM KIEEKYNLTSVLMALGMTDLFIPSAN LTGISSAESLKISQAVHGAFMELSED GIEMAGSTGVIEDIKHSPESEQFRAD HPFLFLIKHNPTNTIVYFGRYWSP www.imim.es/grib Protein interaction networks (systems biology) www.imim.es/grib Prediction of protein structures Relatively few 3D structures of proteins experimentally known: 2001 Estimation for 2005 No. sequences Known 600,000 millions No. 3D-struct. exp. Known 14,000 ~40,000 Source: Lecture of A. Sali, 2001 ⇒ Need of prediction of protein structures: • “ab initio” prediction • homology modelling www.imim.es/grib Homology modelling of cytochrome P450 1A2: sequence alignment www.imim.es/grib Homology model of cytochrome P450 1A2 www.imim.es/grib Computer-assisted drug design Direct CADD Indirect CADD (similarity) (complementarity) www.imim.es/grib Structure-based drug design www.imim.es/grib Indirect CADD: CYP1A2 substrates O N H3 C N H O CH3 O 7-ETHOXYRESORUFIN PHENACETIN O N H3C N NH2 N CH3 MeIQ CH3 CH3 O O O CH3 N N HOOC N N N N CH3 CAFFEINE NH O F ENOXACIN www.imim.es/grib Molecular Interaction Potentials (MIP) probe • Useful tool to study and compare molecular interaction features • Definition: Interaction energies of the studied compounds with relevant molecular probes placed around them • Probes: H+ (MEP); NH3+; OH; CH3; etc. • Usually computed in grids of points around the compounds Energy Compound studied • Usually plotted by means of isopotential surfaces www.imim.es/grib MEP distributions of CYP1A2 substrates 7-ETHOXYRESORUFIN PHENACETIN CAFFEINE MeIQ ENOXACIN www.imim.es/grib CYP1A2 MEP-based pharmacophore MEP MINIMUM OXIDATION SITE 2. 23 .1 -7 . 4 . 6 Å 5Å HETEROCYCLIC SYSTEM MEP MINIMUM Same plane www.imim.es/grib www.imim.es/grib Issues to be considered in molecular library sampling Selection of the relevant molecular descriptors Characteristics of the descriptors space In general, the descriptors are not independent and, consequently, there is a need of avoiding redundancy Sampling methodology www.imim.es/grib Information redundancy Descriptors MW Descriptors DM MW MVol DM Compound 1 Compound 1 Compound 2 Compound 2 Compound 3 Compound 3 …… …… Compound n Compound n The same information (approx.) is considered twice Reduced relative weight Different relative distances between the compounds ! www.imim.es/grib Principal Components Analysis (PCA) x2 Series of compounds originally described by 3 variables (x1, x2 and x3), perhaps including redundant information PC2 PC1 x1 x3 PC2 PC1 Description of the compounds using a reduced number of new variables (PC1 and PC2) that are independent (not redundant at all) and retain the major part of the variability between the compounds www.imim.es/grib Issues to be considered in molecular library sampling Selection of the relevant molecular descriptors Characteristics of the descriptors space In general, the descriptors are not independent and, consequently, there is a need of avoiding redundancy Sampling methodology www.imim.es/grib Sampling methodologies Randomly (“lottery”) Systematic sampling on a molecular descriptors or PCA space More sophisticated methods (k-means, MDC) Exemplified by the selection a maximum diversity sample of 50 amines from the 923 available in the Aldrich catalogue (to be introduced as substituents on a scaffold) www.imim.es/grib Simple random sampling aa a aa a aa aaa a aaa aaa aa a a a a a MolProp2 a a MolProp3 MolProp1 • Too low presence of "extreme"compounds • Non-optimal drop in diversity www.imim.es/grib Factorial sampling a a a a MolProp2 a a a MolProp3 a a MolProp1 • Too high presence of "extreme" compounds www.imim.es/grib k-means clustering method ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ www.imim.es/grib k-means clustering method ~ ~ s ~ ~ s ~ ~ ~ ~ ~ s ~ s ~ ~ ~ ~ ~ ~ ~ s ~ ~ ~ s ~ www.imim.es/grib k-means clustering method ~ ~ s ~ ~ s ~ ~ ~ ~ ~ s ~ s ~ ~ ~ ~ ~ ~ ~ s ~ ~ ~ s ~ www.imim.es/grib k-means clustering method ~ ~ s ~ ~ s ~ ~ ~ ~ ~ s ~ s ~ ~ ~ ~ ~ ~ ~ s ~ ~ ~ s ~ www.imim.es/grib k-means clustering method ~ ~ s ~ ~ ~ s ~ s ~ ~ s ~ ~ s ~ s s s ~ ~ ~ ~ ~ s ~ s ~ ~ ~ s s ~ www.imim.es/grib k-means clustering method ~ ~ s ~ ~ ~ s ~ s ~ ~ s ~ ~ s ~ s s s ~ ~ ~ ~ ~ s ~ s ~ ~ ~ s s ~ www.imim.es/grib k-means clustering method ~ ~ s ~ ~ ~ s ~ s ~ ~ s ~ ~ s ~ s s s ~ ~ ~ ~ ~ s ~ s ~ ~ ~ s s ~ www.imim.es/grib k-means clustering method ~ ~ s ~ ~ s ~ s s ~ ~ ~ s ~ ~ s s ~ s ~ ~ ~ ~ ~ s ~ s ~ s s ~ ~ s s ~ ... until stability www.imim.es/grib k-means clustering method ~ ~ ~ ~ s ~ ~ ~ ~ s ~ s ~ ~ ~ ~ ~ s ~ ~ ~ ~ s s ~ ~ ~ www.imim.es/grib Library vs. sample score plots www.imim.es/grib k-means clustering method ~ ~ ~ ~ s ~ ~ ~ ~ s ~ s ~ ~ ~ ~ ~ s ~ ~ ~ ~ s s ~ ~ ~ www.imim.es/grib k-means clustering method ~ ~ ~ ~ s ~ ~ ~ ~ s ~ s ~ ~ ~ ~ ~ s ~ ~ ~ ~ s s ~ ~ ~ www.imim.es/grib Using different seeds for clustering www.imim.es/grib Descubriendo nuevas dianas: El caso de los GPCR Proteínas de membrana relacionadas con los procesos de transducción de señal (transferencia de información de fuera a dentro de las células) Más de la mitad de los medicamentos actúan en GPCRs Gran familia de proteínas (2001): 4170 secuencias de GPCR descritas (Swiss-Prot + TrEMBL) 597 humanas 448 completas 86 "huerfanas", sin función conocida Véase: http://www.gpcr.org/7tm/ www.imim.es/grib www.imim.es/grib GPCR modelling (difficulties) A single high-resolution (2.8 Å) 3D structure experimentally known: bovine rhodopsin High sequence variability Diversity of ligand binding modes: www.imim.es/grib GPCR modelling (common features) Seven transmembrane α-helices: Interaction with few and homologous G-proteins in the intracellular domain Highly conserved residues in the transmembrane domain www.imim.es/grib GPCR classification Class A (rhodopsin-like) Class B GPCRs Class C Class D Class E Amine Peptide Hormone protein (Rhod)opsin Olfactory Prostanoid Nucleotide-like: Purinergic receptors Adenosine receptors Cannabinoid Platelet activating factor Gonadotropin-releasing hormone Thyrotropin-releasing hormone & Secretagogue Melatonin Viral Lysosphingolipid & LPA (EDG) Leukotriene B4 receptor Orphan/other www.imim.es/grib Adenosine receptors • Natural agonist: • 4 subtipes: A1 ; A2A ; A2B ; A3 • A1 receptor: Involved in cardiovascular modulation (infarct) and renal control (diuresis) www.imim.es/grib A1AR mutagenesis data www.imim.es/grib hA1AR modelling Sequence alignment Building of each TMH TMH prediction Energy minimization TMH bundle packing Rhodopsin template Energy minimization Molecular Dynamics Conformation selection Binding site identification Ligand docking explorations www.imim.es/grib hA1AR modelling: sequence analysis www.imim.es/grib Helices building and optimisation • Standard angles (φ = -59º; ψ = -44º; ω = 180º) for all the residues except Pro (φ = -71º) • Molecular mechanics optimisation using AMBER 6 (PARM94 ff) • Dielectric constant ε = 4r • 10 Å cutoff • Constraints in Cα progressively reduced: 50, 25, 10, 5, 0 kcal/mol2 • Minimisation algorithm: • 500 steepest descent cycles + 500 conjugated gradients cycles for the 50 kcal/mol2 constraint • 500 conjugated gradients cycles for the rest of constraints • Until RMS < 0.001 kcal/mol when no constraint is applied www.imim.es/grib TMH bundle packing & optimisation Packing of the 7 TMH bundle • Superposition on rhodopsin considering Cα trace of segments defined by highly conserved residues • Checking of: • highly conserved H-bond clusters (Asn in TMH1 + Asp in TMH2 + crystallisation H2O + Asn in TMH7) • hydrophobicity profiles • geometry of residues involved in ligand recognition • side chains bumps TMH bundle optimisation • Same as for TMH alone, but keeping force constant = 5 kcal/molÅ2 www.imim.es/grib bRho crystallographic water molecule • bRho: crystallographic water molecule between highly conserved residues • Incorporation of a crystallographic water molecule on the hA1AR model: www.imim.es/grib Molecular dynamics of the TMH bundle • Software: AMBER 6 (PARM94 ff) • Dielectric constant ε = 4r MD analysis: • 10 Å cutoff • AMBER 6 (CARNAL) • Time: 1000 ps (1 ns) • Visual inspection (VMD) • Temp.: 310 K • H-bonds / aromatic clusters • Time step: 2 fs • Protonation of TM histidines (δ or ε) • SHAKE algorithm • Clustering of conformers • Constrain in Cα: 5 kcal/molÅ2 www.imim.es/grib Molecular dynamics www.imim.es/grib MD analysis: protonation of histidines H N δ Nε O N δ O NH 3 + Nε H O O NH 3 + 1st trial: Both histidines protonated in the Nε position: Interaction between His7.43 and Glu1.39 not detected. Consequently, Nε protonation in His7.43 not feasible. 2on trial: Both histidines protonated in the Nδ position: Interaction between His7.43 and Glu1.39 observed, but Nδ protonation in His6.52 forces this residue to be exposed to the lipidic bilayer. 3rd trial: Protonation of His6.52 on Nε and His7.43 on Nδ: Agreement with experimental data, since His7.43 interacts with Glu1.39, and His6.52 is oriented toward the inner part of the bundle. www.imim.es/grib MD analysis: H-bonds www.imim.es/grib PCA/MDC analysis on the MD results • PCA using the RMSs of all sidechains (last 700 ps) • Cluster analysis on the PCA space • Selection of the conformer for docking analysis RMS res. 1 RMS res. 2 RMS res. n Conf. 1 Conf. 2 Conf. 700 www.imim.es/grib hA1AR: Location of the binding site NH2 • Adenosine = adenine N N • CH2OH O N N + ribose HO OH Ribose docking exploration (GROUP module of GRID): polar acidic basic www.imim.es/grib Validation of the ribose binding site • Comparison with experimental data (ribose-contraining complexes from PDB) • Description of each binding site using GRID Independent Descriptors (GRIND/ALMOND) • Comparison of the GRIND correlograms using Hodgkin similarity index www.imim.es/grib Adenosine docking NH2 Done using AUTODOCK 3.0: 7 N 6 N 5 8 • Receptor rigid, ligand flexible • Exploration: Lamarckian genetic algorithm • Scoring function: AMBER-like force-field • 100 independent experiments: clustering on RMS, energetic evaluation A O HOCH2 N9 4 N 3 1 2 5' OH OH 3' 2' B 37% population 37% population ∆Gbinding = -10.6 kcal/mol ∆Gbinding = -10.4 kcal/mol www.imim.es/grib Adenosina en A1: Simulación de la dinámica molecular www.imim.es/grib hA1AR agonists Bulky: Alkylamine (bulky) www.imim.es/grib hA1AR agonists Formula Compound A1AR Ki (nM) ∆G (kcal/mol) 73a -9.8 NH2 N HO O HO N N N ADO (adenosine) OH R6 HN H R5’ = O N NECA (5’-N-ethyl carboxamidoadenosine ) 13.6b -10.8 N R5’ O CPA R6 = (N 6cyclopentyladenosine) R6 = CCPA R2 = Cl (2-chloro-N 6cyclopentyladenosine) 2.25b -11.9 0.83b -12.5 a EC Inhibition of adenylate cyclase in rat A AR (Daly et al. Biochem 50 1 b Klotz et al. Naunyn Schmiedebergs Arch Pharmacol (1998); 357:1 HO N N N R2 OH Pharmacol (1992); 43:1089) www.imim.es/grib Compound Position A Position B ∆G ∆G Rank Rank ADO NECA CPA CCPA www.imim.es/grib Position A Position B ∆G Rank ∆G ADO -10.4 1 NECA - - CPA -10.8 6 CCPA -10.7 6 Compound Rank www.imim.es/grib Position A Position B ∆G Rank ∆G Rank ADO -10.4 1* -10.6 3* NECA - - -11.1 1 CPA -10.8 6 -13.5 1 CCPA -10.7 6 -12.7 1 Compound * quasi-degenerated solutions www.imim.es/grib www.imim.es/grib Docking of A1AR agonists Explanation for structure-affinity relationships: 9 Hydrophobic pocket around R6 9 Steric limitation for R2 and R5’ 9 Highly polar binding pocket for the ribose moiety Leu3.33 Ile7.39 Bulky: Leu6.51 Thr7.42 His7.43 Alkylamine R5’ (bulky) O HN R6 Thr3.36 N N N N R2 Trp6.48 Ser7.46 Ser1.46 HO OH Asp2.50 www.imim.es/grib 5-HT2A and 5-HT2C receptors Involved in psychological disorders, including depression, mania, anxiety, dipolar disorder and schizophrenia www.imim.es/grib Incorporation of loops Database searching for similar loops: • Same number of residues forming the loop (±2) • Distance between extremes equal to rhodopsin (±5 Å) www.imim.es/grib Docking of serotonin F339 F340 Mutagenesis experiments on 5-HT2A D155N S159A F339L F340L Ki: ↓ 37 fold Ki: ↓ 18 fold Ki: ↑ 2.4 fold Ki: ↓ 27 fold www.imim.es/grib Docking of ketanserin O Mutagenesis experiments on 5-HT2A H D120N D155N F339A F340Y W76A W336A Y370A Ki: ↓ 10 Ki: ↓ 75 Ki: ↓ 12 Ki: ↓ 73 Ki: ↓ 10 Ki: ↓ 100 Ki: ↓ 20 O N + N N O www.imim.es/grib F Docking of QF610B in 5-HT2A I Possible docking hypotheses for: N O O N F R S (R = H) IIa IIb www.imim.es/grib Docking of QF610B in 5-HT2A I Docking results for: N O O N F R S (R = H) IIa IIb www.imim.es/grib Docking of QF610B in 5-HT2A N O O N F R S Compound R pK i 5HT 2A QF610B H 8.56 QF620B n-pentyl 7.68 www.imim.es/grib Docking of QF620B in 5-HT2A I Docking results for: N O O N F R S (R = n-pentyl) IIa IIb www.imim.es/grib Docking of QF620B in 5-HT2A www.imim.es/grib Current challenges on GPCR modelling • Consideration of the structural water molecules • Modelling of the receptor loops • Modelling of the heterogeneous environment • Modelling of the receptor activation processes • Influence of ligands in receptor conformations • GPCR dimerisation www.imim.es/grib www.imim.es/grib An tip atí sicót pic ico os s GRIND MIP distributions are automatically simplified, obtaining the most favorable regions for the interaction with the considered probe NH GRIND Distances between the most favorable interaction regions (and the corresponding energy values) are summarized in spectra-like plots called "correlograms” Correlograms allow the alignment-free comparison of compounds NH distance GRIND Usually, several correlograms are obtained for each compound, using several probes or pairs of probes Correlograms are stored in a data matrix NH-O= compounds NH-NH Análisis 3D-QSAR Biological activity Compound 1 Compound 2 Compound 3 …… Compound n Autocorrelogram values using probe A Autocorrelogram values using probe B Cross-correlogram values (probe A vs. probe B) Y X1 X2 … Xm Z1 Z2 Zm Y1 Y2 Y3 … Yn X11 X21 … Xm1 Z11 Z21 … Zm1 X1n X2n … Xmn Z1n Z2n … Zmn Análisis estadístico PLS Actividad = f (características estructurales) www.imim.es/grib Modelo 3D-QSAR para antagonistas 5HT2A (n=52; LV=2; r2=0.85; q2=0.74) www.imim.es/grib Consistencia entre modelado directo e indirecto www.imim.es/grib Pharmacogenomics vs. Pharmacogenetics Pharmacogenomics is the application of genomic approaches and technologies to the identification of drug targets. Pharmacogenetics is a subset of pharmacogenomics which uses genomic/bioinformatic methods to identify genomic correlates, for example SNPs (Single Nucleotide Polymorphisms), characteristic of particular patient response profiles and use those markers to inform the development and administration of therapies. (bioinformatics.org) www.imim.es/grib Estrategias farmacogenómicas y farmacogenéticas Durante el desarrollo del fármaco: Descubrimiento de nuevas dianas. Consideración de los polimorfismos genéticos: Diseñando fármacos apropiados para grupos de individuos con ciertas características genéticas ("druglets"). Seleccionando dianas y estructuras de fármacos no (o poco) afectados por los polimorfismos genéticos ("best-in-class drugs"). Hay que caracterizar genéticamente a los sujetos que participan en ensayos clínicos. Después de la comercialización: Seleccionando el medicamento y la dosis más apropiados para el perfil genético del paciente. Relacionando efectos indeseables con características genéticas. www.imim.es/grib Pharmacogenetic approaches Pre-market approaches: Target discovery. Consideration of genetic polymorphisms in drug design: Designing relevant drugs for genetically defined groups of people ("druglets"). Selecting targets and drug structures not (or less) affected by genetic polymorphisms ("bestin-class drugs"). Post-market approaches: Selecting drug and dosage most relevant for the genetic profile of the patient. Reporting side-effects with genetic information. www.imim.es/grib Pharmacogenetic approaches: "Druglets" Theoretically attractive but …. Is it realistic? If the development of a new drug implies >10 years and >500€ (a cost that could increase because of the genetic studies during the clinical trials). If it is extremely expensive to develop a drug for the whole population (without taking into account the inter-individual variability)…. Is it realistic/economically affordable/payable to develop a different drug for every genetically distinct subpopulation? More realistic approach: Reconsider as potential "druglets" drugs previously discarded by their sideeffects in genetically characterized subpopulations. www.imim.es/grib Molecular modeling in pharmagenetics Comparative modeling of: proteins protein-protein complexes ligand-protein complexes particularly, substrate-enzyme complexes considering the genetic variability www.imim.es/grib www.imim.es/grib Pointing to the most effective drug target sites taking into account the genetic variability (Maggio et al. Tibtech 2001;19: 266-272) www.imim.es/grib www.imim.es/grib Single Nucleotide Polymorphisms (SNPs) database snp.cshl.org www.imim.es/grib www.pharmgkb.org www.imim.es/grib How are drugs discovered today? • • • • Direct costs Opportunity costs Time etc. 9 Multidisciplinary teams 9 Different organisations involved 9 Geographically scattered 9 Using different platforms Slow, inefficient process 9 Strict security requirements www.imim.es/grib Telecolaboración en I+D de medicamentos: Link3D www.imim.es/grib Telecolaboración en I+D de medicamentos: Link3D www.imim.es/grib Telecolaboración en I+D de medicamentos: Link3D Características principales: Proyecto financiado por la Unión Europea. Basado en una investigación exhaustiva de las necesidades de usuario y pruebas en entornos reales. Software diseñado para remplazar y suplementar reuniones físicas. Incorpora audio y moderación avanzada. Permite de 2 a 10 participantes. Cumple los estrictos requerimientos de seguridad y confidencialidad de la industria farmacéutica. Autentificación basada en passwords y certificados. Encriptado de los datos transmitidos. Acepta la mayoría de los formatos de objetos gráficos (secuencias de biopolímeros, estructuras 3D de biomoleculas, fórmulas moleculares, imágenes biológicas, etc.). Funciones avanzadas para la manipulación de los objetos compartidos (cambio de formato de visualización, marcas, etc.). Multi-platforma (MS-Windows, Linux, SGI-IRIX). Bajos requerimientos de ancho de banda. www.imim.es/grib Efectos esperados de la telecolaboración con Link3D Classical working practice GOAL time GOAL time Proposed working practice Classical meeting Scheduled virtual meeting Informal virtual meeting www.imim.es/grib Agradecimientos 9 Dr. Manuel Pastor 9 Dr. Hugo Gutiérrez de Terán 9 Cristina Dezi 9 Fabien Fontaine Universidad de Santiago de Compostela AlmirallProdesfarma www.imim.es/grib