HIV 2-D STRUCTURAL DATABASE

 


 T. N. Bhat

(phone: 301 -975-5448; e-mail: bhat@nist.gov)

Biochemical Science Division

National Institute of Standards and Technology

Gaithersburg, MD 20899 U.S.A.

 

Dr. Mohamed Nasr

(phone: 301-496-0636; e-mail: mn12p@nih.gov)

DDCSB, TRP, DAIDS,

NIAID, NIH, Bethesda  MD

Please suggest : any one else to be acknowledged?

v      How to Use HIV 2-D Structural Database

·         Credits

·         Information in the database

·         Need for a central resource

·         Annotation of inhibitors and search using data-tree

·         Inhibitor Searches

·         Fragment Searches

·         Accessibility Requirements

·         HIV 2-D Structural database to 3-D Structural database

·         Back To main Page

HIV 2-D Structural database  HIV 3-D Structural database

Background and Significance: Since its outbreak, AIDS has caused the deaths of more than 20 million people, and the death toll is expected to triple by 2010 (CNN report, Sept. 30, 2002). It has shattered millions families and orphaned more than 14 million children. AIDS is a major health concern in both the East and the West and in both the developed and developing countries. The rising infection rate of the virus and the ability of the virus to develop drug resistance have made the critical need for effective treatment and eradication an international imperative.

The cure for AIDS is still far from a reality. Almost all current methods of AIDS treatment fall into the category of either containment or prevention. Advancing our knowledge of the AIDS virus offers the promise for developing effective treatments of the disease. Currently, much of the research for the treatment of AIDS is directed either towards vaccine development or towards drug development. Although several promising leads on vaccine have been reported (Shiver et al. 2002) and (Barouch et al. 2002) no effective vaccine has been developed at this time (Veljkovic, et al. 2002). Virtually all vaccines work by stimulating the immune system to make antibodies that target invading microbes, coat them and tag them for destruction. In the case of AIDS, it has been very hard to identify what type of antibody, if any, actually protect a person against infection, not to mention how to get the immune system to make such a substance. For that reason, much of the AIDS research in recent years has focused on vaccines that do not prevent infection, but prime the body to fight permanent war of suppression through ‘cell-mediated’ immunity (Sasaki et al. 1999, Toda et al. 1997)

Another approach for AIDS treatment is by the use of drugs that selectively inhibit specific molecules such as the HIV protease  (Gulick et al. 1997). In fact, such drugs provide the only proven method for the treatment of AIDS. Drugs specifically designed to inhibit the HIV protease, an aspartic protease that carries out the posttranslational processing of the viral gag-pol polypetide into functional viral components, have been quite successful. The processing of the gag-pol translational product by HIV protease releases the viral replication enzymes (protease, reverse transcriptase/ribonuclease H, and integrase) (Kohl et al. 1988 This activity is essential for the viral life cycle, and therefore disrupting the proteolytic activity through inhibitors results in non-infectious virions preventing infection (Lambert et al. 1992). For this reason a concentrated effort of many laboratories has gone into the elucidation of enzyme/inhibitor interaction of this and other enzymes, and many efforts have focused on developing strategies for disrupting critical macromolecular interactions required for the viral life cycle ( Turner and Summers, 1999 ) Thus, there is a critical need for structural information on these systems as long as drugs development for the treatment of AIDS is a work in progress. Support and infrastructure for such activities is all the more important at this time. For this reason, two databases have been established. (a) HIV 2-D Structural database : This database contains 2-D structural information established in collaboration with NIAID which the subject of this web page.  (b) HIV 3-D Structural database . This database contains 3-D structural information obtained from X-Ray or 3-D NMR studies and it is a collaboration with NCI.

HIV 2-D Structural database  HIV 3-D Structural database

Need for a special annotation of structural information on AIDS related compounds: We are witnessing the emergence of a web-based “data-rich” era in chemical and biological compounds. In the past decade, databases have become an integral part of research and development in the biomedical sciences. Bioinformatics now plays an essential role both in deciphering chemical data. Web sites are an important part of current information exchange. However, the ability to organize and retrieve data remains primitive. Users must find the most closely related information when data for a specific substance is not available, and discover compounds with desired structural characteristics and relationships. Currently, the capability of finding similar, related substances in large, complex collections is unsatisfactory. Enormous resources have been brought to bear by the research community on drug discovery activities that target various molecules associated with the AIDS virus. Frequently the scientists associated with these efforts publish the results and the structural information play  is the dominant manthra of several data resources on AIDS.  The goal of this resource is to annotate, archive and distribute ( Chem-BLAST ) structural results and associated chemical data from as many sources as possible.

Why NIST and how does it fit with the overall mission of NIST: NIST, particularly CSTL has been focusing on data related work on biological and chemical-structure based data collections (see, for instance | http://webbook.nist.gov/chemistry/ and | http://www.nist.gov/srd/  and | for prior work PDB ). The production and dissemination of chemical information in NIST Standard Reference Data collections is part of the NIST mission. This program intends to provide advances in the annotation and dissemination of AIDS related ligand data for AIDS research with particular emphasis to drug design interests. None of the structural data held in this database are common to the PDB.

Data Standards: Chemical Semantic Web looks like as a possible solution for the future needs of chemical databases. For this reason, HIVSDB efforts is also focused on developing Semantic Web technology ( Semantic Web ) using AIDS inhibitors.

Industrial interest: The majority of drug development activities for AIDS have been carried out by industry and this continues to be the case. The proposed HIVSDB is expected to be the central archive and distribution system for structural data on both wild type and mutant enzymes complexes with AIDS drugs. Drug resistance mutations are the most troubling aspect of AIDS drug development ( John L, Marra F, Ensom MH. 2001) and structural analysis and annotation of structural data is crucial for elucidating how to circumvent this problem. The HIVSDB is expected to actively facilitate this work.

Healthcare: Since its outbreak, AIDS has caused the deaths of more than 20 million people, and the death toll is expected to triple by 2010. It has shattered millions families and orphaned more than 14 million children. AIDS is a major health concern in both the East and the West and in both the developed and developing countries. The rising infection rate of the virus and the ability of the virus to develop drug resistance have made the critical need for effective treatment and eradication an international imperative. The cure for AIDS is still far from a reality. Most proven approach for AIDS treatment of AIDS is by the use of drugs that selectively inhibit specific molecules such as the HIV protease. In fact, such drugs provide the only proven method for the treatment of AIDS. Drugs specifically designed to inhibit the HIV protease, an aspartic protease that carries out the posttranslational processing of the viral gag-pol polypetide into functional viral components, have been quite successful. The proposal is to develop a centralized archival, annotation, distribution system for structural data on AIDS.

HIV 2-D Structural database  HIV 3-D Structural database

Annotation of inhibitors: |(Inhibitor Searches using data-tree) Despite the wide and expanding availability and use of chemical and biochemical data collections, the ability to organize and retrieve structure-based data remains primitive. While it is possible to readily find compounds whose structures are known in advance, the ability of a user or automated search method to find similar substances in large, complex structural collections is generally unsatisfactory. Such searching or browsing serves at least two purposes; 1) to find the most closely related information when data for a specific substance is not available and 2) to enable users to discover compounds with desired structural characteristics. The principal difficulty in such searching is that structural features of interest to a user often cannot be defined (and indexed) in advance due to the natural complexity of structure/property relations, which can depend on discipline, task and user. 

One of the objective of this project is to organize the inhibitor data in a tree-like arrangement and to develop sophisticated navigation tools to run the web interface. A sample of the inhibitor data tree is shown (fig 1) bellow examples may be found at Chem-BLAST for AIDS inhibitors.

 

 

 

The data-tree described above is novel and it provides dynamic navigation paths for a user. For instance, a user may start with a ring, and get all ligands with ring structures. Alternatively, a user may start with a given ligand same as that of 003504 and traverse back in the data tree to locate 001836, 001837 both of which have the motif of Val and PHE. Utility of a data-tree increases with the number, size and complexity of the molecule of interest. This annotation technique lays the foundation for Semantic Web Technology on chemical compounds. It enables the attachment of 'meaning' i.e., semantics, to data in a manner that far exceeds the current practice of associating 'metadata' with data. This is accomplished by creating a knowledge base (or ontology) associated with data.

HIV 2-D Structural database  HIV 3-D Structural database

·        Information in the database: The database contains chemical structural data together with IUPAC names and synonyms and the data may be searched either by text inputs or by structural components in two different ways:

o                                Using information on inhibitors using data-tree as described in |Annotation of inhibitors |Inhibitor search using data RDF

o                                Text search on all data by user input values as described in |Fragment Searches |(Fragment Searches)

·         Using inhibitor information stored in the database: |(Inhibitor Searches )

Data are contained in the database as several distinct classes of information (i.e. columns):  the NIAID_ID, the method used in the study, citation, abstract, unit cell data, inhibitor names, quality evaluation data like R-factor, resolution and so on.  Inhibitor data are critically important for successful use of the web resource. For this reason considerable effort has been invested in annotating and presenting the inhibitor data both as a single molecule and as several smaller standard fragments. A user may type in a text string and select inhibitors using any of the key words that may exist in any of the columns. A user may also choose to view certain optional information like abstract, unit cell and refinement parameters. Once a selection is made, the user will be presented with a sketch of the molecule, its fragments and certain descriptive information like citation. At this time a user may make a new query using new strings provided in the text box or may decide to perform fragment searches on the inhibitor. For searching a fragment, a user selects a fragment from the fragment displayed immediately after the sketch of the entire molecule. For instance, if a user selects a valine then in the next page the tool will display all the inhibitor molecules that have a valine (a total of ~10 pages) as a fragment. At this time a user may select another fragment, for instance benzyle formate. Then the tool will present all the molecules which have both valine and benzyle formate (~2 pages). At this stage the user may select another fragment for instance phenylethanamine resulting in one page of result. Using this tool a user may rapidly search for multiple fragments and thus perform homology searches.

 HIV 2-D Structural database HIV 3-D Structural database

·         Text search on all data by user input values| (Fragment Searches)

Values for data are contained in the database as several distinct classes of information (i.e. columns):  the enzyme name, the method used in the study, citation, abstract, unit cell data, inhibitor names, quality evaluation data like R-factor, resolution and so on. One may query the database with user entered input values for a given column by using | user input values. Input text strings may be chosen from one or more key-words that may be found in abstracts, author names, journal information, crystallization data, space group, inhibitor names, mutation information and etc. In this query option, one may use AND or OR to specify multiple key-words. For instance one may use Bhat AND  Erickson  or phe OR val, SAIC OR NCI, Roche OR Dupont.  While several AND or several OR may be used in a given selection box, one may not use both AND and OR simultaneously. 

·       Display/download structural data values| Display/download

A user may display and or download structural data in two formats. A user may choose to download only inhibitor data or the entire data. Data download page allows download of already selected data or additional selections based on NIAID_ID or text searches.

   

·        Credits:

 Primary Correspondence: T. N. Bhat bhat@nist.gov

Contributors:  Dr. Mohamed Nasr mn12p@nih.gov ,  Anh Dao Nguyen

Prior Publications: None.

In citing this work please use the following publications:

1. Prasanna, M.D., Vondrasek, J., Wlodawer, A., Bhat, T. N. Application of InChI to curate, index and query 3-D structures. PROTEINS: Structure, Function, and Bioinformatics 60, 1-4 (2005).

2.Prasanna M.D, Vondrasek J, Wlodawer A, Rodriguez H, Bhat T.N. Chemical compound navigator: a web-based Chem-BLAST, chemical taxonomy-based search engine for browsing compounds. Proteins 63(4), 907-917(2006).

 

    ·        Accessibility Requirements:

A black in white title is provided for each web page.

All ‘gif’ representations have been provided with text explanations using ‘ALT=explanation’ in the web.

Descriptive structural or chemical descriptions, usually shown by lengthy texts in most other related web resources, have been replaced or augmented in this web resource by 2 – D drawings. Standard conventions for bonds and molecules have been used to draw such 2-D sketches. Links are for molecular fragments are usually provided using 2-D sketches that denote the fragment selected by the link. Whenever a 2-D drawing is not hyper linked, an explanation is provided. All atoms a re colored according to IUPAC conventions when colors are used. All molecular names are provided using IUPAC conventions.

Descriptions are provided using ‘ALT =description’ in web links that use ‘gif’ files to clarify the result of the action. All text based links are either numbered or preceded by ‘|’ to accommodate customers with difficulties in recognizing links that are shown in color by default.

   ·        HIV 2-D Structural database to 3-D Structural database:

Structure-based drug design is process of developing new drugs using the information contained in three-dimensional structures of enzyme inhibitor complexes. A key step of this design approach is to postulate and model biologically active lead compounds into the active site of the enzyme and to analyze the drug enzyme interactions. These interactions are then used to design and develop new compounds of improved properties such as biological potency and tolerance to both wild type and mutant drug resistance enzymes. Such homology modeling of drugs into the active site of an enzyme relies on the hypothesis that similar fragments of a different drugs bind in a predictable and similar fashion to the active site of an enzyme and thus the first step in homology modeling of a drug involves in gathering all the 3-D structural data on enzyme inhibitor complexes of similar drugs. This page primarily focuses on this aspect of the HIV protease drug interactions.

This page shows the structural neighbors of a 2-D structure in the 3-D database. The listing of fragments that are common between 2-D and 3-D structures may not be complete. The purpose of this page is to facilitate a tentative mapping between 2-D and 3-D structures inside the active site of HIV 1 protease to better understand drug protein interactions and drug resistance when 3-D structure determined by X-Ray or NMR is not available. Result pages are presented in the descending order of the number of fragments common between the 2-D and 3-D data.

The data may be queried using text string or using the ID's of the 2-D compounds. In the text box multiple words may be concatenated using 'AND' or 'OR'.

 

 HIV 2-D Structural database  HIV 3-D Structural database