How genomics and bioinformatics help in drug discovery for COVID-19
Author: Minnie Hung
In the past few months, the world has witness the emergence of the COVID-19 pandemic crisis. The coronavirus outbreak originated in Wuhan, China, back in December 2019 and has now infected more than 350,000 people and caused over 15,000 deaths worldwide. One important step towards developing therapies against COVID-19 (SARS-Cov-2) is to understand its genomic makeup. In January, the genome sequence of SARS-CoV-2 was released to the scientific community and posted on NCBI GenBank. Since then, comparative genomics analyses have demonstrated that this novel virus belongs to the beta-coronavirus family and shared greater sequence homology to SARS-CoV than MERS-CoV. Unfortunately, there is still no specific antiviral treatment to either SARS-CoV or MERS-CoV. In light of this, how do scientists take advantage of the available genomic information of SARS-CoV-2 to better understand this novel virus and find cure for it?
Electron microscope image of COVID19. (SOURCE: U.S. National Institutes of Health)
A new study recently published in Nature Medicine thoroughly looked at the SARS-CoV-2 molecular structure and tried to trace the possible origin of the virus. Due to the similarity between SARS-CoV-2 and bat SARS-CoV-like coronaviruses, the researchers suspected that bats serve as the reservoir hosts to the SARS-Cov-2 progenitor. Interestingly, the spike protein of coronaviruses was found to be the greatest genetic difference between SARS-CoV-2 and RaTG13, a beta-coronavirus sampled from Rhinolophus affinis bat. The spike protein, which is a crown-like protein covering the surface of the coronavirus and binds to mammalian cell membrane during infection, may provide clues into how SARS-CoV-2 can now infect human cells.
From a comparative analysis of the genomic data across multiple coronaviruses, the receptor binding domain (RBD) of the spike protein is found to be the most variable region across all hosts. There are six RBD amino acids that have been shown to be important for human angiotensin-converting enzyme 2 (ACE2) receptor binding in SARS-CoV-2 and they are L455, F486, Q493, S494, N501 and Y505. Despite five of the six RBD residues being different from those of SARS-CoV, a computational model showed that the spike protein shared an almost identical 3D structure at the RBDs between SARS-CoV-2 and SARS-CoV, maintaining similar binding mechanisms at the spike protein-ACE2 interaction interface. Therefore, both SARS-CoV-2 and SARS-CoV use human ACE2 as the entry point of infection and the spike protein of SARS-CoV-2 might have evolved through natural selection to acquire optimized binding to the human ACE2 receptor, thereby obtaining the ability to infect human.
Apart from using comparative genomics to analyze spike protein sequences in different coronavirus strains, bioinformatics or cheminformatics can also be applied to model how chemical compounds can interact with the spike protein or ACE2 receptor, thus helping in drug discovery. Some biophysicists at the University of Tennessee used the IBM-built supercomputer SUMMIT to sift through 9,000 compounds from the SWEETLEAD library, which consists of drugs, chemicals, herbal medicines, and natural products, and found at least 77 compounds that could potentially prevent SARS-CoV-2 infection in human cells. The computational screening was done through structural modeling of the SARS-CoV-2 spike protein and ACE2 receptor complex, and small molecule docking to the protein interaction interface. In other words, compounds that are capable of binding to SARS-CoV-2 spike protein could theoretically block its binding to the ACE2 receptor, thereby stopping the virus from invading human cells. Collectively, this has provide a framework for researchers to further study the behavior of those chemical compounds in cells.
The compound, shown in gray, was calculated to bind to the SARS-CoV-2 spike protein, shown in cyan, to prevent it from docking to the human cell receptor, shown in purple. (Source: Micholas Smith/Oak Ridge National Laboratory, U.S. Dept. of Energy)
Andersen, K.G., Rambaut, A., Lipkin, W.I. et al. The proximal origin of SARS-CoV-2. Nat Med (2020). https://doi.org/10.1038/s41591-020-0820-9
Xu, X., Chen, P., Wang, J., Feng, J., Zhou, H., Li, X., Zhong, W., Hao, P. Science China Life Sciences (2020). Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission. 63. 457-460. 10.1007/s11427-020-1637-5.
Smith, M., and Smith, J.C. (2020): Repurposing Therapeutics for COVID-19: Supercomputer-Based Docking to the SARS-CoV-2 Viral Spike Protein and Viral Spike Protein-Human ACE2 Interface. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.11871402.v3