CoVDB Coronavirus Database Go to the lastest version >>

Overview

The CoVdb extensively collects published Coronavirus genome data and analyzes and identifies 322 Coronavirus gene clusters from the genomes of 104 strains. The genomes of 104 strains are of 780 possible ORFs. There are in average 5-14 genes in each strain. We performed a subcellular localization analysis of the ASFV genes to predict their roles in the infection process . We found that 27% are host nucleus or host cytoplasm proteins. 32% of the Coronavirus genes are membrane proteins. In gene ontology, Coronavirus genes enriched in the membrane activity. Moreover, based on the population genetics test analysis, we found that 11 genes have a significantly low Tajima’s D and significantly high composite likelihood ratio (CLR) (Rank Test, P-value<0.05), indicating that these genes were recently possibly under positive selection. 159 genes have a significantly high Tajima’s D value (Table S4, Rank Test, P-value<0.05), and they may be involved in balancing selection. These results are informative for future Coronavirus research.

Data is CovDB. A is the structure of Coronavirus. B shows the percentages of ASFV gene clusters in each subcellular location group. C is the enrichment of the gene ontology (GO) of ASFV gene clusters. This figure was drawn by WEGO. D is The distribution of strains in CovDB according to country. E is the phylogenetic tree of Coronavirus strains in CovDB.

Main Functions

The main search input of CoVdb, located at the center of the home page, allows users to search for genes by their general gene name, gene accession or function description. We created a gene alias list to make the search results as complete as possible. In the right column of the homepage, there is a list of other gene search methods in CoVdb. Firstly, users can perform BLAST against a single Coronavirus genome following the workflow in the UCSC genome browser19, or alternately BLAST against coding sequence (CDS) or protein sequence. Secondly, users can search for genes at specific subcellular locations. In the results window, we list the topologocal information and function annotation of each gene so that users can easily locate the gene in the host-virus interaction. Thirdly, users can find genes with specific function according to their GO annotations. The “Gene Clusters” lists the persistence of the Coronavirus genes in 104 strains. Users can view specific genes by clicking items in the gene list table. In addition, in order to track a list of genes more conveniently, CoVdb provides gene links by inputting a list of the genomic locations or a list of genes’ accession numbers. These operations facilitate personalized gene list analysis with CoVdb.

Users can go to the genome browser page, by clicking the taxon name in the tree or selecting an item in the “Gbrwoser” list. In the genome browser page, gene segments are subsequently arranged along the genome based on the gene location. Genes annotated from the NCBI GFF files are colored in scarlet as “Annotation from NCBI”. Genes annotated by mapping are colored in mauve as “Newly annotated”. “Genetic Remains” are colored in gray. The gene name is provided above each gene segment, and “>” or “<” indicates the direction of gene transcription. Four tracks of population genetics analysis, Pi, Theta21, Tajima’s D22, and CLR23, 24 are listed following the gene track. To help trace selective signatures, we drew a top 5% line for CLR and two 5% lines in ascending and descending order for Pi, Theta and Tajima’s D. Using the tool bar at the top, users can move or zoom the genome in the browser, set the focus bar to a selected area, export the graph of one track with gene segments and export the data of one track.

Clicking on a gene segment will lead to the gene’s information page. The complete list of annotations includes basic information (strain, gene name, description, location in the genome, Genbank Accession and full name), sequence (CDS and protein), summary (function, UniProt accession, related Pubmed ID, related EMBL ID, corresponding Proteomes ID, related Pfam ID and correlated Interpro ID), ontologies (GO and KEGG), subcellular location, topology (transmembrane region prediction), genomic alignment in the CDS region, multiple alignment of orthologues, gene tree of corresponding NCBI annotated or newly annotated proteins, and orthologous genes in strains. Texts that link to internal or external sites are hyperlinked to facilitate viewing and analysis.

Copyright@ 2018-2023    Any Comments and suggestions mail to:  zhuzl@cqu.edu.cn, mg@cau.edu.cn   渝ICP备19006517号

渝公网安备 50010602502065号

In processing...
Login to ASFVdb
Email
Password
Please go to Regist if without an account.
If you have forgotten your password, you can once again Regist an account with a registed or new email.
Change my password
Enter new password
Reenter new password
Regist an account of ASFVdb
It is required that you provide your institutional e-mail address (with edu or org in the domain) as confirmation of your affiliation.
Enter email
Reenter email
First Name
Last Name
Institution
You can directly go to if with an account.
Registraion Success
Your password has been sent to your email.
Please check it and login later.
Welcome to use ASFVdb.