Four structural proteins are essential for virion assembly and infection of CoVs. Homotrimers of S proteins make up the spikes on the viral surface and they are responsible for attachment to host receptors. The M protein has three membrane domains and it shapes the virions, promotes membrane curvature, and binds to the nucleocapsid. The E protein plays a role in virus assembly and release, and it involved in viral pathogenesis. The N protein contains two domains, both of which can bind virus RNA genome via different mechanisms. It is reported that N protein can bind to nsp3 protein to help tether the genome to RTC, and package the encapsidated genome into virions. N is also an antagonist of interferon (IFN) and viral encoded repressor of RNA interference, which appears to be beneficial for the viral replication.
FUNCTIONS OF NONSTRUCTURAL AND STRUCTURAL PROTEINS IN CORONAVIRAL REPLICATION
Most of the nsps of nsp-16 have been reported for their specific roles in the replication of CoVs. However, the functions of some of the nsps are unknown or not well understood.
The maintenance of such a large genome of CoVs may be related to the special features of the CoV RTC, which contains several RNA processing enzymes such as the 3′-5′ exoribonuclease of nsp14. The 3′-5′ exoribonuclease is unique to CoVs among all RNA viruses, probably providing a proofreading function of the RTC. Sequence analysis shows that the 2019-nCoV possesses a typical genome structure of CoV and belongs to the cluster of betacoronaviruses that includes Bat-SARS-like (SL)-ZC45, Bat-SLZXC21, SARS-CoV and MERS-CoV. Based on the phylogenetic tree of CoVs, 2019-nCoV is more closely related to Bat-SL-CoV ZC45 and Bat-SL-CoV ZXC21 and more distantly related to SARS-CoV
The genome sequence alignment of CoVs shows 58% identity on the nsp-coding region and 43% identity on the structural protein-coding region among different CoVs, with 54% at the whole genome level, suggesting the nsps are more conserved and the structural proteins are more divers in need of adaptation to new hosts. Since the mutation rates in the replication of RNA viruses are much higher than that of DNA viruses, the genomes of RNA viruses are usually less than 10 kb in length. However, the CoV genome is much larger, with roughly 30 kb in length, the largest known RNA viruses.
Other ORFs on the one-third of the genome near the 3′-terminus encodes at least four main structural protein: spike (S), membrane (M), envelope (E) and nucleocapsid (N) proteins. Besides these four main structural proteins, such as HE protein, 3a/b protein, and 4a/b protein. All the structural and accessory proteins are translated from the from the sgRNAs and CoVs.
The genome and subgenomes of a typical CoV contain at least six ORFs. The first ORFs (ORF1a/b), about two-thirds of the whole genome length, encode 16 nsps (nsp1-16), except Gammacoronavirus that lacks nsp1. There is a -1 frameshift between ORF1a and ORF1b, leading to production of two polypeptides:pp1a and pp1b. These polypeptides are processed by virally encoded chymotrypsin-like protease (3CL) or main protease (M) and one or two papain-like protease into 16 nsps.
Subsequently, a nested set of subgenomic RNAs (sgRNAs) are synthesized by RTC in a manner of discontinuous transcription. These subgenomic messenger RNAs (mRNAs) possess common 5′-leader and 3′-terminal sequences. Transcription termination and subsequent acquisitio nof a leader RNA occurs at transcription regulatory sequences, located between open reading frames(ORFs). These minus-strand sgRNAs serve as the templates for the production of subgenomic mRNAs.
CORONAVIRAL GENOME STRUCTURE AND REPLICATION
CoVs belong to the subfamily Coronavirinae in the family of Coronaviridae of the order Nidovirales, and this subfamily includes four genera: Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus. The genome of CoVs is a single-stranded positive-sense RNA (+ssRNA) (∼30kb) with 5′-cap structure and 3′-poly-A tail. The genomic RNA is used as template to directly translate polyprotein 1a/1b (pp1a/pp1ab), which encodes non-structural proteins (nsps) to form the replication-transcription complex (RTC) in a double-membrane vesicles (DMVs).
The sporadic emergence and outbreaks of new types of CoVs remind us that CoVs are a severe global health threat. It is highly likely that new CoV outbreaks are unavoidable in the future due to changes of the climate and ecology, and the increased interactions of human with animals. Thus, there is an urgent need to develop effective therapies and vaccines against CoVs
Chinese goverment and researchers have been taking swift measures to control the outbreak and conduct the etiological studies. The causative agent of the mystery pneumonia has been identified as a novel coronavirus (nCoV) by deep sequencing and etiological investigations by at least five independent laboratories of China. On 12 January 2020, the World Health Organizationtemporarely named the new virus as 2019 novel coronavirus (2019-nCoV)