dcyphr | Preliminary Identification of Potential Vaccine Targets for the COVID-19 Coronavirus (SARS-CoV-2) Based on SARS-CoV Immunological Studies


Researchers looked for vaccine targets against COVID-19 by investigating genetic similarities between SARS-CoV-2, the virus that causes COVID-19, and SARS-CoV, which caused a major outbreak in 2003. The researchers identified B cell and T cell epitopes in the S and N proteins that are identical between the two viruses. For T cell epitopes, the researchers also calculated the percentage of the global and Chinese population that could be covered by vaccines that target these epitopes. The researchers proposed a final set of epitopes that could guide studies about vaccine development against SARS-CoV-2.


The researchers wanted to find epitopes (the part of an antigen that an antibody targets in the immune response) that were identical between SARS-CoV and SARS-CoV-2. These epitopes are associated with known immune responses against SARS-CoV that could also be produced in SARS-CoV-2. These epitopes could be targeted in a vaccine against SARS-CoV-2.


There have been several recent coronavirus outbreaks, including Severe Acute Respiratory Syndrome (SARS) in 2003 and Middle East Respiratory Syndrome (MERS) in 2012. SARS-CoV-2 belongs to the same family as SARS-CoV and MERS-CoV, which cause SARS and MERS. There is a lack of understanding about immune responses against SARS-CoV-2, which makes it difficult to develop a vaccine. However, studies suggest SARS-CoV and SARS-CoV-2 are very similar, which means immune responses against these two viruses could also be similar.

Previous studies suggest that SARS-CoV is targeted by both humoral and cell-mediated immune responses. However, cell-mediated (T cell and B cell) responses have provided the most long-term and efficient protection. In addition, T cell responses against the spike (S) and nucleocapsid (N) proteins were the most effective.

The researchers analyzed SARS-CoV B cell and T cell epitopes. They compared these epitopes with SARS-CoV-2 sequences and selected ones that were identical. Because they were identical, immune responses against them could protect against both SARS and COVID-19. The researchers focused on S and N proteins since they have been found to provide effective and long-term immune responses in SARS. For T cell epitopes, the researchers also looked at their associated major historompatibility complex (MHC) alleles. The MHC is part of the genome that codes proteins essential for the adaptive immune system. Different MHC alleles are present in different individuals, so the researchers wanted to look for epitopes associated with MHC alleles that were present in a greater percentage of the population. This would increase the number of people who could be protected by a potential vaccine against that epitope.


Structural Proteins of SARS-CoV-2 are genetically similar to SARS-CoV, but not MERS-CoV

SARS-CoV-2 is more genetically close to SARS-CoV than MERS-CoV based on comparing their genomes. This is also true at the level of individual structural proteins. The membrane (M), N, and envelope (E) proteins of SARS-CoV-2 and SARS-CoV have over 90% genetic similarity, while the S protein has a lower but still high similarity. The similarity between SARS-CoV-2 and MERS-CoV was substantially lower for all proteins. The researchers decided to focus on S and N proteins as they produce long-term, strong immune against SARS-CoV.

Mapping the SARS-CoV-Derived T Cell Epitopes That Are Identical in SARS-CoV-2, and Determining Those with Greatest Estimated Population Coverage

By using positive T cell assays, 115 SARS-CoV T cell epitopes were found and compared with SARS-CoV-2 protein sequences. 27 were identical between viruses and all were present in the N or S proteins. 19 of these epitopes were associated with 5 MHC alleles, and population coverage calculated from these alleles was low (59.76% for the global population, 32.36% for China). The MHC alleles for the remaining 8 epitopes were unknown and population coverage could not be calculated.

To identify T cell epitopes that could cover more of the population, the researchers considered epitopes that were found from positive MHC binding assays. They found 229 T cell epitopes that were identical between viruses, with 102 found in S and N proteins. Multiple T cell epitopes provided a global population coverage of 96.2%, and 88.11% in China.

Mapping the SARS-CoV-Derived B cell Epitopes that are Identical in SARS-CoV-2

The researchers used a similar approach in finding T cell epitopes for B cell epitopes. They found two kinds of epitopes: linear B cell epitopes and discontinuous B cell epitopes. Of 298 linear B cell epitopes, 49 sequences were identical between the viruses. 45 of these were found in the S or N proteins. The researchers also found 6 discontinuous B cell epitopes, all in the S protein. None were identical between viruses, but three of them were partially identical between viruses.

The researchers also found that many of the identical B cell epitopes associated with two subunits of the S protein, S1 and S2. 20 of the 23 linear epitopes in the S protein were found in the S2 subunit. Antibodies targeting these epitopes could protect against both SARS-CoV and SARS-CoV-2. The S2 subunit is less exposed and could be more difficult to target than S1. The 3 discontinuous epitopes were found in the S1 unit. However, these epitopes are not fully identical between viruses and vaccines targeting these epitopes might not be effective.


The T-cell epitopes found in SARS-CoV that are identical in SARS-CoV-2 are expected to cover a large population.

The B-cell epitopes found are in agreement with recent studies. The study suggests that vaccines targeting the S1 subunit in the SARS-CoV-2 S protein may not be effective. The S2 subunit may be more promising for vaccine development and should be explored further.

Overall, the proposed SARS-CoV epitopes are identical to SARS-CoV-2 and are potential candidates to guide vaccine development efforts. Further experiments must be done to confirm the potential of the proposed epitopes as vaccine targets.


Whole genome sequences of SARS-CoV-2 were downloaded from GISAID. These sequences were aligned to a GenBank reference sequence and translated into amino acids. The amino acid sequences were then aligned using MAFFT. Reference protein sequences for SARS-CoV and MERS-CoV were also obtained from GenBank.

SARS-CoV B cell and T cell epitopes were searched for on the NIAD Virus Pathogen Database and Analysis Resource. The researchers limited their search to epitopes that were experimentally confirmed.

Population coverage for T cell epitopes were computed using the Immune Epitope Database (IEDB). This tool uses the distribution of MHC alleles within a population to estimate population coverage, which is the percentage of individuals that would elicit an immune response to a specific epitope.

The researchers used a software called PASTA to construct a phylogenetic tree of each structural protein using their sequences for SARS-CoV, SARS-CoV-2, and MERS-CoV.