This study explores the evolutionary patterns of SARS-CoV-2 variants, emphasizing key mutations and their global distribution in 2021 (
16). The phylogenetic analysis reveals that SARS-CoV-2 variants from different regions cluster together based on the mutations they acquired (
17). The Delta variant, in particular, branches away from the ancestral virus, hCoV-19/Wuhan/Hu-1/2019 (NC_045512.2), as well as from other VOIs and VOCs, indicating more significant nucleotide differences between them (Refer to
Figure 3) (
18).
The phylogenetic analysis also revealed close genetic similarities among isolates from Europe, North America, and South America for the Delta, Alpha, Beta, and Lambda variants. The Alpha variant, which originated in the United Kingdom (Europe), was first identified in São Paulo, South America, following an unusual PCR result during routine SARS-CoV-2 diagnosis (
18). The genome obtained from the patient formed a cluster strongly supported by 85% bootstrapping of 10 sequences, 60% of which originated in the United Kingdom (Europe). This aligns with the travel history of a close contact of the patient, an asymptomatic family member with SARS-CoV-2, who had traveled from Italy to the United Kingdom and then from London to São Paulo. It is likely that multiple introductions of the Alpha variant occurred in São Paulo, contributing to its prolonged transmission due to the city's significant economic, transportation, and communication networks (
19). This supports the conclusion by (
20) that travel was a key factor in importing variations and subsequently contributing to infections, illustrating the role of globalization in the spread of the SARS-CoV-2 pandemic. Hence, the Alpha variant from Europe, North America, and South America showed close genetic similarities among the isolates.
Similarly, the South American Lambda variant (sublineage C.37.1) displayed close relationships with samples from North America and Europe, characterized by the presence of spike mutations such as Q675H in North American samples and R21I and T572I in European samples. The Q675H mutation was also found in Lambda variants from South America, while the V826L mutation present in both European and North American Lambda variants further highlighted their close evolutionary relationships (
21). Additionally, (
22) described genomic analyses during a Beta variant outbreak in Canada (North America), confirming that all B.1.351 genomes were closely related, with fewer than or only two single nucleotide polymorphisms between the sequences.
Following the discovery of the Delta variant in India, it rapidly spread worldwide, with the United Kingdom predominantly associated with the B.1.617.2 lineage. Common mutations found in most B.1.617.2 sequences include T19R, G142D, R158G, L452R, T478K, D614G, P681R, D950N, and deletions at positions 156-157. The B.1.617.2 lineage continues to evolve, with concerning mutations such as K417N emerging in the AY.1/B.1.617.2.1 sub-lineage (
23). After the integration of the parent B.1.617.2 lineage, 245 lineages were categorized under AY in the Pango nomenclature system. The AY.44 and AY.103 lineages dominated in California, while AY.20 and AY.26 were prevalent in Mexico. According to (
24), the most common Delta-related variants detected in Brazil were AY.99.2, AY.43, AY.101, AY.34.1, AY.43.1, AY.43.2, AY.46.3, AY.100, AY.99.1, and AY.36. Additionally, T19R, T95I, E156G, DEL157/158, L452R, T478K, D614G, Q677H, P681R, D950N, V1104L, and L1265F were frequently reported mutations across more than half of the sequences from these lineages. Thus, the common mutations found in the Delta variant across Europe, North America, and South America suggest close genetic relationships among these genomes.
As mentioned in the results, mutations were observed in nsp, S, E, M, N, and accessory proteins. A study by (
25) identified a P314L amino acid mutation caused by the nucleotide change at position 3037 (C3037T), affecting nsp12 and the viral RNA-dependent RNA polymerase. The P314L mutation in nsp12 may enhance viral replication, increasing the virus's transmissibility and infectiousness. Another notable mutation was C14408T, which targets nsp3 (a viral predicted phosphoesterase) at position 14408. According to (
26), the most prevalent mutations in nsp3 were P1228L, P1469S, and A488S. In the P1228L mutation, proline (HΦ∼13.5) was replaced by leucine (HΦ∼16.0), leading to reduced stability due to increased hydrophobicity. This mutation, located in the α-helical region of nsp3 within the cytoplasm, may decrease the protein's stability at the P1228L site. In the P1469S mutation, proline (HΦ∼13.5) was replaced by serine (HΦ∼3.0), a hydrophilic amino acid, potentially affecting the protein's stability. A study by (
25) suggested that both the P1228L and P1469S mutations in nsp3 may negatively impact protein binding, as these mutations occur near the protein sequence terminal, which is essential for linking nsp3 to nsp4.
Additionally, the A488S mutation replaced alanine (HΦ∼11.0) with serine (HΦ∼3.0), likely increasing the stability of nsp3 in this region. However, this mutation also affected the SUD domain of nsp3, which may influence binding selectivity for G-quadruplexes (G4). This characteristic could play a role in the formation of the replication or transcription viral complex (RTC). Therefore, it is speculated that the A488S mutation may enhance nsp3 stability and strengthen the interaction between the SUD domain and G4s, thereby improving RTC functionality.
The SARS-CoV-2 S proteins have undergone mutations, including changes at glycosylation sites. One of the most prevalent mutations is D614G, which has been shown to significantly enhance viral infectivity (
27). The highest density of mutations is located at the S protein's protease cleavage site. These alterations may benefit the virus by allowing it to undergo proteolytic cleavage by various host enzymes, aiding its survival during evolution. Additionally, the spike protein can undergo mutations at multiple sites, with a single site sometimes associated with more than one mutation. To date, the D614G mutation is the only one consistently observed in S proteins across all continents. Mutations near the receptor-binding domain (RBD) that are close to the ACE-2 receptor may impact the shape and charge of the protein near the interaction interface. However, despite these alterations in the S protein's RBD, the virus remains capable of inducing infection (
28).
Furthermore, the N501Y mutation, which has been identified in all VOIs and VOCs except Delta, is a S glycoprotein mutation associated with increased viral transmission. This is due to the enhanced binding affinity with the host receptor, ACE2, by slowing the dissociation rate from the receptor (
29). Additionally, the K417T mutation in the Gamma variant and the K417N mutation in the Beta and Omicron variants are notable for causing conformational changes in the S protein, contributing to antibody escape (
30). Moreover, analysis by (
31) indicated that the L452R mutation, found only in the Delta variant's S protein, is associated with increased infectivity and transmission, as it enhances ACE2 binding at the furin cleavage site.
The largest accessory protein in SARS-CoV-2, ORF3a, activates the innate immune receptor NLRP3 inflammasome, triggering host inflammatory responses. This leads to an uncontrolled release of pro-inflammatory cytokines and other mediators, contributing to a cytokine storm—a clinical hallmark of SARS-CoV-2 pathogenesis (
32). As projected by (
33), mutations in ORF3a may result in the loss of B cell epitopes, thereby affecting ORF3a's antigenicity. Variations in ORF3a could potentially intensify the host immune response, leading to different severities of COVID-19 among individuals, as ORF3a is predicted to interact with host signaling pathways (
34). The ORF8 locus in SARS-CoV-2 is highly prone to mutations, including deletions, stop codon changes, and point mutations, with L84S being one of the most frequent. These mutations, particularly deletions, have been associated with milder symptoms and reduced infection severity due to a more effective immune response (
35).
Mutations play a crucial role in shaping the clinical outcomes and healthcare response, influencing viral infectivity and transmissibility, as observed with SARS-CoV-2 (
36). Key mutations, such as E484Q in the Delta variant and D614G, have been linked to increased infection rates, presenting challenges for public health efforts. These mutations also impact the virus's ability to evade immune responses, potentially affecting the efficacy of monoclonal antibodies and vaccines. Understanding these mutations is essential for developing effective treatments, particularly those targeting virus-host protein interactions (
37). Additionally, deletions in regions such as ORF7b have been reported globally, underscoring the need for ongoing research to evaluate their effects on viral fitness and pathogenicity.
5.1. Conclusions
In summary, the ongoing mutation and evolution of SARS-CoV-2 have given rise to various variants with distinct genetic profiles, including mutations in key proteins such as nsp, S, E, M, N, and accessory proteins. These mutations have been linked to changes in infectivity, transmissibility, and potential immune response, all of which influence the clinical progression of COVID-19. Understanding these variants and the specific mutations they carry is critical for developing effective diagnostic, therapeutic, and preventive strategies. Further research, particularly in the field of in silico modeling, is needed to fully grasp the implications of these mutations and their potential impact on public health.