1. Background
2. Objectives
3. Methods
3.1. Genome Sequences
3.2. Microsatellite Extraction
3.3. Statistical Analysis
3.4. SSR Distribution Across Coding Regions
3.5. Phylogenetic Tree Construction
3.6. Heat Map of the Studied Genomes
4. Results
4.1. Genome Features
Summary of SSR and cSSRs and cSSR% extracted in this study. A, Genome features (Genome size and GC content) and SSR/cSSR incidence across studied genomes. B, Relative abundance and relative density of SSRs and cSSRs. C, cSSR% (percentage of SSRs present as a part of cSSR) across genomes. The incidence and distribution of SSRs follows no pattern across genomes as indicated by the varying peaks of the graph.
4.2. Incidence of SSRs and cSSRs
4.3. Relative Abundance, Relative Density and cSSR%
4.4. Repeat Motif Prevalence as per Size and Their Composition
Nucleotide composition of the incident SSRs. A, SSRs (Mono- to Tri-nucleotide repeat) coverage in genomes. Note the maximum contribution by di-nucleotide motifs across genomes with few exceptions (CV02, CV08, CV60). B, Prevalent motif constituents across mono-, di- and tri-nucleotide motifs. C, SSRs distribution across coding and non-coding regions. D, Distribution of genomes on the basis of mono-nucleotide contribution from A/T region (here shown as AT%). The highest incidence of di-nucleotide repeats makes genome susceptible to variations while motif composition is inclined towards A/T irrespective of size probably due to genome composition.
4.5. Microsatellites in Coding Region
| S N | Genome ID | Gene with Highest SSR Density | SSR Density | Gene with Lowest SSR Density | SSR Density |
|---|---|---|---|---|---|
| 1 | CV32 | Small membrane protein | 12.04819 | Nucleocapsid phosphoprotein | 1.508296 |
| 2 | CV28 | ORF1ab polyprotein | 4.134367 | ||
| 3 | CV30 | Hemagglutinin-esterase | 6.27451 | Membrane protein | 1.443001 |
| 4 | CV36 | Hypothetical protein | 6.582885 | Spike glycoprotein | 2.875817 |
| 5 | CV02 | Non-structural protein 3a | 12.65823 | Membrane protein | 2.487562 |
| 6 | CV08 | Membrane protein | 4.405286 | Protein 3 | 1.474926 |
| 7 | CV04 | Putative 3a protein | 13.88889 | Matrix protein | 2.534854 |
| 8 | CV61 | Orf10 | 17.09402 | Membrane protein (M) | 1.494768 |
| 9 | CV44 | Small membrane protein | 12.04819 | Membrane glycoprotein | 1.515152 |
| 10 | CV05 | Non-structural protein 7 | 8.438819 | Spike protein | 2.180431 |
| 11 | CV13 | Hypothetical protein | 17.3913 | Spike glycoprotein | 3.542958 |
| 12 | CV57 | Nonstructural protein | 14.49275 | Membrane protein | 1.529052 |
| 13 | CV03 | N protein | 5.30504 | Membrane protein (M) | 1.262626 |
| 14 | CV06 | Non-structural protein 3a | 13.88889 | Non-structural protein 3b | 2.721088 |
| 15 | CV33 | Envelop protein (E) | 3.745318 | Membrane protein (M) | 1.455604 |
| 16 | CV55 | Nonstructural protein | 6.116208 | Nucleocapsid phosphoprotein | 1.888574 |
| 17 | CV53 | Nonstructural protein | 20.83333 | Nucleocapsid phosphoprotein | 0.952381 |
| 18 | CV11 | Envelope protein | 8.888889 | Nucleocapsid protein | 0.788022 |
| 19 | CV07 | Envelope protein | 12.82051 | Surface glycoprotein | 1.703578 |
| 20 | CV14 | Envelope protein | 4.329004 | Putative ORF3 | 1.481481 |
| 21 | CV09 | Spike protein | 3.149225 | Nucleocapsid protein | 0.854701 |
| 22 | CV58 | Hemagglutinin esterase | 5.555556 | Nucleocapsid phosphoprotein | 3.968254 |
| 23 | CV60 | Putative nucleocapsid protein | 8.230453 | Putative membrane protein | 1.461988 |
| 24 | CV48 | Orf 9 | 8.714597 | ORF 5c | 1.893939 |
| 25 | CV46 | 5b protein | 12.04819 | Membrane protein | 2.949853 |
| 26 | CV12 | Nucleocapsid protein | 5.279035 | Spike protein | 2.16763 |
| 27 | CV35 | Small membrane protein | 20.08032 | Membrane glycoprotein | 1.508296 |
| 28 | CV34 | Membrane protein | 5.822416 | Hemagglutinin-esterase | 0.757576 |
| 29 | CV47 | ORF 5b | 8.032129 | N protein | 2.439024 |
| 30 | CV38 | Protein (E) | 8.658009 | Orf1ab polyprotein (Pp1ab) | 2.827388 |
| 31 | CV37 | E protein | 8.658009 | Nonstructural polyprotein Pp1ab | 2.827388 |
4.6. Mono-nucleotide Repeat Motif Exclusivity for Hosts
4.7. Phylogenetic and Similarity Analysis
5. Discussion
Correlation between mono-nucleotide A/T repeat incidence and host. Studied species of Coronaviridae arranged in decreasing order of mono-nucleotide repeats (left to right, Blue representing 100% or mono-SSRs exclusive to the A/T region). The corresponding hosts are also mentioned. Since a direct relation does not exist, multiple factors deciding viral host is expected.




