In this study, we screened seven
Filoviridae family genomes for the presence, abundance and composition of SSR tracts. The incidence of SSRs (mononucleotide to hexanucleotide repeats) was proportional to the genome size of the
Filoviridae family with 64 - 80 SSRs per genome as compared to potyviruses (23 - 45 SSRs) (
29) or Human immunodeficiency virus isolates (22 - 48 SSRs) (
30) yet higher than geminivirus (4-19) with a smaller genome. Though relative density tends to be positively correlated with genome size in some fungal and other genomes (
31-
33) yet for
Filoviridae family species both relative density and relative abundance were non-significantly correlated with genome size and GC content.
The sequence composition of repeats determines the abundance of microsatellites. In the
Filoviridae family, AC/CA repeats predominated whereas GC/CG repeats were rare. Furthermore, CG/GC repeats were also rare in geminivirus, human, Drosophila, Arabidopsis thaliana,
Caenorhabditis elegans, yeast (
3), fungi (
31,
34) and some eukaryotes (
35). Di-nucleotide repeats are more prevalent than trinucleotide repeats due to instability of dinucleotide repeats because of higher slippage rate (
35). The repeat sequences may provide a molecular device for faster adaptation to environmental stresses (
9,
19,
36); thus may accelerate the evolution of the
Filoviridae family.
Notably, no significant correlation was observed between genome size and two of the microsatellite features (relative density and relative abundance), concurrent with
E. Coli/HIV-1. The analysis of cSSRs revealed some interesting results. These compound microsatellites are reportedly involved in regulation of gene expression and at functional level of proteins in several species (
3). Though their significance in the
Filoviridae family is not clear, our results suggest the presence of a possibly complex regulation at the functional level. Further, the analysis of dMAX (10 to 50) showed that cSSR percentage in the five analyzed species of
Filoviridae family increased with increase in dMAX, though not in a linear fashion. Approximately, 97% of the extracted cSSRs constituted of two motifs only. The largest compound microsatellite in the
Filoviridae family was composed of three SSRs whereas, in prokaryotes the largest microsatellite has four and in eukaryotes three SSRs. In general, the cSSR incidence decreases with increase in complexity. Interestingly, cSSRs percentage varied between 1.25 - 5.13% in the
Filoviridae family genome; 0 - 15.15% in potyvirus genome, (
28) 0 - 24.24% in HIV-1 genomes, 4 - 25% in eight eukaryotic genomes (
17), and 1.75 - 2.85% in
E. coli genomes (
37). The distribution of microsatellite in the viral genome is organism specific rather than host specific. This is supported by the fact that the taxonomy of
Filoviridae family shows no comparable congruence with host taxonomy, and species from the same lineage may have quite unrelated hosts (
38). Interestingly, each
Filoviridae family species possesses at least one cSSR, which might be causing their variation and evolution.
Microsatellite regions with higher mutation rates as compared to the rest of the genome (
16,
39) play a crucial role in genome evolution by acting as a source of quantitative genetic variation (
40). The SSR mutation rate is known to be affected by motif length, motif sequence, number of repeats and purity of repetition (
41). Single base substitution can stabilize pure microsatellites by reducing the purity of repetition. The functional role of tandem repeats in viruses, remains to be fully elucidated. However, with the repetitive sequence allegedly acting as a hot spot for recombination (
42), we postulate their involvement in genetic events such as recombination, replication, and repair mechanisms that drive sequence diversity leading to formation of the genetic basis of adaptation. The microsatellites in
Filoviridae family genomes may serve as one of the tools for better understanding of viral genetic diversity and its implications.