1. Background
2. Objectives
3. Patients and Methods
3.1. Study Population
3.2. PCR and Sequencing
| Sequences | Primers | Reference | |
|---|---|---|---|
| A | this study | ||
| F1A | 1088 | GAC CAT TTC ATC ATC ATG TCC CA | |
| R1A | 1425 | TGT ATG CGG CGG CGA ACA AGA CC | |
| F2A | 1113 | CTT CGG AGG GCC GTT GAC TAC TTA GCG | |
| R2A | 1413 | CGA ACA AGA CCC CCC AGT GGG | |
| B | |||
| M105 | 1292 | ATG GCA TGG GAC ATG ATG ATG | (27) |
| R1B | 2061 | TAG GCC CTA AGT TGC AGG GTG GA | this study |
| M106 | 1298 | TGG GAC ATG ATG ATG AAT TGG | (27) |
| R2B | 2022 | CAA ACC CTG TGG AAT TCA TCC AG | this study |
| C | this study | ||
| F1C | 1743 | GGC TGG GGA ACT ATC AGC TAT | |
| R1C | 2636 | AAA CCC ATG AGT CCC CGC AGC C | |
| F2C | 1773 | TCG GGC CCC AGT GAT GAC AAG | |
| R2C | 2612 | AGC CGC GTT TAG GAC AAT GAC GTT CT |
3.3. Analysis of N-Linked Glycosylation Sites
3.4. Prediction of B-Cell Epitopes
3.5. Peptide Design
3.6. GenBank Accession Numbers
4. Results
4.1. Sequence Alignment and Genetic Distances
| Genetic Distancesa,b | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sequence | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | |
| 1 | ZADGM7890 | ||||||||||||||||||
| 2 | ZADGM6544 | 0.12 | |||||||||||||||||
| 3 | ZADGM4227 | 0.13 | 0.12 | ||||||||||||||||
| 4 | ZADGM1908 | 0.14 | 0.12 | 0.13 | |||||||||||||||
| 5 | ZADGM1707 | 0.10 | 0.12 | 0.11 | 0.12 | ||||||||||||||
| 6 | ZADGM651 | 0.13 | 0.13 | 0.14 | 0.13 | 0.12 | |||||||||||||
| 7 | ZADGM308 | 0.12 | 0.12 | 0.12 | 0.12 | 0.10 | 0.11 | ||||||||||||
| 8 | ZADGM6485 | 0.13 | 0.10 | 0.13 | 0.12 | 0.12 | 0.12 | 0.10 | |||||||||||
| 9 | ZADGM4124 | 0.13 | 0.13 | 0.13 | 0.13 | 0.12 | 0.14 | 0.13 | 0.12 | ||||||||||
| 10 | ZADGM2439 | 0.14 | 0.15 | 0.15 | 0.15 | 0.12 | 0.14 | 0.12 | 0.13 | 0.14 | |||||||||
| 11 | ZADGM2352 | 0.13 | 0.14 | 0.14 | 0.14 | 0.13 | 0.12 | 0.11 | 0.12 | 0.14 | 0.13 | ||||||||
| 12 | ZADGM525gp | 0.14 | 0.14 | 0.15 | 0.14 | 0.12 | 0.15 | 0.14 | 0.14 | 0.14 | 0.16 | 0.16 | |||||||
| 13 | ZADGM869 | 0.14 | 0.14 | 0.15 | 0.13 | 0.13 | 0.10 | 0.13 | 0.13 | 0.14 | 0.14 | 0.14 | 0.17 | ||||||
| 14 | ZADGM3013 | 0.15 | 0.14 | 0.14 | 0.15 | 0.12 | 0.15 | 0.13 | 0.12 | 0.15 | 0.15 | 0.15 | 0.17 | 0.15 | |||||
| 15 | ZADGM0518 | 0.9 | 0.12 | 0.11 | 0.12 | 0.08 | 0.13 | 0.10 | 0.11 | 0.11 | 0.12 | 0.13 | 0.14 | 0.13 | 0.11 | ||||
| 16 | ZADGM2582 | 0.14 | 0.14 | 0.13 | 0.14 | 0.13 | 0.13 | 0.12 | 0.13 | 0.14 | 0.13 | 0.13 | 0.16 | 0.15 | 0.15 | 0.13 | |||
| 17 | ZADGM2088 | 0.14 | 0.13 | 0.13 | 0.13 | 0.12 | 0.12 | 0.11 | 0.12 | 0.13 | 0.14 | 0.12 | 0.13 | 0.14 | 0.14 | 0.12 | 0.12 | ||
| 18 | ZADGM1104 | 0.13 | 0.14 | 0.13 | 0.14 | 0.12 | 0.13 | 0.09 | 0.12 | 0.14 | 0.14 | 0.14 | 0.13 | 0.15 | 0.15 | 0.12 | 0.14 | 0.13 | |
aThe values range between 0 (0%) and 1 (100%) substitutions per nucleotide site.
bThe numbers 1-18 corresponds to the sequence number on the vertical side.
4.2. Analysis of E1 and E2 N-Linked Glycosylation
| Sequence | Probability at Glycosylation Site a,b | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| E1 | No of Sites | E2 | No of Sites | ||||||||||||||
| 196 | 209 | 234 | 305 | 325 | 417 | 430 | 448 | 476 | 533 | 541 | 557 | 623 | 645 | ||||
| 1 | ZADGM7890 | + | ++ | ++ | + | - | 4 | + | ++ | - | - | + | ++ | + | + | + | 7 |
| 2 | ZADGM6544 | + | ++ | + | - | - | 3 | + | ++ | - | - | - | ++ | + | + | + | 6 |
| 3 | ZADGM4227 | + | ++ | ++ | + | - | 4 | ++ | ++ | - | - | + | - | + | + | + | 6 |
| 4 | ZADGM1908 | + | ++ | + | - | - | 3 | ++ | + | - | + | ++ | + | + | + | - | 7 |
| 5 | ZADGM1707 | + | ++ | + | + | - | 4 | ++ | + | + | + | + | ++ | + | + | + | 9 |
| 6 | ZADGM651 | + | ++ | ++ | + | - | 4 | ++ | + | - | - | + | + | + | + | - | 6 |
| 7 | ZADGM308 | + | ++ | ++ | - | - | 3 | ++ | - | + | - | + | ++ | + | + | - | 6 |
| 8 | ZADGM6485 | + | ++ | ++ | - | - | 3 | ++ | ++ | - | + | + | ++ | + | + | - | 7 |
| 9 | ZADGM4124 | + | ++ | + | + | - | 4 | ++ | + | - | - | + | + | + | + | + | 7 |
| 10 | ZADGM2439 | + | ++ | + | + | - | 4 | ++ | + | - | + | + | + | + | + | - | 7 |
| 11 | ZADGM2352 | + | ++ | ++ | - | - | 3 | ++ | ++ | - | - | + | ++ | + | + | - | 6 |
| 12 | ZADGM525gp | - | ++ | ++ | + | - | 3 | ++ | + | - | - | - | + | + | + | + | 6 |
| 13 | ZADGM869 | + | ++ | + | - | - | 3 | + | ++ | - | - | + | + | + | + | + | 7 |
| 14 | ZADGM3013 | + | ++ | + | + | - | 4 | ++ | ++ | ++ | + | + | ++ | + | + | + | 9 |
| 15 | ZADGM0518 | + | ++ | + | - | - | 3 | ++ | ++ | ++ | - | - | ++ | + | + | - | 6 |
| 16 | ZADGM2582 | + | ++ | + | - | - | 3 | ++ | + | - | + | + | ++ | + | + | + | 8 |
| 17 | ZADGM2088 | - | ++ | ++ | - | - | 2 | ++ | + | - | - | + | + | + | + | - | 6 |
| 18 | ZADGM1104 | + | ++ | ++ | - | - | 3 | ++ | ++ | + | + | + | + | + | + | + | 9 |
aNumbering is based on the M62321 full-length sequence.
bGlycosylation probability is shown by +++ (probability > 70%), ++ (probability between 60 and 70%), + (probability between 50 and 60%), and - (not predicted).
4.3. B-Cell Epitopes Prediction
| Position | Predicted Epitopes | Antigen score | Genotype 1a | Genotype 1b | Genotype 2 | Genotype 3 | Genotype 4 | Genotype 6 |
|---|---|---|---|---|---|---|---|---|
| 504 | GPVYCFTPSPVVVGTT | 1.1613 | 92 | 97 | 88 | 94 | 89 | 73 |
| 675 | LPCSFTPTPALSTGLI | 0.5340 | 0 | 0 | 0 | 0 | 0 | 0 |
| 685 | LSTGLIHLHQNIVDTQ | 0.6639 | 0 | 0 | 0 | 0 | 0 | 2 |
4.4. Peptide Design
| Positiona | Peptides | Length | Molecular Weight | Theoretical PI | Extinction Coefficient (cm-1 M-1) | Instability Index | Alphatic Index | GRAVY | Composition of hydrophobic AA’s, %b | N-linked glycosylation Cc | N-linked Phosphorylation |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 201 | YHTNDCPNSSI | 14 | 1611.7 | 5.08 | 2980 | 17.36 | 69.29 | -0.343 | 21.4 | + | - |
| 262 | VDYLAGGAA | 9 | 835.9 | 3.80 | 1490 | -3.53 | 108.89 | 0.867 | 22.2 | - | - |
| 304 | CNCSIYSGH | 9 | 983 | 6.72 | 1615 | 5.69 | 43.33 | -0.056 | 11.1 | ++ | - |
| 314 | TGHRMAWDMMMNWSPT | 16 | 1952.2 | 6.41 | 11000 | 29.16 | -0.706 | 6.25 | 37.5 | - | - |
| 352 | HWGVLFAAAY | 10 | 1134.3 | 6.74 | 6990 | -4.25 | 98 | 1.040 | 40 | - | - |
| 562 | VKTCGAPPC | 9 | 875 | 8.03 | 125 | 30.68 | 43.33 | 0.311 | 11.1% | - | - |
| 585 | TDCFRKHP | 8 | 1003.1 | 7.92 | 0 | 5.15 | 0 | -1.512 | 12.5 | - | - |
| 645 | ACNWTRGERCDL | 12 | 1423.5 | 6.1 | 5625 | 27.31 | 40.83 | -0.908 | 16.7 | + | - |
| 664 | LSPLLHTTTQ | 10 | 1110.2 | 6.74 | - | 37.86 | 117 | 0.020 | 30% | - | - |
| 675 | AILPCSFTPTPALSTGLIHLHQNIVDTQ | 28 | 2988.4 | 5.97 | 0 | 28.19 | 115 | 0.421 | 32.1 | - | - |
| 725 | FLLLADAR | 8 | 918.1 | 5.84 | - | -1.86 | 171.25 | 1.225 | 50 | - | - |
aNumbering is based on the M62321 full-length sequence.
blist of hydrophobic amino acids (Leu, Val, Ile, Met, Phe and Trp),
c Glycosylation probability is shown by +++ (probability > 70%), ++ (probability between 60 and 70%), + (probability between 50 and 60%), and - (not present).