Designing a Chimeric Vaccine Against Colorectal Cancer

authors:

avatar Ziba Veisi Malekshahi 1 , avatar Masoumeh Rajabibazl 2 , 3 , avatar Walead Ebrahimizadeh 4 , avatar Jafar Amani ORCID 5 , * , avatar Babak Negahdari ORCID 1 , **

Department of Medical Biotechnology, School of Advanced Technologies in Medicine, Tehran University of Medical Sciences, Tehran, IR Iran
Department of Clinical Biochemistry, Faculty of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, IR Iran
School of Advanced Technologies in Medicine, Shahid Beheshti University of Medical sciences, Tehran, IR Iran
Department of Medical Biotechnology, School of Advanced Technologies in Medicine, Shiraz University of Medical Sciences, Shiraz, IR Iran
Applied Microbiology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, IR Iran
Corresponding Authors:

how to cite: Veisi Malekshahi Z, Rajabibazl M, Ebrahimizadeh W, Amani J, Negahdari B. Designing a Chimeric Vaccine Against Colorectal Cancer. Int J Cancer Manag. 2017;10(12):e7743. https://doi.org/10.5812/ijcm.7743.

Abstract

Background:

Colorectal cancer is the major cause of death worldwide. Recently, cancer immunotherapy has been used in cancer prevention and treatment which has led to immune system response to cancerous cells and their elimination. Recent studies were conducted to specifically target tumor marker to design vaccine.

Objectives:

Here, we designed a vaccine chimeric including CEA and CA19-9 against colorectal cancer.

Methods:

The construct was analyzed, using bioinformatics tools and servers. The physicochemical properties, structures, antigenicity, stability, MHC binding properties, and ligand-receptor docking of chimeric protein (CE-CA) were predicted.

Results:

The results showed the structure of CE-CA has suitable form and separates each domain; furthermore, the epitope mapping did not change after combination.

Conclusions:

The results showed that the construct can be appropriate vaccine against colorectal cancer and could generate potent immune response against this cancer. In silico tools are useful to design vaccine.

1. Background

Colorectal cancer is considered a major gastrointestinal. Colorectal cancer is the second cancer-related cause of death after lung cancer around the world (1). Based on the epidemiologic studies, each year 1 million new cases are diagnosed with colorectal cancer and 500 000 patient with cancer lose their life to the disease (2). The highest incidence of colorectal cancer is reported in North America, Australia, New Zealand, Europe, and Japan and the lowest incidence is in South America, Asia, and Africa (3) .Colorectal cancer has the third highest number of patients and is the second reason cause of death due to cancer (4). Both genders have equal chance of developing colorectal cancer and the disease is mostly seen in elderly people. Approximately, 90% of cases are diagnosed after the age of 50; thus, the target group for diagnostic assayed is this group (5).

Common treatment includes surgery, chemotherapy, and radiotherapy, which in most cases are accompanied by the regression of cancer and metastasis. Furthermore, these treatment methods are considered invasive and hazardous and are ineffective in prevention of metastasis. Studies show that the immune system is capable against cancer and can prevent metastasis if properly guided. Thus, enhancing the immune system and introduction of tumor associated antigens to the immune cells can greatly affect the outcome of treatment (6). One of the methods in boosting the immune system toward battling cancer is the use of cancer vaccines, which has significantly gained the interest of cancer researchers in the past few years. The main goal of cancer immunotherapy is immune reorganization and removal of cancer cells. Several research studies showed that antigens expressed by tumor cells can elicit specific cellular and humoral immune responses (7). Antigen cancer vaccines are important in immunotherapy consisting of cytotoxic T lymphocyte (CTL) epitopes from tumor-specific antigens (TSAs) or tumor-associated antigens (TAA) (8).

One of the major drawbacks in developing an effective diagnosis method or treatment in colorectal cancer is lack of specific tumor marker. Most tumor markers in colorectal cancer are genetic mutation that are detected through the analysis of multiple genes, such as P53, Ras or microsatellite instability and assaying for loss of heterozygosity long arm of chromosome 18 (9). The most commonly used protein markers are carcinoembryonic antigen (CEA) and CA19-9. Using each of these markers alone does not provide sufficient evidence to conform or rule out cancer. Usually both of these markers are used to provide reasonable evidence from the status of cancer (10, 11). If a cancer vaccine is to be developed against colorectal cancer, these 2 antigens need to be used. The proper choose of domains of the antigens and constructing a new cancer antigen that can stimulate the immune system against colorectal cancer cells have vital impact on the quality of the vaccine (12). Thus, bioinformatics plays an important role in design and development of a cancer vaccine. There are several cancer vaccines that have been developed so far, among those, regarding colorectal cancer, CEA had the most application (13). However CEA alone is not recognized as a specific colorectal cancer tumor associated antigen hence, using CEA in combination with another vaccine candidate could increase the specificity of the vaccine. In that regard, CA19-9 antigen comprises the appropriate characteristics of a vaccine candidate. CA19-9 is a cell-surface antigen that has a significant expression increase in colorectal cancer, and it has already been approved as a marker of colorectal cancer. Thus, based on available data, these 2 antigens in combination can provide specificity for production of colorectal cancer vaccine. In this study, using bioinformatics study, we aim at designing a colorectal cancer vaccine based of immune-dominant domains of CEA and CA19-9 proteins.

2. Methods

2.1. Domains Selection and Construct Design

CEA is a large antigen with a molecular weight of 180 kDa and it is significantly glycosylated; thus, approximately 60% of the weight of complete protein with PTM belongs to carbohydrates (14). Since most expression systems including simple Eukaryotes are unable to correctly mimic the glycosylation pattern of mammals, chosen domains need to have minimum glycosylation sites. Study on CEA glycosylation pattern revealed that domains 1, 4, 5, and 7 with respectively 2, 4, 3, and 3 glycosylation sites have less glycosylation among the proteins 7 domains. Domain 7 is located in the c-terminus of the proteins and, therefore, it is less likely to be recognized and interacted with the immune system components. Thus, domains 1, 4, and 5 are chosen for production of final construct. These domains contain the immunogenic sites and can fold independently to mimic the conformation of complete CEA. Unlike CEA, CA19-9 is a much smaller protein with only 1 glycosylation site. Hence, after the removal of firs 64 amino acids that are either signal peptide or cytoplasmic domain, the complete sequence was included in the final chimeric construct. This chimeric construct includes domain 1 of CEA attached to domains 4 and 5 via flexible Seine-Glycine linker, which are, then, linked to the CA19-9 sequence by a rigid linker to provide independent folding for each section of protein. The construct was named CE-CA. The related sequences of major antigens colorectal cancer (CEA, CA19-9) were obtained from Uniprot/Swiss-port and NCBI (15). The sequences were submitted to the basic local alignment search tool (BLAST) to confidence that the selected sequences were conserved (16).

2.2. Antigenicity and Allergenicity Evaluation

APPLE and ALGPred servers were performed to analyze the allergenicity of the construct from sequence derived structural and physicochemical properties of the whole protein. The accuracy of ALGPred could be over than 80% by combined approaches (17). VaxiJen server was used to the prediction of protective antigens, tumor antigens, and subunit vaccines (18).

2.3. 3D structural Model Prediction

Since proper folding of the recombinant chimeric construct have vital impact on the B-cell and T-cell immune response, the construct needs to have similar folding compared to the template proteins, namely CEA and CA19-9. Analysis of the second and third structures were carried out by GOR (19) and I-TASSER servers were, then, aligned with both CEA and CA19-9 (20).

2.4. Homology Modeling

Homology modeling of chimeric protein was performed by phyre2 (protein Homology/analogy Recognization Engine V 2.0) server at (21).

2.5. Evaluation of Model Stability and Validation

3D structural stability of chimeric protein was analyzed by Swiss-pdbViewer software for energy minimization and RAMPAGE server (22).

For Tertiary structure validation, ProSA server (23), PROCHECK server (24), and ERRAT server (25) were used. ProSA-web is a server to check 3D models of protein structure for potential errors. The proSA-web z-score is shown in a plot that contains the z-score of experimentally determined protein structure in PDB. The residue-by-residue stereo chemical qualities of protein structure were validated, using Ramachandran plot obtained from PROCHECK. The ERRAT server is a protein structure verification algorithm evaluating the statistics of non-bounded interactions between different atom types compared to a database of reliable high-resolution crystallography structures.

2.6. Analysis of mRNA

The secondary mRNA structure was predicted by GeneBee (26) and mRNA resulted from both these sequences analyzed by mFold server (27).

2.7. Codon Optimization

The nucleotide sequence of the CE-CA construct was optimized by Genscript Optimization Gene TM algorithm (www.genescript.com, Piscataway.newjersy USA) based on codon bias of E. coli (28).

2.8. Analysis of Physical and Chemical Properties of the CE-CA Chimeric Protein

Physical and chemical properties were obtained using the Expasy ProtParam (29). Physical and chemical properties included amino acid composition, molecular weight, pI, Grand average of hydropathicity (GRAVY), solubility in natural pH and pH ranging from 4 to 12, half-life and total number of positive and negative residues in prokaryotic and eukaryotic systems.

2.9. Solvent Accessibility Prediction

The prediction of protein solubility were performed by Proso server (30) and SOLpro server (31).

2.10. Analysis of Conserved Domains and Protein Localization

Transmembrane and conserved domain were performed, using DAS server (32). The amino acid conservation was determined by PRALINE server (33). Also, conserved functional and structural amino acid were performed by ConSurf server (34). The protein localization was estimated, using CELLO server in eukaryotic and prokaryotic system (35). The prediction of membrane protein topology and signal peptides were analyzed by OCTOPUS server (36). In addition, SignalP 4.1 server (37) predicted presence and location of signal peptide cleavage site in amino acid sequence.

2.11. Prediction of B-Cell Epitopes

The linear B-cell epitopes were carried out by Bcepred server, ABCpred server, and Bepipred servers. Discotope 1.2 (38-40) and SEPPA servers were used to predict discontinuous B-cell epitopes (41). In addition, Ellipro server (42) was used to predict both linear and discontinuous B-cell epitopes. Ellipro is a method that predicts epitopes based on solvent accessibility and flexibility.

2.12. Prediction of Cleavage Sites

Proteasome cleavage sites of the chimeric protein were predicted by Netchop 3.1 (43), MAPPP (44), and PCPS (45). Peptides binding affinity to TAP protein was computed by TAPPred (46).

2.13. Prediction of MHC Binding Peptides Affinity

The chimeric protein was analyzed for MHC binding peptides. NetCTL (47), SYFPEITHI (48), and CTLpred (49) were used for MHC-presented epitopes and MHC-specific anchor and peptide motifs. NetMHC server (50) was used to product a neural network prediction of binding affinities for MHC.

2.14. Prediction of T-Cell Epitopes

For prediction of peptides from the antigenic sequence binding with MHC class I, Propred-I (51) and nHLAPred servers were used (52).

For prediction of peptides from the chimeric protein binding with MHC classII HLApred (52), MHC2Pred (53) and Propred were used (54).

2.15. Prediction of Post-Translational Modification

The prediction of N-glycosylation sites in chimeric protein was estimated by NetGlycate 1.0 (55) and NetNGlyc 1.0 servers (56). The prediction of O-glycosylation sites in this protein was performed by YinOYang 1.2 (57) and NetOGlycate servers (58). Myristoylator server (59) was used for prediction of N-terminal myristoylation site and 6 was estimated for prediction of phosphorylation sites. Prediction of potential c-terminal GPI-modification site was done by GPI server (60).

3. Results

3.1. The Design and Construction of Chimeric Gene

Two fragments of proteins, 551 amino acids from CEA, and CA19-9 of major colorectal cancer antigens were selected. These antigens were selected as a chimeric structure. Seine-Glycine linker was designed to separate the domains of CEA. Linkers consisting of EAAAK repeats were used to separate the domains of CEA and CA19-9. It was shown that helix formation can be stabilized by these linkers between different domains. Four repeated EAAAK sequences were introduced between 2 domains for more flexibility and efficient separation. The EcoRI and HindIII restriction sites for cloning in prokaryotic vectors were successfully introduced at the N- and C-terminal of sequences, respectively. The arrangement of fragment junction and linkers sites are shown in Figure 1A.

A, Schematic Model of Construction; B, analysis of chimeric CE-CA protein secondary structure; C, prediction of 3D structure of chimeric protein (CE-CA) by I-TASSER server. The result was viewed by Accelrys Discovery studio Visualizer 1.7 software.
A, Schematic Model of Construction; B, analysis of chimeric CE-CA protein secondary structure; C, prediction of 3D structure of chimeric protein (CE-CA) by I-TASSER server. The result was viewed by Accelrys Discovery studio Visualizer 1.7 software.

3.2. Antigenicity and Allergenicity Evaluation

Antigen index by vaxiJen server for CEA, CA19-9, and chimeric protein was 0.45, 0.40, and 0.48, respectively. The allergenicity analyses by APPLE and ALGPred servers showed the antigen as non-allergen.

3.3. Secondary Structure Prediction

The prediction of the secondary structure of the chimeric protein was performed by GOR server. The results showed that total residue is 598, of which alpha helix (15.38%), extended strand (24.25%), and random coil (60.37%) are structural constituents of the chimeric protein. The secondary structure prediction of the chimeric protein is shown in Figure 1B.

3.4. Homology Modeling

Phyre 2 employs the alignment of hidden Markov models via HH search to improve alignment accuracy and detection rate. This model also incorporates Poing, a new ab initio folding simulation to model regions of protein with no detectable homology with known structures.

3.5. Tertiary Structure Prediction

I-TASSER server was used for the prediction of the tertiary structure of protein. Tertiary prediction results of the chimeric protein construction using I-TASSER showed a protein with 4 domains attached together with a linker (Figure 1C). The confidence score (C-score) for estimating the quality of the predicted models is typically in the range of -5 to 2. The C-score of models predicted by I-TASSER was -1.48. The z-score of the input structure was within the range of scores typically found for native protein of the similar size. Also, the template modeling (TM) -score for this model was 0.53 ± 0.15 and root-mean-square deviation (RMSD) was 11.2 ± 4. The tertiary structures of the chimeric protein construct are shown in Figure 1C.

3.6. Evaluation of Model Stability and Validation

The quality and potential errors of 3D structure were investigated by ERRAT server and ProSA-web. The z-score of structure was -3.15 and the overall quality factor plot (ERRAT) of structure was 64.07%. The quality and potential errors of 3D structure showed in Figures 2A, 2B, and 2C.

A, Z-score Plot for 3D Structure of Chimeric Protein Displayed Using NMR Spectroscopy (Drak Blue) and X-Ray Crystallography (Light Blue); B, the plot showed local model quality using plotting energies as a function of amino acids sequence position; C, The overall quality factor plot (ERRAT) of structure is 64.07%. The result of ERRAT plot showed the region of the 3D structure that can be disproved at the 95% confidence level in gray lines and region of the 3D structure that can be disproved at the 99% level displayed in black lines; D, Analysis of mRNA stability and start codon position in the structure and free energy details for mRNA structure by mfold server.
A, Z-score Plot for 3D Structure of Chimeric Protein Displayed Using NMR Spectroscopy (Drak Blue) and X-Ray Crystallography (Light Blue); B, the plot showed local model quality using plotting energies as a function of amino acids sequence position; C, The overall quality factor plot (ERRAT) of structure is 64.07%. The result of ERRAT plot showed the region of the 3D structure that can be disproved at the 95% confidence level in gray lines and region of the 3D structure that can be disproved at the 99% level displayed in black lines; D, Analysis of mRNA stability and start codon position in the structure and free energy details for mRNA structure by mfold server.

3.7. mRNA Structure Prediction

The secondary structure of mRNA was predicted using mfold. The 5’ terminus of the gene was folded typically as in all bacterial gene structures. The minimum free energy for secondary structure formed by RNA molecules was predicted. All 42 structural elements obtained in this analysis showed RNA Folding. The mRNA structure had a free energy of -527.74 kcal/mol and the first nucleotide at 5' did not have a long stable hairpin or pseudoknot (Figure 2D). The data have shown that the mRNA was stable enough for efficient translation in the new Host (Figure 2D).

3.8. Codon Optimization Analysis

Life technologies “Gene Optimizer” service is a gene optimization technology that can modify both recombinant and naturally gene sequences to gain the highest conceivable level of expression in any expression system. Both the wild type and construct were analyzed for their codon bias and GC content. The analysis of the sequence encoding, the optimized chimeric construct, and wild type gene are shown in Figures 3 - 5. The codon adaptation index (CAI) of chimeric construct was 0.77, while that of wild type gene was 0.66 (Figure 3). The percentage of codon having a frequency distribution of 90 to 100 in wild chimeric gene was 45% and 85% for E. coli and mus, respectively, which was significantly improved to 95% for E. coli and 85% for mus in the optimized gene sequence (Figure 3). The overall GC content was reduced from 56.78 to 51.54, which should increase the overall stability of mRNA from the synthetic gene. Within the recombinant chimeric construct, splice sites, polyadenylation signal, instability elements, and all the cis-acting sites that may have a negative influence on the expression rate, were removed. Furthermore, the necessary restriction sites (EcoRI and HindIII) were at the end of the sequence for cloning purpose.

Analysis of the Sequence Encoding the Optimized Chimeric Construct and Wild Type Gene; A, codon adaptation index (CAI); B, percentage distribution of codons.
Analysis of the Sequence Encoding the Optimized Chimeric Construct and Wild Type Gene; A, codon adaptation index (CAI); B, percentage distribution of codons.
Evaluation of Model Stability Based on Ramachandran Plot Plot
Evaluation of Model Stability Based on Ramachandran Plot Plot
The Results of Glycation Sites on the Chimeric Protein; NetGlycate (A) and YinOYang 1.2 (B).
The Results of Glycation Sites on the Chimeric Protein; NetGlycate (A) and YinOYang 1.2 (B).

3.9. Evaluation of Model Stability

The profile of energy minimization was calculated by spdbv (Swiss-pdbviewer). The amount -23911.445 kcal/mol indicated that the recombinant protein had acceptable stability compared to that of original structure of each domain. Additionally, the data obtained by Ramachandran plot confirmed the structural stability of the protein (Figure 4).

3.10. Analysis of Physical and Chemical Properties of the CE-CA Protein

The primary structure analysis of a chimeric protein was performed using ProtParam software. The number of amino acids was 598. The molecular weight of chimeric protein was about 66.558 KDa. Isoelectric point (pI) was 8.38. The total numbers of negatively (Asp + Glu) and positively (Arg + Lys) charged residues were 53 and 57, respectively. The half-life of this chimeric protein was 30 hours (mammalian reticulocytes, in vitro), > 20 hours (yeast, in vivo), and > 10 hours (Escherichia coli, in vivo). Instability index was computed to be 43.97, thus the chimeric protein as unstable. Aliphatic index of chimeric protein was 74.48. Extinction coefficient of chimeric protein at 280 nm was 95855 M-1cm-1. The grand average of hydropathicity (GRAVY) was -0.466.

3.11. Solvent Accessibility Prediction

Solvent accessibility prediction was estimated using Proso server. The solvent accessibility distribution was characterized, using the major hydrophobic and polarity properties of residual patterns. These patterns identified that the mean residue accessible surface area (ASA) has given a high solvent accessibility value, approximately 50% (Table 1).

Table 1.

Accessible Surface Area (ASA) Calculation for CE-CA Protein Complex (A); the chart of Protein Charge Based on pH (B)

Probe radiusPOLAR Area/EnergyAPOLAR Area/EnergyTotal Area/EnergyNumber of Surface AtomsNumber of Buried Atoms
1.40010330.5023621. 6733952.1730121683

In order to be confident about the lack of protein precipitant in cell during expression and solubility of protein, Proso server was used. According to the algorithm of the server, scores above 0.5 are soluble form. The solubility score of chimeric protein was 0.842. Thus, the result showed that recombinant protein has a high solubility. The study of protein charge at pH (4 - 10) indicated that protein in physiologic pH is stable and has a high solubility. The amount of protein charge was obtained by protein calculator server (Table 1).

3.12. Analysis of Conserved Domains and Protein Localization

Prediction of subcellular localization: subcellular localization of CE-CA in prokaryotic and eukaryotic systems was predicted by CELLO. The result of localization prediction showed that chimeric protein as an extracellular protein in both prokaryotic and eukaryotic systems.

3.13. Prediction of B-Cell Epitopes

Different parameters such as hydrophobicity, flexibility, exterior accessibility, exposed surface, and antigenicity were used to predict he chimeric protein epitopes. The epitopes located on the surface of the protein could interact easily with antibodies. Bcepred software was used in different parameters including hydrophobicity, Antigenicity, flexibility, accessibility, polarity, and exposed surface to determine the continuous B-cell epitope (Table 2). The results of this analysis included peptides and their corresponding threshold scores. The higher the threshold score, the higher the specificity and binding affinity. Discontinuous B-cell epitopes were predicted by Ellipro software (Table 3). The results of Ellipro software showed 6 set of discontinuous B-Cell epitopes. Discotop server was used for the prediction of conformational B-cell Epitopes (Table 4). Also, SEPPA server was used for conformation B-cell epitope Prediction.

Table 2.

Continuous B-Cell Epitopes Predicted in Chimeric Protein by Bcepred Software

Prediction ParametersEpitope Positions
Hydrophobicity11-17, 36-45, 52-59, 97-104, 112-128, 131-145, 173-179, 203-209, 216-222, 279-293, 307-318, 395-411, 487-499, 543-551, 556-566.
Flexibility33-42, 109-126, 128-135, 172-178, 268-275, 280-293, 327-339, 367-374, 429-435, 446-452, 472-478, 484-497, 526-532, 541-548, 553-563.
Accessibility11-17, 32-45, 52-71, 81-87, 92-110, 131-143, 149-157, 170-182, 187-195, 199-205, 216-222, 224-235, 243-251, 257-265, 268-278, 287-296, 307-318, 331-349, 357-363, 383-389, 393-399, 405-415, 421-438, 443-455, 475-484, 486-500, 525-551, 553-577.
Turns129-139, 160-166, 173-181, 203-210, 280-288, 370-376, 562-568, 590-598.
Exposed surface333-344, 444-453, 489-499, 530-538, 540-550, 556-567.
Polarity11-17, 33-44, 136-145, 289-295, 307-318, 322-328, 331-344, 381-393, 420-431, 478-501, 513-519, 531-538, 540-550, 555-577, 588-598.
Antigenic propensity15-31, 44-52, 73-80, 85-97, 105-111, 164-174, 180-187, 190-197, 204-218, 232-241, 275-281, 294-300, 326-332, 349-357, 359-377, 417-426, 452-461, 512-530, 548-557.
Table 3.

Discontinuous B-Cell Epitopes Predicted in Chimeric Protein by Ellipro Software

NoResiduesNumber of ResiduesScore
1A:D405, A:V406, A:G407, A:N408, A:K409, A:T410, A:T411, A:F441, A:W442, A:G443, A:P444, A:P445, A:S446, A:K447, A:M448, A:Q449, A:K450, A:P451, A:V474, A:P476, A:G477, A:R478, A:M479, A:R480, A:F482, A:D483, A:D484, A:L485, A:F486, A:R487, A:G488, A:E489, A:T490, A:G491, A:K492, A:D493, A:E495, A:K496, A:S497, A:H498, A:S499, A:W500, A:L501, A:S502, A:T503, A:G504, A:W505, A:F506, A:T507, A:M508, A:V509, A:I510, A:A511, A:V512, A:E513, A:L514, A:C515, A:D516, A:H517, A:V518, A:H519, A:M523, A:V524, A:P525, A:P526, A:N527, A:C529, A:S530, A:Q531, A:R532, A:P533, A:R534, A:L535, A:Q536, A:R537, A:M538, A:P539, A:Y540, A:H541, A:Y542, A:Y543, A:E544, A:P545, A:K546, A:G547, A:P548, A:D549, A:E550, A:I555, A:H565, A:H566, A:R567, A:F568, A:I569, A:T570, A:E571, A:K572, A:R573, A:V574, A:F575, A:S576, A:S577, A:W578, A:A579, A:Q580, A:L581, A:Y582, A:G583, A:I584, A:T585, A:F586, A:S587, A:H588, A:P589, A:S590, A:W591, A:H593, A:H594, A:H596, A:H597, A:H5981210.724
2A:M1, A:K2, A:L3, A:T4, A:I5, A:E6, A:S7, A:T8, A:P9, A:F10, A:N11, A:V12, A:A13, A:E14, A:G15, A:K16, A:E17, A:L21, A:V22, A:H23, A:N24, A:L25, A:P26, A:Q27, A:H28, A:L29, A:F30, A:G31, A:Y32, A:S33, A:W34, A:Y35, A:K36, A:G37, A:E38, A:R39, A:V40, A:D41, A:G42, A:N43, A:R44, A:Q45, A:I46, A:I47, A:G48, A:Y49, A:V50, A:I51, A:G52, A:T53, A:Q54, A:Q55, A:A56, A:T57, A:P58, A:G59, A:P60, A:A61, A:Y62, A:S63, A:G64, A:R65, A:E66, A:I67, A:I68, A:Y69, A:P70, A:N71, A:A72, A:S73, A:L74, A:L75, A:I76, A:Q77, A:N78, A:I79, A:I80, A:Q81, A:N82, A:D83, A:T84, A:G85, A:F86, A:Y87, A:T88, A:L89, A:H90, A:V91, A:I92, A:K93, A:S94, A:D95, A:L96, A:V97, A:N98, A:E99, A:E100, A:A101, A:T102, A:G103, A:Q104, A:F105, A:R106, A:V107, A:Y108, A:P109, A:E110, A:L111, A:G112, A:G113, A:G114, A:G115, A:S116, A:G117, A:G118, A:G119, A:G120, A:S121, A:G122, A:G1231200.724
3A:E139, A:D140, A:E141, A:A143, A:W159, A:W160, A:V161, A:N162, A:N163, A:Q164, A:S165, A:L166, A:P167, A:V168, A:S169, A:P170, A:R171, A:L172, A:Q173, A:T182, A:L183, A:L184, A:S185, A:V186, A:T187, A:R188, A:N189, A:D190, A:V191, A:G192, A:P193, A:G197, A:G217, A:P218340.691
4A:A319, A:R338, A:P339, A:V340, A:N341, A:L342, A:L355, A:G356, A:N357, A:K358, A:T359, A:L360, A:P361, A:S362, A:R363, A:E38160.657
5A:L174, A:S175, A:N176, A:D17740.634
6A:F430, A:V431, A:N432, A:R433, A:T434, A:P435, A:V438, A:F439, A:I44090.599
Table 4.

The Prediction of Discontinuous Epitopes of Chimeric Protein by Discotop Server

Start and End PositionStart and End PositionStart and End PositionStart and End PositionStart and End PositionStart and End PositionStart and End PositionStart and End PositionStart and End Position
27-969-10232-10344-12441-13480-12503-8537-18587-14
38-2181-16233-10345-16442-9481-14504-8547-23588-12
39-1382-14243-14346-13443-8482-18516-10548-21589-10
40-1194-10244-16347-11444-10483-14517-10549-20590-15
41-1395-9255-12360-10445-12484-13520-17550-17591-14
42-15133-14256-12399-11446-20485-15521-23551-22592-12
43-20134-14257-11400-12447-16486-15522-15552-17593-9
44-15135-16258-12405-19448-17487-15523-20553-23594-8
50-12151-15259-13406-13449-18488-14524-16555-19595-8
51-8152-15260-13407-10450-19489-10525-16556-14596-8
52-6154-16262-14408-16451-20490-19526-9557-13597-11
53-8163-12285-15409-18452-16491-22527-8558-20598-12
54-13164-10286-16410-20453-12492-17528-8559-16
55-14165-10287-19411-18454-12493-18529-9560-21
56-16166-11288-11427-17470-11494-15530-9561-21
57-14167-11289-18428-15473-20495-14531-15562-20
58-10175-12338-13429-11474-21496-13532-10563-18
59-15176-12339-16430-7476-26497-13533-13564-18
60-12189-13340-18431-10477-25498-15534-14565-18
67-10205-14341-13432-14478-24499-17535-20566-21
68-7218-15342-13433-16479-18500-13536-16567-27

3.14. Prediction of Cleavage Sites

The cleavage site on the construct protein was analyzed by Net Chop server. The Net Chop server produced neural network predictions for cleavage sites of the human proteasome. Number of cleavage site was 64 (data not shown). The prediction of binding affinity of TAP binder in chimeric protein was performed using TAPPred server. The result of TAPPred showed 41 peptides have high binding affinity and 171 peptides have intermediate binding affinity to TAP protein.

3.15. Prediction of T-Cell Epitopes

CTLpred is a direct method for prediction of CTL epitopes. The score of CTLpred- Predicted epitopes are shown in (Table 5).

Table 5.

MHC Restriction of CTL Epitope Prediction by CTLpred Based on Artificial Neural Network in CE-CA

Peptide RankStart PositionSequenceScoreMHC Restriction
130FGYSWYKGE1.000HLA-B*2705, HLA- B*5301, HLA-Cw*0401, HLA-B*2703
2164QSLPVSPRL1.000HLA-Cw*0401, HLA-G
3180TLTLLSVTR1.000HLA-A24, HLA-Cw*0401, HLA-G

NetCTL 1.2 is a server for prediction of CTL epitopes in the chimeric protein sequence. Based on the prediction methods, the scores were defined and thresholds were explained by using sensitivity and specificity of integrated peptides value (Table 6).

Table 6.

NetCTL-1.2 Predictions Using MHC Super Type A1.Threshold 0.750000; CE-CA Chimeric Protein, Number of MHC Ligands 16 Identified; Number of Peptides 590

PositionSequenceaffaff-RescaleCleTapCOMB
270ITEKNSGLY0.77203.27780.90002.92303.5590
206HSDPVILNV0.46731.98410.9763-0.05102.1280
222TISPSYTYY0.43061.82830.96832.88302.1177
186VTRNDVGPY0.23120.98150.92913.06801.2743
394MNDAPTTGY0.23230.98630.95602.77501.2684
464LVFPNMEAY0.21120.89670.96603.14901.1991
513ELCDHVHVY0.21140.89750.94782.79601.1795
221PTISPSYTY0.19850.84280.97692.35301.1070
574VFSSWAQLY0.17300.73440.93413.28201.0386
242AASNPPAQY0.16040. 68090.96843.09400.9809
347ITDGYVPIL0.16990.72150.96330.83400.9077
534RLQRMPYHY0.13650.57960.97573.01100.8765
321AKANEVFHY0.11680.49600.93823.28900.8011
83DTGFYTLHV0.15730.66800.8121-0.04500.7876
54QQATPGPAY0.11180.47470.94683.04000.7687

3.16. Prediction of MHC Binding Peptide

The conserved peptide sequence with the highest binding score to MHC class I and II was predicted, using propredI and propred servers, respectively (Tables 7 and 8). The result of this servers showed that 14 MHC class I alleles and 19 MHC class II alleles were found to identify the common T-cell epitopes.

Table 7.

The Result of Prediction for MHCI Epitopes by PropredI Server

NoEpitope SequencePosition
1LVHNLPQHL21-29
2IIYPNASLL67-75
3IQNDTGFYT80-88
4GQFRVYPEL103-111
5TCEPEIQNT147-155
6EPEIQNTTY149-157
7QNTTYLWWV153-161
8LLSVTRNDV183-191
9HSDPVILNV206-214
10ITEKNSGLY270-278
11PTTGYSADV398-406
12SLVRVIQRA454-462
13WLSTGWFTM500-508
14RVFSSWAQL573-581
Table 8.

The Result of Prediction for MHCII Epitopes by Propred Server

NoEpitope SequencePosition
1FNVAEGKEV10-18
2LVHNLPQHL21-29
3YVIGTQQAT49-57
4IYPNASLL67-75
5FYTLHVIKS86-93
6WVNNQSLPV160-168
7YRPGVNLSL230-238
8YGSLRGRSR329-337
9YVPILGNKT351-359
10IVSSSSHLL369-377
11IRMNDAPTT392-400
12YRVVAHSSV412-420
13VFIFWGPP438-446
14VRVIQRAGL456-464
15FPNMEAYAV466-474
16FRGETGKDR486-494
17FTMVIAVEL506-514
18YGMVPPNYC521-529
19WAQLYGITF578-586

3.17. Prediction of Post-Translational Modification

The 8 glycation sites on the chimeric protein have been found (Figure 5). NetNGlys predicted 10 asparagine amino acids at position (71,82,133,154,163,178,235,269,357,408) to be N-glycosylated .YinOYang 1.2 predicted 12 glycation sites on this protein. For N-terminal, myristoylation using myristoylator found no site. The result of NetPhos server showed that 34 sites have phosphorylation. The result of GPI showed no GPI lipid anchor site found in the sequence.

4. Discussion

Colorectal cancer is a leading cause of cancer-related deaths all over the world (61). The success of any cancer vaccine depends on the selection of a suitable target antigen and presentation pathway (62). No vaccine currently exists to colorectal cancer. So, it is urgently needed to search for finding an effective vaccine for colorectal cancer. Vaccination can stimulate the immune system and increase adaptive to a disease. Cellular immunity has an important role in cancer vaccines (63, 64). Vaccine efficacy can be assessed with ability to induce CD8+ or CD4+ T cell. MHCI restriction depends on CD8+ cytotoxic T-cells (CTL) and MHCII restriction with CD4+ helper T-cells (TH). Thus, B-cell and T-cell epitopes mapping play a vital role in designing vaccines (65). For over a century, the role of immune system in controlling cancer was ambiguous. The vaccine strategies used against cancer depend on how well the target antigens are defined (66). As recent advances in colorectal cancer, tumor antigen identified specific molecular target in colorectal cancer cells (67). These finding indicated how immune responses are generated in patients with cancer. The data help the development of new vaccine strategies. Tumor-associated antigens (TAA) are proteins expressed using cancer cells that can be defined based on recognition by T cells (68). CEA and CA19-9 are 2 TAAs extensively studied in colorectal cancer. Thus, we selected CEA and CA19-9 proteins that play an important role in colorectal cancer. In colorectal cancer, several antigens have been found overexpressed, but not mutated. The most studied are CEA and CA19-9 (69). CEA is a member of the immunoglobulin superfamily and a useful target for vaccine (70). Several studies have shown that well-differentiated colorectal cancers produced more CEA per gram of total protein. Recent studies showed that the CEA overexpressed in > 90% of colorectal cancers and this antigen is weakly recognized by the immune system (71). CA19-9 level was an important prognostic factor for the recurrence of colorectal cancer. The expression of CA19-9 has been described in colorectal cancer and increased in advanced stages of colorectal cancer (72). The epitope is a part of the antigen that was identified by the immune system. T-cell epitopes on the surface of an antigen present cell (APC) and bound to major histocompatibility (MHC) molecules to induce immune response. The identification of epitopes by T-cells and, then, the induction of immune response have a main role in individual’s immune system. While the prediction of epitopes, investigation of the binding affinity of antigenic peptides to the MHC molecules is the main aim (73).

The chimeric construct contains CEA- CA19-9 peptide for expression in E. coli designed (Figure 1A). In order to design chimeric protein, we selected epitopes from residues 500 to 700 of CEA and amino acid residues130 to 400 of CA19-9. The constructed chimeric protein requires appropriate linkers to bind protein domains. Linkers play a critical role in displaying different domains of chimeric protein; based on the linker containing EAAAk, repeats were designed (Figure 1A). To improve the transcription efficiency and transcript stability and enhance recombinant protein production, codon optimization was performed. Codon adaptation index (CAI) was the major factor used for a gene optimization, with a range of 0 to 1. An ideally biased gene would be a CAI of 1.0, although no natural bacterial gene reaches this theoretical value. CAI index increased from 0.66 in the wild type gene to 0.77 in chimeric optimized gene sequence, indicating that the optimized gene sequence could be expressed well (Figure 3). The prediction of allergenic protein is important for modification of proteins in therapeutics. This result showed that the chimeric protein was not allergen. By using VaxiJen server, immunogenicity of chimeric protein was predicted. Models derived include bacterial, viral, tumor, parasite, and fungal kingdoms. The accuracy rate of server was between 70% to 97%. The solubility score of chimeric protein was 0.842 showing this protein can be purified under normal condition when expressed in E. coli. Messenger RNA secondary has a major role in the protein expression. mfold is the software used for prediction of RNA secondary structure. The characterization of low ΔG and energy of the start codon could help ribosome binding and translation initiation. All 42 structural have folding of the RNA construct at 37°C and the best structure had ΔG = -527.74 kcal/mol. The data from mRNA structure prediction showed that the mRNA was stable enough for efficient translation in E. coli (Figure 2D).

The physicochemical parameters of chimeric protein were analyzed by ProtParam software. The protein pI value (8.38) showed that the protein has an acidic nature. Extinction coefficient of CE-CA at 280nm was high (95855 M-1cm-1). On the basic of instability index, expasy ProtParam classifies the chimeric protein as unstable (instability index, 43.97). For chimeric protein, the grand average of hydropath city (GRAVY) was -0.466. The low GRAVY index of this chimeric protein infers that CE-CA could result in a better interaction with water. GOR IV program was used for secondary structural analysis. The very high coil structural content of CE-CA (60.37%) was due to the rich content of more flexible glycine and hydrophobic proline (Figure 1B). The three-dimensional (3D) structure of proteins was of major importance in functional properties of the protein sequence. The three-dimensional model of the chimeric protein was generated, using I- TASSER online software. Our results showed that I- TASSER software can predict the folds as well as good resolution model for our chimeric protein (Figure 1C). RMSD and TM-score were used to evaluate the predicted models. The best RMSD value was the result of our model on template, which consisted of 598 amino acids. Expected TM-score of 0.53 ± 0.15 confirms the correctness of the model. TM-score more than 0.5 shows an accurate topological model. Its confidence was achieved by Z-score and C-score. The Z-score indicates measures the deviation of the total energy of the structure with respect to an energy distribution derived from random conformations and overall model quality. For native protein, Z-score outside a range characteristic indicate erroneous structure. The results of ProSA-web showed that synthetic chimeric protein has features, which are the characteristic of native structures. In Ramachandran plot analysis, the residues rate was 72.8% favored region, 18.1 % allowed, and 9.1% in outlier region. Thus, based on Ramachandran plot prediction, our chimeric structure showed desirable protein stability. A negligible 7.4% of the residues were in Ramachandran plot analysis to be in outliner region that could probably be due to the presence of chimeric junctions (Figure 4). Identification of B-cell epitopes is a crucial step for satisfactory design of vaccines. B-cell epitopes are the specific region of an antigenic surface protein. On the basis of the structural prediction and solvent accessibility, B-cell epitopes for the chimeric protein could be predicted. In order to predict B-cell epitopes, several different method such as hydrophobicity method, accessibility method, antigenicity method, flexibility method, and secondary structure analysis have been developed. All methods together were performed to obtain results good enough to predict the B-cell epitopes. The results of the most similar B-cell epitopes of this chimeric protein were indicated in above table.

Glycosylation analysis showed that constructing has high glycosylation sites. Glycosylation may decrease antigenicity and immunogenicity of the vaccine product. The existence of myristoylation signal in N-terminal raises vaccine efficiency (Figure 3). Various methods were used for B-cell epitope prediction. The results of prediction showed that there are 14 consensuses MHC class I binding regions and 19 consensus MHCII class binding regions in the chimeric protein sequence (Tables 7 and 8). The prediction of CTL epitopes in chimeric protein structure was done by NETCTL server. This server showed that 16 MHC ligands were identified in CE-CA protein (Table 6). The CTLpred server showed the score of epitopes in chimeric protein. The cutoff score was 0.51 (Table 5). NetMHC 3.4 server predicted peptide binding to different HLA alleles by artificial neural networks (ANNs). Three same peptide sequences with high log score were recognized as strong MHB binder in CE-CA chimeric protein. All MHC binding peptides were used for suitable immune response. Propred server predicted MHCII binding regions in antigenic protein sequences. MHC class-II binding peptide prediction in chimeric protein was performed by Propred server and this server showed 57 alleles query in this protein (Table 8).

4.1. Conclusions

In this study, we designed a novel chimeric vaccine for cancer immunotherapy. Our results showed that epitopes of the chimeric protein could induce B-cell and T-cell mediated immune responses, which are important for a protective vaccine against colorectal cancer.

Acknowledgements

References

  • 1.

    Safaee A, Fatemi SR, Ashtari S, Vahedi M, Moghimi-Dehkordi B, Zali MR. Four years incidence rate of colorectal cancer in Iran: a survey of national cancer registry data - implications for screening. Asian Pac J Cancer Prev. 2012;13(6):2695-8. [PubMed ID: 22938443].

  • 2.

    Ohkado A, Yanabu N, Kimura F, Kamimura K, Sumiyoshi T, Wakuya J, et al. [A case of mesothelioma of the pleura with pneumothorax]. Kyobu Geka. 1989;42(7):565-8. [PubMed ID: 2796096].

  • 3.

    Bretthauer M. Evidence for colorectal cancer screening. Best Pract Res Clin Gastroenterol. 2010;24(4):417-25. [PubMed ID: 20833346]. https://doi.org/10.1016/j.bpg.2010.06.005.

  • 4.

    Irvin G3, Horsley J3, Caruana JJ. The morbidity and mortality of emergent operations for colorectal disease. Ann Surg. 1984;199(5):598-603. [PubMed ID: 6609686].

  • 5.

    Moghimi-Dehkordi B, Safaee A, Zali MR. Prognostic factors in 1,138 Iranian colorectal cancer patients. Int J Colorectal Dis. 2008;23(7):683-8. [PubMed ID: 18330578]. https://doi.org/10.1007/s00384-008-0463-7.

  • 6.

    Fioretti D, Iurescia S, Fazio VM, Rinaldi M. DNA vaccines: developing new strategies against cancer. J Biomed Biotechnol. 2010;2010:174378. [PubMed ID: 20368780]. https://doi.org/10.1155/2010/174378.

  • 7.

    Finn OJ. Cancer vaccines: between the idea and the reality. Nat Rev Immunol. 2003;3(8):630-41. [PubMed ID: 12974478]. https://doi.org/10.1038/nri1150.

  • 8.

    Mellgren SI, Johnsen HJ. [Steroid-induced myopathy. Clinical and muscle pathologic aspects with a case report]. Tidsskr Nor Laegeforen. 1979;99(3):163-4. [PubMed ID: 419494].

  • 9.

    Duffy MJ, van Dalen A, Haglund C, Hansson L, Holinski-Feder E, Klapdor R, et al. Tumour markers in colorectal cancer: European Group on Tumour Markers (EGTM) guidelines for clinical use. Eur J Cancer. 2007;43(9):1348-60. [PubMed ID: 17512720]. https://doi.org/10.1016/j.ejca.2007.03.021.

  • 10.

    Ullenhag GJ, Frodin JE, Jeddi-Tehrani M, Strigard K, Eriksson E, Samanci A, et al. Durable carcinoembryonic antigen (CEA)-specific humoral and cellular immune responses in colorectal carcinoma patients vaccinated with recombinant CEA and granulocyte/macrophage colony-stimulating factor. Clin Cancer Res. 2004;10(10):3273-81. [PubMed ID: 15161680]. https://doi.org/10.1158/1078-0432.CCR-03-0706.

  • 11.

    Dahlquist G, Sterky G, Ivarsson JI, Tengvald K, Wall S. Health problems and care in young families--load of illness and patterns of illness behaviour. Scand J Prim Health Care. 1987;5(2):79-86. [PubMed ID: 3616277].

  • 12.

    Paterson Y, Singh R. Administering a Listeria vaccine with a strain expressing an antigen fused to a truncated LLO protein; cancer vaccines; anticarcinogenic agents. Google Patents; 2010.

  • 13.

    Bos R, van Duikeren S, van Hall T, Lauwen MM, Parrington M, Berinstein NL, et al. Characterization of antigen-specific immune responses induced by canarypox virus vaccines. J Immunol. 2007;179(9):6115-22. [PubMed ID: 17947686].

  • 14.

    Hammarstrom S. The carcinoembryonic antigen (CEA) family: structures, suggested functions and expression in normal and malignant tissues. Semin Cancer Biol. 1999;9(2):67-81. [PubMed ID: 10202129]. https://doi.org/10.1006/scbi.1998.0119.

  • 15.

    Misener S, Krawetz SA. Bioinformatics methods and protocols. 132. Springer Science & Business Media; 1999.

  • 16.

    Gish W, States DJ. Identification of protein coding regions by database similarity search. Nat Genet. 1993;3(3):266-72. [PubMed ID: 8485583]. https://doi.org/10.1038/ng0393-266.

  • 17.

    Casado V, Lluis C, Canela E, Franco R, Mallol J. The distribution of A1 adenosine receptor and 5'-nucleotidase in pig brain cortex subcellular fractions. Neurochem Res. 1992;17(2):129-39. [PubMed ID: 1538830].

  • 18.

    Doytchinova IA, Flower DR. VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines. BMC Bioinformatics. 2007;8:4. [PubMed ID: 17207271]. https://doi.org/10.1186/1471-2105-8-4.

  • 19.

    Sen TZ, Jernigan RL, Garnier J, Kloczkowski A. GOR V server for protein secondary structure prediction. Bioinformatics. 2005;21(11):2787-8. [PubMed ID: 15797907]. https://doi.org/10.1093/bioinformatics/bti408.

  • 20.

    Lissoni P, Galli MA, Barni S, Colombo V, Pelizzoni F, Fumagalli R, et al. [The cardiovascular toxicity of interleukin-2: the pathogenic mechanisms and treatment]. G Ital Cardiol. 1990;20(7):631-5. [PubMed ID: 2245901].

  • 21.

    Kelley LA, Sternberg MJ. Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc. 2009;4(3):363-71. [PubMed ID: 19247286]. https://doi.org/10.1038/nprot.2009.2.

  • 22.

    Lovell SC, Davis IW, Arendall W3, de Bakker PI, Word JM, Prisant MG, et al. Structure validation by Calpha geometry: phi,psi and Cbeta deviation. Proteins. 2003;50(3):437-50. [PubMed ID: 12557186]. https://doi.org/10.1002/prot.10286.

  • 23.

    Wiederstein M, Sippl MJ. ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res. 2007;35(Web Server issue):W407-10. [PubMed ID: 17517781]. https://doi.org/10.1093/nar/gkm290.

  • 24.

    Chambers JE, Yarbrough JD. Parathion and methyl parathion toxicity to insecticide-resistant and susceptible mosquitofish (Gambusia affinis). Bull Environ Contam Toxicol. 1974;11(4):315-20. [PubMed ID: 4433817].

  • 25.

    Flickinger KS, Culp LA. Dermal fibroblasts from Down's syndrome patients share a cycloheximide-induced deficiency in collagen adhesion responses with normal aging cells. Exp Cell Res. 1990;189(2):189-201. [PubMed ID: 2142462].

  • 26.

    Tochilin VI. [Nodular goiter and cancer of the thyroid]. Vestn Khir Im I I Grek. 1989;143(11):18-20. [PubMed ID: 2633425].

  • 27.

    Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 2003;31(13):3406-15. [PubMed ID: 12824337].

  • 28.

    Graf M, Deml L, Wagner R. Codon-optimized genes that enable increased heterologous expression in mammalian cells and elicit efficient immune responses in mice after vaccination of naked DNA. Mol Diagnos Infect Dis. 2004:197-210.

  • 29.

    Gasteiger E, Hoogland C, Gattiker A, Duvaud S, Wilkins MR, Appel RD, et al. Protein identification and analysis tools on the ExPASy server. Springer; 2005.

  • 30.

    Smialowski P, Martin-Galiano AJ, Mikolajka A, Girschick T, Holak TA, Frishman D. Protein solubility: sequence based prediction and experimental verification. Bioinformatics. 2007;23(19):2536-42. [PubMed ID: 17150993]. https://doi.org/10.1093/bioinformatics/btl623.

  • 31.

    Magnan CN, Randall A, Baldi P. SOLpro: accurate sequence-based prediction of protein solubility. Bioinformatics. 2009;25(17):2200-7. [PubMed ID: 19549632]. https://doi.org/10.1093/bioinformatics/btp386.

  • 32.

    Cserzo M, Wallin E, Simon I, von Heijne G, Elofsson A. Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng. 1997;10(6):673-6. [PubMed ID: 9278280].

  • 33.

    Heringa J, Frishman D, Argos P. Chapter 4 Computational methods relating protein sequence and structure. 1997.

  • 34.

    Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N. ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res. 2010;38(Web Server issue):W529-33. [PubMed ID: 20478830]. https://doi.org/10.1093/nar/gkq399.

  • 35.

    Yu CS, Chen YC, Lu CH, Hwang JK. Prediction of protein subcellular localization. Proteins. 2006;64(3):643-51. [PubMed ID: 16752418]. https://doi.org/10.1002/prot.21018.

  • 36.

    Viklund H, Elofsson A. OCTOPUS: improving topology prediction by two-track ANN-based preference scores and an extended topological grammar. Bioinformatics. 2008;24(15):1662-8. [PubMed ID: 18474507]. https://doi.org/10.1093/bioinformatics/btn221.

  • 37.

    Nielsen H, Engelbrecht J, Brunak S, von Heijne G. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 1997;10(1):1-6. [PubMed ID: 9051728].

  • 38.

    El-Manzalawy Y, Dobbs D, Honavar V. Predicting linear B-cell epitopes using string kernels. J Mol Recognit. 2008;21(4):243-55. [PubMed ID: 18496882]. https://doi.org/10.1002/jmr.893.

  • 39.

    Saha S, Raghava GP. Prediction of continuous B-cell epitopes in an antigen using recurrent neural network. Proteins. 2006;65(1):40-8. [PubMed ID: 16894596]. https://doi.org/10.1002/prot.21078.

  • 40.

    Larsen JE, Lund O, Nielsen M. Improved method for predicting linear B-cell epitopes. Immunome Res. 2006;2:2. [PubMed ID: 16635264]. https://doi.org/10.1186/1745-7580-2-2.

  • 41.

    Sun J, Wu D, Xu T, Wang X, Xu X, Tao L, et al. SEPPA: a computational server for spatial epitope prediction of protein antigens. Nucleic Acids Res. 2009;37(Web Server issue):W612-6. [PubMed ID: 19465377]. https://doi.org/10.1093/nar/gkp417.

  • 42.

    Ponomarenko J, Bui HH, Li W, Fusseder N, Bourne PE, Sette A, et al. ElliPro: a new structure-based tool for the prediction of antibody epitopes. BMC Bioinformatics. 2008;9:514. [PubMed ID: 19055730]. https://doi.org/10.1186/1471-2105-9-514.

  • 43.

    Kesmir C, Nussbaum AK, Schild H, Detours V, Brunak S. Prediction of proteasome cleavage motifs by neural networks. Protein Eng. 2002;15(4):287-96. [PubMed ID: 11983929].

  • 44.

    Holzhutter HG, Kloetzel PM. A kinetic model of vertebrate 20S proteasome accounting for the generation of major proteolytic fragments from oligomeric peptide substrates. Biophys J. 2000;79(3):1196-205. [PubMed ID: 10968984]. https://doi.org/10.1016/S0006-3495(00)76374-0.

  • 45.

    Diez-Rivero CM, Lafuente EM, Reche PA. Computational analysis and modeling of cleavage by the immunoproteasome and the constitutive proteasome. BMC Bioinformatics. 2010;11:479. [PubMed ID: 20863374]. https://doi.org/10.1186/1471-2105-11-479.

  • 46.

    Bhasin M, Raghava GP. Analysis and prediction of affinity of TAP binding peptides using cascade SVM. Protein Sci. 2004;13(3):596-607. [PubMed ID: 14978300]. https://doi.org/10.1110/ps.03373104.

  • 47.

    Larsen MV, Lundegaard C, Lamberth K, Buus S, Lund O, Nielsen M. Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction. BMC Bioinformatics. 2007;8:424. [PubMed ID: 17973982]. https://doi.org/10.1186/1471-2105-8-424.

  • 48.

    Rammensee H, Bachmann J, Emmerich NP, Bachor OA, Stevanovic S. SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics. 1999;50(3-4):213-9. [PubMed ID: 10602881].

  • 49.

    Bhasin M, Raghava GP. Prediction of CTL epitopes using QM, SVM and ANN techniques. Vaccine. 2004;22(23-24):3195-204. [PubMed ID: 15297074]. https://doi.org/10.1016/j.vaccine.2004.02.005.

  • 50.

    Lundegaard C, Lund O, Nielsen M. Accurate approximation method for prediction of class I MHC affinities for peptides of length 8, 10 and 11 using prediction tools trained on 9mers. Bioinformatics. 2008;24(11):1397-8. [PubMed ID: 18413329]. https://doi.org/10.1093/bioinformatics/btn128.

  • 51.

    Singh H, Raghava GP. ProPred1: prediction of promiscuous MHC Class-I binding sites. Bioinformatics. 2003;19(8):1009-14. [PubMed ID: 12761064].

  • 52.

    Adams HP, Koziol JA. Prediction of binding to MHC class I molecules. J Immunol Methods. 1995;185(2):181-90. [PubMed ID: 7561128].

  • 53.

    Lata S, Bhasin M, Raghava GP. Application of machine learning techniques in predicting MHC binders. Methods Mol Biol. 2007;409:201-15. [PubMed ID: 18450002]. https://doi.org/10.1007/978-1-60327-118-9_14.

  • 54.

    Singh H, Raghava GP. ProPred: prediction of HLA-DR binding sites. Bioinformatics. 2001;17(12):1236-7. [PubMed ID: 11751237].

  • 55.

    Johansen MB, Kiemer L, Brunak S. Analysis and prediction of mammalian protein glycation. Glycobiology. 2006;16(9):844-53. [PubMed ID: 16762979]. https://doi.org/10.1093/glycob/cwl009.

  • 56.

    Gupta R, Jung E, Brunak S. Prediction of N-glycosylation sites in human proteins. 2016.

  • 57.

    Gupta R, Brunak S. Prediction of glycosylation across the human proteome and the correlation to protein function. Pac Symp Biocomput. 2002:310-22. [PubMed ID: 11928486].

  • 58.

    Steentoft C, Vakhrushev SY, Joshi HJ, Kong Y, Vester-Christensen MB, Schjoldager KT, et al. Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology. EMBO J. 2013;32(10):1478-88. [PubMed ID: 23584533]. https://doi.org/10.1038/emboj.2013.79.

  • 59.

    Bologna G, Yvon C, Duvaud S, Veuthey AL. N-Terminal myristoylation predictions by ensembles of neural networks. Proteomics. 2004;4(6):1626-32. [PubMed ID: 15174132]. https://doi.org/10.1002/pmic.200300783.

  • 60.

    Eisenhaber B, Bork P, Eisenhaber F. Sequence properties of GPI-anchored proteins near the omega-site: constraints for the polypeptide binding site of the putative transamidase. Protein Eng. 1998;11(12):1155-61. [PubMed ID: 9930665].

  • 61.

    Haggar FA, Boushey RP. Colorectal cancer epidemiology: incidence, mortality, survival, and risk factors. Clin Colon Rectal Surg. 2009;22(4):191-7. [PubMed ID: 21037809]. https://doi.org/10.1055/s-0029-1242458.

  • 62.

    Vigneron N. Human Tumor Antigens and Cancer Immunotherapy. Biomed Res Int. 2015;2015:948501. [PubMed ID: 26161423]. https://doi.org/10.1155/2015/948501.

  • 63.

    Dzivenu OK, O’Donnell-Tormey J, O’Donnell-Tormey J. Cancer and the immune system: the vital connection. Cancer Research Institute; 2003.

  • 64.

    Palucka K, Banchereau J. Cancer immunotherapy via dendritic cells. Nat Rev Cancer. 2012;12(4):265-77. [PubMed ID: 22437871]. https://doi.org/10.1038/nrc3258.

  • 65.

    Koup RA, Douek DC. Vaccine design for CD8 T lymphocyte responses. Cold Spring Harb Perspect Med. 2011;1(1). a007252. [PubMed ID: 22229122]. https://doi.org/10.1101/cshperspect.a007252.

  • 66.

    Dermime S, Armstrong A, Hawkins RE, Stern PL. Cancer vaccines and immunotherapy. Br Med Bull. 2002;62:149-62. [PubMed ID: 12176857].

  • 67.

    Somers VA, Brandwijk RJ, Joosten B, Moerkerk PT, Arends JW, Menheere P, et al. A panel of candidate tumor antigens in colorectal cancer revealed by the serological selection of a phage displayed cDNA expression library. J Immunol. 2002;169(5):2772-80. [PubMed ID: 12193752].

  • 68.

    Sarcevic B, Spagnoli GC, Terracciano L, Schultz-Thater E, Heberer M, Gamulin M, et al. Expression of cancer/testis tumor associated antigens in cervical squamous cell carcinoma. Oncology. 2003;64(4):443-9. [PubMed ID: 12759544]. https://doi.org/10.1159/000070305.

  • 69.

    Perkins GL, Slater ED, Sanders GK, Prichard JG. Serum tumor markers. Am Fam Physician. 2003;68(6):1075-82. [PubMed ID: 14524394].

  • 70.

    Turriziani M, Fantini M, Benvenuto M, Izzi V, Masuelli L, Sacchetti P, et al. Carcinoembryonic antigen (CEA)-based cancer vaccines: recent patents and antitumor effects from experimental models to clinical trials. Recent Pat Anticancer Drug Discov. 2012;7(3):265-96. [PubMed ID: 22630596].

  • 71.

    Ananenko AA, Malinovskaia VV, Pertseva NG, Bezrukov K, Sultanov AT. [State of lipid peroxidation system and antioxidant defense in acute respiratory viral infection in children and the principles of pathogenetic therapy]. Pediatriia. 1989;(1):27-30. [PubMed ID: 2710600].

  • 72.

    Park IJ, Choi GS, Jun SH. Prognostic value of serum tumor antigen CA19-9 after curative resection of colorectal cancer. Anticancer Res. 2009;29(10):4303-8. [PubMed ID: 19846991].

  • 73.

    Ribas A, Butterfield LH, Economou JS. Genetic immunotherapy for cancer. Oncologist. 2000;5(2):87-98. [PubMed ID: 10794799].