1. Background
Despite major advances in cancer research, the breast cancer is the most common type of cancer among women worldwide. The molecular analysis indicates at least 5 subtypes of breast cancer and those patients overexpressing human epidermal growth factor receptor 2 (HER2) that includes nearly 15% to 20% of all breast cancer (BC) patients, are implicated with poor prognosis leading to a reduced survival rate compared to other subtypes of breast cancer (1). HER2 is a member of human epidermal growth factor receptor (EGFR) family, which can dimerize with itself or other HER family members and is unique in that it lacks a known ligand. However, HER2 in a certain conformation is permanently accessible to heterodimerize with other family members such as HER1, HER3, and HER4 (2, 3) and it is optionally involved in dimer formation with EGFR family. Subsequent dimerization, the intracellular domains of the receptors interacts, and autophosphorylation of tyrosine kinase occurs induce signal transduction pathways of cell proliferation, angiogenesis, metastasis, and apoptosis (4). Besides being a biomarker, HER2 is considered as a therapeutic target. During the past decade treatments that exactly targeted HER2, it significantly enhanced survival of patients with HER2-positive breast cancer (1). Screening of highly potent anticancer agents is important in this field.
Application of antibody-based therapies are emerging anticancer strategies (5). Use of hundreds of monoclonal antibodies (mAbs) is on the rise in this field (6). Trastuzumab was the first humanized monoclonal antibody targeted against HER2. This drug is currently used in controlling patients with metastatic breast cancer (7, 8). The antitumor activity of trastuzumab has not been entirely determined. Trastuzumab induces the induction of antibody dependent cell mediated cytotoxicity (ADCC), attenuated DNA repair, prevention of cleavage of HER2 extracellular domain (ECD), declined intracellular signal transduction, and anti-angiogenic consequences have been attributed to the administration of this medication (9). However, the efficacy of trastuzumab is not ideal, having many side effects, including cardiovascular adverse events, hypotension, leukopenia, anemia, dyspnea, fatigue, fever, headache, diarrhea, and skin rash.
The recurrence rate following the treatment with trastuzumab in female patients is about 15%l, which may be related to de novo or acquired resistance to trastuzumab, requiring further investigations (10). It has been shown that use of mAbs alone may not be adequately effective, and they, consequently, are commonly combined by chemotherapy (11). One approach to increase the potency of mAbs is to conjugate antibody or antibody fragment to the toxic proteins forming chimeric proteins known as immunotoxins (5). mAbs are used for targeting and they bind to target receptor and then toxin moiety delivered to cell. For example, SS1 (dsFv) PE38 recombinant immunotoxin contains the variable domains of murine mAb SS1 that fused to PE38. SS1 mAb binds with high affinity to mesothelin in ovarian and pancreatic cancers (12). The toxins used in immunotoxins include plant or bacterial toxins. Certain toxins can inhibit protein synthesis upon internalization into the target cell. For example, Pseudomonas exotoxin A (PE), ricin toxin (RT), and other similar toxins are more potent in low quantities (5).
In this study, we used trastuzumab as a ligand for HER2 receptor conjugated to Pseudomonas exotoxin A (PE38) and a subunit of Shiga toxin 2a (Stx2a). This fusion protein selectively binds to HER2 receptor and eradicates the target cells upon uptake.
PE has commonly been used as a cytotoxic segment in targeted cancer therapy. It is a bacterial toxin originally been synthesized by Pseudomonas aeruginosa as a 69 kDa polypeptide before being processed through deletion of a N-terminal 25-residue to secrete a 66 kDa native toxin (13). PE is a member of AB toxin family composed of 3 main functional domains: (1) a receptor binding (R) domain (domain I) that facilitates the attachment of toxin to the specific receptors. (2) The B subunit consisting of a translocation (T) domain (domain II) that promotes the transport of the A subunit into the cytoplasm; and (3) The A (domain III) subunit that encodes the catalytic (C) domain with cytotoxic activity. Catalytic domain catalyzes the inactivation of eEF2 by transferring an ADP-ribosyl group from NAD+ to diphthamide residue. Domain III needs a quota of domain Ib for exerting complete catalytic activity (14, 15). Therefore, amino acid sequence of domain III is from 395 to 613 (16).
Shiga toxins (Stx), the main virulence factors of Stx-producing Escherichia coli (STEC), are members of AB5 family of toxins consisting of an A subunit, which has enzymatic activity and 5 equal B subunits (7.7 kDa) for binding to cellular receptors (17). Each B subunit anchors 3 distinguishing binding sites that exactly interacts with trisaccharide subunit of the glycosphingolipid Gb3 (18). Then, the A subunit (32.2 kDa) is cleaved by the furin protease to A1 fragment and a small A2 fragment (19). In the cytoplasm, A1 fragment cleaves the N-glycosidic bond of adenine-4324 in 28S rRNA and inhibits tRNA adhesion, which prevents protein synthesis (20). Stx comprises 2 main antigenic forms (Stx1 and Stx2). Stx2 is 100 times more potent than Stx1 in mouse models (21). Stx2 is classified to subtypes Stx2a - Stx2g based on nucleotide and amino acid sequence.
Using an in silico approach, we designed 2 chimeric constructs composed of PE38 (domain Ib-domain II)-VL-VH and Stxa- PE38 (domain Ib-domain II)-VL-VH sequences conjugated to antibody and anticipated these chimeric constructs induce apoptosis and eliminate breast cancer cells. The chimeric genes were optimized to produce the protein in a prokaryotic expression system. High yield of protein expression was related to the ability of chimeric gene to be expressed properly and confirmed the validity of bioinformatic analysis. Ultimately, the refined chimeric protein was used for the evaluation in vitro and in vivo.
2. Methods
2.1. Design the Construct and Gene Optimization
The sequence of PE38, Stxa, and Herceptin were obtained from GeneBank. The sequences of the constituents of chimeric construct were PE (accession No. P11439), Stx2a (accession No. P09385). Recombinant constructs were generated by fusing the toxins with antibody (VL-VH of Herceptin), using a linker. Several hydrophobic linkers were tested by the GOR4 tool (22) to separate 2 functional parts of the chimeric protein with or without minimal intrusion in their native secondary structure (data not shown). PE38 (domain Ib-domain II-domain III)-VL-VH and Stxa-PE38 (domain Ib-domain II)-VL-VH sequences were created by fusing the N-terminal of PE38, the C-terminal of VL (PE38-VL-VH), N-terminal of PE38, and the C-terminal of Herceptin (Stxa-PE38-VL-VH), using hydrophobic ASGGPE amino acid linker. GGGGSGGGGSGGGGS amino acid linker was used between VL and VH.
The new chimeric protein constructs were back-translated and optimized based on bacterial expression host, E. coli. Codon usage was checked by java codon optimization tool (JCat) (http://www.jcat.de/), optimizer web server (23, 24), and gene script server (http://www.genscript.com/). GC percentage and codon adaptation index (CAI) were, then, calculated (25).
The in silico multiparameter chimeric gene optimization and gene analysis of the synthetic chimera genes were completed, using online data bases such as the codon database, Genbank codon usage database, Swissprot reverse translation online tool (25), and stand-alone software such as DNASIS software.v.2.0 (Hitachi Software Engineering, Yokohama, Japan), Leto software v.1.0 (Entelechon, GmbH, Germany), and the software Protein2DNA (DNA 2.0, www.dnatwopointo.com). The chimeric genes were synthesized by Biomatic Company.
2.2. mRNA Structure Prediction
The program mfold was recruited to analyze the messenger RNA (mRNA) secondary structure of the chimeric genes.
2.3. Protein Primary Structure Property
The Expasy ProtParam server (http://us.expasy.org/tools/protparam.html) was used to measure the physicochemical criteria, molecular weight, half-life, the isoelectric point (pI), instability index (II), the number of positive and negative residues in the sequence, extinction coefficient (ε), aliphatic index (AI), and grand average hydropathy (GRAVY) (26).
2.4. Protein Secondary Structure Prediction
GOR secondary structure prediction method version IV was used to predict the secondary structure of the proteins (27). In addition, the predict protein server was used to estimate the protein structure and sequence analysis and to predict the functional properties of the protein such as investigating regions missing regular structure, secondary structure, coiled-coil structure, regions with low-complexity, the solvent accessible surface area (SASA), transmembrane (TM) helices, and location of disulfide bridges in a protein.
2.5. 3D Structure Prediction Using Homology Approach
The I-TASSER online server was used to produce the 3D model of the recombinant proteins (28) and their confidence score (C-score) is computed, showing the quality of predicted structure. In addition, Swiss-PdbViewer (aka DeepView) was recruited to analyze the stability of 3D structure of synthetic protein for energy minimization (29). The online program ASA was used for evaluation of solvent accessibilities of the protein residues (30).
2.6. Evaluation of Model Stability
Swiss PDBViewer, which includes a version of the GROMOS96 43B1 force field, was used for energy minimization and ProSA-web, Z-scores, and Procheck Ramachandran plot were used to assess the structure and to analyze stereo chemical configurations (31). Also, Swiss-PdbViewer (aka DeepView) was recruited to superimpose the query and template structure in order to visualize the generated models.
2.7. Ligand-Receptor Docking Using Hex
The docking of chimeric proteins with HER2 was done, using GRAMM-X Protein-Protein Docking Web Server v.1.2.0., to study protein-ligand interactions and explore the application of the models for ligand binding potency prediction.
3. Results
3.1. Sequence Analysis and Construct Design
The nucleotide sequence of PE and Stx was obtained from online gene banks and fused to Herceptin. Several hydrophobic linkers were tested to select the best linker to maintain functionality and recover the normal structure of 3 parts of the recombinant protein (data not shown). Eventually, ASGGPE sequence was selected as the optimal linker between the toxin and antibody, and GGGGSGGGGSGGGGS sequence was selected as the linker between VL and VH of Herceptin (Figure 1).
Afterwards, the amino acid sequence (557aa for s1 and 601aa for p2) was back-translated and nucleic acid codons were optimized based on the codon labeled E. coli as the expression host. The codon usage bias in E. coli was enhanced by raising GC content and CAI for s1 to 53.74 (GC% of E. coli is about 50) and 0.98, respectively.
The codon usage bias in E. coli was elevated by raising GC content and CAI for p2 to 56.2 (GC content of E. coli is about 50) and 0.99, respectively.
3.2. mRNA Structure Prediction
For determination of the probable folding of the chimeric gene, prediction of mRNA secondary structure has combined with comparative sequence analysis. The folding of 5’ terminal region of the gene was similar to the bacterial gene structures. The minimum free energy (NFE) for RNA secondary structure was calculated. Values for s1 were ΔG = -599.20 kcal/mol (Figure 2A) and ΔG = -663.00 for p2 (Figure 2B).
mRNA secondary structure prediction. ΔG = -599.20 kcal/mol was for s1 (A) and ΔG = -663.00 was for p2 (B). Graphical representation of secondary elements in chimeric proteins for s1 (C) and for p2 (D). Blue, purple, and red indicate helix, extended strand, and random coiled structures, respectively.
The folding of the RNA construct for both new chimeric protein was presented here, which was in accordance with all 50 structural components that have been predicted in the study. The mRNA construct was quite stable to be translated sufficiently in the new host.
3.3. Primary Structure Property
The physicochemical properties of a s1 and p2 protein sequence were found out by ProtParam and are summarized in Table 1.
Construct Name | S1 | P2 |
---|---|---|
Amino acids residues | 557 | 601 |
Molecular weight | 59873.40 | 64328.51 |
Isoelectric pointed | 5.65 | 5.33 |
Maximum number of amino acids | Serine (S) + glycine (G) | Gly (G) and Ala(A) |
Minimum number of amino acids | Glutamic acid and aspartic acid | Pyl (O) and Sec (U) |
Total number of negatively charged residues (Asp + Glu) | 51 | 65 |
Total number of positively charged residues (Arg + Lys) | 42 | 53 |
Instability index (II) | 40.40 | 41.81 |
Physicochemical Properties of s1 and p2 Sequence
N-terminal end of the s1 sequence starts with glutamic acid (E or Glu). The half-life of this protein is estimated to be 1 hour in mammalian reticulocytes, in vitro, 30 minutes in yeast, in vivo, and more than 10 hours in E. coli, in vivo.
The N-terminal end of the p2 sequence starts with methionine (M, Met). The half-life of this protein is assessed to be 30 hours in mammalian reticulocytes, in vitro, more than 20 hours in yeast, in vivo, and more than 10 hours in E. coli, in vivo.
3.4. Protein Secondary Structure Prediction
The online software was used to predict the secondary structure of the s1 chimeric protein and the frequency of random coils were 49.19%, the extended strands were 24.06%, and alpha helices were less frequent and found to be about 26.75%. This is represented graphically in Figure 2C.
Moreover, the prediction of the secondary structure of the p2 chimeric protein was achieved, using online software and the frequency of random coils was 54.91%, extended strands were 19.47%, and alpha helices were less frequent and found to be about 25.62%. This is represented graphically in Figure 2D.
3.5. Tertiary Structural Prediction for the Chimeric Protein
3D models of chimeric protein have been created by i-Tasser, uploaded to the Swiss-PdbViewer server to describe the tertiary structural illustrations (32).
Five probable tertiary structures were anticipated by I-TASSER tool for s1 (Figure 3A). Consistent with C-scores calculated by this software, model 1, with a C-score of -2.71, had the maximum confidence between the other four models. The expected TM- score and RMSD were 0.40 ± 0.14 and 14.3 ± 3.8 Å, respectively. Also, 5 probable tertiary structures were predicted for p2 construct and model 3 with C- score of -1.98 was selected (Figure 3B). The TM- score and RMSD were 0.30 ± 0.10 and 17.6 ± 2.5 Å, respectively.
3.6. Evaluation of Model Stability
The spdbv (Swiss-PdbViewer) was employed to calculate the profile of energy minimization -6107.159 Kcal/mol and -6205.140, respectively for s1 and p2, indicating that the recombinant proteins had acceptable stability. Furthermore, according to Ramachandran plot analysis (Figure 4A and B), the stability of chimeric protein structure was approved.
Evaluation of model stability based on a Ramachandran plot for s1 (A) and p2 (B). Docking of s1 chimeric protein with HER2 receptor (A) and p2 chimeric protein with HER2 receptor (B), using GRAMM-X. The models for ligand binding potency have been predicted to examine the protein-ligand interactions.
3.7. Ligand Docking
Docking was performed by GRAMM-X server. This server could calculate protein ligand docking. We have uploaded a pair of HER2 and s1-Herceptin fusion protein as a ligand structures in PDB format in GRAMM-X server. Default parameters have been used for carrying out the jobs. When we have viewed the visualization tool like SPVBV, the docking between receptors of proteins and the ligand could be clearly observed as shown in Figure 4C and D.
4. Discussion
Breast cancer is the most common type of cancer among female adults of different age and race and the selected treatment strategy is of great importance to obtain the eligible result. Patients with overexpressed HER2 have shown weaker prognosis and reduced survival rate comparing to other breast cancer subtypes (1). mAbs that targeted tumor cell surface antigens have been displayed to prevent tumor cell growth (33). Also, cytotoxic potential can be enhanced through conjugating cytotoxic agents to mAbs (34). mAbs can bind to the extracellular domain of the human HER2 protein (35). The limitations of therapeutic potential of intact mAbs include large size and their immunogenicity. The Fv (antigen binding domain) of the antibody may be adequate for binding to receptor. Single-chain antigen-binding proteins (scFv), created by a short peptide linker that attaches light (VL) and heavy chain (VH) variable domains of antibody, can be expressed in bacteria and they preserve their high-affinity binding to the target (36). To increase the antibody efficacy, it can be conjugated to toxic agents. In this study, we designed 2 unique constructs, including stxa + Ib + II + vl + vh of Herceptin and III + Ib + II + vl + vh and compared their efficacy in silico to be introduced as new antitumor candidates. Previously Keshtvarz et al. designed and evaluated PE38 - P4A8 chimer immunotoxin that was optimized by codon Adapta-tion index 0.94 and GC percentage 54.2, and revealed high and stable expression in bacterial cells. The most second structures were random coils and disordered regions; also, secondary structure of this immunotoxin was stable. The construct was hydrophilic and acidic. Tertiary structure of the fusion protein by C-score -3.36 contained the highest C-score. In this study, ASGGPE and (G4S)3 linkers were used to isolate different segments (37). In another study, we designed and analyzed TGFαL3-SEB fusion protein that were optimized by CAI index 0.85 and by GC content 44.06% the overall stability of mRNA increased. The coil structural content was high. TGFαL3-SEB fusion protein was hydrophilic and the computed isoelectric point was 7.72. Tertiary structure of the fusion protein by C-score -0.42 contained the highest C-score (38). Comparing these two chimer proteins proposed that TGFαL3-SEB and PE38-P4A8 were stable fusion proteins with proper affinity to their receptors that overexpressed in cancer cells.
Our structural models could demonstrate that the attachment of the scFv domain to HER2 found on the surface of breast cancer cells uptake leads to the uptake of the structure by receptor-mediated endocytosis. Subsequently, the enzymatic domain of the structure is released into the cytoplasm of the cell and the elongation factor 2 is ADP-ribosylated. These events cause inactivation of elongation factor 2 by toxin, which inhibit the synthesis of the protein, leading to tumor cell death by apoptosis. Computational studies were done to predict physicochemical properties, structures, stability, and ligand-receptor interaction of these chimeric proteins. Several hydrophobic linkers were tested by the GOR4 tool (22) to separate 2 functional parts of the chimeric protein with or without minimal intrusion in their native protein secondary structure.
Recombinant stxa + Ib + II + vl + vh and III + Ib + II + vl + vh sequences were constructed by fusing the N-terminal of II and the C-terminal of VL (stxa + Ib + II + vl + vh) and N-terminal of II and the C-terminal of VL (III + Ib + II + vl + vh), using hydrophobic ASGGPE amino acid linker. The 12 aa-linker GSGGSGGSGGSG was used for connection of VL and VH of the 2 proteins. Aria et al. reported the multimerizing feature of short helical linker paralleled with longer ones. Also, a flexible linker based on shorter conformation shows a proficient part in comparison to those with the helical linker (39).
The folding of the 2 structures (stxa + Ib + II + vl + vh) and (III + Ib + II + vl + vh) was analyzed and the 2 constructs have displayed accessible area and have not been hidden in their structure. In silico investigations have confirmed proficient transcriptional and translational capabilities, in addition to the quality expression of the new chimeric constructs in host expression vectors. The major factor used for gene optimization was CAI, with a range of 0 to 1 and an ideal value of 1.0. Since E. coli was used as the expression system codon usage table of E. coli was employed for back-translating the sequences and determining the optimum expression of the fusion proteins.
CAI in the wild type sequences was raised from 0.5 to 0.95 in the s1 optimized chimeric gene and 0.99 for p2. Moreover, the overall GC content was reduced from 65 to 53.74% for s1 and 68 to 57.5% for p2 that in turn enhance the stability of mRNA molecules, which has a major role in regulating synthetic gene expression. Furthermore, the required restriction enzyme sites were added to the ends of the designated gene for future assays. Codon optimization assured that synthetic construct was expressed optimally in the desired host vector. In this study, the mRNA structure was optimized based on the measure of Gibbs free energy (ΔG°) and the energy of the start codon in the mRNA, which is associated to the ribosome binding and translation initiation. The program mfold was recruited to assess the mRNA secondary structure of the s1 and p2 chimeric genes with the factors as follow: Linear RNA folding at 5%, window = 12, max folds = 50. All 43 structural elements achieved in this investigation have shown folding of the RNA construct at 37°C with initial ΔG ranging from -599.20 to -567.05 kcal/mol for s1 and -663.00 to -650.27 kcal/mol for p2. The best structure for s1 and p2 were -599.20 and -663.00 kcal/mol, respectively. The data have revealed the mRNA was stable sufficient for proficient translation in the new host.
The GOR method was applied for prediction of the secondary structure of 2 chimeric proteins. This software permits approximating the probable secondary structure of each amino acid together with its effect on the condition and structure of neighboring amino acids. The most plentiful structure within our fusion proteins was a random coil that could be owing to the attendance of a high quantity of hydrophobic amino acids such as glycin. According to the results of this study, from the physicochemical feature analysis, both fusion proteins had acidic nature with high extinction coefficient at 280 nm owing to high content of cysteine, tryptophan, and tyrosine. The analysis of these fusion proteins could be performed by ultraviolet-visible spectrophotometry. Although our fusion proteins are partially instable, its estimated high aliphatic index is attributed to protein stability in a broad range of temperatures.
One important factor in designing novel chimeric proteins is their molecular functions of supported by three-dimensional (3D) structure. The I-TASSER online server was used to produce the 3D model of the recombinant s1 and p2 protein based on their C-score, Z-score, RMSD, and TM-score. Five models were suggested by this server for each chimeric protein. For s1 and p2 proteins model 1 and model 3 had the highest C-score and were selected for further examination. Structural evaluation and the stability of the fusion protein were completed, using Procheck Ramachandran plot. Energy minimization was determined, using analysis of 3D structural stability of the chimeric proteins by using Swiss-PdbViewer.
Ligand-receptor docking was used to study whether Herceptin could reserve its binding ability to bring PE and Stx to tumors overexpressing HER2 (EGFR in many human tumors). Molecular docking was done by GRAMM-X server. A significant feature of GRAMM is the capability to smooth the protein surface demonstration to account for probable conformational alteration upon binding within the rigid body docking method.
S1 and p2 have revealed great affinity towards HER2. The results of the present study showed that the binding ability of s1 and p2 were strong enough to their receptor; so, s1 and p2 can be introduced as a novel antitumor candidate in breast cancer.
In conclusion, based on docking software analysis, the binding ability of Herceptin was robust enough to its receptor, so these constructs could be assigned as a new antitumor candidate in cancer therapy. The results suggested that s1 and p2 were stable fusion proteins with accurate affinity to the overexpressed receptors making them potential candidates for inducing apoptosis in breast cancer cells.