1. Background
Neurological diseases are a type of disease with significant prevalence that can leave irreparable consequences for the patient, family, and society. The complications caused by neurological diseases include stress, anxiety, reduced life expectancy and general health, and increased care pressure on the caregivers of patients. Consequently, the economic burden resulting from long-term care costs places a major strain on social resources. Therefore, prioritizing early diagnosis and effective therapeutic strategies is essential to mitigate this widespread societal impact (1-4). Psychiatric disorder is a disease related to the brain and nervous system, and is one of the severe mental illnesses that disrupts individual, social, and occupational functioning, placing significant pressure on the patient, family, and care providers (5, 6). Psychiatric disorder is a chronic brain disease with a lifetime prevalence of about one percent. This disease, which places a great health and medical burden on society, leads to disability worldwide (7, 8). Psychiatric disorder is a type of debilitating disease and one of the most important psychiatric diseases, affecting about one percent of people throughout their lives. This disease has many negative effects on the social functioning of sufferers, and the likelihood of the disease being prolonged is very high (9-11). Among people with psychiatric disorder, problems occur in working and long-term memory, attention, social functioning, and processing speed. In an unusual subgroup, the individual may be largely silent, exhibit strange motor behaviors, or display inappropriate anxiety (12). Cognitive deficits are widespread in patients with schizophrenia, and treatment options are limited. Treatment plans for schizophrenia patients include hospitalization, medication, electroconvulsive therapy, and psychosocial therapies such as behavioral, family, group, individual, and social skills training, as well as rehabilitation therapies such as cognitive therapy and physical activity (13, 14). One of the diseases related to the brain and nervous system is bipolar disorder (BD) (15, 16). Bipolar disorder is a chronic and periodic disease with acute attacks, the prevalence of which is reported to be around 2 - 4% during life. This disease has been observed in different age groups, including children, teenagers, and adults. Bipolar disorder at a much younger age causes an increase in hospitalizations, suicides, substance abuse, and behavioral problems (17-19). Bipolar disorder is one of the common psychiatric disorders that has a significant impact on the quality of life and various aspects of a person's emotional, professional, and cognitive performance (20). If the patient suffers from BD, in addition to causing physical and mental problems in the patient, the patient's family members also face basic challenges, including mental disorders such as sleep disorders, depression, tension, and feelings of hunger (21, 22). Various genetic and environmental factors are effective in causing bipolar disorder. One of these factors that can be effective in relation to this disorder and other emotional problems is attachment style (23, 24). Using inappropriate treatments for BD can have serious consequences for the health system. If inappropriate treatment is prescribed and the treatment process is prolonged, treatment costs will increase and lead to the ineffectiveness of the treatment system. In fact, the condition for proper treatment is early diagnosis of the disease (20, 25). In order to diagnose the disease, the factors that have influenced its development must be identified. One of these ways is to conduct bioinformatics studies (26, 27). Bioinformatics is a relatively new field that includes data analysis, modeling of biological phenomena, and the development of algorithms and statistical methods. Bioinformatics has been one of the important research topics in recent years that deals with scientific studies (28). Given the high heritability of diseases, it is likely that molecular genetic methods will reveal the related genes. Unfortunately, gene identification and identification of the location of the gene on the chromosome is a difficult task. In recent years, many efforts have been made to understand biomarkers for their diagnosis, prevention, and treatment (29).
2. Objectives
This study was specifically designed to execute a comprehensive bioinformatics-driven investigation aimed at mapping the critical central genes that govern the complex molecular pathways associated with these debilitating neurological conditions, thereby transitioning from raw data to tangible diagnostic potential. For this reason, this study was conducted with the aim of bioinformatic determination of gene biomarkers in BD.
3. Methods
This biomarker study analytically compared gene expression data from BD patients against a control group.
3.1. Paraphrased Methodology for Bipolar Disorder Gene Network Analysis Study
This study was analytically performed by comparing the gene expression data of all BD patients against a control group, with the primary objective of extracting biomarkers related to network structure.
3.2. Data Acquisition and Candidate Gene Selection
The study population comprised all bipolar patients. Initial network data sources included the NCBI database, SWISS-PROT, and the Diseasome database. Genes involved in the disease were extracted after a comprehensive review of literature and searches across bioinformatics databases such as NCBI, Gene Cards, SWISS-PROT, and Diseasome. Candidate genes were selected only if they were suggested based on evidence from at least one method (in vivo, in vitro, or in silico). Gene expression data for these candidate genes were then collected from standard databases using an algorithm specifically written in MATLAB.
3.3. Data Standardization and Network Construction
To facilitate comparison and test the research hypotheses, the expression data obtained from each group (case and control) were standardized relative to the control group. Subsequently, the gene expression data for all experimental groups were normalized relative to the control group. Following this, gene communication networks were established independently for both the patient cohort and the healthy cohort using MATLAB software.
3.4. Analysis and Biomarker Identification
Advanced descriptive and analytical statistical methods, alongside machine learning methods based on sophisticated bioinformatics algorithms, were utilized to calculate the features of the network data analysis. The topological and structural characteristics of these networks were systematically calculated and compared to identify parameters exhibiting significant divergence between the two groups. These distinctive parameters were then designated as prospective biomarkers.
3.5. Validation and Ethical Compliance
Accuracy assessment and biological validation of both the constructed networks and the proposed biomarkers were performed utilizing the Rectome and Diseasome databases. Comprehensive statistical analyses throughout all stages were executed using R and MATLAB. Furthermore, the necessary permits were obtained from the Ethics Committee under reference number IR.MEDILAM.REC.1402.152 to conduct the study. Obtaining the code of ethics in research from Ilam University of Medical Sciences, observing the principles of confidentiality in publishing articles and reporting data, and adhering to the guidelines issued by the university's research ethics committee were considered as ethical criteria in research.
4. Results
This investigation commenced with the objective of mapping the genomic associations underlying BD and quantifying the network’s essential components. Initially, genes implicated in BD were identified via text mining (as cataloged in Table 1). Subsequently, the expression profiles for these prioritized candidate genes were retrieved from established databases, specifically Entrez Gene and Uniprot. Following data acquisition, the communication network connecting these candidate genes was constructed using the Gephi platform. While mapping the network topology, structural centrality criteria were calculated to ascertain the most essential genes and proteins. Crucially, the edge weights within this constructed network were quantitatively defined based on the corresponding expression levels of the associated genes and proteins (as detailed in Table 2).
| Number | Gene Name | GDA Score |
|---|---|---|
| 1 | HTR2A | 0.8 |
| 2 | S100B | 0.74 |
| 3 | CACNA1C | 0.7 |
| 4 | ANK3 | 0.7 |
| 5 | COMT | 0.7 |
| 6 | NCAN | 0.68 |
| 7 | SP4 | 0.63 |
| 8 | PLOG | 0.62 |
| 8 | ADCY2 | 0.62 |
| 10 | LMAN2L | 0.61 |
| 11 | FADS2 | 0.61 |
| 12 | BDNF | 0.6 |
| 13 | SLC6A4 | 0.6 |
| 14 | CLOCK | 0.6 |
| 15 | GSK3B | 0.6 |
| 16 | NR3C1 | 0.6 |
| 17 | DRD1 | 0.6 |
| 18 | MTHFR | 0.6 |
| 19 | GAD1 | 0.62 |
| 20 | ITIH1 | 0.6 |
| 21 | NDUFV2 | 0.6 |
| 22 | GRIN2A | 0.59 |
| 23 | RELN | 0.58 |
| 24 | ACE | 0.57 |
| 25 | GRK3 | 0.56 |
| 26 | GRM5 | 0.6 |
| Number | MNC | Degree | Closeness | Radiality | Betweenness |
|---|---|---|---|---|---|
| 1 | ACE | BDNF | BDNF | BDNF | PPARGC1A |
| 2 | ADCY2 | TP53 | PPARGC1A | PPARGC1A | TP53 |
| 3 | S100B | NR3C1 | TP53 | TP53 | BDNF |
| 4 | DRD1 | PPARGC1A | GSK3B | GSK3B | GAD1 |
| 5 | NR3C1 | SLC6A4 | NR3C1 | NR3C1 | NDUFV2 |
| 6 | MTHFR | SIRT1 | SIRT1 | SIRT1 | GSK3B |
| 7 | RELN | PER2 | PER2 | PER2 | NDUFA9 |
| 8 | PPARGC1A | COMT | COMT | COMT | NDUFB5 |
| 9 | SORT1 | NDUFA9 | GAD1 | GAD1 | NDUFS3 |
| 10 | SIRT1 | NDUFB5 | ACE | ACE | CACNA1C |
4.1. Determining Critical Nodes (Essential Vertices) via Centrality
The concept of ‘hubs’ in network theory, derived from centrality assessments, is utilized to pinpoint the essential vertices within a structural topology. In the context of biological networks, identifying these hub vertices serves to delineate the most influential components, allowing for the proposition of core genes or proteins as diagnostic biomarkers potentially aiding in disease prognosis or therapeutic targeting.
4.2. Network Structural Parameters Employed
The analysis was based on five distinct graph theoretical metrics:
4.2.1. Maximum Neighborhood Component (MNC)
Defined as the cardinality of the largest connected component associated with a specific vertex (a). The MNC score for vertex a is precisely the size of this largest affiliated component. Based on this metric, the top 10 biomarkers were ranked according to their descending MNC scores.
4.2.2. Degree
Defined as the total number of edges incident to a vertex. This fundamental criterion was applied to rank the most effective biomarkers within the established BD network.
4.2.3. Closeness Centrality
Calculated in a connected network as the reciprocal of the sum of the shortest path distances from a given vertex to all other vertices in the network. Its significance in biological network analysis lies in quantifying a gene’s topological proximity to all other network elements.
Radiality: This metric isolates the vertex exhibiting the shortest maximum distance to its immediate neighbors within its local neighborhood set. The highest radiality scores were calculated for a specific subset of genes.
4.2.4. Betweenness Centrality
This measure quantifies the frequency with which a vertex lies upon the shortest paths connecting any two other vertices. A vertex exhibiting high betweenness centrality is critical for information flow; its removal risks significantly fragmenting overall network connectivity.
4.2.5. Network Representation Context
In the visualized communication network (Figure 1), each node represents a gene, and the communicating edge between nodes signifies a physical or functional interaction substantiated by at least one evidence type (in vivo, in vitro, or in silico study). The final gene scoring was performed based on the GDA criterion.
5. Discussion
In the study of Mohammadi et al., the mean age of patients with BD was 11.8 ± 3.78, and 83.5% of them were urban residents. According to the findings, 0.29% of men and 0.26% of women had BD. Also, the incidence of BD was higher in urban residents than in rural residents. On the other hand, the incidence of BD increased with increasing age (30). In the study by Mohammadi et al., of 3147 patients with mental disorders studied between 2006 and 2010, 3.5% of patients had depressive disorder, 2.6% had BIID disorder, and 6.2% had NOS disorder (30). Bipolar disorder is one of the most important psychiatric diseases that causes educational failure, occupational disability, personality degradation, and disruption of individual, family, and social relationships in sufferers. In addition to the individual's involvement, BD also affects the patient's family, disrupting the structure, function, and duties of the family. Bipolar disorder also increases the problems and psychological burden imposed on families, such as economic problems, emotional and stressful reactions resulting from the need to adapt to the disorder, changes in daily family activities, and restrictions on social activities (31-33). This study aimed to conduct a bioinformatics study of gene biomarkers in BD. Psychiatric disorders are multifactorial disorders that are influenced by multiple genes, environmental factors, and genetic and epigenetic processes. Over the past years, the results of studies have shown that genetic findings in psychiatric disorders can be observed, which can provide important assistance in the diagnosis and treatment of this disease (34-38). Numerous bioinformatics studies have been conducted on various diseases, the results of which are explained below. Masoudi et al. studied the ZBTB16 gene in stem cells and analyzed 100 genes related to the ZBTB16 gene. According to the results, the expression of the ZBTB16 gene was clearly seen, and if the ZBTB16 gene is deleted, some undesirable pathways can be disrupted (39). Khorasani et al. conducted a bioinformatics study on the miR-200 family, which found that the E2F3 gene is a common target of all members of the miR-200 family. Also, in the study of Khorasani et al., miR-200c can interfere in the development of prostate cancer by targeting CDKN1B and miR-429 (40). Based on genome-wide scans, highly susceptible gene loci on other chromosomes have been shown to be involved in the development of schizophrenia. Among them, one of the most important genes involved in this disease is the PRODH gene, located on chromosome 22q11. This gene is highly expressed in the brain and encodes a protein called proline dehydrogenase, which catalyzes the first step in protein degradation (41, 42).
5.1. Conclusions
This study successfully integrated literature-based gene identification with complex network analysis to elucidate the structural underpinnings of BD. Initially, candidate genes related to BD were identified via text mining, yielding connectivity scores for key players such as CLOCK and CACNA1C. Subsequently, expression levels were extracted from public databases to accurately weight the edges within the constructed network using Gephi. This weight assignment ensured that the network topology faithfully represented active biological interactions. Finally, the application of five critical centrality measures—including betweenness and MNC—allowed for the precise ranking and identification of pivotal hub genes that are most critical for maintaining the functional integrity of the BD molecular pathway.
