To identify the most effective biomarkers on IgA nephropathy, the researchers applied LASSO, MCP, and random forest in their high dimensional proteomic data set. Comparison of the total number of selected biomarkers (11 biomarkers) with the protein profile (493) indicated that as expected, most of the biomarkers had no role in diagnosis. In terms of prediction accuracy, the proposed models were the same because they truly differentiated all the IgAN patients from the control groups.
Selected biomarkers among the 3 models were somewhat different, which may be due to the small sample size. Compared to random forest (machine learning techniques generally), LASSO and MCP represent more stable results, where the number of biomarkers is much larger than the sample size (
14,
15). Moreover, penalized methods are more interpretable than machine learning techniques using concepts like odds ratio for biomarkers and probability of disease for each patient. One the other hand, random forest, as one of the most powerful machine learning techniques, is not restricted to linear associations and could detect any relationship between biomarkers and disease (
5).
Since the sensitivity and specificity of the 3 models were 100%, all the eleven significant proteins are important in disease development. Nevertheless, the researchers suggest that the most important urinary proteins that have the highest diagnostic value are more suitable for further validation, whether the protein ID was significant in at least 2 models. Accordingly, the suggested panel is composed of: FBLN5, GOLM1, and CD44. The researchers excluded albumin, because it is non-specific and is excreted in the urine in all types of kidney diseases with proteinuria; however, the fragments of the excreted albumin and the amount of excretion might be disease-specific. Therefore, the significance of albumin, as one of the most abundant proteins in the serum and urine in this condition seems to be logical, and needs to be considered in future experiments for exploring the excretion of the specific fragments of albumin in IgAN.
All of these 3 suggested diagnostic biomarkers are down-regulated in IgAN patients in the present dataset with a fold change of 11.3, 2.4, and 10.6 for FBLN5, GOLM1, and CD44, respectively.
Furthermore, FBLN5, as a multifunctional glycoprotein, belongs to the fibulin family that has been reported to have a relationship with elastic system fibers, and contributes in assembly and organization of the extracellular matrix and regulation of microfibril formation (
16-
18). Decreased excretion of FBLN5 that is also known as fibulin-5 in IgA nephropathy patients compared with healthy individuals indicates the impairment of elastic system fibers and extracellular matrix (ECM)-cell interaction in this disease. A hypothesis on the relationship between decreased urinary excretion of FBLN5 and IgA nephropathy is accumulation of microfibrils and aberrant remodeling of the ECM in expansion of mesangial matrix that are mediated by FBLN5 and cause a decrement in urinary FBLN5 compared with normal conditions.
Furthermore, GOLM1 is a cis-Golgi membrane protein with unknown function (
19). This protein is a known non-invasive biomarker for prostate cancer (
20), which is predominantly expressed by the cells of the epithelial lineage, especially in the liver and kidney (
21). Defects in GOLM1 gene leads to the development of renal disease, most notably focal segmental glomerulosclerosis and hyaline thrombi (
22). This is the first time that GOLM1 is suggested as a potential biomarker of IgA nephropathy. Further experiments are essential for validation of urinary changes of GOLM1 in IgAN patients in a larger cohort.
The third suggested biomarker for IgAN, CD44, is also involved in cell-cell and cell-matrix interactions (
23). The CD44 is a marker of activated Parietal Epithelial Cells (PECs) (
24), whose expression is markedly enhanced in inflammatory renal diseases (
25). Over-expression of CD44 in the renal tissue of IgA nephropathy patients was previously reported by Kim et al. and Florquin et al. (
26,
27). They reported a positive correlation between CD44 expression and degree of proteinuria as well as degree of renal damage (
26). There is evidence on decreased urinary excretion of CD44 in advanced stage of IgA nephropathy compared with the primary stage (
28). However, decreased urinary excretion of this protein was significant in the current study and helped to discriminate the case from the control group. The different pattern of changes of this biomarker in different studies might be due to different samples: tissue versus urine (
10).
4.1. Conclusion
Because all the 3 models were able to truly differentiate all the IgAN patients from the control groups, the researchers suggest that the proposed model could be used for modeling high dimensional and low sample size datasets.