Abstract
Keywords
Supervised Machine Learning Data Collection Model Evaluation یادگیری ماشین نظارت شده جمعآوری داده ارزیابی مدل
References
-
1.
Flasiski M, Flasiski M. Symbolic artificial intelligence. Introduc Art Intell 2016; 15-22.##https://doi.org/10.1007/978-3-319-40022-8_2.
-
2.
Badillo S, Banfai B, Birzele F, Davydov II, Hutchinson L, KamThong T, et al. An introduction to machine learning. Clin Pharmacol Ther 2020; 107: 871-885.
-
3.
Gron A. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: O'Reilly Media, Inc 2022.
-
4.
Jain A, Patel H, Nagalapatti L, Gupta N, Mehta S, Guttula S, et al, editors. Overview and importance of data quality for machine learning tasks. Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining; 2020.##https://doi.org/10.1145/3394486.3406477.
-
5.
Budach L, Feuerpfeil M, Ihde N, Nathansen A, Noack N, Patzlaff H, et al. The effects of data quality on machine learning performance. ArXiv (preprint) 2022.
-
6.
Kariluoto A, Kultanen J, Soininen J, Prnnen A, Abrahamsson P, editors. Quality of data in machine learning. 2021 IEEE 21st international conference on software quality, reliability and security companion (QRS-C); 2021: IEEE.##https://doi.org/10.1109/QRS-C55045.2021.00040.
-
7.
Sarker IH. Machine learning: Algorithms, real-world applications and research directions. SN Comput Sci 2021; 2: 160.
-
8.
Banko M, Brill E, editors. Mitigating the paucity-of-data problem: Exploring the effect of training corpus size on classifier performance for natural language processing. Proceedings of the first international conference on Human language technology research; 2001.
-
9.
Jin J, Yin F, Xu Y, Zhang J, editors. Learning a model with the most generality for small-sample problems. proceedings of the 2022 5th international conference on algorithms, computing and artificial intelligence; 2022.##https://doi.org/10.1145/3579731.3579814.
-
10.
Kim JY, Cho SB. An information theoretic approach to reducing algorithmic bias for machine learning. Neurocomputing 2022; 500: 26-38.##https://doi.org/10.1016/j.neucom.2021.09.081.
-
11.
Chen M, Cheng H, Du Y, Xu M, Jiang W, Wang C, editors. Two wrongs don't make a right: Combating confirmation bias in learning with label noise. Proceedings of the AAAI Conference on Artificial Intelligence; 2023.##https://doi.org/10.1609/aaai.v37i12.26725.
-
12.
Akhatov A, Ulugmurodov SA. Training data selection and labeling for machine learning braille recognition models. Int J Contem Sci Tech Res 2023; 15-21.
-
13.
Whang SE, Roh Y, Song H, Lee JG. Data collection and quality challenges in deep learning: A data-centric ai perspective. VLDB J 2023; 32: 791-813.##https://doi.org/10.1007/s00778-022-00775-9.
-
14.
Angloher G, Banik S, Bartolot D, Benato G, Bento A, Bertolini A, Breier R, Bucci C, Burkhart J, Canonica L. Towards an automated data cleaning with deep learning in CRESST. Eur Phys J Plus 2023; 138: 1-11.
-
15.
Chu X, Ilyas IF, Krishnan S, Wang J, editors. Data cleaning: Overview and emerging challenges. Proceedings of the 2016 international conference on management of data; 2016.##https://doi.org/10.1145/2882903.2912574.
-
16.
Singh A, Thakur N, Sharma A, editors. A review of supervised machine learning algorithms. 2016 3rd international conference on computing for sustainable global development (INDIACom); 2016: IEEE.
-
17.
Eminaga O, Abbas M, Shen J, Laurie M, Brooks JD, Liao JC, Rubin DL. PlexusNet: A neural network architectural concept for medical image classification. Comput Biol Med 2023; 154: 106594.
-
18.
Gupta A, Chaithra N, Jha J, Sayal A, Gupta V, Memoria M, editors. Machine learning algorithms for disease diagnosis using medical records: a comparative analysis. 2023 4th International Conference on Intelligent Engineering and Management (ICIEM); 2023: IEEE.##https://doi.org/10.1109/ICIEM59379.2023.10165850.
-
19.
Kaur P, Singh RK. A review on optimization techniques for medical image analysis. Concur Comput Pract Exp 2023; 35: e7443.##https://doi.org/10.1002/cpe.7443.
-
20.
Shanbehzadeh M, Valinejadi A, Afrah R, Kazemi AH, Orooji A, Kaffashian MR. Comparison of machine-learning algorithms efficiency to build a predictive model for mortality risk in COVID-19 hospitalized patients. Koomesh 2021; 24: 128-138. (Persian).
-
21.
Tanhapour M KL, Maghooli K, Rostam Niakan Kalhori S. Determining the progression stages of liver fibrosis in patients with chronic hepatitis B. Koomesh 2022; 24: 639-647. (Persian).
-
22.
Ying X, editor. An overview of overfitting and its solutions. Journal of physics: Conference series; 2019: IOP Publishing.##https://doi.org/10.1088/1742-6596/1168/2/022022.
-
23.
Nazha A, Elemento O, McWeeney SK, Miles M, Haferlach T. How I read an article that uses machine learning methods. Blood Adv 2023; 2023010140.
-
24.
Jabbar H, Khan RZ. Methods to avoid over-fitting and under-fitting in supervised machine learning (comparative study). Comput Sci Commun Instru Devic 2015; 70: 978-981.##https://doi.org/10.3850/978-981-09-5247-1_017.
-
25.
Swathi P. Analysis on solutions for over-fitting and under-fitting in machine learning algorithms. Int J Innov Res Sci Eng Technol 2018; 7: 10.15680.
-
26.
Uar MK, Nour M, Sindi H, Polat K. The effect of training and testing process on machine learning in biomedical datasets. Mathem Prob Engin 2020; 2020.##https://doi.org/10.1155/2020/2836236.
-
27.
Avulu E, Elen A. Evaluation of train and test performance of machine learning algorithms and Parkinson diagnosis with statistical measurements. Med Biol Eng Comput 2020; 58: 2775-2788.
-
28.
Anguita D, Ghelardoni L, Ghio A, Oneto L, Ridella S, editors. The 'K' in K-fold Cross Validation. ESANN 2012.
-
29.
Wong TT, Yeh PY. Reliable accuracy estimates from k-fold cross validation. IEEE Trans Knowledge Data Eng 2019; 32: 1586-1594.##https://doi.org/10.1109/TKDE.2019.2912815.
-
30.
Fushiki T. Estimation of prediction error by using K-fold cross-validation. Stat Comput 2011; 21: 137-146.##https://doi.org/10.1007/s11222-009-9153-8.
-
31.
Lewis GA, Bellomo S, Ozkaya I, editors. Characterizing and detecting mismatch in machine-learning-enabled systems. 2021 IEEE/ACM 1st Workshop on AI Engineering-Software Engineering for AI (WAIN); 2021: IEEE.##https://doi.org/10.1109/WAIN52551.2021.00028.
-
32.
Althnian A, AlSaeed D, Al-Baity H, Samha A, Dris AB, Alzakari N, et al. Impact of dataset size on classification performance: an empirical evaluation in the medical domain. Appl Sci 2021; 11: 796.##https://doi.org/10.3390/app11020796.
-
33.
Kavzoglu T. Increasing the accuracy of neural network classification using refined training data. Environ Model Software 2009; 24: 850-858.##https://doi.org/10.1016/j.envsoft.2008.11.012.
-
34.
Cai J, Luo J, Wang S, Yang S. Feature selection in machine learning: A new perspective. Neurocomputing 2018; 300: 70-79.##https://doi.org/10.1016/j.neucom.2017.11.077.
-
35.
Guyon I, Elisseeff A. An introduction to feature extraction. Feature extraction: foundations and applications: Springer; 2006. p. 1-25.##https://doi.org/10.1007/978-3-540-35488-8_1.
-
36.
Veeramachaneni S, Olivetti E, Avesani P, editors. Active sampling for detecting irrelevant features. Proceedings of the 23rd international conference on machine learning; 2006.##https://doi.org/10.1145/1143844.1143965.