Overhead Cross Section Sampling Machine Learning based Cervical Cancer Risk  Factors Prediction

A. Peter Soosai Anandaraj, M. Shyamala Devi, J. Amutharaj, M. Dineshkumar

PDF

Published: 2021-07-21

A. Peter Soosai Anandaraj, M. Shyamala Devi, J. Amutharaj, M. Dineshkumar

Abstract

Most forms of human papillomavirus can create alterations on a woman's cervix that can lead to cervical cancer in the long run, while others can produce genital or epidermal tumors. Cervical cancer is a leading cause of morbidity and mortality among women in low- and middle-income countries. The prediction of cervical cancer still remains an open challenge as there are several risk factors affecting the cervix of the women. By considering the above, the cervical cancer risk factor dataset from KAGGLE data warehouse is executed for predicting the cervical cancer risk classes. The cervical cancer data set is normalised with incomplete data and Pattern Calibration. Secondly, the interpretive data analysis is carried out, and the target feature's dispersion of the cervical cancer risk is visualised. Thirdly, several classifiers are fitted to the unprocessed data set, and the performance is measured with pre and post feature scaling. Fourth, oversampling methodologies are applied to the pre - processed data set. Fifth, the oversampled dataset by differment methods are applied to all the classifiers and the performance is compared with pre and post feature scaling. Sixth, Precision, recall, F-score, accuracy, and running time are some of the metrics used in performance analysis. The code is written in Python and executed with Anaconda Navigator on the Spyder framework. The findings of the experiments reveal that the Random forest classifier tends to sustain 96% accuracy pre and post scaling for unporocessed dataset. Similarly the same classifier tends to sustain 98% accuracy for all the oversampling techniques.

Issue

Vol. 12 No. 6 (2021)

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Article Sidebar

Main Article Content

Abstract

Article Details

Issue

Section