Aim:
Ā Ā Ā Ā Ā Ā The primary aim of this study is to develop a robust and accurate auxiliary diagnostic system for breast cancer by integrating machine learning techniques with a hybrid strategy.
Abstract:
Ā Ā Ā Ā Ā Ā Ā Breast cancer has emerged as the most common cancer among women globally, posing significant challenges to early diagnosis and effective treatment. This study focuses on developing a robust and efficient breast cancer auxiliary diagnosis model by integrating advanced data processing techniques and machine learning methodologies. To address issues such as sample imbalance, a hybrid data preprocessing strategy is employed, enhancing the quality and separability of the data.
Ā Ā Ā Ā Ā Ā Feature selection methods are applied in a two-step process to reduce dimensionality and improve the model’s generalization ability. Various machine learning models are utilized for classification and prediction, with optimal configurations for each model identified. The experiments, conducted on the Wisconsin Diagnostic Breast Cancer dataset, demonstrate that the proposed approach achieves superior performance compared to prior methods, offering a more accurate and efficient solution for breast cancer diagnosis.
Introduction:
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā According to the Cancer Statistics, 2023 statistical estimates, breast cancer, lung cancer, and CRC account for 52% of all new diagnoses, with breast cancer alone accounting for 31% of female cancers. Breast cancer, as one of the common malignant tumors in women, has become a focus of public health attention around the world. Its early diagnosis is important for the success of treatment and patient survival. With the rapid development of machine learning and other technologies, more and more research has been devoted to applying these advanced technologies to the diagnosis and assisted decision making of breast cancer.
Machine learning, as an important artificial intelligence technology, has the ability to extract features, discover patterns and build predictive models from a large amount of medical data. It can not only assist doctors in identifying high-risk groups in early screening, but also be used for accurate diagnosis and personalized treatment plan development.
Existing System:
Ā Ā Ā Ā Ā Ā Ā Ā This existing employed the Wisconsin Breast Cancer Diagnostic (WBCD) dataset to develop a voting ensemble classifier by integrating four machine learning models: Extra Trees Classifier (ETC), Light Gradient Boosting Machine (LightGBM), Ridge Classifier (RC), and Linear Discriminant Analysis (LDA). Various evaluation metrics were used to assess the performance and efficiency of the proposed model compared to individual classifiers and other state-of-the-art methods. The findings suggest that the ensemble-based approach delivers superior results, offering a robust solution for breast cancer detection and diagnosis
Ā Proposed system:
Ā Ā Ā Ā Ā Ā Ā Ā The incidence and mortality rate of breast cancer is increasing year by year and has become the number one cancer among women worldwide. In the medical field, the diagnosis and treatment of breast cancer relies heavily on early detection and treatment, and the earlier the treatment, the better the clinical outcome for patients. Firstly, in the preprocessing sections, some categorical values are found, factorize is used to encode the categorical values into numerical.
Ā Ā Ā Ā Ā A combined SMOTE sampling method is used to solve the problem of sample imbalance. Then, the features of the dataset are screened using the mutual information method, and further the recursive feature elimination method based on the Logistic Regression is used to derive the best feature subset. Finally, four different machine learning models Random Forest, SVM, KNN, and Gradient Boost, are used for classification and prediction. The experimental results find that the best prediction results are obtained using the RF model, with the high accuracy. This is better than the previous research methods.
Reviews
There are no reviews yet.