Aim:
Ā Ā Ā Ā To propose an approach that improves the accuracy and efficiency of cyberbullying detection in social media text by utilizing an advanced model that aims to overcome ambiguity and classification challenges.
Abstract:
Ā Ā Ā Ā Ā Ā Cyberbullying detection on social media platforms is challenging due to the informal, ambiguous, and context-dependent nature of language. This project proposes an approach to enhance the accuracy of fine-grained cyberbullying classification using the DistilBERT model. By leveraging DistilBERT, the system is expected to improve classification accuracy on large tweet datasets. This approach hypothesizes that it can outperform traditional machine learning models, Deep learning models.
Existing System:
Ā Ā Ā Ā Ā Previous approaches to cyberbullying detection have primarily utilized BERT (Bidirectional Encoder Representations from Transformers), achieving notable accuracy levels. However, BERT has demonstrated difficulty in handling ambiguous and overlapping cyberbullying categories, leading to potential misclassifications. The limitations of BERT are particularly evident in distinguishing subtle differences in various types of cyberbullying due to the complexities of informal, context-dependent language used on social media.
Problem Definition:
Ā Ā Ā Ā Ā Ā Cyberbullying detection is inherently difficult due to the informal and context-dependent nature of language on social media platforms. Existing models often struggle with ambiguity, particularly when a text does not clearly fit into predefined categories. This project seeks to explore a novel approach to enhance fine-grained classification and improve detection accuracy, particularly when encountering overlapping categories of cyberbullying-related hate speech.
Proposed System:
Ā Ā Ā Ā Ā Ā This project proposes the use of the DistilBERT model to classify various types of cyberbullying, with a focus on fine-grained detection. The system intends to leverage advanced machine learning techniques to handle ambiguous cases effectively. The model will be evaluated on two tweet datasets, one containing 47,000 samples and another larger dataset with 100,000 samples. The approach hypothesizes that the use of DistilBERT could significantly improve the performance of cyberbullying detection compared to existing models.
Advantages:
- Potential for Improved Accuracy: The proposed approach aims to enhance the detection of cyberbullying with DistilBERT.
- Handling Ambiguity: DistilBERT, with its transformer-based architecture, is expected to address ambiguity effectively and improve the model’s ability to detect subtle variations in cyberbullying.
Reviews
There are no reviews yet.