Aim:
To improve the accuracy and efficiency of cyberbullying detection in social media text by utilizing an advanced machine learning model (DistilBERT) that overcomes ambiguity and classification challenges.
Abstract:
Cyberbullying detection on social media remains a challenging task due to informal language, emotional subjectivity, and subtle overlap between harmful and non-harmful expressions. Existing solutions, including IEEE studies integrating sentiment analysis with machine learning, still struggle with ambiguity, feature sparsity, and lack of interpretability. This work introduces an enhanced detection framework using DistilBERT (M1) for fine‑grained contextual encoding and LIME (M2) for transparent, human‑interpretable explanations. DistilBERT improves the separation of overlapping bullying categories, while LIME highlights influential tokens contributing to model predictions. Experiments on large Twitter datasets demonstrate that the framework outperforms deep learning baselines such as LSTM, BiLSTM, and BERT. The addition of explainability strengthens trust and supports real‑time forensic applications. This revised approach enhances detection accuracy while addressing the interpretability gap absent in current sentiment‑analysis‑driven models.
Proposed System:
The proposed system integrates DistilBERT (M1) for advanced contextual encoding and LIME (M2) for explainable prediction visualization. DistilBERT processes tweets at a deeper semantic level, capturing subtle cues, while LIME highlights tokens contributing to cyberbullying classification. This combination resolves both accuracy and interpretability gaps identified in existing transformer and sentiment‑analysis approaches. The system is implemented with preprocessing, fine‑tuning, and a deployable interface for real-time tweet analysis.
Advantages:
- High performance in detecting cyberbullying-related hate speech.
- Effective handling of ambiguity using DistilBERT’s transformer-based architecture.
- High Accuracy: Achieved high accuracy with DistilBERT on the 100k tweet dataset.
- Explainable predictions through LIME token‑level highlighting.
- Handling Ambiguity: DistilBERT, with its transformer-based architecture, helps address ambiguity and improves the model’s ability to detect subtle variations in cyberbullying.






Reviews
There are no reviews yet.