Aim:
Ā Ā Ā Ā This study aims to develop an efficient and scalable system for multi-class classification of URLs into Phishing, Benign, Defacement, and Malware categories using the lightweight and context-aware DistilBERT model.
Abstract:
Ā Ā Ā Ā Ā URLs are common vectors for a range of cyber threats, including phishing, defacement, and malware distribution. Traditional binary detection systems often fall short in recognizing and differentiating between multiple threat types. This research proposes a DistilBERT-based framework that performs multi-class classification of URLs into four categories: Phishing, Benign, Defacement, and Malware. DistilBERTās ability to understand deep contextual relationships in textual data enables it to capture subtle cues embedded in URLs, allowing for accurate and efficient detection without the need for complex or high-overhead models.
Proposed Method:
Ā Ā Ā Ā Ā This study proposes a multi-class classification system using the DistilBERT model to categorize URLs as Phishing, Benign, Defacement, or Malware. The model is fine-tuned on a labeled dataset of URLs, where each URL belongs to one of the four classes. DistilBERT processes the tokenized URL strings to extract semantic features for effective classification across these threat categories.






Reviews
There are no reviews yet.