Model Card for Fine-tuned BERT-Base-Uncased on Phishing Site Classification

Model Details

Model Description

This model is a fine-tuned version of BERT-Base-Uncased for phishing site classification. The model predicts whether a website is classified as "Safe" or "Not Safe" based on textual input.

Developed by: shogun-the-great
Model type: Binary Classification (Safe vs Not Safe)
Language(s): English
License: Apache-2.0 (or specify your license)
Finetuned from model: google/bert-base-uncased

Model Sources

Dataset: shawhin/phishing-site-classification

Uses

Direct Use

This model can be directly used for phishing detection by classifying text into two categories: "Safe" and "Not Safe." Typical use cases include:

Integrating with browser extensions for real-time website classification.
Analyzing textual data for phishing indicators.

Downstream Use

Users can fine-tune the model further for specific binary classification tasks or for datasets with similar domains.

Out-of-Scope Use

This model might not perform well for:

Non-English text.
Adversarial phishing attacks or heavily obfuscated text.
Tasks unrelated to text-based classification.

Bias, Risks, and Limitations

Bias

The model's predictions are influenced by the dataset used during fine-tuning. If the training data contains biases, these may reflect in the predictions.

Risks

False positives: Legitimate websites flagged as phishing.
False negatives: Some phishing sites might not be detected.
Potential vulnerabilities to adversarial examples.

Recommendations

Regularly update the dataset and model to stay aligned with emerging phishing patterns.
Use in combination with other security measures for robust phishing detection.

How to Get Started with the Model

You can load the fine-tuned model directly from the Hugging Face Hub:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load the tokenizer and model from Hugging Face Hub
model_name = "shogun-the-great/finetuned-bert-phishing-site-classification"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example usage
text = "Enter your login credentials to claim a free reward!"
inputs = tokenizer(text, return_tensors="pt", truncation=True)
outputs = model(**inputs)

# Get the predicted label
logits = outputs.logits
prediction = logits.argmax(dim=-1).item()
print("Prediction:", "Not Safe" if prediction == 1 else "Safe")