MouezYazidi commited on
Commit
a31104c
·
verified ·
1 Parent(s): b118c82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -21
README.md CHANGED
@@ -7,28 +7,86 @@ tags:
7
  model-index:
8
  - name: modernBERT-base-bilingual-CampingReviewsSentiment
9
  results: []
 
 
 
 
 
 
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
15
- # modernBERT-base-bilingual-CampingReviewsSentiment
16
 
17
- This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
18
- It achieves the following results on the evaluation set:
19
- - Loss: 0.5419
20
 
21
- ## Model description
 
22
 
23
- More information needed
 
 
 
 
 
 
 
24
 
25
- ## Intended uses & limitations
26
 
27
- More information needed
28
 
29
- ## Training and evaluation data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
- More information needed
32
 
33
  ## Training procedure
34
 
@@ -43,20 +101,10 @@ The following hyperparameters were used during training:
43
  - lr_scheduler_type: linear
44
  - num_epochs: 5
45
 
46
- ### Training results
47
-
48
- | Training Loss | Epoch | Step | Validation Loss |
49
- |:-------------:|:-----:|:----:|:---------------:|
50
- | No log | 1.0 | 400 | 0.4953 |
51
- | 0.5332 | 2.0 | 800 | 0.3315 |
52
- | 0.2257 | 3.0 | 1200 | 0.3839 |
53
- | 0.0637 | 4.0 | 1600 | 0.5029 |
54
- | 0.0357 | 5.0 | 2000 | 0.5419 |
55
-
56
 
57
  ### Framework versions
58
 
59
  - Transformers 4.48.0.dev0
60
  - Pytorch 2.5.0+cu124
61
  - Datasets 3.1.0
62
- - Tokenizers 0.21.0
 
7
  model-index:
8
  - name: modernBERT-base-bilingual-CampingReviewsSentiment
9
  results: []
10
+ datasets:
11
+ - MouezYazidi/campSentiment-Bilingual
12
+ language:
13
+ - fr
14
+ - en
15
+ pipeline_tag: text-classification
16
  ---
17
 
18
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
19
  should probably proofread and complete it, then remove this comment. -->
20
 
21
+ # MouezYazidi/modernBERT-base-bilingual-CampingReviewsSentiment
22
 
23
+ **modernBERT-base-bilingual-CampingReviewsSentiment** is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
24
+ using a bilingual sentiment dataset [MouezYazidi/campSentiment-Bilingual](https://huggingface.co/datasets/MouezYazidi/campSentiment-Bilingual).
 
25
 
26
+ Model supports bilingual sentiment classification ( english & french )
27
+ ## Model Evaluation
28
 
29
+ After fine-tuning the model, we evaluate its performance on the test dataset from [MouezYazidi/campSentiment-Bilingual](https://huggingface.co/datasets/MouezYazidi/campSentiment-Bilingual)
30
+ | Class | Precision | Recall | F1-Score | Support |
31
+ |-------|-----------|--------|----------|---------|
32
+ | 0 | 0.88 | 0.67 | 0.76 | 102 |
33
+ | 1 | 0.89 | 0.94 | 0.93 | 298 |
34
+ | **Accuracy** | | | **0.89** | 400 |
35
+ | **Macro Avg** | 0.89 | 0.82 | 0.85 | 400 |
36
+ | **Weighted Avg** | 0.89 | 0.89 | 0.89 | 400 |
37
 
38
+ ## How to use
39
 
40
+ ### Requirements
41
 
42
+ Since **transformers** only supports the **ModernBERT** architecture from version `4.48.0.dev0`, use the following
43
+ command to get the required version:
44
+
45
+ ```bash
46
+ pip install "git+https://github.com/huggingface/transformers.git@6e0515e99c39444caae39472ee1b2fd76ece32f1" --upgrade
47
+ ```
48
+
49
+ Install **FlashAttention** to accelerate inference performance
50
+
51
+ ```bash
52
+ pip install flash-attn==2.7.2.post1
53
+ ```
54
+
55
+ ### Quick start
56
+ ```python
57
+ import torch
58
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
59
+
60
+ # Set device (GPU if available, else CPU)
61
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
62
+
63
+ # Load model and tokenizer
64
+ model_id = "MouezYazidi/modernBERT-base-bilingual-CampingReviewsSentiment"
65
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
66
+ model = AutoModelForSequenceClassification.from_pretrained(model_id, torch_dtype=torch.float16).to(device)
67
+ model.eval()
68
+
69
+ def predict_sentiment(text: str):
70
+ """Predicts sentiment of the given text using the model."""
71
+ inputs = tokenizer([text], return_tensors="pt").to(device)
72
+
73
+ with torch.no_grad(): # Use no_grad for inference optimization
74
+ outputs = model(**inputs)
75
+ prediction = outputs.logits.argmax(dim=-1).item()
76
+
77
+ return 'positive' if prediction==1 else 'negative'
78
+
79
+ # Example usage
80
+ text = """
81
+ Place is amazing. Entertainment is next level brilliant
82
+ Pool areas excellent
83
+ Literally no complaints at all. Staff so friendly everywhere. Brought 2 teenagers they had a great time aswell as 3 and 9 year old
84
+ Fantastic time had by us all
85
+ """
86
+ prediction = predict_sentiment(text)
87
+ print(f"Predicted Sentiment: {prediction}")
88
+ ```
89
 
 
90
 
91
  ## Training procedure
92
 
 
101
  - lr_scheduler_type: linear
102
  - num_epochs: 5
103
 
 
 
 
 
 
 
 
 
 
 
104
 
105
  ### Framework versions
106
 
107
  - Transformers 4.48.0.dev0
108
  - Pytorch 2.5.0+cu124
109
  - Datasets 3.1.0
110
+ - Tokenizers 0.21.0