File size: 5,294 Bytes
cfc2530
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
license: apache-2.0
base_model: distilbert-base-multilingual-cased
tags:
- generated_from_trainer
model-index:
- name: privacy-masknig
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# privacy-masknig

This model is a fine-tuned version of [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3686
- Overall Precision: 0.2885
- Overall Recall: 0.2143
- Overall F1: 0.2459
- Overall Accuracy: 0.8688
- Bod F1: 0.2375
- Building F1: 0.2871
- Cardissuer F1: 0.0
- City F1: 0.2540
- Country F1: 0.3055
- Date F1: 0.2341
- Driverlicense F1: 0.2233
- Email F1: 0.2654
- Geocoord F1: 0.1603
- Givenname1 F1: 0.2161
- Givenname2 F1: 0.1507
- Idcard F1: 0.2472
- Ip F1: 0.1851
- Lastname1 F1: 0.2296
- Lastname2 F1: 0.1305
- Lastname3 F1: 0.1245
- Pass F1: 0.1980
- Passport F1: 0.2792
- Postcode F1: 0.2794
- Secaddress F1: 0.2486
- Sex F1: 0.2933
- Socialnumber F1: 0.2258
- State F1: 0.2921
- Street F1: 0.2177
- Tel F1: 0.2409
- Time F1: 0.2893
- Title F1: 0.2814
- Username F1: 0.2368

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine_with_restarts
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step   | Validation Loss | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy | Bod F1 | Building F1 | Cardissuer F1 | City F1 | Country F1 | Date F1 | Driverlicense F1 | Email F1 | Geocoord F1 | Givenname1 F1 | Givenname2 F1 | Idcard F1 | Ip F1  | Lastname1 F1 | Lastname2 F1 | Lastname3 F1 | Pass F1 | Passport F1 | Postcode F1 | Secaddress F1 | Sex F1 | Socialnumber F1 | State F1 | Street F1 | Tel F1 | Time F1 | Title F1 | Username F1 |
|:-------------:|:-----:|:------:|:---------------:|:-----------------:|:--------------:|:----------:|:----------------:|:------:|:-----------:|:-------------:|:-------:|:----------:|:-------:|:----------------:|:--------:|:-----------:|:-------------:|:-------------:|:---------:|:------:|:------------:|:------------:|:------------:|:-------:|:-----------:|:-----------:|:-------------:|:------:|:---------------:|:--------:|:---------:|:------:|:-------:|:--------:|:-----------:|
| 0.4774        | 1.0   | 62187  | 0.4611          | 0.1764            | 0.1017         | 0.1291     | 0.8380           | 0.1353 | 0.1842      | 0.0           | 0.1255  | 0.2337     | 0.1185  | 0.0936           | 0.1261   | 0.0500      | 0.0893        | 0.0506        | 0.1041    | 0.1122 | 0.1241       | 0.0463       | 0.0020       | 0.0486  | 0.1080      | 0.1726      | 0.1540        | 0.2044 | 0.0886          | 0.1588   | 0.1239    | 0.1406 | 0.1667  | 0.1583   | 0.1386      |
| 0.4205        | 2.0   | 124374 | 0.4272          | 0.2372            | 0.1567         | 0.1887     | 0.8542           | 0.1831 | 0.2706      | 0.0           | 0.1923  | 0.2819     | 0.1821  | 0.1521           | 0.1863   | 0.1197      | 0.0997        | 0.0662        | 0.1473    | 0.1512 | 0.1443       | 0.0955       | 0.0527       | 0.1678  | 0.1997      | 0.2469      | 0.2066        | 0.2641 | 0.1827          | 0.2266   | 0.1602    | 0.1879 | 0.2372  | 0.2202   | 0.2069      |
| 0.3367        | 3.0   | 186561 | 0.3686          | 0.2885            | 0.2143         | 0.2459     | 0.8688           | 0.2375 | 0.2871      | 0.0           | 0.2540  | 0.3055     | 0.2341  | 0.2233           | 0.2654   | 0.1603      | 0.2161        | 0.1507        | 0.2472    | 0.1851 | 0.2296       | 0.1305       | 0.1245       | 0.1980  | 0.2792      | 0.2794      | 0.2486        | 0.2933 | 0.2258          | 0.2921   | 0.2177    | 0.2409 | 0.2893  | 0.2814   | 0.2368      |
| 0.301         | 4.0   | 248748 | 0.3734          | 0.3073            | 0.2484         | 0.2747     | 0.8737           | 0.2565 | 0.3272      | 0.1429        | 0.2634  | 0.3355     | 0.2707  | 0.2591           | 0.3032   | 0.2153      | 0.2458        | 0.1847        | 0.2757    | 0.2252 | 0.2594       | 0.1680       | 0.1551       | 0.2410  | 0.3080      | 0.2945      | 0.2488        | 0.3139 | 0.2522          | 0.3007   | 0.2447    | 0.2584 | 0.3107  | 0.2933   | 0.2880      |
| 0.2451        | 5.0   | 310935 | 0.3895          | 0.3091            | 0.2664         | 0.2862     | 0.8744           | 0.2720 | 0.3313      | 0.0           | 0.2773  | 0.3470     | 0.2803  | 0.2732           | 0.3109   | 0.2202      | 0.2554        | 0.1945        | 0.2899    | 0.2382 | 0.2539       | 0.1800       | 0.1651       | 0.2514  | 0.3156      | 0.2982      | 0.2720        | 0.3364 | 0.2695          | 0.3196   | 0.2561    | 0.2732 | 0.3169  | 0.3054   | 0.3020      |


### Framework versions

- Transformers 4.40.0
- Pytorch 2.2.1+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1