Terjman-Ultra-v2.2-bs-4-lr-0.005-ep-3-wp-0.1-gacc-32-gnm-1.0-mx-512-v2.2
This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.1788
- Bleu: 22.9949
- Chrf: 42.7982
- Ter: 81.779
- Gen Len: 12.4471
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.005
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 32
- total_train_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Chrf | Ter | Gen Len |
---|---|---|---|---|---|---|---|
43.8491 | 0.0361 | 100 | 2.5695 | 17.2289 | 37.3828 | 89.9572 | 12.6812 |
37.8172 | 0.0723 | 200 | 2.4585 | 18.2812 | 37.9854 | 88.7129 | 12.6447 |
41.0438 | 0.1084 | 300 | 2.5096 | 18.8162 | 38.8708 | 87.3215 | 12.5471 |
42.4993 | 0.1446 | 400 | 2.5739 | 17.0759 | 37.6617 | 92.179 | 12.7612 |
46.4173 | 0.1807 | 500 | 2.5927 | 17.7493 | 37.0782 | 90.7712 | 12.7153 |
50.3046 | 0.2169 | 600 | 2.7970 | 15.8428 | 34.6685 | 92.0702 | 12.5482 |
51.0591 | 0.2530 | 700 | 2.7518 | 13.7852 | 32.3087 | 122.4174 | 12.7729 |
53.5123 | 0.2892 | 800 | 2.8505 | 15.0263 | 33.5287 | 103.083 | 13.0224 |
54.1577 | 0.3253 | 900 | 2.8056 | 15.0231 | 33.3181 | 111.4563 | 13.2682 |
51.5253 | 0.3615 | 1000 | 2.7933 | 15.0815 | 32.7664 | 113.3758 | 13.4812 |
48.9478 | 0.3976 | 1100 | 2.7655 | 16.2933 | 34.243 | 94.6244 | 12.8918 |
47.938 | 0.4338 | 1200 | 2.7612 | 16.456 | 34.2244 | 100.909 | 13.7153 |
45.326 | 0.4699 | 1300 | 2.6848 | 16.6214 | 34.6197 | 98.8592 | 12.8941 |
43.4495 | 0.5061 | 1400 | 2.6833 | 16.4706 | 34.2734 | 93.3621 | 12.7494 |
42.9919 | 0.5422 | 1500 | 2.6799 | 17.0703 | 34.625 | 97.7785 | 12.7 |
41.6122 | 0.5783 | 1600 | 2.6150 | 17.075 | 34.9678 | 95.8607 | 12.8212 |
39.7837 | 0.6145 | 1700 | 2.5757 | 18.1852 | 35.6426 | 95.2936 | 13.0435 |
39.4488 | 0.6506 | 1800 | 2.6387 | 17.3216 | 36.3273 | 101.5896 | 13.5753 |
37.5611 | 0.6868 | 1900 | 2.6270 | 17.7145 | 35.9937 | 88.7923 | 12.7388 |
35.9422 | 0.7229 | 2000 | 2.6597 | 17.6019 | 36.3482 | 92.8667 | 12.8165 |
35.7019 | 0.7591 | 2100 | 2.5452 | 17.7785 | 36.4548 | 106.5095 | 13.5741 |
34.3932 | 0.7952 | 2200 | 2.5148 | 15.9903 | 34.5921 | 98.0078 | 12.8953 |
33.1799 | 0.8314 | 2300 | 2.4486 | 18.1216 | 35.6711 | 87.5631 | 12.2859 |
32.7152 | 0.8675 | 2400 | 2.4996 | 18.7002 | 37.2594 | 90.4309 | 12.8471 |
31.5079 | 0.9037 | 2500 | 2.4359 | 18.1631 | 37.2894 | 88.5531 | 12.5518 |
30.6564 | 0.9398 | 2600 | 2.4656 | 18.9625 | 38.9211 | 88.7146 | 12.7259 |
29.5573 | 0.9760 | 2700 | 2.4242 | 18.6898 | 37.6113 | 88.1869 | 12.4588 |
27.8906 | 1.0119 | 2800 | 2.4335 | 19.9443 | 38.6642 | 95.4597 | 13.2976 |
26.8031 | 1.0481 | 2900 | 2.4132 | 20.279 | 39.0184 | 87.0889 | 12.6753 |
26.3385 | 1.0842 | 3000 | 2.4161 | 19.4296 | 38.2119 | 87.4356 | 12.5035 |
26.1276 | 1.1204 | 3100 | 2.4171 | 19.6523 | 37.982 | 86.3196 | 12.7235 |
25.269 | 1.1565 | 3200 | 2.4185 | 19.2776 | 38.4189 | 88.6809 | 12.7 |
24.6057 | 1.1927 | 3300 | 2.4177 | 18.9888 | 37.8358 | 86.3955 | 12.4953 |
24.2829 | 1.2288 | 3400 | 2.3645 | 19.4833 | 38.1572 | 88.7418 | 12.6929 |
23.7796 | 1.2650 | 3500 | 2.3611 | 19.9181 | 38.7512 | 91.6993 | 12.9671 |
23.3656 | 1.3011 | 3600 | 2.4021 | 20.3132 | 39.1634 | 85.3322 | 12.4976 |
22.9616 | 1.3372 | 3700 | 2.3549 | 19.5902 | 38.7572 | 86.4413 | 12.5306 |
22.6219 | 1.3734 | 3800 | 2.3330 | 20.4724 | 40.3681 | 85.8608 | 12.5459 |
22.1655 | 1.4095 | 3900 | 2.2768 | 21.042 | 40.3518 | 84.8207 | 12.4212 |
21.465 | 1.4457 | 4000 | 2.2969 | 21.1681 | 39.6366 | 85.3886 | 12.4294 |
21.3829 | 1.4818 | 4100 | 2.3233 | 20.0998 | 39.8767 | 85.702 | 12.4871 |
21.2687 | 1.5180 | 4200 | 2.2875 | 21.3475 | 40.1695 | 85.7872 | 12.6141 |
20.4221 | 1.5541 | 4300 | 2.2167 | 21.6864 | 40.2743 | 83.9404 | 12.4353 |
20.4634 | 1.5903 | 4400 | 2.2713 | 20.9198 | 40.4753 | 86.215 | 12.5776 |
19.923 | 1.6264 | 4500 | 2.2585 | 20.287 | 40.0784 | 84.9928 | 12.4341 |
19.5794 | 1.6626 | 4600 | 2.2550 | 20.648 | 40.5857 | 86.0858 | 12.7694 |
19.4394 | 1.6987 | 4700 | 2.2104 | 21.9344 | 41.1899 | 83.1 | 12.4929 |
18.6122 | 1.7349 | 4800 | 2.2093 | 22.767 | 41.4871 | 82.5461 | 12.6235 |
18.6992 | 1.7710 | 4900 | 2.2190 | 20.9887 | 41.0 | 85.8985 | 12.8235 |
18.2776 | 1.8072 | 5000 | 2.2146 | 20.9857 | 41.2827 | 83.8972 | 12.4824 |
17.6287 | 1.8433 | 5100 | 2.1996 | 21.9022 | 41.5448 | 85.0591 | 12.7012 |
17.9071 | 1.8795 | 5200 | 2.2085 | 22.0738 | 41.5302 | 82.5875 | 12.5071 |
16.8171 | 1.9156 | 5300 | 2.2099 | 21.7048 | 41.25 | 82.833 | 12.5141 |
16.9345 | 1.9517 | 5400 | 2.1895 | 22.0361 | 41.4987 | 82.2035 | 12.5094 |
16.7839 | 1.9879 | 5500 | 2.1839 | 21.9444 | 41.5776 | 85.0539 | 12.7247 |
13.937 | 2.0239 | 5600 | 2.2415 | 21.8045 | 41.7823 | 82.9828 | 12.48 |
13.8213 | 2.0600 | 5700 | 2.2454 | 21.8772 | 41.4813 | 83.3942 | 12.5624 |
14.1685 | 2.0962 | 5800 | 2.2389 | 21.463 | 41.9267 | 84.3062 | 12.5435 |
13.8483 | 2.1323 | 5900 | 2.2024 | 22.6018 | 41.8719 | 81.5109 | 12.3553 |
13.5261 | 2.1684 | 6000 | 2.2463 | 22.011 | 41.9225 | 82.6643 | 12.4235 |
13.6918 | 2.2046 | 6100 | 2.2273 | 22.7619 | 42.3633 | 81.7888 | 12.4012 |
13.2274 | 2.2407 | 6200 | 2.2184 | 22.3117 | 41.7462 | 82.1613 | 12.4035 |
13.2595 | 2.2769 | 6300 | 2.2420 | 22.554 | 41.7048 | 82.5436 | 12.4624 |
13.229 | 2.3130 | 6400 | 2.2358 | 22.8454 | 42.2844 | 82.2621 | 12.5412 |
12.8033 | 2.3492 | 6500 | 2.2154 | 23.1289 | 42.5379 | 81.5014 | 12.4094 |
12.8587 | 2.3853 | 6600 | 2.2180 | 21.9732 | 41.935 | 83.0024 | 12.5071 |
12.4347 | 2.4215 | 6700 | 2.2110 | 22.446 | 42.1488 | 81.8244 | 12.4976 |
12.4909 | 2.4576 | 6800 | 2.1824 | 22.3089 | 42.3126 | 81.7373 | 12.4812 |
12.4221 | 2.4938 | 6900 | 2.1763 | 22.7566 | 42.4346 | 82.1514 | 12.5341 |
12.3757 | 2.5299 | 7000 | 2.2033 | 22.7765 | 42.3882 | 84.5552 | 12.7024 |
12.1889 | 2.5661 | 7100 | 2.1985 | 23.0604 | 42.4498 | 81.9895 | 12.4859 |
11.8595 | 2.6022 | 7200 | 2.1895 | 22.7264 | 42.4201 | 81.7662 | 12.5082 |
12.3815 | 2.6384 | 7300 | 2.1769 | 22.531 | 42.608 | 82.6642 | 12.5141 |
12.107 | 2.6745 | 7400 | 2.1906 | 22.9567 | 42.8802 | 81.617 | 12.44 |
11.7338 | 2.7106 | 7500 | 2.1932 | 22.8328 | 42.879 | 81.8005 | 12.4553 |
11.6615 | 2.7468 | 7600 | 2.1772 | 22.9404 | 42.6161 | 81.7249 | 12.4271 |
12.2163 | 2.7829 | 7700 | 2.1904 | 22.9762 | 42.8218 | 81.7552 | 12.4565 |
11.766 | 2.8191 | 7800 | 2.1768 | 22.8033 | 42.8082 | 82.034 | 12.4082 |
11.5269 | 2.8552 | 7900 | 2.1870 | 22.8658 | 42.5434 | 81.7954 | 12.4435 |
11.6161 | 2.8914 | 8000 | 2.1844 | 23.2422 | 42.986 | 81.6856 | 12.4788 |
11.6672 | 2.9275 | 8100 | 2.1809 | 22.9331 | 42.725 | 81.7888 | 12.4353 |
11.6776 | 2.9637 | 8200 | 2.1788 | 22.9949 | 42.7982 | 81.779 | 12.4471 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 7
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for BounharAbdelaziz/Terjman-Ultra-v2.2-bs-4-lr-0.005-ep-3-wp-0.1-gacc-32-gnm-1.0-mx-512-v2.2
Base model
facebook/nllb-200-1.3B