Terjman-Ultra-v2.2-bs-4-lr-0.005-ep-3-wp-0.1-gacc-32-gnm-1.0-mx-512-v2.2

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1788
  • Bleu: 22.9949
  • Chrf: 42.7982
  • Ter: 81.779
  • Gen Len: 12.4471

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.005
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Bleu Chrf Ter Gen Len
43.8491 0.0361 100 2.5695 17.2289 37.3828 89.9572 12.6812
37.8172 0.0723 200 2.4585 18.2812 37.9854 88.7129 12.6447
41.0438 0.1084 300 2.5096 18.8162 38.8708 87.3215 12.5471
42.4993 0.1446 400 2.5739 17.0759 37.6617 92.179 12.7612
46.4173 0.1807 500 2.5927 17.7493 37.0782 90.7712 12.7153
50.3046 0.2169 600 2.7970 15.8428 34.6685 92.0702 12.5482
51.0591 0.2530 700 2.7518 13.7852 32.3087 122.4174 12.7729
53.5123 0.2892 800 2.8505 15.0263 33.5287 103.083 13.0224
54.1577 0.3253 900 2.8056 15.0231 33.3181 111.4563 13.2682
51.5253 0.3615 1000 2.7933 15.0815 32.7664 113.3758 13.4812
48.9478 0.3976 1100 2.7655 16.2933 34.243 94.6244 12.8918
47.938 0.4338 1200 2.7612 16.456 34.2244 100.909 13.7153
45.326 0.4699 1300 2.6848 16.6214 34.6197 98.8592 12.8941
43.4495 0.5061 1400 2.6833 16.4706 34.2734 93.3621 12.7494
42.9919 0.5422 1500 2.6799 17.0703 34.625 97.7785 12.7
41.6122 0.5783 1600 2.6150 17.075 34.9678 95.8607 12.8212
39.7837 0.6145 1700 2.5757 18.1852 35.6426 95.2936 13.0435
39.4488 0.6506 1800 2.6387 17.3216 36.3273 101.5896 13.5753
37.5611 0.6868 1900 2.6270 17.7145 35.9937 88.7923 12.7388
35.9422 0.7229 2000 2.6597 17.6019 36.3482 92.8667 12.8165
35.7019 0.7591 2100 2.5452 17.7785 36.4548 106.5095 13.5741
34.3932 0.7952 2200 2.5148 15.9903 34.5921 98.0078 12.8953
33.1799 0.8314 2300 2.4486 18.1216 35.6711 87.5631 12.2859
32.7152 0.8675 2400 2.4996 18.7002 37.2594 90.4309 12.8471
31.5079 0.9037 2500 2.4359 18.1631 37.2894 88.5531 12.5518
30.6564 0.9398 2600 2.4656 18.9625 38.9211 88.7146 12.7259
29.5573 0.9760 2700 2.4242 18.6898 37.6113 88.1869 12.4588
27.8906 1.0119 2800 2.4335 19.9443 38.6642 95.4597 13.2976
26.8031 1.0481 2900 2.4132 20.279 39.0184 87.0889 12.6753
26.3385 1.0842 3000 2.4161 19.4296 38.2119 87.4356 12.5035
26.1276 1.1204 3100 2.4171 19.6523 37.982 86.3196 12.7235
25.269 1.1565 3200 2.4185 19.2776 38.4189 88.6809 12.7
24.6057 1.1927 3300 2.4177 18.9888 37.8358 86.3955 12.4953
24.2829 1.2288 3400 2.3645 19.4833 38.1572 88.7418 12.6929
23.7796 1.2650 3500 2.3611 19.9181 38.7512 91.6993 12.9671
23.3656 1.3011 3600 2.4021 20.3132 39.1634 85.3322 12.4976
22.9616 1.3372 3700 2.3549 19.5902 38.7572 86.4413 12.5306
22.6219 1.3734 3800 2.3330 20.4724 40.3681 85.8608 12.5459
22.1655 1.4095 3900 2.2768 21.042 40.3518 84.8207 12.4212
21.465 1.4457 4000 2.2969 21.1681 39.6366 85.3886 12.4294
21.3829 1.4818 4100 2.3233 20.0998 39.8767 85.702 12.4871
21.2687 1.5180 4200 2.2875 21.3475 40.1695 85.7872 12.6141
20.4221 1.5541 4300 2.2167 21.6864 40.2743 83.9404 12.4353
20.4634 1.5903 4400 2.2713 20.9198 40.4753 86.215 12.5776
19.923 1.6264 4500 2.2585 20.287 40.0784 84.9928 12.4341
19.5794 1.6626 4600 2.2550 20.648 40.5857 86.0858 12.7694
19.4394 1.6987 4700 2.2104 21.9344 41.1899 83.1 12.4929
18.6122 1.7349 4800 2.2093 22.767 41.4871 82.5461 12.6235
18.6992 1.7710 4900 2.2190 20.9887 41.0 85.8985 12.8235
18.2776 1.8072 5000 2.2146 20.9857 41.2827 83.8972 12.4824
17.6287 1.8433 5100 2.1996 21.9022 41.5448 85.0591 12.7012
17.9071 1.8795 5200 2.2085 22.0738 41.5302 82.5875 12.5071
16.8171 1.9156 5300 2.2099 21.7048 41.25 82.833 12.5141
16.9345 1.9517 5400 2.1895 22.0361 41.4987 82.2035 12.5094
16.7839 1.9879 5500 2.1839 21.9444 41.5776 85.0539 12.7247
13.937 2.0239 5600 2.2415 21.8045 41.7823 82.9828 12.48
13.8213 2.0600 5700 2.2454 21.8772 41.4813 83.3942 12.5624
14.1685 2.0962 5800 2.2389 21.463 41.9267 84.3062 12.5435
13.8483 2.1323 5900 2.2024 22.6018 41.8719 81.5109 12.3553
13.5261 2.1684 6000 2.2463 22.011 41.9225 82.6643 12.4235
13.6918 2.2046 6100 2.2273 22.7619 42.3633 81.7888 12.4012
13.2274 2.2407 6200 2.2184 22.3117 41.7462 82.1613 12.4035
13.2595 2.2769 6300 2.2420 22.554 41.7048 82.5436 12.4624
13.229 2.3130 6400 2.2358 22.8454 42.2844 82.2621 12.5412
12.8033 2.3492 6500 2.2154 23.1289 42.5379 81.5014 12.4094
12.8587 2.3853 6600 2.2180 21.9732 41.935 83.0024 12.5071
12.4347 2.4215 6700 2.2110 22.446 42.1488 81.8244 12.4976
12.4909 2.4576 6800 2.1824 22.3089 42.3126 81.7373 12.4812
12.4221 2.4938 6900 2.1763 22.7566 42.4346 82.1514 12.5341
12.3757 2.5299 7000 2.2033 22.7765 42.3882 84.5552 12.7024
12.1889 2.5661 7100 2.1985 23.0604 42.4498 81.9895 12.4859
11.8595 2.6022 7200 2.1895 22.7264 42.4201 81.7662 12.5082
12.3815 2.6384 7300 2.1769 22.531 42.608 82.6642 12.5141
12.107 2.6745 7400 2.1906 22.9567 42.8802 81.617 12.44
11.7338 2.7106 7500 2.1932 22.8328 42.879 81.8005 12.4553
11.6615 2.7468 7600 2.1772 22.9404 42.6161 81.7249 12.4271
12.2163 2.7829 7700 2.1904 22.9762 42.8218 81.7552 12.4565
11.766 2.8191 7800 2.1768 22.8033 42.8082 82.034 12.4082
11.5269 2.8552 7900 2.1870 22.8658 42.5434 81.7954 12.4435
11.6161 2.8914 8000 2.1844 23.2422 42.986 81.6856 12.4788
11.6672 2.9275 8100 2.1809 22.9331 42.725 81.7888 12.4353
11.6776 2.9637 8200 2.1788 22.9949 42.7982 81.779 12.4471

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
7
Safetensors
Model size
1.37B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for BounharAbdelaziz/Terjman-Ultra-v2.2-bs-4-lr-0.005-ep-3-wp-0.1-gacc-32-gnm-1.0-mx-512-v2.2

Finetuned
(7)
this model