Terjman-Nano-v2.2-512
This model is a fine-tuned version of atlasia/Terjman-Nano on BounharAbdelaziz/Terjman-v2-English-Darija-Dataset-350K. It achieves the following results on atlasia/TerjamaBench set:
- Loss: 3.3996
- Bleu: 18.8386
- Chrf: 38.4027
- Ter: 94.7265
- Gen Len: 9.7059
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 256
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Chrf | Ter | Gen Len |
---|---|---|---|---|---|---|---|
13.3826 | 0.0723 | 100 | 3.9345 | 9.9676 | 26.2425 | 104.4045 | 10.62 |
12.8584 | 0.1446 | 200 | 3.9364 | 10.2822 | 26.9005 | 118.5236 | 10.3635 |
11.6455 | 0.2169 | 300 | 3.9230 | 11.2322 | 28.4112 | 169.8713 | 11.1788 |
10.8856 | 0.2892 | 400 | 3.9062 | 12.1716 | 29.7281 | 188.6184 | 11.4588 |
9.9054 | 0.3615 | 500 | 3.8526 | 13.7346 | 31.3446 | 173.6084 | 10.9635 |
9.1353 | 0.4338 | 600 | 3.7746 | 14.9908 | 32.8135 | 141.8675 | 10.5882 |
8.4983 | 0.5061 | 700 | 3.7088 | 15.7536 | 34.0031 | 124.8542 | 10.0847 |
7.8704 | 0.5783 | 800 | 3.6546 | 16.1519 | 34.6619 | 125.2093 | 10.0929 |
7.5657 | 0.6506 | 900 | 3.6034 | 16.497 | 35.4358 | 124.2517 | 10.0918 |
7.1648 | 0.7229 | 1000 | 3.5651 | 16.8745 | 36.0108 | 120.2402 | 10.0694 |
6.9344 | 0.7952 | 1100 | 3.5358 | 17.4455 | 36.8006 | 111.3276 | 9.9271 |
6.7933 | 0.8675 | 1200 | 3.5146 | 17.4007 | 36.7547 | 109.1005 | 9.8659 |
6.5663 | 0.9398 | 1300 | 3.4908 | 17.8837 | 37.2366 | 97.7684 | 9.6588 |
6.156 | 1.0116 | 1400 | 3.4978 | 17.885 | 37.1388 | 101.7405 | 9.68 |
6.2336 | 1.0839 | 1500 | 3.4947 | 17.6667 | 37.0401 | 104.102 | 9.7506 |
6.0831 | 1.1562 | 1600 | 3.4840 | 17.9693 | 37.3924 | 103.7544 | 9.8318 |
5.9833 | 1.2284 | 1700 | 3.4736 | 18.0472 | 37.6755 | 100.4721 | 9.7482 |
5.889 | 1.3007 | 1800 | 3.4686 | 18.4178 | 37.7324 | 99.5847 | 9.7494 |
5.831 | 1.3730 | 1900 | 3.4629 | 18.546 | 37.9074 | 105.798 | 9.8329 |
5.7614 | 1.4453 | 2000 | 3.4568 | 18.4814 | 38.1828 | 96.5594 | 9.6847 |
5.6739 | 1.5176 | 2100 | 3.4438 | 18.3851 | 37.8621 | 101.9954 | 9.7859 |
5.6724 | 1.5899 | 2200 | 3.4434 | 18.8482 | 38.0621 | 100.9195 | 9.7612 |
5.5575 | 1.6622 | 2300 | 3.4418 | 18.4627 | 38.0205 | 101.7897 | 9.8024 |
5.5368 | 1.7345 | 2400 | 3.4352 | 18.5974 | 38.1539 | 95.8059 | 9.6659 |
5.4737 | 1.8068 | 2500 | 3.4317 | 18.6153 | 37.9427 | 95.6594 | 9.6541 |
5.5492 | 1.8791 | 2600 | 3.4224 | 18.7484 | 38.2565 | 94.3134 | 9.64 |
5.4826 | 1.9514 | 2700 | 3.4228 | 18.8072 | 38.2437 | 101.5818 | 9.7882 |
5.3407 | 2.0231 | 2800 | 3.4214 | 18.572 | 38.1618 | 101.9617 | 9.7929 |
5.4007 | 2.0954 | 2900 | 3.4154 | 18.6936 | 38.1956 | 100.9953 | 9.7776 |
5.3852 | 2.1677 | 3000 | 3.4138 | 18.8817 | 38.3229 | 93.9953 | 9.6576 |
5.3565 | 2.2400 | 3100 | 3.4136 | 18.7169 | 38.2232 | 101.2653 | 9.7824 |
5.3588 | 2.3123 | 3200 | 3.4117 | 19.0345 | 38.5406 | 93.8038 | 9.6471 |
5.3093 | 2.3846 | 3300 | 3.4096 | 18.8479 | 38.3863 | 94.7586 | 9.6776 |
5.2726 | 2.4569 | 3400 | 3.4082 | 18.793 | 38.4605 | 93.4033 | 9.6294 |
5.3462 | 2.5292 | 3500 | 3.4067 | 18.7692 | 38.5134 | 94.441 | 9.6506 |
5.3179 | 2.6015 | 3600 | 3.4082 | 18.7075 | 38.3506 | 95.021 | 9.6894 |
5.2932 | 2.6738 | 3700 | 3.4058 | 18.8893 | 38.4524 | 94.1772 | 9.6635 |
5.2667 | 2.7461 | 3800 | 3.4101 | 18.7618 | 38.4204 | 95.1496 | 9.6882 |
5.2381 | 2.8184 | 3900 | 3.4058 | 18.7816 | 38.3548 | 95.2425 | 9.6847 |
5.2377 | 2.8907 | 4000 | 3.4025 | 18.7729 | 38.4445 | 94.5543 | 9.6812 |
5.2117 | 2.9629 | 4100 | 3.4036 | 18.7731 | 38.6414 | 94.1615 | 9.6635 |
5.1246 | 3.0347 | 4200 | 3.4048 | 18.8311 | 38.4772 | 94.9196 | 9.6976 |
5.2024 | 3.1070 | 4300 | 3.4019 | 18.8853 | 38.4819 | 94.7683 | 9.6988 |
5.1983 | 3.1793 | 4400 | 3.4031 | 18.7309 | 38.3883 | 95.2313 | 9.7094 |
5.2622 | 3.2516 | 4500 | 3.4001 | 18.7809 | 38.2732 | 94.5622 | 9.6965 |
5.2596 | 3.3239 | 4600 | 3.4000 | 18.8192 | 38.377 | 94.5234 | 9.6882 |
5.2171 | 3.3962 | 4700 | 3.4022 | 18.8882 | 38.4257 | 94.3211 | 9.6882 |
5.2709 | 3.4685 | 4800 | 3.4001 | 18.9978 | 38.4208 | 93.8448 | 9.6706 |
5.2127 | 3.5408 | 4900 | 3.4010 | 18.8523 | 38.4182 | 94.0805 | 9.68 |
5.1972 | 3.6130 | 5000 | 3.4017 | 18.5275 | 38.1393 | 99.7695 | 9.7859 |
5.138 | 3.6853 | 5100 | 3.3996 | 18.624 | 38.2046 | 94.6994 | 9.6929 |
5.2447 | 3.7576 | 5200 | 3.4008 | 18.936 | 38.5847 | 93.9781 | 9.6718 |
5.249 | 3.8299 | 5300 | 3.4007 | 18.8005 | 38.3895 | 94.4262 | 9.6788 |
5.1984 | 3.9022 | 5400 | 3.4000 | 18.7464 | 38.4657 | 94.8043 | 9.6953 |
5.2089 | 3.9745 | 5500 | 3.4013 | 18.9012 | 38.4854 | 94.7817 | 9.6882 |
5.1492 | 4.0463 | 5600 | 3.3997 | 18.7251 | 38.3877 | 94.9903 | 9.7012 |
5.1992 | 4.1186 | 5700 | 3.3981 | 18.7467 | 38.3938 | 94.6414 | 9.6941 |
5.1869 | 4.1909 | 5800 | 3.3987 | 18.8237 | 38.4655 | 94.1302 | 9.6765 |
5.2141 | 4.2631 | 5900 | 3.3992 | 18.8302 | 38.3754 | 100.4665 | 9.8118 |
5.1576 | 4.3354 | 6000 | 3.3993 | 18.6991 | 38.4918 | 94.8632 | 9.7012 |
5.1984 | 4.4077 | 6100 | 3.4002 | 18.6786 | 38.4421 | 94.547 | 9.6718 |
5.1864 | 4.4800 | 6200 | 3.3999 | 18.845 | 38.4827 | 94.5178 | 9.7 |
5.1793 | 4.5523 | 6300 | 3.4000 | 18.6263 | 38.5282 | 94.89 | 9.68 |
5.2021 | 4.6246 | 6400 | 3.3989 | 18.621 | 38.5006 | 94.5353 | 9.6765 |
5.1802 | 4.6969 | 6500 | 3.3991 | 18.6936 | 38.361 | 94.56 | 9.6906 |
5.2364 | 4.7692 | 6600 | 3.3991 | 18.8428 | 38.6199 | 94.6669 | 9.6976 |
5.1949 | 4.8415 | 6700 | 3.3992 | 18.7368 | 38.3819 | 94.5997 | 9.6929 |
5.153 | 4.9138 | 6800 | 3.3994 | 18.8365 | 38.5249 | 94.7585 | 9.6988 |
5.2049 | 4.9861 | 6900 | 3.3996 | 18.8386 | 38.4027 | 94.7265 | 9.7059 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 27
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.