You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Terjman-Nano-v2.2-512

This model is a fine-tuned version of atlasia/Terjman-Nano on BounharAbdelaziz/Terjman-v2-English-Darija-Dataset-350K. It achieves the following results on atlasia/TerjamaBench set:

  • Loss: 3.3996
  • Bleu: 18.8386
  • Chrf: 38.4027
  • Ter: 94.7265
  • Gen Len: 9.7059

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Bleu Chrf Ter Gen Len
13.3826 0.0723 100 3.9345 9.9676 26.2425 104.4045 10.62
12.8584 0.1446 200 3.9364 10.2822 26.9005 118.5236 10.3635
11.6455 0.2169 300 3.9230 11.2322 28.4112 169.8713 11.1788
10.8856 0.2892 400 3.9062 12.1716 29.7281 188.6184 11.4588
9.9054 0.3615 500 3.8526 13.7346 31.3446 173.6084 10.9635
9.1353 0.4338 600 3.7746 14.9908 32.8135 141.8675 10.5882
8.4983 0.5061 700 3.7088 15.7536 34.0031 124.8542 10.0847
7.8704 0.5783 800 3.6546 16.1519 34.6619 125.2093 10.0929
7.5657 0.6506 900 3.6034 16.497 35.4358 124.2517 10.0918
7.1648 0.7229 1000 3.5651 16.8745 36.0108 120.2402 10.0694
6.9344 0.7952 1100 3.5358 17.4455 36.8006 111.3276 9.9271
6.7933 0.8675 1200 3.5146 17.4007 36.7547 109.1005 9.8659
6.5663 0.9398 1300 3.4908 17.8837 37.2366 97.7684 9.6588
6.156 1.0116 1400 3.4978 17.885 37.1388 101.7405 9.68
6.2336 1.0839 1500 3.4947 17.6667 37.0401 104.102 9.7506
6.0831 1.1562 1600 3.4840 17.9693 37.3924 103.7544 9.8318
5.9833 1.2284 1700 3.4736 18.0472 37.6755 100.4721 9.7482
5.889 1.3007 1800 3.4686 18.4178 37.7324 99.5847 9.7494
5.831 1.3730 1900 3.4629 18.546 37.9074 105.798 9.8329
5.7614 1.4453 2000 3.4568 18.4814 38.1828 96.5594 9.6847
5.6739 1.5176 2100 3.4438 18.3851 37.8621 101.9954 9.7859
5.6724 1.5899 2200 3.4434 18.8482 38.0621 100.9195 9.7612
5.5575 1.6622 2300 3.4418 18.4627 38.0205 101.7897 9.8024
5.5368 1.7345 2400 3.4352 18.5974 38.1539 95.8059 9.6659
5.4737 1.8068 2500 3.4317 18.6153 37.9427 95.6594 9.6541
5.5492 1.8791 2600 3.4224 18.7484 38.2565 94.3134 9.64
5.4826 1.9514 2700 3.4228 18.8072 38.2437 101.5818 9.7882
5.3407 2.0231 2800 3.4214 18.572 38.1618 101.9617 9.7929
5.4007 2.0954 2900 3.4154 18.6936 38.1956 100.9953 9.7776
5.3852 2.1677 3000 3.4138 18.8817 38.3229 93.9953 9.6576
5.3565 2.2400 3100 3.4136 18.7169 38.2232 101.2653 9.7824
5.3588 2.3123 3200 3.4117 19.0345 38.5406 93.8038 9.6471
5.3093 2.3846 3300 3.4096 18.8479 38.3863 94.7586 9.6776
5.2726 2.4569 3400 3.4082 18.793 38.4605 93.4033 9.6294
5.3462 2.5292 3500 3.4067 18.7692 38.5134 94.441 9.6506
5.3179 2.6015 3600 3.4082 18.7075 38.3506 95.021 9.6894
5.2932 2.6738 3700 3.4058 18.8893 38.4524 94.1772 9.6635
5.2667 2.7461 3800 3.4101 18.7618 38.4204 95.1496 9.6882
5.2381 2.8184 3900 3.4058 18.7816 38.3548 95.2425 9.6847
5.2377 2.8907 4000 3.4025 18.7729 38.4445 94.5543 9.6812
5.2117 2.9629 4100 3.4036 18.7731 38.6414 94.1615 9.6635
5.1246 3.0347 4200 3.4048 18.8311 38.4772 94.9196 9.6976
5.2024 3.1070 4300 3.4019 18.8853 38.4819 94.7683 9.6988
5.1983 3.1793 4400 3.4031 18.7309 38.3883 95.2313 9.7094
5.2622 3.2516 4500 3.4001 18.7809 38.2732 94.5622 9.6965
5.2596 3.3239 4600 3.4000 18.8192 38.377 94.5234 9.6882
5.2171 3.3962 4700 3.4022 18.8882 38.4257 94.3211 9.6882
5.2709 3.4685 4800 3.4001 18.9978 38.4208 93.8448 9.6706
5.2127 3.5408 4900 3.4010 18.8523 38.4182 94.0805 9.68
5.1972 3.6130 5000 3.4017 18.5275 38.1393 99.7695 9.7859
5.138 3.6853 5100 3.3996 18.624 38.2046 94.6994 9.6929
5.2447 3.7576 5200 3.4008 18.936 38.5847 93.9781 9.6718
5.249 3.8299 5300 3.4007 18.8005 38.3895 94.4262 9.6788
5.1984 3.9022 5400 3.4000 18.7464 38.4657 94.8043 9.6953
5.2089 3.9745 5500 3.4013 18.9012 38.4854 94.7817 9.6882
5.1492 4.0463 5600 3.3997 18.7251 38.3877 94.9903 9.7012
5.1992 4.1186 5700 3.3981 18.7467 38.3938 94.6414 9.6941
5.1869 4.1909 5800 3.3987 18.8237 38.4655 94.1302 9.6765
5.2141 4.2631 5900 3.3992 18.8302 38.3754 100.4665 9.8118
5.1576 4.3354 6000 3.3993 18.6991 38.4918 94.8632 9.7012
5.1984 4.4077 6100 3.4002 18.6786 38.4421 94.547 9.6718
5.1864 4.4800 6200 3.3999 18.845 38.4827 94.5178 9.7
5.1793 4.5523 6300 3.4000 18.6263 38.5282 94.89 9.68
5.2021 4.6246 6400 3.3989 18.621 38.5006 94.5353 9.6765
5.1802 4.6969 6500 3.3991 18.6936 38.361 94.56 9.6906
5.2364 4.7692 6600 3.3991 18.8428 38.6199 94.6669 9.6976
5.1949 4.8415 6700 3.3992 18.7368 38.3819 94.5997 9.6929
5.153 4.9138 6800 3.3994 18.8365 38.5249 94.7585 9.6988
5.2049 4.9861 6900 3.3996 18.8386 38.4027 94.7265 9.7059

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
27
Safetensors
Model size
76.4M params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for BounharAbdelaziz/Terjman-Nano-v2.2

Finetuned
(1)
this model

Space using BounharAbdelaziz/Terjman-Nano-v2.2 1