tfa_output_2025_m02_d07_t07h_43m_33s

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4660

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
No log 0 0 1.4694
4.604 0.0030 1 1.4694
4.6485 0.0060 2 1.4693
4.6515 0.0091 3 1.4693
4.9659 0.0121 4 1.4692
4.5235 0.0151 5 1.4692
4.7107 0.0181 6 1.4690
4.3335 0.0211 7 1.4686
4.8938 0.0242 8 1.4682
4.8609 0.0272 9 1.4677
4.5648 0.0302 10 1.4667
4.5394 0.0332 11 1.4655
4.7629 0.0363 12 1.4644
4.5808 0.0393 13 1.4632
4.5545 0.0423 14 1.4619
4.4343 0.0453 15 1.4605
4.5662 0.0483 16 1.4591
4.398 0.0514 17 1.4575
4.3894 0.0544 18 1.4550
4.61 0.0574 19 1.4524
4.4373 0.0604 20 1.4501
4.2311 0.0634 21 1.4478
4.3044 0.0665 22 1.4454
4.2496 0.0695 23 1.4431
4.3269 0.0725 24 1.4409
4.2602 0.0755 25 1.4385
4.4063 0.0785 26 1.4362
4.2922 0.0816 27 1.4341
3.8772 0.0846 28 1.4320
4.2066 0.0876 29 1.4298
4.1774 0.0906 30 1.4280
4.1788 0.0937 31 1.4259
4.2413 0.0967 32 1.4240
4.2196 0.0997 33 1.4220
4.4204 0.1027 34 1.4203
4.1017 0.1057 35 1.4185
4.25 0.1088 36 1.4168
4.1705 0.1118 37 1.4152
4.0192 0.1148 38 1.4136
4.1317 0.1178 39 1.4123
4.243 0.1208 40 1.4107
3.9988 0.1239 41 1.4094
3.9451 0.1269 42 1.4081
4.0244 0.1299 43 1.4068
4.0956 0.1329 44 1.4055
3.9313 0.1360 45 1.4043
3.9223 0.1390 46 1.4033
3.7763 0.1420 47 1.4021
3.8153 0.1450 48 1.4011
4.1018 0.1480 49 1.4003
4.2674 0.1511 50 1.3994
4.0947 0.1541 51 1.3984
3.8134 0.1571 52 1.3976
3.7836 0.1601 53 1.3969
3.9645 0.1631 54 1.3961
3.5071 0.1662 55 1.3955
3.9643 0.1692 56 1.3950
3.7689 0.1722 57 1.3944
3.9377 0.1752 58 1.3937
3.9542 0.1782 59 1.3932
3.7511 0.1813 60 1.3926
3.7859 0.1843 61 1.3922
3.7963 0.1873 62 1.3916
3.7728 0.1903 63 1.3913
3.7714 0.1934 64 1.3910
3.9066 0.1964 65 1.3906
3.9718 0.1994 66 1.3903
3.7207 0.2024 67 1.3901
3.7618 0.2054 68 1.3896
3.5737 0.2085 69 1.3895
3.6742 0.2115 70 1.3892
3.6627 0.2145 71 1.3892
3.7246 0.2175 72 1.3889
3.5988 0.2205 73 1.3887
3.5042 0.2236 74 1.3885
3.6754 0.2266 75 1.3884
3.8237 0.2296 76 1.3883
3.6541 0.2326 77 1.3882
3.8442 0.2356 78 1.3881
3.8189 0.2387 79 1.3881
3.4796 0.2417 80 1.3879
3.7061 0.2447 81 1.3880
3.7453 0.2477 82 1.3880
3.4375 0.2508 83 1.3880
3.6748 0.2538 84 1.3881
3.6132 0.2568 85 1.3880
3.6022 0.2598 86 1.3879
3.9084 0.2628 87 1.3879
3.5629 0.2659 88 1.3882
3.6004 0.2689 89 1.3883
3.8498 0.2719 90 1.3883
3.5523 0.2749 91 1.3884
3.7526 0.2779 92 1.3886
3.7638 0.2810 93 1.3887
3.6319 0.2840 94 1.3888
3.551 0.2870 95 1.3888
3.8053 0.2900 96 1.3890
3.6299 0.2931 97 1.3891
3.8778 0.2961 98 1.3891
3.4661 0.2991 99 1.3893
3.6199 0.3021 100 1.3894
3.7169 0.3051 101 1.3897
3.6181 0.3082 102 1.3898
3.712 0.3112 103 1.3900
3.426 0.3142 104 1.3903
3.2462 0.3172 105 1.3905
3.4656 0.3202 106 1.3907
3.5511 0.3233 107 1.3910
3.5268 0.3263 108 1.3913
3.4383 0.3293 109 1.3915
3.5351 0.3323 110 1.3919
3.4221 0.3353 111 1.3922
3.064 0.3384 112 1.3926
3.4006 0.3414 113 1.3928
3.5908 0.3444 114 1.3931
3.5341 0.3474 115 1.3935
3.4771 0.3505 116 1.3938
3.4362 0.3535 117 1.3940
3.5801 0.3565 118 1.3941
3.5304 0.3595 119 1.3943
3.6278 0.3625 120 1.3945
3.5677 0.3656 121 1.3948
3.5208 0.3686 122 1.3950
3.5702 0.3716 123 1.3952
3.4074 0.3746 124 1.3954
3.1593 0.3776 125 1.3956
3.4503 0.3807 126 1.3958
3.6251 0.3837 127 1.3961
3.4879 0.3867 128 1.3965
3.3838 0.3897 129 1.3968
3.411 0.3927 130 1.3972
3.3504 0.3958 131 1.3975
3.2542 0.3988 132 1.3980
3.5616 0.4018 133 1.3984
3.3994 0.4048 134 1.3990
3.3805 0.4079 135 1.3994
3.4415 0.4109 136 1.3997
3.6288 0.4139 137 1.4001
3.2875 0.4169 138 1.4004
3.2699 0.4199 139 1.4007
3.6125 0.4230 140 1.4011
3.4689 0.4260 141 1.4013
3.4061 0.4290 142 1.4015
3.3196 0.4320 143 1.4019
3.4037 0.4350 144 1.4020
3.4214 0.4381 145 1.4024
3.3448 0.4411 146 1.4025
3.4775 0.4441 147 1.4028
3.4709 0.4471 148 1.4032
3.4283 0.4502 149 1.4033
3.4671 0.4532 150 1.4035
3.2426 0.4562 151 1.4038
3.4191 0.4592 152 1.4042
3.4264 0.4622 153 1.4045
3.2461 0.4653 154 1.4050
3.3855 0.4683 155 1.4053
3.313 0.4713 156 1.4056
3.3058 0.4743 157 1.4059
3.5508 0.4773 158 1.4062
3.2109 0.4804 159 1.4066
3.5045 0.4834 160 1.4068
3.4068 0.4864 161 1.4071
3.3438 0.4894 162 1.4075
3.3953 0.4924 163 1.4078
3.2312 0.4955 164 1.4079
3.2971 0.4985 165 1.4084
3.3118 0.5015 166 1.4086
3.4395 0.5045 167 1.4088
3.7162 0.5076 168 1.4089
3.3864 0.5106 169 1.4091
3.0887 0.5136 170 1.4092
2.9898 0.5166 171 1.4095
3.4697 0.5196 172 1.4099
3.2762 0.5227 173 1.4102
3.1383 0.5257 174 1.4106
3.2522 0.5287 175 1.4112
3.309 0.5317 176 1.4117
3.4431 0.5347 177 1.4121
3.1366 0.5378 178 1.4126
3.3094 0.5408 179 1.4131
3.4466 0.5438 180 1.4136
3.3411 0.5468 181 1.4140
3.013 0.5498 182 1.4143
3.4785 0.5529 183 1.4147
3.0358 0.5559 184 1.4152
3.2833 0.5589 185 1.4158
3.2953 0.5619 186 1.4162
3.3485 0.5650 187 1.4167
3.4911 0.5680 188 1.4172
3.3863 0.5710 189 1.4176
3.1944 0.5740 190 1.4180
3.2994 0.5770 191 1.4185
3.4385 0.5801 192 1.4187
3.3346 0.5831 193 1.4191
3.3318 0.5861 194 1.4194
3.4027 0.5891 195 1.4198
3.2532 0.5921 196 1.4201
3.2351 0.5952 197 1.4205
3.4589 0.5982 198 1.4203
3.4375 0.6012 199 1.4204
3.0901 0.6042 200 1.4207
3.3186 0.6073 201 1.4210
3.2891 0.6103 202 1.4212
3.2752 0.6133 203 1.4217
3.3808 0.6163 204 1.4221
3.376 0.6193 205 1.4223
3.4086 0.6224 206 1.4226
3.3506 0.6254 207 1.4228
3.4508 0.6284 208 1.4232
3.4237 0.6314 209 1.4235
3.2154 0.6344 210 1.4240
3.2379 0.6375 211 1.4244
2.8335 0.6405 212 1.4251
3.1927 0.6435 213 1.4251
3.1871 0.6465 214 1.4256
3.1004 0.6495 215 1.4259
3.2405 0.6526 216 1.4264
3.1544 0.6556 217 1.4269
3.1204 0.6586 218 1.4274
3.3257 0.6616 219 1.4280
3.2689 0.6647 220 1.4286
3.0117 0.6677 221 1.4289
3.4276 0.6707 222 1.4295
3.2358 0.6737 223 1.4301
3.1374 0.6767 224 1.4307
3.2972 0.6798 225 1.4312
3.2838 0.6828 226 1.4318
3.2839 0.6858 227 1.4322
3.2228 0.6888 228 1.4328
3.2605 0.6918 229 1.4334
3.2945 0.6949 230 1.4340
3.3155 0.6979 231 1.4344
3.1988 0.7009 232 1.4349
3.2921 0.7039 233 1.4355
2.8752 0.7069 234 1.4359
3.0065 0.7100 235 1.4361
3.1689 0.7130 236 1.4366
3.1959 0.7160 237 1.4370
3.3473 0.7190 238 1.4373
3.2927 0.7221 239 1.4377
2.9934 0.7251 240 1.4379
3.2058 0.7281 241 1.4384
3.1388 0.7311 242 1.4388
3.2384 0.7341 243 1.4388
3.2028 0.7372 244 1.4392
3.3737 0.7402 245 1.4392
3.166 0.7432 246 1.4397
3.0255 0.7462 247 1.4397
3.0979 0.7492 248 1.4401
3.2436 0.7523 249 1.4404
3.1785 0.7553 250 1.4408
3.2052 0.7583 251 1.4411
3.1967 0.7613 252 1.4413
2.9086 0.7644 253 1.4416
3.355 0.7674 254 1.4421
3.4027 0.7704 255 1.4422
2.9307 0.7734 256 1.4428
3.1738 0.7764 257 1.4429
3.3088 0.7795 258 1.4431
3.4942 0.7825 259 1.4432
2.831 0.7855 260 1.4437
3.1675 0.7885 261 1.4444
3.3274 0.7915 262 1.4447
3.0326 0.7946 263 1.4448
3.3138 0.7976 264 1.4454
3.2153 0.8006 265 1.4455
3.3983 0.8036 266 1.4458
3.179 0.8066 267 1.4461
3.2621 0.8097 268 1.4464
3.0191 0.8127 269 1.4468
3.058 0.8157 270 1.4472
3.3188 0.8187 271 1.4478
2.9837 0.8218 272 1.4481
3.2624 0.8248 273 1.4486
3.2701 0.8278 274 1.4492
3.1579 0.8308 275 1.4497
3.3164 0.8338 276 1.4501
2.9827 0.8369 277 1.4507
3.1842 0.8399 278 1.4512
3.2366 0.8429 279 1.4519
3.0562 0.8459 280 1.4520
2.9503 0.8489 281 1.4526
3.0441 0.8520 282 1.4530
3.4535 0.8550 283 1.4534
3.2656 0.8580 284 1.4537
3.3452 0.8610 285 1.4541
3.0958 0.8640 286 1.4548
3.1579 0.8671 287 1.4553
3.1473 0.8701 288 1.4556
3.2825 0.8731 289 1.4559
2.8554 0.8761 290 1.4563
3.2792 0.8792 291 1.4566
3.0977 0.8822 292 1.4567
3.0414 0.8852 293 1.4568
3.2151 0.8882 294 1.4569
3.1287 0.8912 295 1.4571
3.1167 0.8943 296 1.4572
3.1497 0.8973 297 1.4574
3.0451 0.9003 298 1.4576
3.147 0.9033 299 1.4578
3.2183 0.9063 300 1.4582
3.0974 0.9094 301 1.4585
3.1824 0.9124 302 1.4589
3.0607 0.9154 303 1.4593
3.1255 0.9184 304 1.4598
2.6534 0.9215 305 1.4604
2.9006 0.9245 306 1.4605
3.336 0.9275 307 1.4609
3.2408 0.9305 308 1.4609
3.0551 0.9335 309 1.4610
2.8721 0.9366 310 1.4610
3.1009 0.9396 311 1.4610
3.3979 0.9426 312 1.4608
3.133 0.9456 313 1.4609
3.1008 0.9486 314 1.4609
3.2113 0.9517 315 1.4610
3.161 0.9547 316 1.4612
2.968 0.9577 317 1.4614
2.936 0.9607 318 1.4619
3.4561 0.9637 319 1.4622
3.1529 0.9668 320 1.4625
3.1159 0.9698 321 1.4629
3.2588 0.9728 322 1.4630
2.9729 0.9758 323 1.4633
3.2778 0.9789 324 1.4636
2.9019 0.9819 325 1.4638
3.094 0.9849 326 1.4643
3.0259 0.9879 327 1.4647
3.3842 0.9909 328 1.4652
3.217 0.9940 329 1.4655
3.4145 0.9970 330 1.4658
3.3328 1.0 331 1.4660

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
4
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for brando/tfa_output_2025_m02_d07_t07h_43m_33s

Finetuned
(393)
this model