tfa_output_2025_m02_d07_t07h_43m_33s
This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4660
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-06
- train_batch_size: 2
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
No log | 0 | 0 | 1.4694 |
4.604 | 0.0030 | 1 | 1.4694 |
4.6485 | 0.0060 | 2 | 1.4693 |
4.6515 | 0.0091 | 3 | 1.4693 |
4.9659 | 0.0121 | 4 | 1.4692 |
4.5235 | 0.0151 | 5 | 1.4692 |
4.7107 | 0.0181 | 6 | 1.4690 |
4.3335 | 0.0211 | 7 | 1.4686 |
4.8938 | 0.0242 | 8 | 1.4682 |
4.8609 | 0.0272 | 9 | 1.4677 |
4.5648 | 0.0302 | 10 | 1.4667 |
4.5394 | 0.0332 | 11 | 1.4655 |
4.7629 | 0.0363 | 12 | 1.4644 |
4.5808 | 0.0393 | 13 | 1.4632 |
4.5545 | 0.0423 | 14 | 1.4619 |
4.4343 | 0.0453 | 15 | 1.4605 |
4.5662 | 0.0483 | 16 | 1.4591 |
4.398 | 0.0514 | 17 | 1.4575 |
4.3894 | 0.0544 | 18 | 1.4550 |
4.61 | 0.0574 | 19 | 1.4524 |
4.4373 | 0.0604 | 20 | 1.4501 |
4.2311 | 0.0634 | 21 | 1.4478 |
4.3044 | 0.0665 | 22 | 1.4454 |
4.2496 | 0.0695 | 23 | 1.4431 |
4.3269 | 0.0725 | 24 | 1.4409 |
4.2602 | 0.0755 | 25 | 1.4385 |
4.4063 | 0.0785 | 26 | 1.4362 |
4.2922 | 0.0816 | 27 | 1.4341 |
3.8772 | 0.0846 | 28 | 1.4320 |
4.2066 | 0.0876 | 29 | 1.4298 |
4.1774 | 0.0906 | 30 | 1.4280 |
4.1788 | 0.0937 | 31 | 1.4259 |
4.2413 | 0.0967 | 32 | 1.4240 |
4.2196 | 0.0997 | 33 | 1.4220 |
4.4204 | 0.1027 | 34 | 1.4203 |
4.1017 | 0.1057 | 35 | 1.4185 |
4.25 | 0.1088 | 36 | 1.4168 |
4.1705 | 0.1118 | 37 | 1.4152 |
4.0192 | 0.1148 | 38 | 1.4136 |
4.1317 | 0.1178 | 39 | 1.4123 |
4.243 | 0.1208 | 40 | 1.4107 |
3.9988 | 0.1239 | 41 | 1.4094 |
3.9451 | 0.1269 | 42 | 1.4081 |
4.0244 | 0.1299 | 43 | 1.4068 |
4.0956 | 0.1329 | 44 | 1.4055 |
3.9313 | 0.1360 | 45 | 1.4043 |
3.9223 | 0.1390 | 46 | 1.4033 |
3.7763 | 0.1420 | 47 | 1.4021 |
3.8153 | 0.1450 | 48 | 1.4011 |
4.1018 | 0.1480 | 49 | 1.4003 |
4.2674 | 0.1511 | 50 | 1.3994 |
4.0947 | 0.1541 | 51 | 1.3984 |
3.8134 | 0.1571 | 52 | 1.3976 |
3.7836 | 0.1601 | 53 | 1.3969 |
3.9645 | 0.1631 | 54 | 1.3961 |
3.5071 | 0.1662 | 55 | 1.3955 |
3.9643 | 0.1692 | 56 | 1.3950 |
3.7689 | 0.1722 | 57 | 1.3944 |
3.9377 | 0.1752 | 58 | 1.3937 |
3.9542 | 0.1782 | 59 | 1.3932 |
3.7511 | 0.1813 | 60 | 1.3926 |
3.7859 | 0.1843 | 61 | 1.3922 |
3.7963 | 0.1873 | 62 | 1.3916 |
3.7728 | 0.1903 | 63 | 1.3913 |
3.7714 | 0.1934 | 64 | 1.3910 |
3.9066 | 0.1964 | 65 | 1.3906 |
3.9718 | 0.1994 | 66 | 1.3903 |
3.7207 | 0.2024 | 67 | 1.3901 |
3.7618 | 0.2054 | 68 | 1.3896 |
3.5737 | 0.2085 | 69 | 1.3895 |
3.6742 | 0.2115 | 70 | 1.3892 |
3.6627 | 0.2145 | 71 | 1.3892 |
3.7246 | 0.2175 | 72 | 1.3889 |
3.5988 | 0.2205 | 73 | 1.3887 |
3.5042 | 0.2236 | 74 | 1.3885 |
3.6754 | 0.2266 | 75 | 1.3884 |
3.8237 | 0.2296 | 76 | 1.3883 |
3.6541 | 0.2326 | 77 | 1.3882 |
3.8442 | 0.2356 | 78 | 1.3881 |
3.8189 | 0.2387 | 79 | 1.3881 |
3.4796 | 0.2417 | 80 | 1.3879 |
3.7061 | 0.2447 | 81 | 1.3880 |
3.7453 | 0.2477 | 82 | 1.3880 |
3.4375 | 0.2508 | 83 | 1.3880 |
3.6748 | 0.2538 | 84 | 1.3881 |
3.6132 | 0.2568 | 85 | 1.3880 |
3.6022 | 0.2598 | 86 | 1.3879 |
3.9084 | 0.2628 | 87 | 1.3879 |
3.5629 | 0.2659 | 88 | 1.3882 |
3.6004 | 0.2689 | 89 | 1.3883 |
3.8498 | 0.2719 | 90 | 1.3883 |
3.5523 | 0.2749 | 91 | 1.3884 |
3.7526 | 0.2779 | 92 | 1.3886 |
3.7638 | 0.2810 | 93 | 1.3887 |
3.6319 | 0.2840 | 94 | 1.3888 |
3.551 | 0.2870 | 95 | 1.3888 |
3.8053 | 0.2900 | 96 | 1.3890 |
3.6299 | 0.2931 | 97 | 1.3891 |
3.8778 | 0.2961 | 98 | 1.3891 |
3.4661 | 0.2991 | 99 | 1.3893 |
3.6199 | 0.3021 | 100 | 1.3894 |
3.7169 | 0.3051 | 101 | 1.3897 |
3.6181 | 0.3082 | 102 | 1.3898 |
3.712 | 0.3112 | 103 | 1.3900 |
3.426 | 0.3142 | 104 | 1.3903 |
3.2462 | 0.3172 | 105 | 1.3905 |
3.4656 | 0.3202 | 106 | 1.3907 |
3.5511 | 0.3233 | 107 | 1.3910 |
3.5268 | 0.3263 | 108 | 1.3913 |
3.4383 | 0.3293 | 109 | 1.3915 |
3.5351 | 0.3323 | 110 | 1.3919 |
3.4221 | 0.3353 | 111 | 1.3922 |
3.064 | 0.3384 | 112 | 1.3926 |
3.4006 | 0.3414 | 113 | 1.3928 |
3.5908 | 0.3444 | 114 | 1.3931 |
3.5341 | 0.3474 | 115 | 1.3935 |
3.4771 | 0.3505 | 116 | 1.3938 |
3.4362 | 0.3535 | 117 | 1.3940 |
3.5801 | 0.3565 | 118 | 1.3941 |
3.5304 | 0.3595 | 119 | 1.3943 |
3.6278 | 0.3625 | 120 | 1.3945 |
3.5677 | 0.3656 | 121 | 1.3948 |
3.5208 | 0.3686 | 122 | 1.3950 |
3.5702 | 0.3716 | 123 | 1.3952 |
3.4074 | 0.3746 | 124 | 1.3954 |
3.1593 | 0.3776 | 125 | 1.3956 |
3.4503 | 0.3807 | 126 | 1.3958 |
3.6251 | 0.3837 | 127 | 1.3961 |
3.4879 | 0.3867 | 128 | 1.3965 |
3.3838 | 0.3897 | 129 | 1.3968 |
3.411 | 0.3927 | 130 | 1.3972 |
3.3504 | 0.3958 | 131 | 1.3975 |
3.2542 | 0.3988 | 132 | 1.3980 |
3.5616 | 0.4018 | 133 | 1.3984 |
3.3994 | 0.4048 | 134 | 1.3990 |
3.3805 | 0.4079 | 135 | 1.3994 |
3.4415 | 0.4109 | 136 | 1.3997 |
3.6288 | 0.4139 | 137 | 1.4001 |
3.2875 | 0.4169 | 138 | 1.4004 |
3.2699 | 0.4199 | 139 | 1.4007 |
3.6125 | 0.4230 | 140 | 1.4011 |
3.4689 | 0.4260 | 141 | 1.4013 |
3.4061 | 0.4290 | 142 | 1.4015 |
3.3196 | 0.4320 | 143 | 1.4019 |
3.4037 | 0.4350 | 144 | 1.4020 |
3.4214 | 0.4381 | 145 | 1.4024 |
3.3448 | 0.4411 | 146 | 1.4025 |
3.4775 | 0.4441 | 147 | 1.4028 |
3.4709 | 0.4471 | 148 | 1.4032 |
3.4283 | 0.4502 | 149 | 1.4033 |
3.4671 | 0.4532 | 150 | 1.4035 |
3.2426 | 0.4562 | 151 | 1.4038 |
3.4191 | 0.4592 | 152 | 1.4042 |
3.4264 | 0.4622 | 153 | 1.4045 |
3.2461 | 0.4653 | 154 | 1.4050 |
3.3855 | 0.4683 | 155 | 1.4053 |
3.313 | 0.4713 | 156 | 1.4056 |
3.3058 | 0.4743 | 157 | 1.4059 |
3.5508 | 0.4773 | 158 | 1.4062 |
3.2109 | 0.4804 | 159 | 1.4066 |
3.5045 | 0.4834 | 160 | 1.4068 |
3.4068 | 0.4864 | 161 | 1.4071 |
3.3438 | 0.4894 | 162 | 1.4075 |
3.3953 | 0.4924 | 163 | 1.4078 |
3.2312 | 0.4955 | 164 | 1.4079 |
3.2971 | 0.4985 | 165 | 1.4084 |
3.3118 | 0.5015 | 166 | 1.4086 |
3.4395 | 0.5045 | 167 | 1.4088 |
3.7162 | 0.5076 | 168 | 1.4089 |
3.3864 | 0.5106 | 169 | 1.4091 |
3.0887 | 0.5136 | 170 | 1.4092 |
2.9898 | 0.5166 | 171 | 1.4095 |
3.4697 | 0.5196 | 172 | 1.4099 |
3.2762 | 0.5227 | 173 | 1.4102 |
3.1383 | 0.5257 | 174 | 1.4106 |
3.2522 | 0.5287 | 175 | 1.4112 |
3.309 | 0.5317 | 176 | 1.4117 |
3.4431 | 0.5347 | 177 | 1.4121 |
3.1366 | 0.5378 | 178 | 1.4126 |
3.3094 | 0.5408 | 179 | 1.4131 |
3.4466 | 0.5438 | 180 | 1.4136 |
3.3411 | 0.5468 | 181 | 1.4140 |
3.013 | 0.5498 | 182 | 1.4143 |
3.4785 | 0.5529 | 183 | 1.4147 |
3.0358 | 0.5559 | 184 | 1.4152 |
3.2833 | 0.5589 | 185 | 1.4158 |
3.2953 | 0.5619 | 186 | 1.4162 |
3.3485 | 0.5650 | 187 | 1.4167 |
3.4911 | 0.5680 | 188 | 1.4172 |
3.3863 | 0.5710 | 189 | 1.4176 |
3.1944 | 0.5740 | 190 | 1.4180 |
3.2994 | 0.5770 | 191 | 1.4185 |
3.4385 | 0.5801 | 192 | 1.4187 |
3.3346 | 0.5831 | 193 | 1.4191 |
3.3318 | 0.5861 | 194 | 1.4194 |
3.4027 | 0.5891 | 195 | 1.4198 |
3.2532 | 0.5921 | 196 | 1.4201 |
3.2351 | 0.5952 | 197 | 1.4205 |
3.4589 | 0.5982 | 198 | 1.4203 |
3.4375 | 0.6012 | 199 | 1.4204 |
3.0901 | 0.6042 | 200 | 1.4207 |
3.3186 | 0.6073 | 201 | 1.4210 |
3.2891 | 0.6103 | 202 | 1.4212 |
3.2752 | 0.6133 | 203 | 1.4217 |
3.3808 | 0.6163 | 204 | 1.4221 |
3.376 | 0.6193 | 205 | 1.4223 |
3.4086 | 0.6224 | 206 | 1.4226 |
3.3506 | 0.6254 | 207 | 1.4228 |
3.4508 | 0.6284 | 208 | 1.4232 |
3.4237 | 0.6314 | 209 | 1.4235 |
3.2154 | 0.6344 | 210 | 1.4240 |
3.2379 | 0.6375 | 211 | 1.4244 |
2.8335 | 0.6405 | 212 | 1.4251 |
3.1927 | 0.6435 | 213 | 1.4251 |
3.1871 | 0.6465 | 214 | 1.4256 |
3.1004 | 0.6495 | 215 | 1.4259 |
3.2405 | 0.6526 | 216 | 1.4264 |
3.1544 | 0.6556 | 217 | 1.4269 |
3.1204 | 0.6586 | 218 | 1.4274 |
3.3257 | 0.6616 | 219 | 1.4280 |
3.2689 | 0.6647 | 220 | 1.4286 |
3.0117 | 0.6677 | 221 | 1.4289 |
3.4276 | 0.6707 | 222 | 1.4295 |
3.2358 | 0.6737 | 223 | 1.4301 |
3.1374 | 0.6767 | 224 | 1.4307 |
3.2972 | 0.6798 | 225 | 1.4312 |
3.2838 | 0.6828 | 226 | 1.4318 |
3.2839 | 0.6858 | 227 | 1.4322 |
3.2228 | 0.6888 | 228 | 1.4328 |
3.2605 | 0.6918 | 229 | 1.4334 |
3.2945 | 0.6949 | 230 | 1.4340 |
3.3155 | 0.6979 | 231 | 1.4344 |
3.1988 | 0.7009 | 232 | 1.4349 |
3.2921 | 0.7039 | 233 | 1.4355 |
2.8752 | 0.7069 | 234 | 1.4359 |
3.0065 | 0.7100 | 235 | 1.4361 |
3.1689 | 0.7130 | 236 | 1.4366 |
3.1959 | 0.7160 | 237 | 1.4370 |
3.3473 | 0.7190 | 238 | 1.4373 |
3.2927 | 0.7221 | 239 | 1.4377 |
2.9934 | 0.7251 | 240 | 1.4379 |
3.2058 | 0.7281 | 241 | 1.4384 |
3.1388 | 0.7311 | 242 | 1.4388 |
3.2384 | 0.7341 | 243 | 1.4388 |
3.2028 | 0.7372 | 244 | 1.4392 |
3.3737 | 0.7402 | 245 | 1.4392 |
3.166 | 0.7432 | 246 | 1.4397 |
3.0255 | 0.7462 | 247 | 1.4397 |
3.0979 | 0.7492 | 248 | 1.4401 |
3.2436 | 0.7523 | 249 | 1.4404 |
3.1785 | 0.7553 | 250 | 1.4408 |
3.2052 | 0.7583 | 251 | 1.4411 |
3.1967 | 0.7613 | 252 | 1.4413 |
2.9086 | 0.7644 | 253 | 1.4416 |
3.355 | 0.7674 | 254 | 1.4421 |
3.4027 | 0.7704 | 255 | 1.4422 |
2.9307 | 0.7734 | 256 | 1.4428 |
3.1738 | 0.7764 | 257 | 1.4429 |
3.3088 | 0.7795 | 258 | 1.4431 |
3.4942 | 0.7825 | 259 | 1.4432 |
2.831 | 0.7855 | 260 | 1.4437 |
3.1675 | 0.7885 | 261 | 1.4444 |
3.3274 | 0.7915 | 262 | 1.4447 |
3.0326 | 0.7946 | 263 | 1.4448 |
3.3138 | 0.7976 | 264 | 1.4454 |
3.2153 | 0.8006 | 265 | 1.4455 |
3.3983 | 0.8036 | 266 | 1.4458 |
3.179 | 0.8066 | 267 | 1.4461 |
3.2621 | 0.8097 | 268 | 1.4464 |
3.0191 | 0.8127 | 269 | 1.4468 |
3.058 | 0.8157 | 270 | 1.4472 |
3.3188 | 0.8187 | 271 | 1.4478 |
2.9837 | 0.8218 | 272 | 1.4481 |
3.2624 | 0.8248 | 273 | 1.4486 |
3.2701 | 0.8278 | 274 | 1.4492 |
3.1579 | 0.8308 | 275 | 1.4497 |
3.3164 | 0.8338 | 276 | 1.4501 |
2.9827 | 0.8369 | 277 | 1.4507 |
3.1842 | 0.8399 | 278 | 1.4512 |
3.2366 | 0.8429 | 279 | 1.4519 |
3.0562 | 0.8459 | 280 | 1.4520 |
2.9503 | 0.8489 | 281 | 1.4526 |
3.0441 | 0.8520 | 282 | 1.4530 |
3.4535 | 0.8550 | 283 | 1.4534 |
3.2656 | 0.8580 | 284 | 1.4537 |
3.3452 | 0.8610 | 285 | 1.4541 |
3.0958 | 0.8640 | 286 | 1.4548 |
3.1579 | 0.8671 | 287 | 1.4553 |
3.1473 | 0.8701 | 288 | 1.4556 |
3.2825 | 0.8731 | 289 | 1.4559 |
2.8554 | 0.8761 | 290 | 1.4563 |
3.2792 | 0.8792 | 291 | 1.4566 |
3.0977 | 0.8822 | 292 | 1.4567 |
3.0414 | 0.8852 | 293 | 1.4568 |
3.2151 | 0.8882 | 294 | 1.4569 |
3.1287 | 0.8912 | 295 | 1.4571 |
3.1167 | 0.8943 | 296 | 1.4572 |
3.1497 | 0.8973 | 297 | 1.4574 |
3.0451 | 0.9003 | 298 | 1.4576 |
3.147 | 0.9033 | 299 | 1.4578 |
3.2183 | 0.9063 | 300 | 1.4582 |
3.0974 | 0.9094 | 301 | 1.4585 |
3.1824 | 0.9124 | 302 | 1.4589 |
3.0607 | 0.9154 | 303 | 1.4593 |
3.1255 | 0.9184 | 304 | 1.4598 |
2.6534 | 0.9215 | 305 | 1.4604 |
2.9006 | 0.9245 | 306 | 1.4605 |
3.336 | 0.9275 | 307 | 1.4609 |
3.2408 | 0.9305 | 308 | 1.4609 |
3.0551 | 0.9335 | 309 | 1.4610 |
2.8721 | 0.9366 | 310 | 1.4610 |
3.1009 | 0.9396 | 311 | 1.4610 |
3.3979 | 0.9426 | 312 | 1.4608 |
3.133 | 0.9456 | 313 | 1.4609 |
3.1008 | 0.9486 | 314 | 1.4609 |
3.2113 | 0.9517 | 315 | 1.4610 |
3.161 | 0.9547 | 316 | 1.4612 |
2.968 | 0.9577 | 317 | 1.4614 |
2.936 | 0.9607 | 318 | 1.4619 |
3.4561 | 0.9637 | 319 | 1.4622 |
3.1529 | 0.9668 | 320 | 1.4625 |
3.1159 | 0.9698 | 321 | 1.4629 |
3.2588 | 0.9728 | 322 | 1.4630 |
2.9729 | 0.9758 | 323 | 1.4633 |
3.2778 | 0.9789 | 324 | 1.4636 |
2.9019 | 0.9819 | 325 | 1.4638 |
3.094 | 0.9849 | 326 | 1.4643 |
3.0259 | 0.9879 | 327 | 1.4647 |
3.3842 | 0.9909 | 328 | 1.4652 |
3.217 | 0.9940 | 329 | 1.4655 |
3.4145 | 0.9970 | 330 | 1.4658 |
3.3328 | 1.0 | 331 | 1.4660 |
Framework versions
- Transformers 4.48.0
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for brando/tfa_output_2025_m02_d07_t07h_43m_33s
Base model
meta-llama/Meta-Llama-3-8B