End of training

Browse files

Files changed (4) hide show

README.md +59 -12
adapter_config.json +2 -2
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.9079
 ## Model description
@@ -46,22 +46,69 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 100
-- training_steps: 1000
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 3.0206        | 0.28  | 100  | 2.8708          |
-| 2.8007        | 0.56  | 200  | 2.6294          |
-| 2.5551        | 0.85  | 300  | 2.4552          |
-| 2.3554        | 1.13  | 400  | 2.2903          |
-| 2.2233        | 1.41  | 500  | 2.1771          |
-| 2.1114        | 1.69  | 600  | 2.0850          |
-| 2.0211        | 1.97  | 700  | 2.0163          |
-| 1.8995        | 2.25  | 800  | 1.9635          |
-| 1.835         | 2.54  | 900  | 1.9226          |
-| 1.81          | 2.82  | 1000 | 1.9079          |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.9170
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 100
+- num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 2.9836        | 0.09  | 100  | 2.8641          |
+| 2.8536        | 0.17  | 200  | 2.7929          |
+| 2.8051        | 0.26  | 300  | 2.7567          |
+| 2.7782        | 0.35  | 400  | 2.7092          |
+| 2.7542        | 0.44  | 500  | 2.6946          |
+| 2.6978        | 0.52  | 600  | 2.6719          |
+| 2.6833        | 0.61  | 700  | 2.6497          |
+| 2.6504        | 0.7   | 800  | 2.6172          |
+| 2.6228        | 0.78  | 900  | 2.6008          |
+| 2.6219        | 0.87  | 1000 | 2.5802          |
+| 2.5629        | 0.96  | 1100 | 2.5519          |
+| 2.5315        | 1.05  | 1200 | 2.5255          |
+| 2.4813        | 1.13  | 1300 | 2.5156          |
+| 2.4539        | 1.22  | 1400 | 2.4884          |
+| 2.4466        | 1.31  | 1500 | 2.4660          |
+| 2.4205        | 1.39  | 1600 | 2.4431          |
+| 2.3937        | 1.48  | 1700 | 2.4238          |
+| 2.3686        | 1.57  | 1800 | 2.4069          |
+| 2.3209        | 1.66  | 1900 | 2.3826          |
+| 2.3409        | 1.74  | 2000 | 2.3606          |
+| 2.2874        | 1.83  | 2100 | 2.3453          |
+| 2.309         | 1.92  | 2200 | 2.3222          |
+| 2.2676        | 2.01  | 2300 | 2.2981          |
+| 2.1734        | 2.09  | 2400 | 2.2892          |
+| 2.1495        | 2.18  | 2500 | 2.2549          |
+| 2.1163        | 2.27  | 2600 | 2.2401          |
+| 2.1           | 2.35  | 2700 | 2.2317          |
+| 2.1046        | 2.44  | 2800 | 2.2153          |
+| 2.1138        | 2.53  | 2900 | 2.1938          |
+| 2.0691        | 2.62  | 3000 | 2.1775          |
+| 2.0945        | 2.7   | 3100 | 2.1563          |
+| 2.045         | 2.79  | 3200 | 2.1408          |
+| 2.0212        | 2.88  | 3300 | 2.1229          |
+| 2.0011        | 2.96  | 3400 | 2.1156          |
+| 1.983         | 3.05  | 3500 | 2.0942          |
+| 1.9309        | 3.14  | 3600 | 2.0769          |
+| 1.8844        | 3.23  | 3700 | 2.0709          |
+| 1.9085        | 3.31  | 3800 | 2.0589          |
+| 1.8827        | 3.4   | 3900 | 2.0405          |
+| 1.8511        | 3.49  | 4000 | 2.0310          |
+| 1.8807        | 3.57  | 4100 | 2.0170          |
+| 1.8437        | 3.66  | 4200 | 2.0045          |
+| 1.8667        | 3.75  | 4300 | 2.0036          |
+| 1.8081        | 3.84  | 4400 | 1.9886          |
+| 1.8688        | 3.92  | 4500 | 1.9767          |
+| 1.8187        | 4.01  | 4600 | 1.9652          |
+| 1.7511        | 4.1   | 4700 | 1.9592          |
+| 1.7384        | 4.18  | 4800 | 1.9558          |
+| 1.7843        | 4.27  | 4900 | 1.9474          |
+| 1.7389        | 4.36  | 5000 | 1.9412          |
+| 1.7465        | 4.45  | 5100 | 1.9346          |
+| 1.7483        | 4.53  | 5200 | 1.9290          |
+| 1.7149        | 4.62  | 5300 | 1.9246          |
+| 1.7154        | 4.71  | 5400 | 1.9211          |
+| 1.7637        | 4.8   | 5500 | 1.9188          |
+| 1.7559        | 4.88  | 5600 | 1.9181          |
+| 1.7204        | 4.97  | 5700 | 1.9170          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -20,10 +20,10 @@
   "revision": null,
   "target_modules": [
     "v_proj",
     "dense",
-    "fc1",
     "q_proj",
-    "k_proj",
     "fc2"
   ],
   "task_type": "CAUSAL_LM",

   "revision": null,
   "target_modules": [
     "v_proj",
+    "k_proj",
     "dense",
     "q_proj",
+    "fc1",
     "fc2"
   ],
   "task_type": "CAUSAL_LM",

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5cd45520bede814b83497b26a30c5678f13df6261fea94d453a5587fc543f176
 size 56660888

 version https://git-lfs.github.com/spec/v1
+oid sha256:32d22604e71c0ed518393d6d1488b9015450e26a985ed7c58708f173bff03707
 size 56660888

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5298e0f030dae0ac073a6d5fa9075cb92719fbbc01f98c6f8eead4f7a77bce57
 size 4920

 version https://git-lfs.github.com/spec/v1
+oid sha256:4d8a6346625b0d0cf6bb61dc0c804b72803ad60c8336003827e98a9d80f15e4f
 size 4920