Model save

Browse files

Files changed (6) hide show

README.md +21 -8
adapter_model.safetensors +1 -1
all_results.json +5 -5
runs/Aug24_00-16-04_nova.cs.ucla.edu/events.out.tfevents.1724484907.nova.cs.ucla.edu.2225210.0 +2 -2
train_results.json +5 -5
trainer_state.json +0 -0

README.md CHANGED Viewed

@@ -1,14 +1,12 @@
 ---
-license: llama3
 library_name: peft
 tags:
-- alignment-handbook
 - trl
 - sft
 - generated_from_trainer
-base_model: meta-llama/Meta-Llama-3-8B
-datasets:
-- yihanwang617/ultrachat_200k_processed_indicator_0.6_4k
 model-index:
 - name: llama-3-qlora-ultrachat-200k-processed-indicator-0.6
   results: []
@@ -19,9 +17,9 @@ should probably proofread and complete it, then remove this comment. -->
 # llama-3-qlora-ultrachat-200k-processed-indicator-0.6
-This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the yihanwang617/ultrachat_200k_processed_indicator_0.6_4k dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.0235
 ## Model description
@@ -58,7 +56,22 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.008         | 0.9997 | 3247 | 1.0235          |
 ### Framework versions

 ---
+base_model: meta-llama/Meta-Llama-3-8B
 library_name: peft
+license: llama3
 tags:
 - trl
 - sft
+- alignment-handbook
 - generated_from_trainer
 model-index:
 - name: llama-3-qlora-ultrachat-200k-processed-indicator-0.6
   results: []
 # llama-3-qlora-ultrachat-200k-processed-indicator-0.6
+This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.0200
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.0614        | 0.0616 | 200  | 1.0632          |
+| 1.0689        | 0.1232 | 400  | 1.0476          |
+| 1.0053        | 0.1847 | 600  | 1.0413          |
+| 1.0446        | 0.2463 | 800  | 1.0366          |
+| 1.0091        | 0.3079 | 1000 | 1.0336          |
+| 1.0093        | 0.3695 | 1200 | 1.0310          |
+| 1.0086        | 0.4311 | 1400 | 1.0291          |
+| 1.0362        | 0.4926 | 1600 | 1.0270          |
+| 1.0155        | 0.5542 | 1800 | 1.0256          |
+| 1.0138        | 0.6158 | 2000 | 1.0240          |
+| 1.0392        | 0.6774 | 2200 | 1.0226          |
+| 1.0079        | 0.7389 | 2400 | 1.0216          |
+| 1.0139        | 0.8005 | 2600 | 1.0208          |
+| 0.9857        | 0.8621 | 2800 | 1.0204          |
+| 1.0258        | 0.9237 | 3000 | 1.0201          |
+| 1.0147        | 0.9853 | 3200 | 1.0200          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3f9022babcfd34e80b4b7cd5ff487d739f4a9f6923cf01f3ba717387cd624e04
 size 2185327392

 version https://git-lfs.github.com/spec/v1
+oid sha256:c7efffb66a421f2b1b787112c4a4ad120b81b7a07960a8e4f38b5376ac182154
 size 2185327392

all_results.json CHANGED Viewed

@@ -5,10 +5,10 @@
     "eval_samples": 23109,
     "eval_samples_per_second": 6.137,
     "eval_steps_per_second": 0.384,
-    "total_flos": 4.15615026158633e+16,
-    "train_loss": 1.0426253444200522,
-    "train_runtime": 139956.5138,
     "train_samples": 207864,
-    "train_samples_per_second": 1.485,
-    "train_steps_per_second": 0.023
 }

     "eval_samples": 23109,
     "eval_samples_per_second": 6.137,
     "eval_steps_per_second": 0.384,
+    "total_flos": 3.805052032096666e+16,
+    "train_loss": 0.6459755006114042,
+    "train_runtime": 124690.9481,
     "train_samples": 207864,
+    "train_samples_per_second": 1.667,
+    "train_steps_per_second": 0.026
 }

runs/Aug24_00-16-04_nova.cs.ucla.edu/events.out.tfevents.1724484907.nova.cs.ucla.edu.2225210.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3dd93d0c11d80dbb8a5370394e7b7a608f98b8eed3811ebc139cbe8854455fb2
-size 92146

 version https://git-lfs.github.com/spec/v1
+oid sha256:0ab6568c69e2f9945244db35c2bda9ed4730fd3361a8bcc690217549b7af04f3
+size 94399

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 0.9997305930800908,
-    "total_flos": 4.15615026158633e+16,
-    "train_loss": 1.0426253444200522,
-    "train_runtime": 139956.5138,
     "train_samples": 207864,
-    "train_samples_per_second": 1.485,
-    "train_steps_per_second": 0.023
 }

 {
     "epoch": 0.9997305930800908,
+    "total_flos": 3.805052032096666e+16,
+    "train_loss": 0.6459755006114042,
+    "train_runtime": 124690.9481,
     "train_samples": 207864,
+    "train_samples_per_second": 1.667,
+    "train_steps_per_second": 0.026
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff