mlfoundations-dev
/

oh-dcft-v3.1-claude-3-5-haiku-20241022-qwen

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sedrickkeh commited on 28 days ago

Commit

66ef69b

·

verified ·

1 Parent(s): 89e4cbc

Model save

Files changed (1) hide show

README.md +5 -6

README.md CHANGED Viewed

@@ -4,7 +4,6 @@ license: apache-2.0
 base_model: Qwen/Qwen2.5-7B
 tags:
 - llama-factory
-- full
 - generated_from_trainer
 model-index:
 - name: oh-dcft-v3.1-claude-3-5-haiku-20241022-qwen
@@ -16,9 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
 # oh-dcft-v3.1-claude-3-5-haiku-20241022-qwen
-This model is a fine-tuned version of [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) on the mlfoundations-dev/oh-dcft-v3.1-claude-3-5-haiku-20241022 dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.4279
 ## Model description
@@ -54,9 +53,9 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.4292        | 1.0   | 1746 | 0.4309          |
-| 0.3757        | 2.0   | 3492 | 0.4209          |
-| 0.3268        | 3.0   | 5238 | 0.4279          |
 ### Framework versions

 base_model: Qwen/Qwen2.5-7B
 tags:
 - llama-factory
 - generated_from_trainer
 model-index:
 - name: oh-dcft-v3.1-claude-3-5-haiku-20241022-qwen
 # oh-dcft-v3.1-claude-3-5-haiku-20241022-qwen
+This model is a fine-tuned version of [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4277
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.429         | 1.0   | 1746 | 0.4307          |
+| 0.3755        | 2.0   | 3492 | 0.4207          |
+| 0.3266        | 3.0   | 5238 | 0.4277          |
 ### Framework versions