appvoid
/

palmer-004-turbo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

appvoid commited on Jul 17, 2024

Commit

8b06358

·

verified ·

1 Parent(s): 556ceb2

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -27,10 +27,10 @@ tags:
 This model has a slightly different architecture and training style:
 1. The model was followed by a continual pretraining (lm_head + embedding layers were tuned).
-2. Base model was trained on 15k instruction/response pairs.
 3. Similar architecture than palmer series but smaller in context size (8192)
-In short, palmer is now half the size, twice the speed and same overall performance with a dramatical boost on arc challenge instead of winogrande.
 As all palmer models, the model is biased to respond to answers without using any specific prompt, feel free to further fine-tune it for your specific use case.

 This model has a slightly different architecture and training style:
 1. The model was followed by a continual pretraining (lm_head + embedding layers were tuned).
+2. Base model was pretrained on 75k instruction/response pairs and merged.
 3. Similar architecture than palmer series but smaller in context size (8192)
+In short, palmer is now half the size, twice the speed and same overall performance with a notable improvement on mmlu and arc challenge instead of winogrande.
 As all palmer models, the model is biased to respond to answers without using any specific prompt, feel free to further fine-tune it for your specific use case.