Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,112 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: afl-3.0
|
3 |
+
library_name: transformers
|
4 |
+
tags:
|
5 |
+
- UNA
|
6 |
+
- juanako
|
7 |
+
datasets:
|
8 |
+
- jondurbin/py-dpo-v0.1
|
9 |
+
- Replete-AI/code_bagel_hermes-2.5
|
10 |
+
- mlabonne/orpo-dpo-mix-40k
|
11 |
+
---
|
12 |
+
|
13 |
+
# UNA-ThePitbull 21.4B v2
|
14 |
+
|
15 |
+
Introducing the best LLM in the industry. Nearly as good as a 70B, just a 21.4B based on saltlux/luxia-21.4b-alignment-v1.0
|
16 |
+
![UNA - ThePitbull 21.4B v2](https://huggingface.co/fblgit/UNA-ThePitbull-21.4-v1/resolve/main/UNA-ThePitbull.png)
|
17 |
+
|
18 |
+
This model has not been poisoned to score high and be useless. We release him becaues its the real deal of EQ & IQ all together in a crazy powerful smart and conversational model. So far the #1 of them at 25/5/2024
|
19 |
+
|
20 |
+
Quant version available at ... soon ..
|
21 |
+
|
22 |
+
## Difference V1 vs V2
|
23 |
+
|
24 |
+
On V2 we implemented a different UNA strategy and covered partially the MLP's and Attention Layers.
|
25 |
+
We also performed further SFT over V1 and further DPO over V1 and we'll release some of those soon as well.
|
26 |
+
|
27 |
+
### Changes
|
28 |
+
|
29 |
+
1. SFT over V1 with `Replete-AI/code_bagel_hermes-2.5` at 1.0e-4 till 5.0e-5
|
30 |
+
2. DPO with: 1.0e-4 to min_lr 5.0e-5
|
31 |
+
* `mlabonne/orpo-dpo-mix-40k`
|
32 |
+
* `jondurbin/py-dpo-v0.1`
|
33 |
+
*
|
34 |
+
## Evaluations
|
35 |
+
|
36 |
+
Can only be compared with its non-una base model: the original luxia-21.4b and ThePitbull-v1
|
37 |
+
|
38 |
+
## UNA v2 (VLLM) Evaluations:
|
39 |
+
```
|
40 |
+
vllm (pretrained=/data/tools/mergekit/una-thepitbull-v5,dtype=bfloat16,gpu_memory_utilization=0.8,max_model_len=2048,data_parallel_size=2,tensor_parallel_size=4), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
|
41 |
+
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|
42 |
+
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|
43 |
+
|gsm8k | 3|strict-match | 5|exact_match|0.7695|± |0.0116|+
|
44 |
+
| | |flexible-extract| 5|exact_match|0.7695|± |0.0116|+
|
45 |
+
|hellaswag | 1|none | 10|acc |0.8110|± |0.0039|
|
46 |
+
| | |none | 10|acc_norm |0.9169|± |0.0028|+
|
47 |
+
|winogrande | 1|none | 5|acc |0.8777|± |0.0092|+
|
48 |
+
|mmlu |N/A |none | 0|acc |0.6427|± |0.0038|-
|
49 |
+
|arc_challenge | 1|none | 25|acc |0.7713|± |0.0123|
|
50 |
+
| | |none | 25|acc_norm |0.7875|± |0.0120|+
|
51 |
+
|truthfulqa_mc2| 2|none | 0|acc |0.7824|± |0.0135|-
|
52 |
+
|mathqa | 1|none | 0|acc |0.4037|± | 0.009|
|
53 |
+
| | |none | 0|acc_norm |0.4034|± | 0.009|+
|
54 |
+
|pubmedqa | 1|none | 0|acc |0.7260|± | 0.020|+
|
55 |
+
|boolq | 2|none | 0|acc |0.8602|± |0.0061|+
|
56 |
+
```
|
57 |
+
|
58 |
+
## UNA v1 (VLLM) Evaluations
|
59 |
+
```
|
60 |
+
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|
61 |
+
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|
62 |
+
|gsm8k | 3|strict-match | 5|exact_match|0.7566|± |0.0118|
|
63 |
+
| | |flexible-extract| 5|exact_match|0.7582|± |0.0118|
|
64 |
+
|hellaswag | 1|none | 10|acc |0.8168|± |0.0039|
|
65 |
+
| | |none | 10|acc_norm |0.9188|± |0.0027|
|
66 |
+
|winogrande | 1|none | 5|acc |0.8635|± |0.0097|
|
67 |
+
|mmlu | N/A|none | 0|acc |0.6444|± |0.0038|
|
68 |
+
|arc_challenge | 1|none | 25|acc |0.7747|± |0.0122|
|
69 |
+
| | |none | 25|acc_norm |0.7850|± |0.0120|
|
70 |
+
|truthfulqa_mc2| 2|none | 0|acc |0.7902|± |0.0134|
|
71 |
+
|mathqa | 1|none | 0|acc |0.4030|± | 0.009|
|
72 |
+
| | |none | 0|acc_norm |0.4034|± | 0.009|
|
73 |
+
|pubmedqa | 1|none | 0|acc |0.6860|± |0.0208|
|
74 |
+
|boolq | 2|none | 0|acc |0.8401|± |0.0064|
|
75 |
+
```
|
76 |
+
|
77 |
+
## Original (VLLM) Evaluations
|
78 |
+
```
|
79 |
+
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|
80 |
+
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|
81 |
+
|gsm8k | 3|strict-match | 5|exact_match|0.7528|± |0.0119|
|
82 |
+
| | |flexible-extract| 5|exact_match|0.7521|± |0.0119|
|
83 |
+
|hellaswag | 1|none | 10|acc |0.8117|± |0.0039|
|
84 |
+
| | |none | 10|acc_norm |0.9167|± |0.0028|
|
85 |
+
|winogrande | 1|none | 5|acc |0.8682|± |0.0095|
|
86 |
+
|mmlu | N/A|none | 0|acc |0.6448|± |0.0038|
|
87 |
+
|arc_challenge | 1|none | 25|acc |0.7688|± |0.0123|
|
88 |
+
| | |none | 25|acc_norm |0.7730|± |0.0122|
|
89 |
+
|truthfulqa_mc2| 2|none | 0|acc |0.7895|± |0.0133|
|
90 |
+
|mathqa | 1|none | 0|acc |0.4000|± | 0.009|
|
91 |
+
| | |none | 0|acc_norm |0.4003|± | 0.009|
|
92 |
+
|pubmedqa | 1|none | 0|acc |0.6680|± |0.0211|
|
93 |
+
|boolq | 2|none | 0|acc |0.8346|± |0.0065|
|
94 |
+
```
|
95 |
+
|
96 |
+
## Citations
|
97 |
+
* saltlux
|
98 |
+
* mlabonne
|
99 |
+
* jondurbin & Replete-AI
|
100 |
+
* bartowski & TheBloke
|
101 |
+
|
102 |
+
If you use UNA models dont forget to cite:
|
103 |
+
```
|
104 |
+
@misc{unathepitbull21b,
|
105 |
+
title={ThePitbull: Uniform Neural Alignment},
|
106 |
+
author={Xavier Murias},
|
107 |
+
year={2024},
|
108 |
+
publisher = {Juanako.AI},
|
109 |
+
journal = {HuggingFace repository},
|
110 |
+
howpublished = {\url{https://huggingface.co/fblgit/UNA-ThePitbull-21.4-v1}},
|
111 |
+
}
|
112 |
+
```
|