papahawk commited on
Commit
69addf3
·
1 Parent(s): 6e920b8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +177 -1
README.md CHANGED
@@ -1,4 +1,180 @@
1
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  <img src="https://alt-web.xyz/images/rainbow.png" alt="Rainbow Solutions" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
4
 
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ license: mit
5
+ datasets:
6
+ - HuggingFaceH4/ultrachat_200k
7
+ - HuggingFaceH4/ultrafeedback_binarized
8
+ language:
9
+ - en
10
+ base_model: mistralai/Mistral-7B-v0.1
11
+ widget:
12
+ - text: "<|system|>\nYou are a pirate chatbot who always responds with Arr!</s>\n<|user|>\nThere's a llama on my lawn, how can I get rid of him?</s>\n<|assistant|>\n"
13
+ output:
14
+ text: "Arr! 'Tis a puzzlin' matter, me hearty! A llama on yer lawn be a rare sight, but I've got a plan that might help ye get rid of 'im. Ye'll need to gather some carrots and hay, and then lure the llama away with the promise of a tasty treat. Once he's gone, ye can clean up yer lawn and enjoy the peace and quiet once again. But beware, me hearty, for there may be more llamas where that one came from! Arr!"
15
+ pipeline_tag: text-generation
16
+ model-index:
17
+ - name: devi-ai-7b-beta
18
+ description: "Initial fork of Zephyr 7B β, with plans for GGML integration and further development."
19
+ results:
20
+ # AI2 Reasoning Challenge (25-Shot)
21
+ - task:
22
+ type: text-generation
23
+ name: Text Generation
24
+ dataset:
25
+ name: AI2 Reasoning Challenge (25-Shot)
26
+ type: ai2_arc
27
+ config: ARC-Challenge
28
+ split: test
29
+ args:
30
+ num_few_shot: 25
31
+ metrics:
32
+ - type: acc_norm
33
+ name: normalized accuracy
34
+ value: 62.03071672354948
35
+ source:
36
+ name: Open LLM Leaderboard
37
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
38
+
39
+ # HellaSwag (10-shot)
40
+ - task:
41
+ type: text-generation
42
+ name: Text Generation
43
+ dataset:
44
+ name: HellaSwag (10-Shot)
45
+ type: hellaswag
46
+ split: validation
47
+ args:
48
+ num_few_shot: 10
49
+ metrics:
50
+ - type: acc_norm
51
+ name: normalized accuracy
52
+ value: 84.35570603465445
53
+ source:
54
+ name: Open LLM Leaderboard
55
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
56
+
57
+ # DROP (3-shot)
58
+ - task:
59
+ type: text-generation
60
+ name: Text Generation
61
+ dataset:
62
+ name: Drop (3-Shot)
63
+ type: drop
64
+ split: validation
65
+ args:
66
+ num_few_shot: 3
67
+ metrics:
68
+ - type: f1
69
+ name: f1 score
70
+ value: 9.662437080536909
71
+ source:
72
+ name: Open LLM Leaderboard
73
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
74
+
75
+ # TruthfulQA (0-shot)
76
+ - task:
77
+ type: text-generation
78
+ name: Text Generation
79
+ dataset:
80
+ name: TruthfulQA (0-shot)
81
+ type: truthful_qa
82
+ config: multiple_choice
83
+ split: validation
84
+ args:
85
+ num_few_shot: 0
86
+ metrics:
87
+ - type: mc2
88
+ value: 57.44916942762855
89
+ source:
90
+ name: Open LLM Leaderboard
91
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
92
+
93
+ # GSM8k (5-shot)
94
+ - task:
95
+ type: text-generation
96
+ name: Text Generation
97
+ dataset:
98
+ name: GSM8k (5-shot)
99
+ type: gsm8k
100
+ config: main
101
+ split: test
102
+ args:
103
+ num_few_shot: 5
104
+ metrics:
105
+ - type: acc
106
+ name: accuracy
107
+ value: 12.736921910538287
108
+ source:
109
+ name: Open LLM Leaderboard
110
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
111
+
112
+ # MMLU (5-Shot)
113
+ - task:
114
+ type: text-generation
115
+ name: Text Generation
116
+ dataset:
117
+ name: MMLU (5-Shot)
118
+ type: cais/mmlu
119
+ config: all
120
+ split: test
121
+ args:
122
+ num_few_shot: 5
123
+ metrics:
124
+ - type: acc
125
+ name: accuracy
126
+ value: 61.07
127
+ source:
128
+ name: Open LLM Leaderboard
129
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
130
+
131
+ # Winogrande (5-shot)
132
+ - task:
133
+ type: text-generation
134
+ name: Text Generation
135
+ dataset:
136
+ name: Winogrande (5-shot)
137
+ type: winogrande
138
+ config: winogrande_xl
139
+ split: validation
140
+ args:
141
+ num_few_shot: 5
142
+ metrics:
143
+ - type: acc
144
+ name: accuracy
145
+ value: 77.74269928966061
146
+ source:
147
+ name: Open LLM Leaderboard
148
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=HuggingFaceH4/zephyr-7b-beta
149
+
150
+ # AlpacaEval (taken from model card)
151
+ - task:
152
+ type: text-generation
153
+ name: Text Generation
154
+ dataset:
155
+ name: AlpacaEval
156
+ type: tatsu-lab/alpaca_eval
157
+ metrics:
158
+ - type: unknown
159
+ name: win rate
160
+ value: 0.9060
161
+ source:
162
+ url: https://tatsu-lab.github.io/alpaca_eval/
163
+
164
+ # MT-Bench (taken from model card)
165
+ - task:
166
+ type: text-generation
167
+ name: Text Generation
168
+ dataset:
169
+ name: MT-Bench
170
+ type: unknown
171
+ metrics:
172
+ - type: unknown
173
+ name: score
174
+ value: 7.34
175
+ source:
176
+ url: https://huggingface.co/spaces/lmsys/mt-bench
177
+ ---
178
 
179
  <img src="https://alt-web.xyz/images/rainbow.png" alt="Rainbow Solutions" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
180