mrm8488 commited on
Commit
ceb302c
·
verified ·
1 Parent(s): b140468

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -0
README.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: wtfpl
3
+ datasets:
4
+ - teknium/openhermes
5
+ pipeline_tag: text-generation
6
+ thumbnail:
7
+ language:
8
+ - en
9
+ ---
10
+
11
+ # **S**tablelm2- (1.6b) 🐍 fine-tuned on OpenHermes
12
+
13
+ <div style="text-align:center;width:250px;height:250px;">
14
+ <img src="" alt="logo">
15
+ </div>
16
+
17
+
18
+ ## Base model info
19
+
20
+ Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers.
21
+ It is based on the line of progress on [structured state space models](https://github.com/state-spaces/s4),
22
+ with an efficient hardware-aware design and implementation in the spirit of [FlashAttention](https://github.com/Dao-AILab/flash-attention).
23
+
24
+ ## Dataset info
25
+
26
+ The OpenHermes dataset is composed of 242,000 entries of primarily GPT-4 generated data, from open datasets across the AI landscape, including:
27
+
28
+ OpenHermes 13B is the first fine tune of the Hermes dataset that has a fully open source dataset!
29
+
30
+ OpenHermes was trained on 242,000 entries of primarily GPT-4 generated data, from open datasets across the AI landscape, including:
31
+
32
+ - GPTeacher - General Instruct, Roleplay v1, Roleplay v2, and Code Instruct Datasets, by Teknium
33
+ - WizardLM (v1, evol_instruct 70k), by WizardLM Team/nlpxucan
34
+ - Airoboros GPT-4 (v1.0), by JonDurbin
35
+ - Camel-AI's domain expert datasets, by the Camel-AI Team
36
+ - CodeAlpaca, by Sahil2801
37
+ - GPT4-LLM and Unnatural Instructions, by Microsoft
38
+ Filtering included removal of OpenAI refusals, disclaimers, and "As an AI" type examples and more
39
+ The base dataset mix is identical to the original Nous-Hermes', minus the Nous-Instruct and PDACTL datasets which were private datasets.
40
+
41
+ ## Usage
42
+
43
+ WIP
44
+
45
+
46
+ ## Evaluations
47
+
48
+ WIP