mrm8488
/

stablelm2-1.6b-ft-openhermes

Text Generation

Model card Files Files and versions Community

mrm8488 commited on Jan 30, 2024

Commit

ceb302c

·

verified ·

1 Parent(s): b140468

Create README.md

Files changed (1) hide show

README.md +48 -0

README.md ADDED Viewed

	@@ -0,0 +1,48 @@

+---
+license: wtfpl
+datasets:
+- teknium/openhermes
+pipeline_tag: text-generation
+thumbnail:
+language:
+- en
+---
+# **S**tablelm2- (1.6b) 🐍 fine-tuned on OpenHermes
+<div style="text-align:center;width:250px;height:250px;">
+    <img src="" alt="logo">
+</div>
+## Base model info
+Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers.
+It is based on the line of progress on [structured state space models](https://github.com/state-spaces/s4),
+with an efficient hardware-aware design and implementation in the spirit of [FlashAttention](https://github.com/Dao-AILab/flash-attention).
+## Dataset info
+The OpenHermes dataset is composed of 242,000 entries of primarily GPT-4 generated data, from open datasets across the AI landscape, including:
+OpenHermes 13B is the first fine tune of the Hermes dataset that has a fully open source dataset!
+OpenHermes was trained on 242,000 entries of primarily GPT-4 generated data, from open datasets across the AI landscape, including:
+- GPTeacher - General Instruct, Roleplay v1, Roleplay v2, and Code Instruct Datasets, by Teknium
+- WizardLM (v1, evol_instruct 70k), by WizardLM Team/nlpxucan
+- Airoboros GPT-4 (v1.0), by JonDurbin
+- Camel-AI's domain expert datasets, by the Camel-AI Team
+- CodeAlpaca, by Sahil2801
+- GPT4-LLM and Unnatural Instructions, by Microsoft
+Filtering included removal of OpenAI refusals, disclaimers, and "As an AI" type examples and more
+The base dataset mix is identical to the original Nous-Hermes', minus the Nous-Instruct and PDACTL datasets which were private datasets.
+## Usage
+WIP
+## Evaluations
+WIP