allenai
/

OLMoE-1B-7B-0924

Text Generation

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

natolambert commited on Sep 3, 2024

Commit

e4c66e0

·

verified ·

1 Parent(s): ff42442

Update README.md

Files changed (1) hide show

README.md +3 -19

README.md CHANGED Viewed

@@ -19,26 +19,10 @@ datasets:
 > OLMoE-1B-7B is a Mixture-of-Experts LLM with 1B active and 7B total parameters released in September 2024 (0924). It yields state-of-the-art performance among models with a similar cost (1B) and is competitive with much larger models like Llama2-13B. OLMoE is 100% open-source.
 This information and more can also be found on the [**OLMoE GitHub repository**](https://github.com/allenai/OLMoE).
 - **Paper**: (Soon)
-- **Pretraining**
-  - [Checkpoints](https://hf.co/allenai/OLMoE-1B-7B-0924)
-  - [Code](https://github.com/allenai/OLMo/tree/Muennighoff/MoE)
-  - [Data](https://huggingface.co/datasets/allenai/OLMoE-mix-0924)
-  - [Logs](https://wandb.ai/ai2-llm/olmoe/reports/OLMoE-1B-7B-0924--Vmlldzo4OTcyMjU3)
-- **SFT (Supervised Fine-Tuning)**
-  - [Checkpoints](https://huggingface.co/allenai/OLMoE-1B-7B-0924-SFT)
-  - [Code](https://github.com/allenai/open-instruct/tree/olmoe-sft)
-  - [Data](https://hf.co/datasets/allenai/tulu-v3.1-mix-preview-4096-OLMoE)
-  - [Logs](https://github.com/allenai/OLMoE/blob/main/logs/olmoe-sft-logs.txt)
-- **DPO/KTO (Direct Preference Optimization/Kahneman-Tversky Optimization)**
-  - [Checkpoints](https://huggingface.co/allenai/OLMoE-1B-7B-0924-Instruct)
-  - [Preference Data](https://hf.co/datasets/allenai/ultrafeedback_binarized_cleaned)
-  - [DPO code](https://github.com/allenai/open-instruct/tree/olmoe-sft), [KTO code](https://github.com/Muennighoff/kto/blob/master/kto.py)
-  - [Logs](https://github.com/allenai/OLMoE/blob/main/logs/olmoe-dpo-logs.txt)
 # Use

 > OLMoE-1B-7B is a Mixture-of-Experts LLM with 1B active and 7B total parameters released in September 2024 (0924). It yields state-of-the-art performance among models with a similar cost (1B) and is competitive with much larger models like Llama2-13B. OLMoE is 100% open-source.
 This information and more can also be found on the [**OLMoE GitHub repository**](https://github.com/allenai/OLMoE).
 - **Paper**: (Soon)
+- **Pretraining** [Checkpoints](https://hf.co/allenai/OLMoE-1B-7B-0924), [Code](https://github.com/allenai/OLMo/tree/Muennighoff/MoE), [Data](https://huggingface.co/datasets/allenai/OLMoE-mix-0924) and [Logs](https://wandb.ai/ai2-llm/olmoe/reports/OLMoE-1B-7B-0924--Vmlldzo4OTcyMjU3).
+- **SFT (Supervised Fine-Tuning)** [Checkpoints](https://huggingface.co/allenai/OLMoE-1B-7B-0924-SFT), [Code](https://github.com/allenai/open-instruct/tree/olmoe-sft), [Data](https://hf.co/datasets/allenai/tulu-v3.1-mix-preview-4096-OLMoE) and [Logs](https://github.com/allenai/OLMoE/blob/main/logs/olmoe-sft-logs.txt).
+- **DPO/KTO (Direct Preference Optimization/Kahneman-Tversky Optimization)**, [Checkpoints](https://huggingface.co/allenai/OLMoE-1B-7B-0924-Instruct), [Preference Data](https://hf.co/datasets/allenai/ultrafeedback_binarized_cleaned), [DPO code](https://github.com/allenai/open-instruct/tree/olmoe-sft), [KTO code](https://github.com/Muennighoff/kto/blob/master/kto.py) and [Logs](https://github.com/allenai/OLMoE/blob/main/logs/olmoe-dpo-logs.txt).
 # Use