--- language: - en - fr - de - es - it - pt - zh - ja - ru - ko license: other license_name: mrl license_link: https://mistral.ai/licenses/MRL-0.1.md base_model: - mistralai/Ministral-8B-Instruct-2410 library_name: transformers tags: - reasoning - hybrid - gemini-2.0 - deepseek-r1 - synthetic data - unsloth - trl - hybrid --- ![Header](./DeepNeo-Banner.png) # **DeepNeo: A hybrid model with precision and power** ## **Overview** DeepNeo is a hybrid model that can be used like any other LLM, but DeepNeo has a mode that is inspired by [NousResearch/DeepHermes-3-Llama-3-8B-Preview](https://huggingface.co/NousResearch/DeepHermes-3-Llama-3-8B-Preview), which allows the model to activate a CoT-like response. This is done by toggling the system prompt. Unlike [NousResearch/DeepHermes-3-Llama-3-8B-Preview](https://huggingface.co/NousResearch/DeepHermes-3-Llama-3-8B-Preview), DeepNeo is slightly more flexible in its sizes. We have introduced an 8B and 12B model; both of them are based on **Mistral AI's models** ## **Model Details** ## DeepNeo 8B Key features - Developed by: [Spestly (Open-Neo)](https://x.com/Spestly) & [Kazex (Open-Neo)](https://x.com/32GIGABYTES_YT) - Released under the **Mistral Research License**, reach out to **Mistral AI** for a commercial license - Trained with a **128k context window** with **interleaved sliding-window attention** - Trained on a large proportion of **multilingual and synthetic reasoning data** - Supports **function calling** | Feature | Value | |:---------------------:|:--------------------:| | **Architecture** | Dense Transformer | | **Parameters** | 8,019,808,256 | | **Layers** | 36 | | **Heads** | 32 | | **Dim** | 4096 | | **KV Heads (GQA)** | 8 | | **Hidden Dim** | 12288 | | **Head Dim** | 128 | | **Vocab Size** | 131,072 | | **Context Length** | 128k | | **Attention Pattern** | Ragged (128k,32k,32k,32k) | ## **Usage** ### **Intuitive mode** By default, this mode is activated, and you do not need to change anything. This means you are allowed to use any system prompt! We have given an example below. ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("") model = AutoModelForCausalLM.from_pretrained( "open-neo/DeepNeo-1-8B-Preview", torch_dtype=torch.float16, device_map="auto" ) messages = [ {"role": "user", "content": "What are the most interesting things to do in Paris?"} ] input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda") generated_ids = model.generate(input_ids, max_new_tokens=2500, temperature=0.8, do_sample=True) print(f"Generated Tokens: {generated_ids.shape[-1]}") response = tokenizer.decode(generated_ids[0], skip_special_tokens=True) print(f"Response: {response}") ``` ### **Reasoning mode** To activate this mode, we need to do some extra steps. Almost all system instructions should work as long as they mention `` and ``. An example of this system prompt is given below. Please note that it may require tweaking for your specific use case. ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("") model = AutoModelForCausalLM.from_pretrained( "open-neo/DeepNeo-1-8B-Preview", torch_dtype=torch.float16, device_map="auto" ) messages = [ {"role": "system", "content": "You are a deep-thinking AI model. You must put your thoughts in the tags and your output in the tags."}, {"role": "user", "content": "What are the most interesting things to do in Paris?"} ] input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda") generated_ids = model.generate(input_ids, max_new_tokens=2500, temperature=0.8, do_sample=True) print(f"Generated Tokens: {generated_ids.shape[-1]}") response = tokenizer.decode(generated_ids[0], skip_special_tokens=True) print(f"Response: {response}") ``` ## **Citations** ```bibtex @misc{deepneo-1, title={DeepNeo: A hybrid model with precision and power}, author={Aayan Mishra and Krish Thumar}, howpublished={https://huggingface.co/collections/open-neo/deepneo-1-67aea4c0f086ab0f70ed5720}, year={2025} } ```