ZeroXClem
/

Qwen2.5-7B-DistilPrism

@@ -8,21 +8,37 @@ tags:
 - mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1
 - Triangle104/DSR1-Distill-Qwen-7B-RP
 - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
 ---
 # ZeroXClem/Qwen2.5-7B-DistilPrism
-ZeroXClem/Qwen2.5-7B-DistilPrism is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
-* [huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2](https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2)
-* [mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1](https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1)
-* [Triangle104/DSR1-Distill-Qwen-7B-RP](https://huggingface.co/Triangle104/DSR1-Distill-Qwen-7B-RP)
-* [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
-## 🧩 Configuration
 ```yaml
 # Merge configuration for ZeroXClem/Qwen2.5-7B-DistilPrism using Model Stock
 name: ZeroXClem-Qwen2.5-7B-DistilPrism
 merge_method: model_stock
 base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
@@ -44,6 +60,138 @@ models:
   - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
     parameters:
       weight: 0.25
-```

 - mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1
 - Triangle104/DSR1-Distill-Qwen-7B-RP
 - deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
+language:
+- en
+- zh
+base_model:
+- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
+- huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2
+- mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1
+- Triangle104/DSR1-Distill-Qwen-7B-RP
+pipeline_tag: text-classification
+library_name: transformers
 ---
 # ZeroXClem/Qwen2.5-7B-DistilPrism
+**Qwen2.5-7B-DistilPrism** is a **distillation / reasoning focused model merge** designed to combine multiple variations of DeepSeek-R1 distillations, resulting in a **refined, high-performance language model**. Utilizing the **Model Stock** merge method, this fusion captures the best attributes of **DeepSeek-R1-Distill-Qwen-7B** and its improved derivatives.
+## 🚀 Merged Models
+This model is a weighted merge of the following:
+- [**huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2**](https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Qwen-7B-abliterated-v2): An uncensored distillation of DeepSeek-R1, optimized to remove refusals and improve usability.
+- [**mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1**](https://huggingface.co/mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-7B-v1.1): A refined distillation that improves accuracy and robustness across various benchmarks.
+- [**Triangle104/DSR1-Distill-Qwen-7B-RP**](https://huggingface.co/Triangle104/DSR1-Distill-Qwen-7B-RP): A composite merge of various distilled DeepSeek variants, serving as an essential ingredient for performance tuning.
+- [**deepseek-ai/DeepSeek-R1-Distill-Qwen-7B**](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B): The foundation of this merge, representing the distilled form of DeepSeek-R1 optimized for efficiency and strong reasoning capabilities.
+## 🧩 Merge Configuration
+The following **YAML configuration** defines how these models were combined using **Model Stock**, ensuring **balanced contributions** from each source:
 ```yaml
 # Merge configuration for ZeroXClem/Qwen2.5-7B-DistilPrism using Model Stock
 name: ZeroXClem-Qwen2.5-7B-DistilPrism
 merge_method: model_stock
 base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
   - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
     parameters:
       weight: 0.25
+```
+### 🔑 Key Parameters
+- **Normalization & Rescaling**: Ensures weight distributions remain balanced across all components.
+- **Model Stock Merge Method**: Optimizes contribution from each model to retain the best attributes.
+- **Weighted Blending**: The **abliterated** and **re-distilled** models contribute the most, refining both alignment and general usability.
+---
+## 🗣️ Inference
+You can use the model for text generation as follows:
+### Ollama
+**[Quickstart to Ollama Guide Here](https://aidev.zeroxclem.com/blog/08-setting-up-ollama)** I recommend ollama for daily driver applications, as it supports thinkking tags.
+```bash
+ollama run hf.co/ZeroXClem/Qwen2.5-7B-DistilPrism
+# If you are using quants, just copy the url and replace 'huggingface.co/' with 'hf.co/' followed by name of quant.
+```
+### Transformers
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
+import torch
+# Define the model name
+model_name = "ZeroXClem/Qwen2.5-7B-DistilPrism"
+# Load the tokenizer
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+# Load the model
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+# Initialize the pipeline
+text_generator = pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+# Define the input prompt
+prompt = "Explain the significance of artificial intelligence in modern healthcare."
+# Generate the output
+outputs = text_generator(
+    prompt,
+    max_new_tokens=150,
+    do_sample=True,
+    temperature=0.7,
+    top_k=50,
+    top_p=0.95
+)
+# Print the generated text
+print(outputs[0]["generated_text"])
+```
+---
+## 🎯 Use Case & Applications
+**Qwen2.5-7B-DistilPrism** is designed for **efficient, high-quality text generation** with strong reasoning capabilities. It is well-suited for:
+- **Advanced Reasoning & Problem Solving**: Excels in logic-heavy tasks and multi-step reasoning problems.
+- **Conversational AI**: Optimized for **fluid, responsive dialogue**, reducing refusals and improving engagement.
+- **Mathematical & Scientific Computation**: Enhanced **math & code generation abilities** compared to standard distillations.
+- **Content Creation & Summarization**: Generates coherent and **contextually rich** text suitable for various applications.
+---
+## 📜 License
+This model is released under the **MIT License**.
+---
+## 📊 Benchmark Results (Coming Soon)
+We are currently in the process of **quantizing and benchmarking** this model. Stay tuned for performance updates across:
+- **IFEval (0-Shot)**
+- **BBH (3-Shot)**
+- **MATH (4-Shot)**
+- **GPQA (0-Shot)**
+- **MuSR (0-Shot)**
+- **MMLU-PRO (5-Shot)**
+---
+## 💡 Tags
+- `merge`
+- `mergekit`
+- `model_stock`
+- `DeepSeek-R1`
+- `Distillation`
+- `abliterated`
+- `re-distilled`
+- `DeepSeek-R1-Distill-Qwen-7B`
+---
+## 🙏 Special Thanks
+This project wouldn't be possible without the incredible contributions from:
+- **[@huihui-ai](https://huggingface.co/huihui-ai)** – For developing **DeepSeek-R1-Distill-Qwen-7B-abliterated-v2**, a bold step towards improving model alignment.
+- **[@mobiuslabsgmbh](https://huggingface.co/mobiuslabsgmbh)** – For refining distillation techniques with **DeepSeek-R1-ReDistill-Qwen-7B-v1.1**.
+- **[@Triangle104](https://huggingface.co/Triangle104)** – For crafting innovative merges like **DSR1-Distill-Qwen-7B-RP**, an essential component in this blend.
+- **[@deepseek-ai](https://huggingface.co/deepseek-ai)** – For open-sourcing **DeepSeek-R1-Distill-Qwen-7B**, a foundation for reasoning advancements.
+And a heartfelt **thank you** to everyone in the **🤗 & Open-Source AI community** for their continued research, testing, and support. 💜🚀
+---
+Let me know if you want to add any personal shoutouts or modifications! 😊
+# 🔗 Additional Resources
+- [Hugging Face Model Card](https://huggingface.co/ZeroXClem/Qwen2.5-7B-DistilPrism)
+- [MergeKit Repository](https://github.com/ZeroXClem/mergekit)
+- [DeepSeek AI Homepage](https://huggingface.co/deepseek-ai)
+- [Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)