metadata

base_model:
  - unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit
license: apache-2.0
datasets:
  - Omartificial-Intelligence-Space/Arabic_Reasoning_Dataset
language:
  - ar
pipeline_tag: text-generation
library_name: peft
tags:
  - unsloth
  - arabic
  - deepseek-R1
  - Peft

DeepSeek-R1-Distill-Llama-8B (Arabic Reasoning Edition)

Overview

DeepSeek-R1-Distill-Llama-8B (Arabic Reasoning Edition) is a fine-tuned version of the base model unsloth/DeepSeek-R1-Distill-Llama-8B that has been further optimized for Arabic text generation, with a special focus on mathematical reasoning tasks in Arabic. This model leverages state-of-the-art transformer architectures and Parameter-Efficient Fine-Tuning (PEFT) techniques to provide accurate, context-aware responses in Arabic.

Key Features

Base Model: unsloth/DeepSeek-R1-Distill-Llama-8B
Fine-Tuning Dataset: Omartificial-Intelligence-Space/Arabic_Reasoning_Dataset
Target Language: Arabic (ar)
Pipeline: Text Generation
Optimizations:
- Fine-tuning using PEFT for efficient adaptation.
- Optimized for generating responses in Arabic, including complex math reasoning tasks.
License: Apache-2.0

Intended Use

This model is intended for:

Arabic Text Generation: Generating coherent and contextually relevant Arabic text.
Mathematical Reasoning: Solving and explaining mathematical problems in Arabic.
Educational Tools: Assisting in learning and tutoring applications that require Arabic language support and reasoning capabilities.

How to Use

Below is an example snippet using the Unsloth library and the Transformers framework:

import torch
from unsloth import FastLanguageModel

def load_model(model_name="Omartificial-Intelligence-Space/Arabic-DeepSeek-R1-Distill-8B"):
    """Loads the fine-tuned model and tokenizer."""
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name=model_name,
        max_seq_length=2048,
        dtype=None,
        load_in_4bit=True,
    )
    FastLanguageModel.for_inference(model)
    return model, tokenizer

def generate_response(model, tokenizer, instruction, max_new_tokens=256):
    """Generates a response for a given instruction using the model."""
    chat_template = """Below are some instructions that describe some tasks. Write responses in Arabic language only that appropriately complete each request.

### Instruction:
{INPUT}

### Response:
{OUTPUT}
"""

    prompt = chat_template.replace("{INPUT}", instruction).replace("{OUTPUT}", "")
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda" if torch.cuda.is_available() else "cpu")
    with torch.no_grad():
        output_ids = model.generate(**inputs, max_new_tokens=max_new_tokens)
    return tokenizer.decode(output_ids[0], skip_special_tokens=True)

if __name__ == "__main__":
    # Load the fine-tuned model
    model, tokenizer = load_model()
    
    # Example prompt
    instruction = "إذا كان لديك 200 ريال، واشتريت شيئًا بـ 75 ريالًا، كم تبقى لديك؟"
    response = generate_response(model, tokenizer, instruction)
    
    print("Generated Response:\n", response)

Evaluation and Comparison

This model has been evaluated on the Arabic Reasoning Dataset from Omartificial-Intelligence-Space. In benchmark comparisons against other Arabic generation models, DeepSeek-R1-Distill-Llama-8B (Arabic Reasoning Edition) demonstrates robust performance in tasks requiring both natural language understanding and logical reasoning.

General Conclusions

Fine-tuned Responses:

They use proper and well-organized Arabic language.
They provide step-by-step explanations with a clear presentation of the given data and logical calculations.
They deliver clear and direct answers that match the correct answer in most cases.

Baseline Responses:

They often mix Arabic and English.
They include unnecessary text, such as internal thought processes or incomplete symbols.
In some examples, there is unwanted repetition or overly verbose explanations, which can confuse the user.
In one example (calculating the time to cover 12 km at a speed of 4 km/h), both responses were inaccurate; the correct approach is to use the formula time = distance ÷ speed to obtain 3 hours.

Limitations

Domain-Specific: While optimized for Arabic reasoning, the model might not generalize as well to tasks outside of its fine-tuned domain.
4-bit Quantization: Although efficient, quantization may sometimes result in a slight degradation in the quality of generated text compared to full-precision models.

Citation

If you use this model in your research or applications, please cite the original base model and fine-tuning methodologies appropriately.

@misc{deepseekai2025deepseekr1incentivizingreasoningcapability,
      title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning}, 
      author={DeepSeek-AI and Daya Guo and Dejian Yang and Haowei Zhang and Junxiao Song and Ruoyu Zhang and Runxin Xu and Qihao Zhu and Shirong Ma and Peiyi Wang and Xiao Bi and Xiaokang Zhang and Xingkai Yu and Yu Wu and Z. F. Wu and Zhibin Gou and Zhihong Shao and Zhuoshu Li and Ziyi Gao and Aixin Liu and Bing Xue and Bingxuan Wang and Bochao Wu and Bei Feng and Chengda Lu and Chenggang Zhao and Chengqi Deng and Chenyu Zhang and Chong Ruan and Damai Dai and Deli Chen and Dongjie Ji and Erhang Li and Fangyun Lin and Fucong Dai and Fuli Luo and Guangbo Hao and Guanting Chen and Guowei Li and H. Zhang and Han Bao and Hanwei Xu and Haocheng Wang and Honghui Ding and Huajian Xin and Huazuo Gao and Hui Qu and Hui Li and Jianzhong Guo and Jiashi Li and Jiawei Wang and Jingchang Chen and Jingyang Yuan and Junjie Qiu and Junlong Li and J. L. Cai and Jiaqi Ni and Jian Liang and Jin Chen and Kai Dong and Kai Hu and Kaige Gao and Kang Guan and Kexin Huang and Kuai Yu and Lean Wang and Lecong Zhang and Liang Zhao and Litong Wang and Liyue Zhang and Lei Xu and Leyi Xia and Mingchuan Zhang and Minghua Zhang and Minghui Tang and Meng Li and Miaojun Wang and Mingming Li and Ning Tian and Panpan Huang and Peng Zhang and Qiancheng Wang and Qinyu Chen and Qiushi Du and Ruiqi Ge and Ruisong Zhang and Ruizhe Pan and Runji Wang and R. J. Chen and R. L. Jin and Ruyi Chen and Shanghao Lu and Shangyan Zhou and Shanhuang Chen and Shengfeng Ye and Shiyu Wang and Shuiping Yu and Shunfeng Zhou and Shuting Pan and S. S. Li and Shuang Zhou and Shaoqing Wu and Shengfeng Ye and Tao Yun and Tian Pei and Tianyu Sun and T. Wang and Wangding Zeng and Wanjia Zhao and Wen Liu and Wenfeng Liang and Wenjun Gao and Wenqin Yu and Wentao Zhang and W. L. Xiao and Wei An and Xiaodong Liu and Xiaohan Wang and Xiaokang Chen and Xiaotao Nie and Xin Cheng and Xin Liu and Xin Xie and Xingchao Liu and Xinyu Yang and Xinyuan Li and Xuecheng Su and Xuheng Lin and X. Q. Li and Xiangyue Jin and Xiaojin Shen and Xiaosha Chen and Xiaowen Sun and Xiaoxiang Wang and Xinnan Song and Xinyi Zhou and Xianzu Wang and Xinxia Shan and Y. K. Li and Y. Q. Wang and Y. X. Wei and Yang Zhang and Yanhong Xu and Yao Li and Yao Zhao and Yaofeng Sun and Yaohui Wang and Yi Yu and Yichao Zhang and Yifan Shi and Yiliang Xiong and Ying He and Yishi Piao and Yisong Wang and Yixuan Tan and Yiyang Ma and Yiyuan Liu and Yongqiang Guo and Yuan Ou and Yuduan Wang and Yue Gong and Yuheng Zou and Yujia He and Yunfan Xiong and Yuxiang Luo and Yuxiang You and Yuxuan Liu and Yuyang Zhou and Y. X. Zhu and Yanhong Xu and Yanping Huang and Yaohui Li and Yi Zheng and Yuchen Zhu and Yunxian Ma and Ying Tang and Yukun Zha and Yuting Yan and Z. Z. Ren and Zehui Ren and Zhangli Sha and Zhe Fu and Zhean Xu and Zhenda Xie and Zhengyan Zhang and Zhewen Hao and Zhicheng Ma and Zhigang Yan and Zhiyu Wu and Zihui Gu and Zijia Zhu and Zijun Liu and Zilin Li and Ziwei Xie and Ziyang Song and Zizheng Pan and Zhen Huang and Zhipeng Xu and Zhongyu Zhang and Zhen Zhang},
      year={2025},
      eprint={2501.12948},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.12948}, 
}