Model Introduction

Early experimental model uses unique advance form of supervised tuning. This training program loads the model, and than loads the data from dataset. It will provide data in inference time. Than it trains the LLM. During inference and than checks if it reaches the answer or goal. If not, it will keep training until it reaches the answer or solution.

Context Window: 128k

Installation

Update latest transformers

pip install -U transformers

System prompt suggested for math:

system_prompt="<problem>...</problem><solution>...</solution>"

Inference

from transformers import pipeline
model_id = "EpistemeAI/OpenReasoner-Llama-3.2-3B-rs1.0"
pipe = pipeline(
    "text-generation", 
    model=model_id, 
    torch_dtype=torch.bfloat16, 
    device_map="auto"
)
print(pipe("What is larger 9.9 or 9.11?"))

Reference

Thank you so much to Hugging Face H4 and the dataset: Math-500

We use this as evaluator. It was not directly trained, it was used as a test

Uploaded model

  • Developed by: EpistemeAI
  • License: apache-2.0
  • Finetuned from model : EpistemeAI/ReasoningCore-Llama-3.2-3B-r1-V1.1

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
30
Safetensors
Model size
3.21B params
Tensor type
FP16
·
Inference Providers NEW
Input a message to start chatting with EpistemeAI/OpenReasoner-Llama-3.2-3B-rs1.0.

Model tree for EpistemeAI/OpenReasoner-Llama-3.2-3B-rs1.0

Finetuned
(2)
this model
Finetunes
1 model
Quantizations
2 models

Space using EpistemeAI/OpenReasoner-Llama-3.2-3B-rs1.0 1

Collection including EpistemeAI/OpenReasoner-Llama-3.2-3B-rs1.0