Overview
This model is a fine-tuned version of Qwen/Qwen2-7B-Instruct on the LogicNet-Subnet/Aristole dataset. It achieves the following benchmarks on the evaluation set:
- Reliability: 98.53%
- Correctness: 0.9739
Key Details:
- Developed by: LogicNet Team
- License: Apache 2.0
- Base Model: unsloth/qwen2-7b-instruct-bnb-4bit
This fine-tuned Qwen2 model was trained 2x faster using Unsloth and Hugging Face's TRL library.
Model and Training Hyperparameters
Model Configuration:
- dtype:
torch.bfloat16
- load_in_4bit:
True
Prompt Configuration:
- max_seq_length:
2048
PEFT Model Parameters:
- r:
16
- lora_alpha:
16
- lora_dropout:
0
- bias:
"none"
- use_gradient_checkpointing:
"unsloth"
- random_state:
3407
- use_rslora:
False
- loftq_config:
None
Training Arguments:
- per_device_train_batch_size:
2
- gradient_accumulation_steps:
4
- warmup_steps:
5
- max_steps:
70
- learning_rate:
2e-4
- fp16:
not is_bfloat16_supported()
- bf16:
is_bfloat16_supported()
- logging_steps:
1
- optim:
"adamw_8bit"
- weight_decay:
0.01
- lr_scheduler_type:
"linear"
- seed:
3407
- output_dir:
"outputs"
Training Results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.4764 | 1.0 | 1150 | 1.1850 |
1.3102 | 2.0 | 2050 | 1.1091 |
1.1571 | 3.0 | 3100 | 1.0813 |
1.0922 | 4.0 | 3970 | 0.9906 |
0.9809 | 5.0 | 5010 | 0.9021 |
How To Use
You can easily use the model for inference as shown below:
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load the model
tokenizer = AutoTokenizer.from_pretrained("LogicNet-Subnet/LogicNet-7B")
model = AutoModelForCausalLM.from_pretrained("LogicNet-Subnet/LogicNet-7B")
# Prepare the input
inputs = tokenizer(
[
"what is odd which is bigger than zero?" # Example prompt
],
return_tensors="pt"
).to("cuda")
# Generate an output
outputs = model.generate(**inputs)
# Decode and print the result
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model’s pipeline type.