File size: 4,409 Bytes
4df3eeb
5b3f608
4df3eeb
 
 
 
 
 
50cf9cb
 
 
 
4df3eeb
 
 
50cf9cb
 
 
 
4df3eeb
 
5b3f608
4df3eeb
5b3f608
50cf9cb
4df3eeb
50cf9cb
5b3f608
4df3eeb
50cf9cb
5b3f608
50cf9cb
 
 
 
 
 
 
 
 
5b3f608
50cf9cb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5b3f608
50cf9cb
 
 
 
 
 
 
 
 
 
5b3f608
50cf9cb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3d9a7f1
50cf9cb
 
 
3d9a7f1
 
 
 
50cf9cb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
- gammacorpus
- zurich
- chat
- conversational
license: apache-2.0
language:
- en
datasets:
- rubenroy/GammaCorpus-v2-10k
pipeline_tag: text-generation
library_name: transformers
---

![Zunich Banner](https://cdn.ruben-roy.com/AI/Zurich/img/banner-7B-10k.png)

# Zurich 7B GammaCorpus v2-10k 
*A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*

## Overview
Zurich 7B GammaCorpus v2-10k is a fine-tune of Alibaba's **Qwen 2.5 7B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-10k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-10k).

## Model Details
- **Base Model:** [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
- **Type:** Causal Language Models
- **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
- **Number of Parameters:** 7.61B
- **Number of Paramaters (Non-Embedding):** 6.53B
- **Number of Layers:** 28
- **Number of Attention Heads (GQA):** 28 for Q and 4 for KV

## Training Details

Zurich-7B-GCv2-10k underwent fine-tuning with 1 T4 GPU for ~20 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-7B-GCv2-10k was trained for **60 Epochs**. 

## Usage

### Requirements

We **strongly** recommend you use the latest version of the `transformers` package. You may install it via `pip` as follows:

```
pip install transformers
```

### Quickstart

Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents;

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "rubenroy/Zurich-7B-GCv2-10k"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "How tall is the Eiffel tower?"
messages = [
    {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 7B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```

## About GammaCorpus

This model, and all Zurich models, are trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that is filled with structured and filtered multi-turn conversations.
GammaCorpus has 4 version with different sizes in each. These are the following versions and sizes:

### GammaCorpus v1
- 10k UNFILTERED
- 50k UNFILTERED
- 70k UNFILTERED

Here is a link to the GCv1 dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-v1-67935e4e52a04215f15a7a60

### GammaCorpus v2
- **10k  <-- This is the version of GammaCorpus v2 that the Zurich model you are using was trained on.**
- 50k
- 100k
- 500k
- 1m
- 5m

Here is a link to the GCv2 dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-v2-67935e895e1259c404a579df

### GammaCorpus CoT
- Math 170k

Here is a link to the GC-CoT dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-cot-6795bbc950b62b1ced41d14f

### GammaCorpus QA
- Fact 450k

Here is a link to the GC-QA dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-qa-679857017bb3855234c1d8c7

### The link to the full GammaCorpus dataset collection can be found [here](https://huggingface.co/collections/rubenroy/gammacorpus-67765abf607615a0eb6d61ac).

## Known Limitations

- **Bias:** We have tried our best to mitigate as much bias we can, but please be aware of the possibility that the model might generate some biased answers.

## Additional Information

### Licensing Information

The model is released under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**. Please refer to the license for usage rights and restrictions.