mav23 commited on
Commit
d91fc85
·
verified ·
1 Parent(s): c078cda

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ gemma-2-27b-it-function-calling-gguf.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,403 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: google/gemma-2-27b-it
3
+ datasets:
4
+ - DiTy/function-calling
5
+ language:
6
+ - en
7
+ library_name: transformers
8
+ license: apache-2.0
9
+ pipeline_tag: text-generation
10
+ tags:
11
+ - conversational
12
+ - gemma2
13
+ - function-calling
14
+ - trl
15
+ ---
16
+ # DiTy/gemma-2-27b-it-function-calling-GGUF
17
+
18
+ This model is a fine-tuned version of [google/gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it) for the **Function Calling** task on non-synthetic data,
19
+ fully annotated by humans only, on the English version of the <ins>*DiTy/function-calling*</ins> dataset.
20
+ <!-- Provide a quick summary of what the model is/does. -->
21
+
22
+ In addition to **safetensors**, the model is available in **GGUF** formats (in this case, you need to download only a single file (*[how to inference GGUF model](https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#high-level-api)*)):
23
+
24
+ | Filename | Quant type | File Size | Description |
25
+ | -------- | ---------- | --------- | ----------- |
26
+ | [gemma-2-27B-it-function-calling-Q8_0.gguf](https://huggingface.co/DiTy/gemma-2-27b-it-function-calling-GGUF/blob/main/gemma-2-27B-it-function-calling-Q8_0.gguf) | Q8_0 | 28.9GB | Extremely high quality, generally unneeded but max available quant. |
27
+ | [gemma-2-27B-it-function-calling-Q6_K.gguf](https://huggingface.co/DiTy/gemma-2-27b-it-function-calling-GGUF/blob/main/gemma-2-27B-it-function-calling-Q6_K.gguf) | Q6_K | 22.3GB | Very high quality, near perfect, *recommended*. |
28
+ | [gemma-2-27B-it-function-calling-Q5_K_M.gguf](https://huggingface.co/DiTy/gemma-2-27b-it-function-calling-GGUF/blob/main/gemma-2-27B-it-function-calling-Q5_K_M.gguf) | Q5_K_M | 19.4GB | High quality, very usable. |
29
+ | [gemma-2-27B-it-function-calling-Q5_K_S.gguf](https://huggingface.co/DiTy/gemma-2-27b-it-function-calling-GGUF/blob/main/gemma-2-27B-it-function-calling-Q5_K_S.gguf) | Q5_K_S | 18.9GB | High quality, very usable. |
30
+
31
+
32
+ ## Model card tree
33
+
34
+ * [How prepare your functions (tools) for *Function Calling*](#prepare_func_call)
35
+ * [Just use chat template for generation](#just_chat_template)
36
+ * [Prompt structure and expected content](#roles)
37
+ * [Evaluation of function calling models](#eval)
38
+
39
+ ## Usage (HuggingFace Transformers)
40
+
41
+ Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with:
42
+ ```bash
43
+ pip install -U transformers
44
+ ```
45
+
46
+ ### <a name="prepare_func_call"></a>Prepare your functions for *Function Calling*
47
+
48
+ You should write the functions (tools) used by the model in *Python code* and make sure to add *Python docstrings* as in the example below:
49
+ ```python
50
+ def get_weather(city: str):
51
+ """
52
+ A function that returns the weather in a given city.
53
+
54
+ Args:
55
+ city: The city to get the weather for.
56
+ """
57
+ import random
58
+
59
+ return "sunny" if random.random() > 0.5 else "rainy"
60
+ def get_sunrise_sunset_times(city: str):
61
+ """
62
+ A function that returns the time of sunrise and sunset at the present moment, for a given city, in the form of a list: [sunrise_time, sunset_time].
63
+
64
+ Args:
65
+ city: The city to get the sunrise and sunset times for.
66
+ """
67
+
68
+ return ["6:00 AM", "6:00 PM"]
69
+ ```
70
+
71
+ ### <a name="just_chat_template"></a>Just use chat template
72
+
73
+ Next, you need to download the model and tokenizer:
74
+ ```python
75
+ import torch
76
+ from transformers import AutoTokenizer, AutoModelForCausalLM
77
+ model = AutoModelForCausalLM.from_pretrained(
78
+ "DiTy/gemma-2-27b-it-function-calling-GGUF",
79
+ device_map="auto",
80
+ torch_dtype=torch.bfloat16, # use float16 or float32 if bfloat16 is not available to you.
81
+ cache_dir=PATH_TO_MODEL_DIR, # optional
82
+ )
83
+ tokenizer = AutoTokenizer.from_pretrained(
84
+ "DiTy/gemma-2-27b-it-function-calling-GGUF",
85
+ cache_dir=PATH_TO_MODEL_DIR, # optional
86
+ )
87
+ ```
88
+
89
+ To get the result of generation, just use `apply_chat_template`. In order to take into account our written functions (tools),
90
+ we need to pass them as a list through the `tools` attribute and also use `add_prompt_generation=True`.
91
+ ```python
92
+ history_messages = [
93
+ {"role": "system", "content": "You are a helpful assistant with access to the following functions. Use them if required - "},
94
+ {"role": "user", "content": "Hi, can you tell me the time of sunrise in Los Angeles?"},
95
+ ]
96
+ inputs = tokenizer.apply_chat_template(
97
+ history_messages,
98
+ tokenize=False,
99
+ add_generation_prompt=True, # adding prompt for generation
100
+ tools=[get_weather, get_sunrise_sunset_times], # our functions (tools)
101
+ )
102
+ print(inputs)
103
+ ```
104
+
105
+ Then our `inputs` will look like this:
106
+ ```
107
+ <bos><start_of_turn>user
108
+ You are a helpful assistant with access to the following functions. Use them if required - {
109
+ "name": "get_weather",
110
+ "description": "A function that returns the weather in a given city.",
111
+ "parameters": {
112
+ "type": "object",
113
+ "properties": {
114
+ "city": {
115
+ "type": "string",
116
+ "description": "The city to get the weather for."
117
+ }
118
+ },
119
+ "required": [
120
+ "city"
121
+ ]
122
+ }
123
+ },
124
+ {
125
+ "name": "get_sunrise_sunset_times",
126
+ "description": "A function that returns the time of sunrise and sunset at the present moment, for a given city, in the form of a list: [sunrise_time, sunset_time].",
127
+ "parameters": {
128
+ "type": "object",
129
+ "properties": {
130
+ "city": {
131
+ "type": "string",
132
+ "description": "The city to get the sunrise and sunset times for."
133
+ }
134
+ },
135
+ "required": [
136
+ "city"
137
+ ]
138
+ }
139
+ }
140
+ Hi, can you tell me the time of sunrise in Los Angeles?<end_of_turn>
141
+ <start_of_turn>model
142
+ ```
143
+
144
+ Now we can generate a model's response.
145
+ Be careful because, after `apply_chat_template`, there is no need to *add special tokens* during tokenization. So, use `add_special_tokens=False`:
146
+ ```python
147
+ terminator_ids = [
148
+ tokenizer.eos_token_id,
149
+ tokenizer.convert_tokens_to_ids("<end_of_turn>"),
150
+ ]
151
+ prompt_ids = tokenizer.encode(inputs, add_special_tokens=False, return_tensors='pt').to(model.device)
152
+ generated_ids = model.generate(
153
+ prompt_ids,
154
+ max_new_tokens=512,
155
+ eos_token_id=terminator_ids,
156
+ bos_token_id=tokenizer.bos_token_id,
157
+ )
158
+ generated_response = tokenizer.decode(generated_ids[0][prompt_ids.shape[-1]:], skip_special_tokens=False) # `skip_special_tokens=False` for debug
159
+ print(generated_response)
160
+ ```
161
+
162
+ We get the generation as a function call:
163
+ ```
164
+ Function call: {"name": "get_sunrise_sunset_times", "arguments": {"city": "Los Angeles"}}<end_of_turn>
165
+ ```
166
+
167
+ Great, now we can pick up and process the results with our *called function*, and then provide the model with the *function's response*:
168
+ ```python
169
+ history_messages = [
170
+ {"role": "system", "content": "You are a helpful assistant with access to the following functions. Use them if required - "},
171
+ {"role": "user", "content": "Hi, can you tell me the time of sunrise in Los Angeles?"},
172
+ {"role": "function-call", "content": '{"name": "get_sunrise_sunset_times", "arguments": {"city": "Los Angeles"}}'},
173
+ {"role": "function-response", "content": '{"times_list": ["6:00 AM", "6:00 PM"]}'}, # a hypothetical response from our function
174
+ ]
175
+ inputs = tokenizer.apply_chat_template(
176
+ history_messages,
177
+ tokenize=False,
178
+ add_generation_prompt=True, # adding prompt for generation
179
+ tools=[get_weather, get_sunrise_sunset_times], # our functions (tools)
180
+ )
181
+ print(inputs)
182
+ ```
183
+
184
+ Let's make sure the `inputs` are correct:
185
+ ```
186
+ <bos><start_of_turn>user
187
+ You are a helpful assistant with access to the following functions. Use them if required - {
188
+ "name": "get_weather",
189
+ "description": "A function that returns the weather in a given city.",
190
+ "parameters": {
191
+ "type": "object",
192
+ "properties": {
193
+ "city": {
194
+ "type": "string",
195
+ "description": "The city to get the weather for."
196
+ }
197
+ },
198
+ "required": [
199
+ "city"
200
+ ]
201
+ }
202
+ },
203
+ {
204
+ "name": "get_sunrise_sunset_times",
205
+ "description": "A function that returns the time of sunrise and sunset at the present moment, for a given city, in the form of a list: [sunrise_time, sunset_time].",
206
+ "parameters": {
207
+ "type": "object",
208
+ "properties": {
209
+ "city": {
210
+ "type": "string",
211
+ "description": "The city to get the sunrise and sunset times for."
212
+ }
213
+ },
214
+ "required": [
215
+ "city"
216
+ ]
217
+ }
218
+ }
219
+ Hi, can you tell me the time of sunrise in Los Angeles?<end_of_turn>
220
+ <start_of_turn>model
221
+ Function call: {"name": "get_sunrise_sunset_times", "arguments": {"city": "Los Angeles"}}<end_of_turn>
222
+ <start_of_turn>user
223
+ Function response: {"times_list": ["6:00 AM", "6:00 PM"]}<end_of_turn>
224
+ <start_of_turn>model
225
+ ```
226
+
227
+ Similarly, we generate a response from the model:
228
+ ```python
229
+ prompt_ids = tokenizer.encode(inputs, add_special_tokens=False, return_tensors='pt').to(model.device)
230
+ generated_ids = model.generate(
231
+ prompt_ids,
232
+ max_new_tokens=512,
233
+ eos_token_id=terminator_ids,
234
+ bos_token_id=tokenizer.bos_token_id,
235
+ )
236
+ generated_response = tokenizer.decode(generated_ids[0][prompt_ids.shape[-1]:], skip_special_tokens=False) # `skip_special_tokens=False` for debug
237
+ print(generated_response)
238
+ ```
239
+
240
+ As a result, we get the model's response:
241
+ ```
242
+ The sunrise time in Los Angeles is 6:00 AM.<end_of_turn>
243
+ ```
244
+
245
+ ## Usage via transformers `pipeline`
246
+
247
+ <details>
248
+ <summary>
249
+ Generation via pipeline
250
+ </summary>
251
+
252
+ ```python
253
+ from transformers import pipeline
254
+ generation_pipeline = pipeline(
255
+ "text-generation",
256
+ model="DiTy/gemma-2-27b-it-function-calling-GGUF",
257
+ model_kwargs={
258
+ "torch_dtype": torch.bfloat16, # use float16 or float32 if bfloat16 is not supported for you.
259
+ "cache_dir": PATH_TO_MODEL_DIR, # OPTIONAL
260
+ },
261
+ device_map="auto",
262
+ )
263
+ history_messages = [
264
+ {"role": "system", "content": "You are a helpful assistant with access to the following functions. Use them if required - "},
265
+ {"role": "user", "content": "Hi, can you tell me the time of sunrise in Los Angeles?"},
266
+ {"role": "function-call", "content": '{"name": "get_sunrise_sunset_times", "arguments": {"city": "Los Angeles"}}'},
267
+ {"role": "function-response", "content": '{"times_list": ["6:00 AM", "6:00 PM"]}'},
268
+ ]
269
+ inputs = generation_pipeline.tokenizer.apply_chat_template(
270
+ history_messages,
271
+ tokenize=False,
272
+ add_generation_prompt=True,
273
+ tools=[get_weather, get_sunrise_sunset_times],
274
+ )
275
+ terminator_ids = [
276
+ generation_pipeline.tokenizer.eos_token_id,
277
+ generation_pipeline.tokenizer.convert_tokens_to_ids("<end_of_turn>")
278
+ ]
279
+ outputs = generation_pipeline(
280
+ inputs,
281
+ max_new_tokens=512,
282
+ eos_token_id=terminator_ids,
283
+ )
284
+ print(outputs[0]["generated_text"][len(inputs):])
285
+ ```
286
+
287
+ </details>
288
+
289
+ ## <a name="roles"></a>Prompt structure and expected content
290
+
291
+ For the most correct operation of the model, it is assumed that `apply_chat_template` will be used.
292
+ It is necessary to transmit the message history in a certain format.
293
+ ```python
294
+ history_messages = [
295
+ {"role": "...", "content": "..."},
296
+ ...
297
+ ]
298
+ ```
299
+
300
+ The following roles are available for use:
301
+
302
+ * `system` - an optional role, its content is always placed at the very beginning and before listing the functions available to the model (tools).
303
+ You can always use the standard option that was used during the training: ***"You are a helpful assistant with access to the following functions. Use them if required - "***
304
+ * `user` - the user's request is transmitted through this role.
305
+ * `function-call` - The body of the function call is passed through this role.
306
+ Although the model is trained to generate a function call in the form of ***"Function call: {...}\<end_of_turn\>"***, you should still pass only the body ***"{...}"***
307
+ to the *"content"* field, since using `apply_chat_template`, the postscript in the instructions is added automatically.
308
+ * `function-response` - in this role, we must pass the response of our function in the *"content"* field as a dictionary ***'{"name_returnable_value": value}'***.
309
+ * `model` - the content under this role is considered to be the generated text of the model.
310
+
311
+ ### Chat history with *Function Calling*
312
+
313
+ ```
314
+ [
315
+ {"role": "system", "content": "You are a helpful assistant with access to the following functions. Use them if required - "},
316
+ {"role": "user", "content": "Hi, can you tell me the time of sunrise in Los Angeles?"},
317
+ {"role": "function-call", "content": '{"name": "get_sunrise_sunset_times", "arguments": {"city": "Los Angeles"}}'},
318
+ {"role": "function-response", "content": '{"times_list": ["6:00 AM", "6:00 PM"]}'},
319
+ ]
320
+ ```
321
+
322
+ It looks like:
323
+ ```
324
+ <bos><start_of_turn>user
325
+ You are a helpful assistant with access to the following functions. Use them if required - {
326
+ "name": "get_weather",
327
+ "description": "A function that returns the weather in a given city.",
328
+ "parameters": {
329
+ "type": "object",
330
+ "properties": {
331
+ "city": {
332
+ "type": "string",
333
+ "description": "The city to get the weather for."
334
+ }
335
+ },
336
+ "required": [
337
+ "city"
338
+ ]
339
+ }
340
+ },
341
+ {
342
+ "name": "get_sunrise_sunset_times",
343
+ "description": "A function that returns the time of sunrise and sunset at the present moment, for a given city, in the form of a list: [sunrise_time, sunset_time].",
344
+ "parameters": {
345
+ "type": "object",
346
+ "properties": {
347
+ "city": {
348
+ "type": "string",
349
+ "description": "The city to get the sunrise and sunset times for."
350
+ }
351
+ },
352
+ "required": [
353
+ "city"
354
+ ]
355
+ }
356
+ }
357
+ Hi, can you tell me the time of sunrise in Los Angeles?<end_of_turn>
358
+ <start_of_turn>model
359
+ Function call: {"name": "get_sunrise_sunset_times", "arguments": {"city": "Los Angeles"}}<end_of_turn>
360
+ <start_of_turn>user
361
+ Function response: {"times_list": ["6:00 AM", "6:00 PM"]}<end_of_turn>
362
+ ```
363
+
364
+
365
+ ### Chat history with a standard user-model template
366
+
367
+ ```
368
+ [
369
+ {"role": "system", "content": "You are a helpful assistant"},
370
+ {"role": "user", "content": "Tell me about California"},
371
+ ]
372
+ ```
373
+
374
+ It looks like:
375
+ ```
376
+ <bos><start_of_turn>user
377
+ You are a helpful assistant
378
+ Tell me about California<end_of_turn>
379
+ ```
380
+
381
+ ## <a name="eval"></a>Evaluation
382
+
383
+ During the learning process, the validation error was approximated to the following values:
384
+
385
+ | **Model** | **Generation Language** | **Approximately Validation Loss** |
386
+ | :-----: | :-----: | :-----: |
387
+ | [**DiTy/gemma-2-27b-it-function-calling-GGUF**](https://huggingface.co/DiTy/gemma-2-27b-it-function-calling-GGUF) | **EN** | **0.47** |
388
+ | [DiTy/gemma-2-9b-it-russian-function-calling-GGUF](https://huggingface.co/DiTy/gemma-2-9b-it-russian-function-calling-GGUF) | RU | 0.57 |
389
+ | [DiTy/gemma-2-9b-it-function-calling-GGUF](https://huggingface.co/DiTy/gemma-2-9b-it-function-calling-GGUF) | EN | 0.5 |
390
+ | [DiTy/gemma-2-2b-it-function-calling](https://huggingface.co/DiTy/gemma-2-2b-it-function-calling) | EN | 0.66 |
391
+
392
+ ## Citation
393
+
394
+ ```none
395
+ @article{gemma_2024,
396
+ title={Gemma},
397
+ url={https://www.kaggle.com/m/3301},
398
+ DOI={10.34740/KAGGLE/M/3301},
399
+ publisher={Kaggle},
400
+ author={Gemma Team},
401
+ year={2024}
402
+ }
403
+ ```
gemma-2-27b-it-function-calling-gguf.Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:744ae2c4ada48a987fbdaf68c3a81d216fcbcb61fea2547916009a9c095f9601
3
+ size 15628381152