jpacifico commited on
Commit
7f79223
·
verified ·
1 Parent(s): 27a963c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -6
README.md CHANGED
@@ -29,21 +29,53 @@ The recommended usage is by loading the low-rank adapter using unsloth:
29
 
30
  ```python
31
  from unsloth import FastLanguageModel
 
 
 
 
32
 
33
- model_name = "jpacifico/Chocolatine-Cook-3B-combined-SFT-DPO-v0.1"
34
  model, tokenizer = FastLanguageModel.from_pretrained(
35
- model_name = model_name,
36
- max_seq_length = 2048,
37
- dtype = None,
38
- load_in_4bit = True,
39
  )
40
 
41
  FastLanguageModel.for_inference(model)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  ```
43
 
44
  ### Limitations
45
 
46
- The Chocolatine model is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance.
47
  It does not have any moderation mechanism.
48
 
49
  - **Developed by:** Jonathan Pacifico, 2024
 
29
 
30
  ```python
31
  from unsloth import FastLanguageModel
32
+ from transformers import TextStreamer
33
+ import torch
34
+
35
+ model_name = "jpacifico/final_model_combined_sft_dpo"
36
 
 
37
  model, tokenizer = FastLanguageModel.from_pretrained(
38
+ model_name,
39
+ max_seq_length=2048,
40
+ dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
41
+ load_in_4bit=False
42
  )
43
 
44
  FastLanguageModel.for_inference(model)
45
+ model.eval()
46
+
47
+ def generate_response(user_question: str):
48
+ messages = [
49
+ {"role": "system", "content": "Tu es un assistant IA spécialisé dans le langage culinaire français. Une question te sera posée. Tu dois générer une réponse précise et concise."},
50
+ {"role": "user", "content": "En cuisine "+user_question},
51
+ ]
52
+
53
+ inputs = tokenizer.apply_chat_template(
54
+ messages,
55
+ tokenize=True,
56
+ add_generation_prompt=True,
57
+ return_tensors="pt",
58
+ ).to("cuda")
59
+
60
+ attention_mask = (inputs != tokenizer.pad_token_id).long()
61
+
62
+ text_streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
63
+
64
+ with torch.no_grad():
65
+ _ = model.generate(
66
+ input_ids=inputs,
67
+ attention_mask=attention_mask,
68
+ max_new_tokens=128,
69
+ use_cache=True,
70
+ streamer=text_streamer,
71
+ do_sample=False,
72
+ temperature=0.7,
73
+ )
74
  ```
75
 
76
  ### Limitations
77
 
78
+ The Chocolatine model series is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance.
79
  It does not have any moderation mechanism.
80
 
81
  - **Developed by:** Jonathan Pacifico, 2024