Transformers
GGUF
llama-cpp
gguf-my-repo
Inference Endpoints
conversational
Triangle104 commited on
Commit
2297738
·
verified ·
1 Parent(s): e4d1531

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md CHANGED
@@ -15,6 +15,60 @@ tags:
15
  This model was converted to GGUF format from [`nbeerbower/mistral-nemo-gutenberg3-12B`](https://huggingface.co/nbeerbower/mistral-nemo-gutenberg3-12B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
16
  Refer to the [original model card](https://huggingface.co/nbeerbower/mistral-nemo-gutenberg3-12B) for more details on the model.
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ## Use with llama.cpp
19
  Install llama.cpp through brew (works on Mac and Linux)
20
 
 
15
  This model was converted to GGUF format from [`nbeerbower/mistral-nemo-gutenberg3-12B`](https://huggingface.co/nbeerbower/mistral-nemo-gutenberg3-12B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
16
  Refer to the [original model card](https://huggingface.co/nbeerbower/mistral-nemo-gutenberg3-12B) for more details on the model.
17
 
18
+ ---
19
+ Model details:
20
+ -
21
+ Mahou-1.5-mistral-nemo-12B-lorablated finetuned on jondurbin/gutenberg-dpo-v0.1, nbeerbower/gutenberg2-dpo, and nbeerbower/gutenberg-moderne-dpo.
22
+ Method
23
+
24
+ ORPO tuned with 8x A100 for 2 epochs.
25
+
26
+ QLoRA config:
27
+
28
+ # QLoRA config
29
+ bnb_config = BitsAndBytesConfig(
30
+ load_in_4bit=True,
31
+ bnb_4bit_quant_type="nf4",
32
+ bnb_4bit_compute_dtype=torch_dtype,
33
+ bnb_4bit_use_double_quant=True,
34
+ )
35
+ # LoRA config
36
+ peft_config = LoraConfig(
37
+ r=16,
38
+ lora_alpha=32,
39
+ lora_dropout=0.05,
40
+ bias="none",
41
+ task_type="CAUSAL_LM",
42
+ target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj']
43
+ )
44
+
45
+ Training config:
46
+
47
+ orpo_args = ORPOConfig(
48
+ run_name=new_model,
49
+ learning_rate=8e-6,
50
+ lr_scheduler_type="linear",
51
+ max_length=4096,
52
+ max_prompt_length=2048,
53
+ max_completion_length=2048,
54
+ beta=0.1,
55
+ per_device_train_batch_size=2,
56
+ per_device_eval_batch_size=2,
57
+ gradient_accumulation_steps=1,
58
+ optim="paged_adamw_8bit",
59
+ num_train_epochs=2,
60
+ evaluation_strategy="steps",
61
+ eval_steps=0.2,
62
+ logging_steps=1,
63
+ warmup_steps=10,
64
+ max_grad_norm=10,
65
+ report_to="wandb",
66
+ output_dir="./results/",
67
+ bf16=True,
68
+ gradient_checkpointing=True,
69
+ )
70
+
71
+ ---
72
  ## Use with llama.cpp
73
  Install llama.cpp through brew (works on Mac and Linux)
74