kevin009 commited on
Commit
30c8720
·
verified ·
1 Parent(s): 7985bdf

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +145 -0
README.md ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Merged-Llama-Adapters-317-320
2
+
3
+ A merged LoRA adapter combining four fine-tuned adapters (317-320) for the Llama-3.1-8B language model.
4
+
5
+ ## Model Details
6
+
7
+ - Base Model: meta-llama/Llama-3.1-8B-instruct
8
+ - Adaptation Method: Merged LoRA
9
+ - Source Adapters:
10
+ - https://huggingface.co/kevin009/llama317
11
+ - https://huggingface.co/kevin009/llama318
12
+ - https://huggingface.co/kevin009/llama319
13
+ - https://huggingface.co/kevin009/llama320
14
+
15
+ ## Merger Configuration
16
+
17
+ ### Source Adapters
18
+
19
+ All source adapters share the following configuration:
20
+ - Rank (r): 16
21
+ - Alpha: 16
22
+ - Target Modules:
23
+ - q_proj (Query projection)
24
+ - k_proj (Key projection)
25
+ - v_proj (Value projection)
26
+ - o_proj (Output projection)
27
+ - up_proj (Upsampling projection)
28
+ - down_proj (Downsampling projection)
29
+ - gate_proj (Gate projection)
30
+
31
+ ### Merger Details
32
+
33
+ - Merger Method: Linear interpolation
34
+ - Merger Weights: Equal weights (0.25) for each adapter
35
+ - Combined Rank: 16 (maintained from source adapters)
36
+
37
+ ## Usage
38
+
39
+ This merged adapter must be used with the base Llama-3.1-8B-instruct model.
40
+
41
+ ### Loading the Model
42
+
43
+ ```python
44
+ from peft import PeftModel, PeftConfig
45
+ from transformers import AutoModelForCausalLM, AutoTokenizer
46
+
47
+ # Load base model
48
+ base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-instruct")
49
+ tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-instruct")
50
+
51
+ # Load merged LoRA adapter
52
+ model = PeftModel.from_pretrained(base_model, "path_to_merged_adapter")
53
+ ```
54
+
55
+ ## Limitations and Biases
56
+
57
+ - This merged adapter inherits limitations and biases from:
58
+ - The base Llama-3.1-8B-instruct model
59
+ - All four source adapters
60
+ - The merging process may result in:
61
+ - Potential loss of specialized capabilities from individual adapters
62
+ - Averaged behavior across different adapter specializations
63
+ - Possible interference between adapter weights
64
+
65
+ ## Merging Process
66
+
67
+ The adapters were merged using the following approach:
68
+ 1. Linear interpolation of adapter weights
69
+ 2. Equal weighting (0.25) applied to each source adapter
70
+ 3. Preservation of original LoRA rank and architecture
71
+
72
+ ### Method Used
73
+
74
+ The adapters were merged using PEFT (Parameter-Efficient Fine-Tuning) library's weighted adapter combination feature. The process combines multiple LoRA adapters using linear interpolation with specified weights.
75
+
76
+ ### Step-by-Step Merging Process
77
+
78
+ 1. Load the base model and initial adapter:
79
+ ```python
80
+ from peft import PeftModel, PeftConfig
81
+ from transformers import AutoModelForCausalLM, AutoTokenizer
82
+
83
+ MODEL_NAME = "meta-llama/Meta-Llama-3.1-8B-Instruct"
84
+ model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
85
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
86
+
87
+ # Load first adapter as base
88
+ peft_model = PeftModel.from_pretrained(model, "llama319", adapter_name="llama319")
89
+ ```
90
+
91
+ 2. Load additional adapters:
92
+ ```python
93
+ # Load remaining adapters
94
+ peft_model.load_adapter("llama320", adapter_name="llama320")
95
+ peft_model.load_adapter("llama318", adapter_name="llama318")
96
+ peft_model.load_adapter("llama317", adapter_name="llama317")
97
+ ```
98
+
99
+ 3. Configure and execute the merger:
100
+ ```python
101
+ # Define adapters and their weights
102
+ adapters = ["llama319", "llama320", "llama318", "llama317"]
103
+ weights = [1.0, 1.0, 1.0, 1.0] # Equal weights for all adapters
104
+
105
+ # Merge adapters
106
+ peft_model.add_weighted_adapter(
107
+ adapters,
108
+ weights,
109
+ "merge",
110
+ combination_type="ties", # Using ties combination method
111
+ density=0.2 # Density parameter for merger
112
+ )
113
+
114
+ # Set active adapter to merged version
115
+ peft_model.set_adapter("merge")
116
+
117
+ # Save the merged adapter
118
+ peft_model.save_pretrained("merged")
119
+ ```
120
+
121
+ ### Key Parameters
122
+
123
+ - `combination_type="ties"`: Uses the TIES (Task Interference Edge Selection) method for combining adapters
124
+ - `density=0.2`: Controls the sparsity of the merged weights
125
+ - `weights=[1.0, 1.0, 1.0, 1.0]`: Equal weighting for all adapters (0.25 each after normalization)
126
+
127
+ ### Notes
128
+
129
+ - The order of loading adapters may affect the final result
130
+ - Equal weights were chosen to maintain balanced influence from each adapter
131
+ - The merged adapter maintains the same architecture and rank as the original adapters
132
+ - While this adapter merges multiple fine-tunes, each component was developed as part of independent research efforts to explore and language model capabilities as part of R&D process.
133
+
134
+ ## License
135
+
136
+ Licensed under Apache 2.0 License.
137
+
138
+ This merged adapter is part of independent individual research work. While the code is open-source under the Apache 2.0 license, please note:
139
+
140
+ - You are free to use, modify, and distribute this adapter following the Apache 2.0 license terms
141
+ - This work is provided "as is" without warranties or conditions of any kind
142
+ - This is an independent research project and not affiliated with any organization
143
+ - Attribution is appreciated but not required
144
+ - For full license details, see: https://www.apache.org/licenses/LICENSE-2.0
145
+