apepkuss79 commited on
Commit
c8bf68e
·
verified ·
1 Parent(s): ed6d799

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -115
README.md CHANGED
@@ -1,116 +1,78 @@
1
- ---
2
- model_name: DeepSeek-R1-Distill-Llama-8B
3
- base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
4
- model_creator: deepseek-ai
5
- quantized_by: Second State Inc.
6
- library_name: transformers
7
- ---
8
-
9
- <!-- header start -->
10
- <!-- 200823 -->
11
- <div style="width: auto; margin-left: auto; margin-right: auto">
12
- <img src="https://github.com/LlamaEdge/LlamaEdge/raw/dev/assets/logo.svg" style="width: 100%; min-width: 400px; display: block; margin: auto;">
13
- </div>
14
- <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
15
- <!-- header end -->
16
-
17
- # DeepSeek-R1-Distill-Llama-8B-GGUF
18
-
19
- ## Original Model
20
-
21
- [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
22
-
23
- ## Run with LlamaEdge
24
-
25
- - LlamaEdge version: coming soon
26
-
27
- <!-- - LlamaEdge version: [v0.12.4](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.12.4) and above
28
-
29
- - Prompt template
30
-
31
- - Prompt type for chat: `llama-3-chat`
32
-
33
- - Prompt string
34
-
35
- ```text
36
- <|begin_of_text|><|start_header_id|>system<|end_header_id|>
37
-
38
- {{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>
39
-
40
- {{ user_message_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
41
-
42
- {{ model_answer_1 }}<|eot_id|><|start_header_id|>user<|end_header_id|>
43
-
44
- {{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
45
- ```
46
-
47
- - Prompt type for tool use: `llama-3-tool`
48
-
49
- - Prompt string
50
-
51
- ```text
52
- <|begin_of_text|><|start_header_id|>system<|end_header_id|>
53
-
54
- {system_message}<|eot_id|><|start_header_id|>user<|end_header_id|>
55
-
56
- Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
57
-
58
- Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.
59
-
60
- [{"type":"function","function":{"name":"get_current_weather","description":"Get the current weather in a given location","parameters":{"type":"object","properties":{"location":{"type":"string","description":"The city and state, e.g. San Francisco, CA"},"unit":{"type":"string","description":"The temperature unit to use. Infer this from the users location.","enum":["celsius","fahrenheit"]}},"required":["location","unit"]}}}]
61
-
62
- Question: {user_message}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
63
- ``` -->
64
-
65
- - Context size: `128000`
66
-
67
- <!-- - Run as LlamaEdge service
68
-
69
- - Chat
70
-
71
- ```bash
72
- wasmedge --dir .:. --nn-preload default:GGML:AUTO:Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf \
73
- llama-api-server.wasm \
74
- --prompt-template llama-3-chat \
75
- --ctx-size 128000 \
76
- --model-name Llama-3.1-8b
77
- ```
78
-
79
- - Tool use
80
-
81
- ```bash
82
- wasmedge --dir .:. --nn-preload default:GGML:AUTO:Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf \
83
- llama-api-server.wasm \
84
- --prompt-template llama-3-tool \
85
- --ctx-size 128000 \
86
- --model-name Llama-3.1-8b
87
- ```
88
-
89
- - Run as LlamaEdge command app
90
-
91
- ```bash
92
- wasmedge --dir .:. --nn-preload default:GGML:AUTO:Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf \
93
- llama-chat.wasm \
94
- --prompt-template llama-3-chat \
95
- --ctx-size 128000
96
- ``` -->
97
-
98
- ## Quantized GGUF Models
99
-
100
- | Name | Quant method | Bits | Size | Use case |
101
- | ---- | ---- | ---- | ---- | ----- |
102
- | [Meta-Llama-3.1-8B-Instruct-Q2_K.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q2_K.gguf) | Q2_K | 2 | 3.18 GB| smallest, significant quality loss - not recommended for most purposes |
103
- | [Meta-Llama-3.1-8B-Instruct-Q3_K_L.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_L.gguf) | Q3_K_L | 3 | 4.32 GB| small, substantial quality loss |
104
- | [Meta-Llama-3.1-8B-Instruct-Q3_K_M.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_M.gguf) | Q3_K_M | 3 | 4.02 GB| very small, high quality loss |
105
- | [Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf) | Q3_K_S | 3 | 3.66 GB| very small, high quality loss |
106
- | [Meta-Llama-3.1-8B-Instruct-Q4_0.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_0.gguf) | Q4_0 | 4 | 4.66 GB| legacy; small, very high quality loss - prefer using Q3_K_M |
107
- | [Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf) | Q4_K_M | 4 | 4.92 GB| medium, balanced quality - recommended |
108
- | [Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf) | Q4_K_S | 4 | 4.69 GB| small, greater quality loss |
109
- | [Meta-Llama-3.1-8B-Instruct-Q5_0.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q5_0.gguf) | Q5_0 | 5 | 5.6 GB| legacy; medium, balanced quality - prefer using Q4_K_M |
110
- | [Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf) | Q5_K_M | 5 | 5.73 GB| large, very low quality loss - recommended |
111
- | [Meta-Llama-3.1-8B-Instruct-Q5_K_S.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q5_K_S.gguf) | Q5_K_S | 5 | 5.6 GB| large, low quality loss - recommended |
112
- | [Meta-Llama-3.1-8B-Instruct-Q6_K.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q6_K.gguf) | Q6_K | 6 | 6.6 GB| very large, extremely low quality loss |
113
- | [Meta-Llama-3.1-8B-Instruct-Q8_0.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q8_0.gguf) | Q8_0 | 8 | 8.54 GB| very large, extremely low quality loss - not recommended |
114
- | [Meta-Llama-3.1-8B-Instruct-f16.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-f16.gguf) | f16 | 16 | 16.1 GB| |
115
-
116
  *Quantized with llama.cpp b4519*
 
1
+ ---
2
+ model_name: DeepSeek-R1-Distill-Llama-8B
3
+ base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
4
+ model_creator: deepseek-ai
5
+ quantized_by: Second State Inc.
6
+ library_name: transformers
7
+ ---
8
+
9
+ <!-- header start -->
10
+ <!-- 200823 -->
11
+ <div style="width: auto; margin-left: auto; margin-right: auto">
12
+ <img src="https://github.com/LlamaEdge/LlamaEdge/raw/dev/assets/logo.svg" style="width: 100%; min-width: 400px; display: block; margin: auto;">
13
+ </div>
14
+ <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
15
+ <!-- header end -->
16
+
17
+ # DeepSeek-R1-Distill-Llama-8B-GGUF
18
+
19
+ ## Original Model
20
+
21
+ [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
22
+
23
+ ## Run with LlamaEdge
24
+
25
+ - LlamaEdge version: coming soon
26
+
27
+ <!-- - LlamaEdge version: [v0.12.4](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.12.4) and above -->
28
+
29
+ - Prompt template
30
+
31
+ - Prompt type: `deepseek-chat-25`
32
+
33
+ - Prompt string
34
+
35
+ ```text
36
+ <|begin_of_sentence|>{system_message}<|User|>{user_message_1}<|Assistant|>{assistant_message_1}<|end_of_sentence|><|User|>{user_message_2}<|Assistant|>
37
+ ```
38
+
39
+ - Context size: `128000`
40
+
41
+ - Run as LlamaEdge service
42
+
43
+ ```bash
44
+ wasmedge --dir .:. --nn-preload default:GGML:AUTO:DeepSeek-R1-Distill-Llama-8B-Q5_K_M.gguf \
45
+ llama-api-server.wasm \
46
+ --prompt-template deepseek-chat-25 \
47
+ --ctx-size 128000 \
48
+ --model-name DeepSeek-R1-Distill-Llama-8B
49
+ ```
50
+
51
+ - Run as LlamaEdge command app
52
+
53
+ ```bash
54
+ wasmedge --dir .:. --nn-preload default:GGML:AUTO:DeepSeek-R1-Distill-Llama-8B-Q5_K_M.gguf \
55
+ llama-chat.wasm \
56
+ --prompt-template deepseek-chat-25 \
57
+ --ctx-size 128000
58
+ ```
59
+
60
+ ## Quantized GGUF Models
61
+
62
+ | Name | Quant method | Bits | Size | Use case |
63
+ | ---- | ---- | ---- | ---- | ----- |
64
+ | [Meta-Llama-3.1-8B-Instruct-Q2_K.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q2_K.gguf) | Q2_K | 2 | 3.18 GB| smallest, significant quality loss - not recommended for most purposes |
65
+ | [Meta-Llama-3.1-8B-Instruct-Q3_K_L.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_L.gguf) | Q3_K_L | 3 | 4.32 GB| small, substantial quality loss |
66
+ | [Meta-Llama-3.1-8B-Instruct-Q3_K_M.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_M.gguf) | Q3_K_M | 3 | 4.02 GB| very small, high quality loss |
67
+ | [Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf) | Q3_K_S | 3 | 3.66 GB| very small, high quality loss |
68
+ | [Meta-Llama-3.1-8B-Instruct-Q4_0.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_0.gguf) | Q4_0 | 4 | 4.66 GB| legacy; small, very high quality loss - prefer using Q3_K_M |
69
+ | [Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf) | Q4_K_M | 4 | 4.92 GB| medium, balanced quality - recommended |
70
+ | [Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf) | Q4_K_S | 4 | 4.69 GB| small, greater quality loss |
71
+ | [Meta-Llama-3.1-8B-Instruct-Q5_0.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q5_0.gguf) | Q5_0 | 5 | 5.6 GB| legacy; medium, balanced quality - prefer using Q4_K_M |
72
+ | [Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf) | Q5_K_M | 5 | 5.73 GB| large, very low quality loss - recommended |
73
+ | [Meta-Llama-3.1-8B-Instruct-Q5_K_S.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q5_K_S.gguf) | Q5_K_S | 5 | 5.6 GB| large, low quality loss - recommended |
74
+ | [Meta-Llama-3.1-8B-Instruct-Q6_K.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q6_K.gguf) | Q6_K | 6 | 6.6 GB| very large, extremely low quality loss |
75
+ | [Meta-Llama-3.1-8B-Instruct-Q8_0.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-Q8_0.gguf) | Q8_0 | 8 | 8.54 GB| very large, extremely low quality loss - not recommended |
76
+ | [Meta-Llama-3.1-8B-Instruct-f16.gguf](https://huggingface.co/second-state/DeepSeek-R1-Distill-Llama-8B/blob/main/Meta-Llama-3.1-8B-Instruct-f16.gguf) | f16 | 16 | 16.1 GB| |
77
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
  *Quantized with llama.cpp b4519*