Text Generation
GGUF
English
All use cases
reasoning
thoughts
deep thinking
deepseek
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid writing
fiction
bfloat16
swearing
sillytavern
Lmstudio
backyard
horror
Qwen 2.5
context 128k
mergekit
Inference Endpoints
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -36,7 +36,7 @@ pipeline_tag: text-generation
|
|
36 |
|
37 |
(quants uploading...)
|
38 |
|
39 |
-
<h2>DeepSeek-R1-Distill-Qwen-25.5B with Brainstorm 40x, (88 layers,
|
40 |
|
41 |
Context : 128k.
|
42 |
|
@@ -52,7 +52,7 @@ The "thinking/reasoning" tech (for the model at this repo) is from the original
|
|
52 |
|
53 |
[ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B ]
|
54 |
|
55 |
-
In this case, Brainstorm 40x module was grafted directly onto "DeepSeek-R1-Distill-Llama-8B" bringing it up to
|
56 |
|
57 |
---
|
58 |
|
|
|
36 |
|
37 |
(quants uploading...)
|
38 |
|
39 |
+
<h2>DeepSeek-R1-Distill-Qwen-25.5B with Brainstorm 40x, (88 layers, 1043 tensors) </h2>
|
40 |
|
41 |
Context : 128k.
|
42 |
|
|
|
52 |
|
53 |
[ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B ]
|
54 |
|
55 |
+
In this case, Brainstorm 40x module was grafted directly onto "DeepSeek-R1-Distill-Llama-8B" bringing it up to 88 layers, 25.5B parameters.
|
56 |
|
57 |
---
|
58 |
|