Text Generation
GGUF
English
All use cases
reasoning
thoughts
deep thinking
deepseek
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid writing
fiction
bfloat16
swearing
sillytavern
Lmstudio
backyard
horror
Qwen 2.5
context 128k
mergekit
Inference Endpoints
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,62 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
license: apache-2.0
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
tags:
|
5 |
+
- creative
|
6 |
+
- creative writing
|
7 |
+
- fiction writing
|
8 |
+
- plot generation
|
9 |
+
- sub-plot generation
|
10 |
+
- fiction writing
|
11 |
+
- story generation
|
12 |
+
- scene continue
|
13 |
+
- storytelling
|
14 |
+
- fiction story
|
15 |
+
- science fiction
|
16 |
+
- romance
|
17 |
+
- all genres
|
18 |
+
- story
|
19 |
+
- writing
|
20 |
+
- vivid prosing
|
21 |
+
- vivid writing
|
22 |
+
- fiction
|
23 |
+
- roleplaying
|
24 |
+
- bfloat16
|
25 |
+
- swearing
|
26 |
+
- role play
|
27 |
+
- sillytavern
|
28 |
+
- backyard
|
29 |
+
- horror
|
30 |
+
- llama 3.1
|
31 |
+
- context 128k
|
32 |
+
- mergekit
|
33 |
+
pipeline_tag: text-generation
|
34 |
+
---
|
35 |
+
|
36 |
+
(quants uploading...)
|
37 |
+
|
38 |
+
<h2>DeepSeek-R1-Distill-Qwen-25.5B with Brainstorm 40x, (88 layers, 1047 tensors) </h2>
|
39 |
+
|
40 |
+
<img src="deepseek.jpg" style="float:right; width:300px; height:300px; padding:10px;">
|
41 |
+
|
42 |
+
Context : 128k.
|
43 |
+
|
44 |
+
Required: CHATML template.
|
45 |
+
|
46 |
+
Keep in mind this model is experimental and may require one or more regens to work, especially with the "think" system of Deekseek involved here.
|
47 |
+
|
48 |
+
Brainstorm 40x is by DavidAU, and extends the "decision making" and "creativity" of an LLM/AI.
|
49 |
+
|
50 |
+
Higher temps will result in deeper, richer "thoughts"... and frankly more interesting ones too.
|
51 |
+
|
52 |
+
The "thinking/reasoning" tech (for the model at this repo) is from the original Qwen 2.5 "Distill" model from Deepseek:
|
53 |
+
|
54 |
+
[ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B ]
|
55 |
+
|
56 |
+
In this case, Brainstorm 40x module was grafted directly onto "DeepSeek-R1-Distill-Llama-8B" bringing it up to 72 layers, 16.5B parameters.
|
57 |
+
|
58 |
+
---
|
59 |
+
|
60 |
+
Model card, and examples pending...
|
61 |
+
|
62 |
+
---
|