DavidAU
/

DeepSeek-R1-Distill-Qwen-25.5B-Brainstorm-gguf

Model card Files Files and versions Community

DavidAU commited on 10 days ago

Commit

8e8d35c

·

verified ·

1 Parent(s): 64fe239

Update README.md

Files changed (1) hide show

README.md +62 -3

README.md CHANGED Viewed

@@ -1,3 +1,62 @@
----
-license: apache-2.0
----

+license: apache-2.0
+language:
+- en
+tags:
+- creative
+- creative writing
+- fiction writing
+- plot generation
+- sub-plot generation
+- fiction writing
+- story generation
+- scene continue
+- storytelling
+- fiction story
+- science fiction
+- romance
+- all genres
+- story
+- writing
+- vivid prosing
+- vivid writing
+- fiction
+- roleplaying
+- bfloat16
+- swearing
+- role play
+- sillytavern
+- backyard
+- horror
+- llama 3.1
+- context 128k
+- mergekit
+pipeline_tag: text-generation
+---
+(quants uploading...)
+<h2>DeepSeek-R1-Distill-Qwen-25.5B with Brainstorm 40x, (88 layers, 1047 tensors) </h2>
+<img src="deepseek.jpg" style="float:right; width:300px; height:300px; padding:10px;">
+Context : 128k.
+Required: CHATML template.
+Keep in mind this model is experimental and may require one or more regens to work, especially with the "think" system of Deekseek involved here.
+Brainstorm 40x is by DavidAU, and extends the "decision making" and "creativity" of an LLM/AI.
+Higher temps will result in deeper, richer "thoughts"... and frankly more interesting ones too.
+The "thinking/reasoning" tech (for the model at this repo) is from the original Qwen 2.5 "Distill" model from Deepseek:
+[ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B ]
+In this case, Brainstorm 40x module was grafted directly onto "DeepSeek-R1-Distill-Llama-8B" bringing it up to 72 layers, 16.5B parameters.
+---
+Model card, and examples pending...
+---