rubenroy commited on
Commit
d84367f
·
verified ·
1 Parent(s): 17f593f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -31,10 +31,11 @@ Zurich 1.5B GammaCorpus v2-10k is a fine-tune of Alibaba's **Qwen 2.5 1.5B Instr
31
  - **Base Model:** [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
32
  - **Type:** Causal Language Models
33
  - **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
34
- - **Number of Parameters:** 7.61B
35
- - **Number of Paramaters (Non-Embedding):** 6.53B
36
  - **Number of Layers:** 28
37
- - **Number of Attention Heads (GQA):** 28 for Q and 4 for KV
 
38
 
39
  ## Training Details
40
 
 
31
  - **Base Model:** [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
32
  - **Type:** Causal Language Models
33
  - **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
34
+ - **Number of Parameters:** 1.54B
35
+ - **Number of Paramaters (Non-Embedding)**: 1.31B
36
  - **Number of Layers:** 28
37
+ - **Number of Attention Heads (GQA):** 12 for Q and 2 for KV
38
+
39
 
40
  ## Training Details
41