Update README.md
Browse files
README.md
CHANGED
@@ -65,6 +65,16 @@ model = AutoModelForCausalLM.from_pretrained('fla-hub/rwkv7-2.9B-world', trust_r
|
|
65 |
tokenizer = AutoTokenizer.from_pretrained('fla-hub/rwkv7-2.9B-world', trust_remote_code=True)
|
66 |
```
|
67 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
68 |
## FAQ
|
69 |
Q: safetensors metadata is none.
|
70 |
|
|
|
65 |
tokenizer = AutoTokenizer.from_pretrained('fla-hub/rwkv7-2.9B-world', trust_remote_code=True)
|
66 |
```
|
67 |
|
68 |
+
### Training Data
|
69 |
+
|
70 |
+
This model is trained on the World v3 with a total of 3.119 trillion tokens.
|
71 |
+
|
72 |
+
#### Training Hyperparameters
|
73 |
+
|
74 |
+
- **Training regime:** bfloat16, lr 4e-4 to 1e-5 "delayed" cosine decay, wd 0.1 (with increasing batch sizes during the middle)
|
75 |
+
- **Final Loss:** 1.8745
|
76 |
+
- **Token Count:** 3.119 trillion
|
77 |
+
|
78 |
## FAQ
|
79 |
Q: safetensors metadata is none.
|
80 |
|