JonasGeiping commited on
Commit
15aba98
·
verified ·
1 Parent(s): 57dbfbc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -5
README.md CHANGED
@@ -92,12 +92,9 @@ pipeline_tag: text-generation
92
  # - neuralwork/arxiver
93
  ---
94
 
95
- # Huginn-0125
96
- This is Huginn, version 01/25. This is a latent recurrent-depth model with 3.5B parameters, trained for 800B tokens on AMD MI250X machines. This is a proof-of-concept model, but surprisingly capable in reasoning and code given its training budget and size.
97
- All details on this model can be found in the tech report: "Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach."
98
 
99
- 8 intermediate checkpoints of the model can be found in its collection. Additional intermediate checkpoints are available upon request while we find a place to host all ~350 of them. The data used to train
100
- this model is publicly available (entirely on Hugging Face), and scripts provided with the pretraining code at https://github.com/seal-rg/recurrent-pretraining can be used to repeat our preprocessing and our entire training run.
101
 
102
  ## Table of Contents
103
 
 
92
  # - neuralwork/arxiver
93
  ---
94
 
95
+ # Huginn-0125-intermediate checkpoints
96
+ This is an intermediate checkpoint from our large-scale training run. Additional intermediate checkpoints are available upon request. All other information can be found at the main checkpoint.
 
97
 
 
 
98
 
99
  ## Table of Contents
100