SQCU
/

pgptlformer-tinystories

Model card Files Files and versions Community

pgptlformer-tinystories

Commit History

compiled models train faster so you can train more of them in a short experiment, to better convergence.

921107d
verified

SQCU commited on 20 days ago

89,301,000 parameter attention_ii, z_lossed model trained for 6250 steps at batchsize:4*32, device_batchsize:32

8a69386
verified

SQCU commited on 22 days ago

sling the illustrious and mysterious "attention_II" models. also some layerwise rmsnorm, qkprojection rmsnorm models, one twice as large as the other.

1f45909
verified

SQCU commited on 22 days ago

Upload 8 files

6d543db
verified

SQCU commited on 27 days ago

Update README.md

87045f5
verified

SQCU commited on 27 days ago

Create README.md

fd3ca39
verified

SQCU commited on 27 days ago

initial commit

5e8f667
verified

SQCU commited on 27 days ago