pgptlformer-tinystories / dyn_qkrmsnorm_ii-7a038ecd-be98-46cb-abe8-e0f013fd7eed.txt

Commit History

sling the illustrious and mysterious "attention_II" models. also some layerwise rmsnorm, qkprojection rmsnorm models, one twice as large as the other.
1f45909
verified

SQCU commited on