Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
2
Kaiyue Wen
KaiyueWen
Follow
AI & ML interests
None yet
Recent Activity
authored
a paper
15 days ago
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
authored
a paper
15 days ago
RNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrieval
authored
a paper
15 days ago
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models
View all activity
Organizations
None yet
Papers
3
arxiv:
2501.11873
arxiv:
2402.18510
arxiv:
2307.11007
models
None public yet
datasets
None public yet