Alex Havrilla

Dahoas

AI & ML interests

NLP, RL

Recent Activity

updated a dataset 8 days ago
Dahoas/MATH
published a dataset 8 days ago
Dahoas/MATH
updated a dataset about 2 months ago
Dahoas/numina-synthetic
View all activity

Organizations

CarperAI's profile picture DuckAI's profile picture Critiquers's profile picture An optimal synthetic data sampling strategy for MATH's profile picture

Articles 1

Article
140

Illustrating Reinforcement Learning from Human Feedback (RLHF)