kyle's picture

kyle PRO

kaikaidai

AI & ML interests

None yet

Recent Activity

liked a Space about 2 hours ago
allenai/reward-bench
updated a model about 20 hours ago
AtlaAI/Selene-1-Mini-Llama-3.1-8B
updated a collection 3 days ago
Selene-1-Mini
View all activity

Organizations

Blog-explorers's profile picture Atla's profile picture

Posts 1

view post
Post
1060
๐Ÿ“ˆ Early results on the 8B evaluation model we've been training...

@NinaCalvi wrote about the progress we've made this quarter towards training the best 'LLM-as-a-judge' evaluator. We've significantly improved against the baseline and are approaching state-of-the-art evaluation performance with an 8B model.

Next up: training Llama-3.1-70B ๐Ÿ‘€

Here's the full article: https://www.atla-ai.com/post/evaluating-the-evaluator

Articles 3

Article
12

Selene 1 Mini: the best small language model-as-a-judge

models

None public yet

datasets

None public yet