Phil
phil111
AI & ML interests
None yet
Recent Activity
new activity
12 days ago
mistralai/Mistral-Small-24B-Instruct-2501:This Mistral Small has FAR less knowledge than the last.
liked
a model
25 days ago
deepseek-ai/DeepSeek-R1
new activity
28 days ago
internlm/internlm3-8b-instruct:English tests and tasks are absurdly overfit.
Organizations
None yet
phil111's activity
This Mistral Small has FAR less knowledge than the last.
20
#5 opened 15 days ago
by
phil111
English tests and tasks are absurdly overfit.
21
#8 opened 30 days ago
by
phil111
A heavily filtered corpus simply doesn't work.
4
#19 opened about 1 month ago
by
phil111
I Don't Understand This Model
16
#9 opened about 1 month ago
by
phil111
Notably better than Phi3.5 in many ways, but something is wrong.
8
#5 opened about 2 months ago
by
phil111
Very impressive. Good world knowledge (SimpleQA of 25) despite high math/coding performance.
2
#27 opened about 2 months ago
by
phil111
SimpleQA score
2
#1 opened 2 months ago
by
frappuccino
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1660344552924-noauth.png)
Exceptional creative writer
5
#1 opened about 2 months ago
by
SubtleOne
Very High English MMLU scores, Yet Extremely Low Broad English Knowledge
2
#8 opened about 2 months ago
by
phil111
How was r7b?
6
#3 opened 2 months ago
by
MRU4913
Add Qwen 2.5 7B & Tulu 3 8B results to OLLM benchmarks
12
#1 opened 2 months ago
by
Fizzarolli
![](https://cdn-avatars.huggingface.co/v1/production/uploads/634262af8d8089ebaefd410e/pr6KcEebXTo5V2XAlpQNw.png)
local Llama + GPU(cuda)
7
#34 opened 2 months ago
by
Luciolla
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/rG2lgYGCW6gemXEHRTr0m.png)
Base Model?
3
#32 opened 2 months ago
by
User8213
Add Hymba-1.5B to the leaderboard
3
#1030 opened 2 months ago
by
pmolchanov
![](https://cdn-avatars.huggingface.co/v1/production/uploads/646d0c1c534e52f8c30500a6/75VH8ClbRaP75BU2ONfXE.png)
Hallucinates more than Mistral 7b
#13 opened 3 months ago
by
phil111
Looks like not as good as Qwen2.5 7B
9
#5 opened 4 months ago
by
MonolithFoundation
This LLM is hallucinating like crazy. Can someone verify these prompts?
28
#3 opened 4 months ago
by
phil111
Looks like not as good as Qwen2.5 7B
9
#5 opened 4 months ago
by
MonolithFoundation
This LLM is hallucinating like crazy. Can someone verify these prompts?
28
#3 opened 4 months ago
by
phil111
This LLM is hallucinating like crazy. Can someone verify these prompts?
28
#3 opened 4 months ago
by
phil111