14 55 107

dumball

archit11

https://archit-spec.github.io

AI & ML interests

small language models, looking for work please reachout [email protected]

Recent Activity

liked a dataset 1 day ago

simplescaling/s1K

upvoted an article 1 day ago

The case for specialized pre-training: ultra-fast foundation models for dedicated tasks

upvoted a collection 2 days ago

Scotch & SOTA 🥃 Pt. 7: Human Feedback Datasets 🫣

View all activity

Organizations

archit11's activity

liked a dataset 1 day ago

simplescaling/s1K

Viewer • Updated about 3 hours ago • 1k • 343 • 51

upvoted an article 1 day ago

Article

The case for specialized pre-training: ultra-fast foundation models for dedicated tasks

•

Aug 4, 2024

• 29

upvoted 3 collections 2 days ago

updated a model 3 days ago

archit11/smollm350m-grpo

Text Generation • Updated 3 days ago • 13

liked a dataset 4 days ago

AymanTarig/function-calling-v0.2-with-r1-cot

Viewer • Updated 4 days ago • 58k • 163 • 18

published a model 4 days ago

archit11/smollm350m-grpo

Text Generation • Updated 3 days ago • 13

New activity in ubermenchh/SmolLM2-DPO 5 days ago

details pls

#1 opened 5 days ago by

archit11

upvoted an article 6 days ago

Article

How to deploy and fine-tune DeepSeek models on AWS

8 days ago

• 34

upvoted an article 8 days ago

Article

Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia?

•

May 7, 2024

• 8

liked a dataset 8 days ago

OnDeviceMedNotes/synthetic-medical-conversations-deepseek-v3

Viewer • Updated 9 days ago • 143k • 260 • 29

upvoted a collection 10 days ago

Deepseek Papers

Collection

Deepseek papers collection • 15 items • Updated 2 days ago • 46

liked a model 15 days ago

openbmb/MiniCPM-o-2_6

Any-to-Any • Updated 11 days ago • 316k • 922

updated a dataset 18 days ago

archit11/arxiv_links

Viewer • Updated 18 days ago • 842 • 27

published a dataset 18 days ago

archit11/arxiv_links

Viewer • Updated 18 days ago • 842 • 27

upvoted an article 20 days ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

23 days ago

• 136

updated 2 datasets about 2 months ago

archit11/uptso3

Preview • Updated Dec 9, 2024 • 18

archit11/uptso2

Updated Dec 9, 2024 • 24

updated a dataset 2 months ago

archit11/uspto

Preview • Updated Dec 8, 2024 • 18