Kudos to you guys, Feeling excited to contribute. Woorth taking a look..! #goHF
ROHITH VENKATA REDDY
knight7561
AI & ML interests
Deep learning, Autonomous Driving
Recent Activity
replied to
albertvillanova's
post
1 day ago
π Introducing @huggingface Open Deep-Researchπ₯
In just 24 hours, we built an open-source agent that:
β
Autonomously browse the web
β
Search, scroll & extract info
β
Download & manipulate files
β
Run calculations on data
55% on GAIA validation set! Help us improve it!π‘
https://huggingface.co/blog/open-deep-research
upvoted
an
article
1 day ago
Open-source DeepResearch β Freeing our search agents
Organizations
knight7561's activity
replied to
albertvillanova's
post
1 day ago
upvoted
an
article
1 day ago
Article
Open-source DeepResearch β Freeing our search agents
β’
630
reacted to
bartowski's
post with π
15 days ago
Post
28496
Switching to
I posted a poll on twitter, and others have mentioned the interest in me using the convention of including the author name in the model path when I upload.
It has a couple advantages, first and foremost of course is ensuring clarity of who uploaded the original model (did Qwen upload Qwen2.6? Or did someone fine tune Qwen2.5 and named it 2.6 for fun?)
The second thing is that it avoids collisions, so if multiple people upload the same model and I try to quant them both, I would normally end up colliding and being unable to upload both
I'll be implementing the change next week, there are just two final details I'm unsure about:
First, should the files also inherit the author's name?
Second, what to do in the case that the author name + model name pushes us past the character limit?
Haven't yet decided how to handle either case, so feedback is welcome, but also just providing this as a "heads up"
author_model-name
I posted a poll on twitter, and others have mentioned the interest in me using the convention of including the author name in the model path when I upload.
It has a couple advantages, first and foremost of course is ensuring clarity of who uploaded the original model (did Qwen upload Qwen2.6? Or did someone fine tune Qwen2.5 and named it 2.6 for fun?)
The second thing is that it avoids collisions, so if multiple people upload the same model and I try to quant them both, I would normally end up colliding and being unable to upload both
I'll be implementing the change next week, there are just two final details I'm unsure about:
First, should the files also inherit the author's name?
Second, what to do in the case that the author name + model name pushes us past the character limit?
Haven't yet decided how to handle either case, so feedback is welcome, but also just providing this as a "heads up"
reacted to
onekq's
post with π₯
15 days ago
Post
2660
This is historical. π
DeepSeek πR1π surpassed OpenAI πo1π on the dual leaderboard. What a year for the open source!
onekq-ai/WebApp1K-models-leaderboard
DeepSeek πR1π surpassed OpenAI πo1π on the dual leaderboard. What a year for the open source!
onekq-ai/WebApp1K-models-leaderboard
upvoted
a
collection
30 days ago
Thank you for the cool tool..!
reacted to
cfahlgren1's
post with β€οΈ
3 months ago
Post
3171
You can clean and format datasets entirely in the browser with a few lines of SQL.
In this post, I replicate the process @mlabonne used to clean the new microsoft/orca-agentinstruct-1M-v1 dataset.
The cleaning process consists of:
- Joining the separate splits together / add split column
- Converting string messages into list of structs
- Removing empty system prompts
https://huggingface.co/blog/cfahlgren1/the-beginners-guide-to-cleaning-a-dataset
Here's his new cleaned dataset: mlabonne/orca-agentinstruct-1M-v1-cleaned
In this post, I replicate the process @mlabonne used to clean the new microsoft/orca-agentinstruct-1M-v1 dataset.
The cleaning process consists of:
- Joining the separate splits together / add split column
- Converting string messages into list of structs
- Removing empty system prompts
https://huggingface.co/blog/cfahlgren1/the-beginners-guide-to-cleaning-a-dataset
Here's his new cleaned dataset: mlabonne/orca-agentinstruct-1M-v1-cleaned
upvoted
an
article
4 months ago
Article
ColPali: Efficient Document Retrieval with Vision Language Models π
By
β’
β’
196upvoted
an
article
5 months ago
Article
Training and Finetuning Embedding Models with Sentence Transformers v3
β’
176
upvoted
a
paper
5 months ago