Derek Thomas's picture

Derek Thomas

derek-thomas

AI & ML interests

None yet

Recent Activity

upvoted an article about 23 hours ago
1 Billion Classifications
published an article 1 day ago
1 Billion Classifications
updated a Space 2 days ago
derek-thomas/classification-analysis
View all activity

Organizations

open spaced repetition's profile picture Hugging Face Success Team's profile picture ZamaFace's profile picture Sphere Spring 2023 Class's profile picture Finance Inc.'s profile picture Hugging Face Time-Series's profile picture Core42's profile picture SCB's profile picture ZeroGPU Explorers's profile picture Open Arabic LLM Leaderboard's profile picture Reddit Tools on πŸ€—'s profile picture Social Post Explorers's profile picture MIT Critical Data's profile picture Arabic Translation Prompt Engineering's profile picture Audio Processing Exploration's profile picture Dataset Tools's profile picture Success Sandbox's profile picture

derek-thomas's activity

reacted to erinys's post with ❀️ 5 months ago
view post
Post
1977
We shut down XetHub today after almost 2 years. What we learned from launching our Git-scaled product from scratch:
- Don't make me change my workflow
- Data inertia is real
- ML best practices are still evolving

Closing the door on our public product lets us focus on our new goal of scaling HF Hub's storage backend to improve devX for a larger community. We'd love to hear your thoughts on what experiences we can improve!

Read the full post: https://xethub.com/blog/shutting-down-xethub-learnings-and-takeaways
Β·
reacted to thomwolf's post with πŸ‘β€οΈπŸ”₯ 6 months ago
view post
Post
5184
A Little guide to building Large Language Models in 2024

This is a post-recording of a 75min lecture I gave two weeks ago on how to train a LLM from scratch in 2024. I tried to keep it short and comprehensive – focusing on concepts that are crucial for training good LLM but often hidden in tech reports.

In the lecture, I introduce the students to all the important concepts/tools/techniques for training good performance LLM:
* finding, preparing and evaluating web scale data
* understanding model parallelism and efficient training
* fine-tuning/aligning models
* fast inference

There is of course many things and details missing and that I should have added to it, don't hesitate to tell me you're most frustrating omission and I'll add it in a future part. In particular I think I'll add more focus on how to filter topics well and extensively and maybe more practical anecdotes and details.

Now that I recorded it I've been thinking this could be part 1 of a two-parts series with a 2nd fully hands-on video on how to run all these steps with some libraries and recipes we've released recently at HF around LLM training (and could be easily adapted to your other framework anyway):
*datatrove for all things web-scale data preparation: https://github.com/huggingface/datatrove
*nanotron for lightweight 4D parallelism LLM training: https://github.com/huggingface/nanotron
*lighteval for in-training fast parallel LLM evaluations: https://github.com/huggingface/lighteval

Here is the link to watch the lecture on Youtube: https://www.youtube.com/watch?v=2-SPH9hIKT8
And here is the link to the Google slides: https://docs.google.com/presentation/d/1IkzESdOwdmwvPxIELYJi8--K3EZ98_cL6c5ZcLKSyVg/edit#slide=id.p

Enjoy and happy to hear feedback on it and what to add, correct, extend in a second part.
  • 2 replies
Β·
reacted to their post with 😎 6 months ago
view post
Post
2261
Here is an AI Puzzle!
When you solve it just use a 😎 emoji.
NO SPOILERS
A similar puzzle might have each picture that has a hidden meaning of summer, winter, fall, spring, and the answer would be seasons.

Its a little dated now (almost a year), so bottom right might be tough.

Thanks to @johko for the encouragement to post!
posted an update 6 months ago
view post
Post
2261
Here is an AI Puzzle!
When you solve it just use a 😎 emoji.
NO SPOILERS
A similar puzzle might have each picture that has a hidden meaning of summer, winter, fall, spring, and the answer would be seasons.

Its a little dated now (almost a year), so bottom right might be tough.

Thanks to @johko for the encouragement to post!
reacted to MohamedRashad's post with πŸ”₯ 9 months ago
reacted to abhishek's post with πŸš€πŸ”₯ 10 months ago
view post
Post
3069
πŸš€πŸš€πŸš€πŸš€ Introducing AutoTrain Configs! πŸš€πŸš€πŸš€πŸš€
Now you can train models using yaml config files! πŸ’₯ These configs are easy to understand and are not at all overwhelming. So, even a person with almost zero knowledge of machine learning can train state of the art models without writing any code. Check out example configs in the config directory of autotrain-advanced github repo and feel free to share configs by creating a pull request πŸ€—
Github repo: https://github.com/huggingface/autotrain-advanced
  • 2 replies
Β·
reacted to andrewrreed's post with πŸ‘ 10 months ago
view post
Post
2324
IMO, the "grounded generation" feature from Cohere's CommandR+ has flown under the radar...

For RAG use cases, responses directly include inline citations, making source attribution an inherent part of generation rather than an afterthought 😎

Who's working on an open dataset with this for the HF community to fine-tune with??

πŸ”—CommandR+ Docs: https://docs.cohere.com/docs/retrieval-augmented-generation-rag

πŸ”—Model on the πŸ€— Hub: CohereForAI/c4ai-command-r-plus
  • 1 reply
Β·
reacted to chiphuyen's post with πŸ‘ 12 months ago
view post
Post
It feels awkward having my first post sharing my stuff, but this is a weekend project that I really enjoyed working on. I'd love to meet more people interested in random ideas like this.

A hard part of building AI applications is choosing which model to use. What if we don’t have to? What if we can predict the best model for any prompt?

Predictive human preference aims to predict which model users might prefer for a specific query.

https://huyenchip.com/2024/02/28/predictive-human-preference.html

One use case is model routing. If we know in advance that for a prompt, users will prefer Claude Instant’s response over GPT-4, and Claude Instant is cheaper/faster than GPT-4, we can route this prompt to Claude Instant. Model routing has the potential to increase response quality while reducing costs and latency.

One pattern is that for simple prompts, weak models can do (nearly) as well as strong models. For more challenging prompts, however, users are more likely to prefer stronger models. Here’s a visualization of predicted human preference for an easy prompt (β€œhello, how are you?”) and a challenging prompt (β€œExplain why Planc length …”).

Preference predictors make it possible to create leaderboards unique to any prompt and domain.
Β·
posted an update 12 months ago
reacted to alielfilali01's post with ❀️ 12 months ago
view post
Post
πŸŽ‰πŸ₯³πŸŽ‰
Today, we are thrilled to officially launch the "2A2I" Arabic Artificial Intelligence Initiative. This is a community-driven initiative founded on the philosophy of "Small team, Big work" Our goal is to elevate Arabic AI (LLMs, Diffusion Models, ASR, etc.) to the same level as English (and also Chinese πŸ‰).

Naturally, our focus today is primarily on datasets. We aim to provide high-quality datasets, especially for LLMs this month, to support our future efforts. In line with this, we're excited to introduce the Arabic version of H4-no_robots, find here : 2A2I/H4_no_robots (and yes, we know it's not "no_robots" anymore πŸ˜„). Stay tuned for more exciting, high-quality datasets in the next couple of weeks (+ 4 million rowsπŸ”₯)

In parallel, we're also developing a model πŸͺ that we hope will set new high standards for Arabic LLMs. πŸ”₯ This model is planned for release in the coming months.

For more information, please visit our Organization card here : https://huggingface.co/2A2I

If you're interested in Arabic AI and want to help pushing the wheel as well, fill out this form, and let us know your motivation and your exciting ideas πŸ”₯

The form link : https://forms.gle/kZLVuynWFU2FyTm57

If you have any questions, feel free to reach out to us at the email address below.

Additionally, if you believe as we do in this mission and would like to help this community and contribute some compute resources πŸ˜‰ or any other form of help you might think about, please contact us at the same email address below or reach out to me through LinkedIn πŸ”₯

2A2I Contact Email : [email protected]
My LinkedIn : https://www.linkedin.com/in/alielfilali01/