Rico Ardiansyah

Blane187

AI & ML interests

LMS Voice2Voice Text 2 Image

Organizations

Gradio-Themes-Party's profile picture Gradio-Blocks-Party's profile picture Literally Me FRFR Research Society's profile picture AI Indonesia Community's profile picture Blog-explorers's profile picture That Time I got Reincarnated as a Hugging Face Organization's profile picture MLX Community's profile picture Dev Mode Explorers's profile picture RED's profile picture KindaHex's profile picture indonsian LLMs in Hugging Face's profile picture

Blane187's activity

reacted to onekq's post with 👀 5 months ago
view post
Post
1128
If your plan keeps changing it's a sign that you are living the moment.

I just got the pass@1 result of GPT 🍓o1-preview🍓 : 0.95!!!

This means my benchmark is cast into oblivion, I need to up the ante. I am all ears to suggestions. onekq-ai/WebApp1K-models-leaderboard
  • 1 reply
·
reacted to Tonic's post with 🔥 5 months ago
view post
Post
2663
So awesome , now i can deploy a jupyterlab on huggingface and deploy gradio from the jupyterlab
reacted to singhsidhukuldeep's post with 🔥 5 months ago
view post
Post
3465
This is an absolutely mind-boggling experiment!

@GuangyuRobert (Twitter Handle) from MIT has created Project Sid, which simulates over 1,000 autonomous AI agents collaborating in a Minecraft environment, operating for extended periods without human intervention. This simulation demonstrates unprecedented levels of agent interaction, decision-making, and societal development.

Agents operate independently for hours or days, showcasing advanced decision-making algorithms and goal-oriented behavior.

The simulation produced complex, emergent phenomena, including:
- Economic systems with currency (gems) and trading
- Cultural development and religious practices
- Agents even understood bribing. Priests were moving the most gems to bribe people into following them!
- Governmental structures and democratic processes

Project Sid addresses fundamental challenges in AI research:
- Coherence: Maintaining consistent agent behavior over extended periods.
- Multi-agent Collaboration: Enabling effective communication and coordination among numerous AI entities.
- Long-term Progression: Developing agents capable of learning and evolving over time.

While Minecraft serves as the initial testbed, the underlying AI architecture is designed to be game-agnostic, suggesting potential applications in various digital environments and real-world simulations.

Imagine a policy being debated by the government and how it might affect society; Sid can simulate its impact!

Even if this remains just a game experiment, the project successfully manages 1,000+ agents simultaneously, a feat that requires robust distributed computing and efficient agent architecture.
replied to lunarflu's post 6 months ago
view reply

you can tell my if any someting worng, so I can correct my mistakes

replied to lunarflu's post 6 months ago
view reply

sorry for slow respond because i was tired

replied to lunarflu's post 6 months ago
posted an update 6 months ago
reacted to singhsidhukuldeep's post with 😎 6 months ago
view post
Post
2763
What is the best LLM for RAG systems? 🤔

In a business setting, it will be the one that gives the best performance at a great price! 💼💰

And maybe it should be easy to fine-tune, cheap to fine-tune... FREE to fine-tune? 😲✨

That's @Google Gemini 1.5 Flash! 🚀🌟

It now supports fine-tuning, and the inference cost is the same as the base model! <coughs LORA adopters> 🤭🤖

So the base model must be expensive? 💸
For the base model, the input price is reduced by 78% to $0.075/1 million tokens and the output price by 71% to $0.3/1 million tokens. 📉💵

But is it any good? 🤷‍♂️
On the LLM Hallucination Index, Gemini 1.5 Flash achieved great context adherence scores of 0.94, 1, and 0.92 across short, medium, and long contexts. 📊🎯

Google has finally given a model that is free to tune and offers an excellent balance between performance and cost. ⚖️👌

Happy tuning... 🎶🔧

Gemini 1.5 Flash: https://developers.googleblog.com/en/gemini-15-flash-updates-google-ai-studio-gemini-api/ 🔗

LLM Hallucination Index: https://www.rungalileo.io/hallucinationindex 🔗
  • 1 reply
·
reacted to lunarflu's post with 🔥 6 months ago
view post
Post
1894
Cool things this week from @huggingface !

🌎AI math olympiad winner NuminaMath is here!
🤗Announcing New Hugging Face and Keras NLP integration
✨UI overhaul to HF tokens!
🧊 Embed our dataset viewer on any webpage!

https://huggingface.co/blog/winning-aimo-progress-prize
https://huggingface.co/blog/keras-nlp-integration
https://huggingface.co/settings/tokens
https://x.com/julien_c/status/1812099420726456457

Check out the full list on our discord! 👇
https://discord.com/invite/JfAtkvEtRb
posted an update 6 months ago
view post
Post
1446
hello everyone, today I have been working on a project Blane187/rvc-demo, a demo of rvc using pip, this project is still a demo though (I don't have a beta tester lol)
reacted to nroggendorff's post with 👍 6 months ago
view post
Post
4081
Datasets are down, I offer a solution

git lfs install

git clone https://huggingface.co/datasets/{dataset/id}

from datasets import load_dataset

dataset = load_dataset("id")
reacted to merve's post with 🔥 6 months ago
view post
Post
2713
OWLSAM2: text-promptable SAM2 🦉 merve/OWLSAM2

Marrying cutting-edge zero-shot object detector OWLv2 🤝 mask generator SAM2 (small checkpoint)
Zero-shot segmentation with insane precision ⛵️

I also uploaded all models with usage snippets and made a collection of SAM2 models and demos merve/sam2-66ac9deac6fca3bc5482fe30
  • 2 replies
·
reacted to m-ric's post with 👀 6 months ago
view post
Post
1750
𝗦𝗔𝗠 𝟮 𝗿𝗲𝗹𝗲𝗮𝘀𝗲𝗱: 𝗡𝗲𝘄 𝗦𝗢𝗧𝗔 𝗼𝗻 𝘀𝗲𝗴𝗺𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻, 𝗯𝘆 𝗰𝗼𝗺𝗯𝗶𝗻𝗶𝗻𝗴 𝘀𝘆𝗻𝘁𝗵𝗲𝘁𝗶𝗰 𝗱𝗮𝘁𝗮 𝘄𝗶𝘁𝗵 𝗵𝘂𝗺𝗮𝗻 𝗳𝗲𝗲𝗱𝗯𝗮𝗰𝗸 🚀

It's a model for Object segmentation, for both image and video:
👉 input = a text prompt, or a click on a specific object
👉 output = the model draws a mask around the object. In video segmentation, the mask should follow the object's movements (it is then called a masklet)

💪 SAM 2 is 6x faster than the previous version, it now also works on a video, and it beats SOTA by far on both image and video segmentation tasks.

How did they pull that?

The main blocker for video segmentation was that data is really hard to collect: to build your training dataset, should you manually draw masks on every frame? That would be way too costly! ➡️ As a result, existing video segmentation datasets have a real lack of coverage: few examples, few masklets drawn.

💡 Key idea: researchers they decided to use a segmentation model to help them collect the dataset.

But then it’s a chicken and egg problem: you need the model to create the dataset and the opposite as well? 🤔

⇒ To solve this, they build a data generation system that they scale up progressively in 3 successive manual annotations phases:

𝗦𝘁𝗲𝗽 𝟭: Annotators use only SAM + manual editing tools on each frame ⇒ Create 16k masklets across 1.4k videos

𝗦𝘁𝗲𝗽 𝟮: Then train a first SAM 2, add it in the loop to temporally propagate frames, and correct by re-doing a mask manually when an error has occured ⇒ This gets a 5.1x speedup over data collection in phase 1! 🏃 Collect 60k masklets

𝗦𝘁𝗲𝗽 𝟯: Now SAM 2 is more powerful, it has the “single click” prompting option, thus annotators can use it with simple clicks to re-annotate data.

They even add a completely automatic step to generate 350k more masklets!
And in turn, the model perf gradually increases.

I find this a great example of combining synthetic data generation with human annotation 👏
reacted to gabrielmbmb's post with 🔥 6 months ago
view post
Post
3575
Just dropped magpie-ultra-v0.1! The first open synthetic dataset generated with Llama 3.1 405B. Created with distilabel, it's our most advanced and compute-intensive pipeline to date. We made the GPUs of the cluster go brrrrr 🚀

argilla/magpie-ultra-v0.1

Take it a look and tell us what you think! Probably, the models taking the most out of it are smol models 🤗 We will be improving the dataset in upcoming iterations!
reacted to victor's post with 😎 6 months ago
view post
Post
4051
Hugging Face famous organisations activity. Guess which one has the word "Open" in it 😂
  • 2 replies
·
reacted to takeraparterer's post with ❤️ 6 months ago
view post
Post
2100
They should make a thing like google colab but you can have unlimited free access to a whole datacenter that would be cool. like if you agree
·
reacted to davidberenstein1957's post with 🔥 6 months ago
reacted to 1aurent's post with 🔥 7 months ago
reacted to KingNish's post with 👍 7 months ago
view post
Post
5878
Introducing OpenCHAT mini: a lightweight, fast, and unlimited version of OpenGPT 4o.

KingNish/OpenCHAT-mini2

It has unlimited web search, vision and image generation.

Please take a look and share your review. Thank you! 🤗
·
reacted to MonsterMMORPG's post with 👍 7 months ago
view post
Post
5350
LivePortrait AI: Transform Static Photos into Talking Videos. Now supporting Video-to-Video conversion and Superior Expression Transfer at Remarkable Speed

A new tutorial is anticipated to showcase the latest changes and features in V3, including Video-to-Video capabilities and additional enhancements.

This post provides information for both Windows (local) and Cloud installations (Massed Compute, RunPod, and free Kaggle Account).

🔗 Windows Local Installation Tutorial ️⤵️
▶️ https://youtu.be/FPtpNrmuwXk

🔗 Cloud (no-GPU) Installations Tutorial for Massed Compute, RunPod and free Kaggle Account ️⤵️
▶️ https://youtu.be/wG7oPp01COg

The V3 update introduces video-to-video functionality. If you're seeking a one-click installation method for LivePortrait, an open-source zero-shot image-to-animation application on Windows, for local use, this tutorial is essential. It introduces the cutting-edge image-to-animation open-source generator Live Portrait. Simply provide a static image and a driving video to create an impressive animation in seconds. LivePortrait is incredibly fast and adept at preserving facial expressions from the input video. The results are truly astonishing.

With the V3 update adding video-to-video functionality, those interested in using LivePortrait but lacking a powerful GPU, using a Mac, or preferring cloud-based solutions will find this tutorial invaluable. It guides you through the one-click installation and usage of LivePortrait on #MassedCompute, #RunPod, and even a free #Kaggle account. After following this tutorial, you'll find running LivePortrait on cloud services as straightforward as running it locally. LivePortrait is the latest state-of-the-art static image to talking animation generator, surpassing even paid services in both speed and quality.

  • 2 replies
·