Quan Nguyen's picture

Quan Nguyen PRO

qnguyen3

AI & ML interests

None yet

Organizations

Alignment Lab Ai's profile picture Blog-explorers's profile picture Arcee AI's profile picture OpenOrca's profile picture VILM's profile picture BEEspoke Data's profile picture DopikAI JSC's profile picture Qwen's profile picture Vietnamese Mistral's profile picture ZeroGPU Explorers's profile picture jsonifize's profile picture Ontocord.AI's profile picture MLX Community's profile picture fne's profile picture Social Post Explorers's profile picture Cognitive Computations's profile picture OpenVLM's profile picture Sailor2's profile picture Hugging Face Discord Community's profile picture Arcee Training Org's profile picture AI for Vietnam's profile picture DopikAI LLM's profile picture WARA Media and Language's profile picture Sailor2 Evaluation's profile picture MidnightLabs's profile picture

qnguyen3's activity

reacted to Tonic's post with πŸ‘€ 4 months ago
reacted to KingNish's post with πŸ‘€ 5 months ago
view post
Post
1892
I am excited to announce a major speed updated in Voicee, a superfast voice assistant.

It has now achieved latency <250 ms.
While its average latency is about 500ms.
KingNish/Voicee

This become Possible due to newly launched @sambanovasystems cloud.

You can also use your own API Key to get fastest speed.
You can get on from here: https://cloud.sambanova.ai/apis

For optimal performance use Google Chrome.

Please try Voicee and share your valuable feedback to help me further improve its performance and usability.
Thank you!
reacted to FeYuan's post with πŸ‘ 7 months ago
view post
Post
4770
Hi everyone,

I am excited to introduce our latest work, LLaMAX. 😁😁😁

LLaMAX is a powerful language model created specifically for multilingual scenarios. Built upon Meta's LLaMA series models, LLaMAX undergoes extensive training across more than 100 languages.

Remarkably, it enhances its multilingual capabilities without compromising its generalization ability, surpassing existing LLMs.

✨Highlights:

🎈 LLaMAX supports the 102 languages covered by Flores-101, and its performance in translating between low-resource languages far surpasses other decoder-only LLMs.

🎈 Even for languages not covered in Flores-200, LLaMAX still shows significant improvements in translation performance.

🎈 By performing simple SFT on English task data, LLaMAX demonstrates impressive multilingual transfer abilities in downstream tasks.

🎈 In our paper, we discuss effective methods for enhancing the multilingual capabilities of LLMs during the continued training phase.

We welcome you to use our model and provide feedback.

More Details:

πŸŽ‰ Code: https://github.com/CONE-MT/LLaMAX/

πŸŽ‰ Model: https://huggingface.co/LLaMAX/
Β·
replied to their post 7 months ago
view reply

yes it does, you can check the HF Spaces to see how to get streaming works

reacted to their post with πŸ”₯ 7 months ago
posted an update 7 months ago
reacted to their post with πŸ”₯ 9 months ago
view post
Post
5501
πŸŽ‰ Introducing nanoLLaVA, a powerful multimodal AI model that packs the capabilities of a 1B parameter vision language model into just 5GB of VRAM. πŸš€ This makes it an ideal choice for edge devices, bringing cutting-edge visual understanding and generation to your devices like never before. πŸ“±πŸ’»

Model: qnguyen3/nanoLLaVA πŸ”
Spaces: qnguyen3/nanoLLaVA (thanks to @merve )

Under the hood, nanoLLaVA is based on the powerful vilm/Quyen-SE-v0.1 (my Qwen1.5-0.5B finetune) and Google's impressive google/siglip-so400m-patch14-384. 🧠 The model is trained using a data-centric approach to ensure optimal performance. πŸ“Š

In the spirit of transparency and collaboration, all code and model weights are open-sourced under the Apache 2.0 license. 🀝
  • 1 reply
Β·
reacted to phenixrhyder's post with πŸ”₯ 9 months ago
view post
Post
2226
The boy king. This was timeless diffusion I think. Or retrolife. I forget actually, but like the cartoon effect
reacted to osanseviero's post with πŸ”₯ 10 months ago
view post
Post
11237
Diaries of Open Source. Part 14 πŸ€—

πŸ”₯CohereForAI releases Command R+, an open 104B model with:
- Tool usage capabilities
- Specialized in RAGs
- Multilingual
It's one of the first models to surpass GPT-4 in the lmsys arena, check it out!
Model: CohereForAI/c4ai-command-r-plus
Official demo: https://hf.co/spaces/CohereForAI/c4ai-command-r-plus
Quantized: CohereForAI/c4ai-command-r-plus-4bit

πŸŽ‰Google releases a new version of their Gemma instruct models, with improved quality, nicer to converse, and a fancier RL algorithm. The model is similar to Llama 2 70B in the Chat Arena!
Models: google/gemma-release-65d5efbccdbb8c4202ec078b
Try it out in HuggingChat https://hf.co/chat/models/google/gemma-1.1-7b-it

πŸͺ„VoiceCraft, a speech editing and TTS SOTA open model
Paper: VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild (2403.16973)
Model: pyp1/VoiceCraft

πŸ’»Google released CodeGemma, a family of code generation, completion, and chat models
Blog post: https://hf.co/blog/codegemma
Models: google/codegemma-release-66152ac7b683e2667abdee11
Report: https://storage.googleapis.com/deepmind-media/gemma/codegemma_report.pdf

Misc models:
πŸ¦–T-Rex2, a very powerful object detection model for many applications https://github.com/IDEA-Research/T-Rex
πŸ‘€ CT-RATE : A 3D dataset paired with text reports ibrahimhamamci/CT-RATE
πŸ™Octopus v2: a Gemma-based model trained for Android API - extremely fast, better than Llama+RAG, great results NexaAIDev/Octopus-v2
  • 2 replies
Β·
posted an update 10 months ago
view post
Post
5501
πŸŽ‰ Introducing nanoLLaVA, a powerful multimodal AI model that packs the capabilities of a 1B parameter vision language model into just 5GB of VRAM. πŸš€ This makes it an ideal choice for edge devices, bringing cutting-edge visual understanding and generation to your devices like never before. πŸ“±πŸ’»

Model: qnguyen3/nanoLLaVA πŸ”
Spaces: qnguyen3/nanoLLaVA (thanks to @merve )

Under the hood, nanoLLaVA is based on the powerful vilm/Quyen-SE-v0.1 (my Qwen1.5-0.5B finetune) and Google's impressive google/siglip-so400m-patch14-384. 🧠 The model is trained using a data-centric approach to ensure optimal performance. πŸ“Š

In the spirit of transparency and collaboration, all code and model weights are open-sourced under the Apache 2.0 license. 🀝
  • 1 reply
Β·
reacted to yzhuang's post with πŸ€— 11 months ago
reacted to dhuynh95's post with πŸ‘ 11 months ago
view post
Post
🌊 Released #LaVague, fullly open-source AI pipeline to turn natural language into browser actions!

In less than 150 lines of code (RAG with local embedding + Zephyr-7b-Gemma locally or Mixtral on HF Inference API), it generates #Selenium code from user query. In this GIF you can see it follow user instructions to command a browser to browse HF website!

Try it on Colab: colab.research.google.com/github/dhuynh95/LaVague/blob/main/LaVague.ipynb
GitHub: github.com/dhuynh95/LaVague

Pretty exciting how it becomes possible to create an AI assistant that could perform actions for us, such as logging on gov accounts, fill forms, or pull personal information!

It was quite fun to hack in the weekend using open-source tools, from @huggingface local embedding with transformers for local inference or HF Inference API, to RAG with @llama_index, through @MistralAI Mixtral model!

Some challenges: to make it run on Colab for the #GPU Poors, I first resorted to @huggingface Inference API with Mixtral as it was the only model good enough (gemma-7b did not make it and refused to produce code). But after some experimentations, I managed to make it work a local Zephyr-7b-Gemma so that people could run this assistant fully locally!

Because I used an off-the-shelf model, I had to improve performance with few-shot learning and Chain Of Thought, which managed to generate appropriate code!

I hope this project will herald a new dawn where transparent, private and local AI assistants help automate menial but critical tasks, such as helping fill taxes, book accomodation, or research information for us.
Β·
reacted to Sentdex's post with πŸ€— 12 months ago
view post
Post
Hi, welcome to my first post here!

I am slowly wrangling about 5 years of reddit comments (2015-2020). It's a total of billions samples that can be filtered as comment-reply pairs, chains of discussion, filtered by subreddit, up/down votes, controversy, sentiment, and more.

Any requests or ideas for curated datasets from here? I'll also tinker with uploading the entire dataset potentially in chunks or something, but it's quite a few terabytes in total, so I'll need to break it up still. I have some ideas for datasets I personally want too, but curious if anyone has something they'd really like to see that sounds interesting too.
Β·