Open-Source AI Meetup

community

AI & ML interests

Open science and open source

SFEvent's activity

ehristoforu 
posted an update about 2 months ago
view post
Post
3202
✒️ Ultraset - all-in-one dataset for SFT training in Alpaca format.
fluently-sets/ultraset

❓ Ultraset is a comprehensive dataset for training Large Language Models (LLMs) using the SFT (instruction-based Fine-Tuning) method. This dataset consists of over 785 thousand entries in eight languages, including English, Russian, French, Italian, Spanish, German, Chinese, and Korean.

🤯 Ultraset solves the problem faced by users when selecting an appropriate dataset for LLM training. It combines various types of data required to enhance the model's skills in areas such as text writing and editing, mathematics, coding, biology, medicine, finance, and multilingualism.

🤗 For effective use of the dataset, it is recommended to utilize only the "instruction," "input," and "output" columns and train the model for 1-3 epochs. The dataset does not include DPO or Instruct data, making it suitable for training various types of LLM models.

❇️ Ultraset is an excellent tool to improve your language model's skills in diverse knowledge areas.
lunarflu 
posted an update 2 months ago
Taylor658 
posted an update 2 months ago
view post
Post
567
🌐 The Stanford Institute for Human-Centered AI (https://aiindex.stanford.edu/vibrancy/) has released its 2024 Global AI Vibrancy Tool, a way to explore and compare AI progress across 36 countries.

📊 It measures progress across the 8 broad pillars of R&D, Responsible AI, Economy, Education, Diversity, Policy and Governance, Public Opinion and Infrastructure. (Each of these pillars have a number of Sub Indices)

📈 As a whole it is not surprising that the USA was at the top in terms of overall score as of 2023 (AI investment activity is a large part of the economic pillar for example and that is a large part of the overall USA ranking) but drilling in to more STRATEGIC Macro pillars like Education, Infrastructure or R&D reveal interesting growth patterns in Asia (particularly China) and Western Europe that I suspect the 2024 metrics will bear out.

🤖 Hopefully the 2024 Global Vibrancy ranking will break out AI and ML verticals like Computer Vision or NLP and or the AI Agent space as that may also from a global macro level give indications of what is to come globally for AI in 2025.
Taylor658 
posted an update 2 months ago
view post
Post
819
🤖💻 Function Calling is a key component of Agent workflows. To call functions, an LLM needs a way to interact with other systems and run code. This usually means connecting it to a runtime environment that can handle function calls, data, and security.

Per the Berkeley Function-Calling Leaderboard there are only 2 fully open source models (The other 2 in the top 20 that are not closed source have cc-by-nc-4.0 licenses) out of the top 20 models that currently have function calling built in as of 17 Nov 2024.
https://gorilla.cs.berkeley.edu/leaderboard.html

The 2 Open Source Models out of the top 20 that currently support function calling are:

meetkai/functionary-medium-v3.1
Team-ACE/ToolACE-8B

This is a both a huge disadvantage AND an opportunity for the Open Source community as Enterprises, Small Business, Government Agencies etc. quickly adopt Agents and Agent workflows over the next few months. Open Source will have a lot of catching up to do as Enterprises will be hesitant to switch from the closed source models that they may initially build their Agent workflows on in the next few months to an open source alternative later.

Hopefully more open source models will support function calling in the near future.
Taylor658 
posted an update 4 months ago
view post
Post
2278
The Mystery Bot 🕵️‍♂️ saga I posted about from earlier this week has been solved...🤗

Cohere for AI has just announced its open source Aya Expanse multilingual model. The Initial release supports 23 languages with more on the way soon.🌌 🌍

You can also try Aya Expanse via SMS on your mobile phone using the global WhatsApp number or one of the initial set of country specific numbers listed below.⬇️

🌍WhatsApp - +14313028498
Germany - (+49) 1771786365
USA – +18332746219
United Kingdom — (+44) 7418373332
Canada – (+1) 2044107115
Netherlands – (+31) 97006520757
Brazil — (+55) 11950110169
Portugal – (+351) 923249773
Italy – (+39) 3399950813
Poland - (+48) 459050281
  • 1 reply
·
Taylor658 
posted an update 4 months ago
view post
Post
2521
Spent the weekend testing out some prompts with 🕵️‍♂️Mystery Bot🕵️‍♂️ on my mobile... exciting things are coming soon for the following languages:

🌐Arabic, Chinese, Czech, Dutch, English French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish, Ukrainian, and Vietnamese!🌐
Taylor658 
posted an update 5 months ago
Taylor658 
posted an update 5 months ago
view post
Post
2352
💡Andrew Ng recently gave a strong defense of Open Source AI models and the need to slow down legislative efforts in the US and the EU to restrict innovation in Open Source AI at Stanford GSB.

🎥See video below
https://youtu.be/yzUdmwlh1sQ?si=bZc690p8iubolXm_
·
lunarflu 
posted an update 6 months ago
ehristoforu 
posted an update 7 months ago
view post
Post
4345
😏 Hello from Project Fluently Team!

✨ Finally we can give you some details about Supple Diffusion. We worked on it for a long time and we have little left, we apologize that we had to increase the work time.

🛠️ Some technical information. The first version will be the Small version (there will also be Medium, Large, Huge, possibly Tiny), it will be based on the SD1 architecture, that is, one text encoder, U-net, VAE. Now about each component, the first is a text encoder, it will be a CLIP model (perhaps not CLIP-L-path14), CLIP was specially retrained by us in order to achieve the universality of the model in understanding completely different styles and to simplify the prompt as much as possible. Next, we did U-net, U-net in a rather complicated way, first we trained different parts (types) of data with different U-nets, then we carried out merging using different methods, then we trained DPO and SPO using methods, and then we looked at the remaining shortcomings and further trained model, details will come later. We left VAE the same as in SD1 architecture.

🙌 Compatibility. Another goal of the Supple model series is full compatibility with Auto1111 and ComfyUI already at the release stage, the model is fully supported by these interfaces and the diffusers library and does not require adaptation, your usual Sampling methods are also compatible, such as DPM++ 2M Karras, DPM++ SDE and others.

🧐 Today, without demo images (there wasn’t much time), final work is underway on the model and we are already preparing to develop the Medium version, the release of the Small version will most likely be in mid-August or earlier.

😻 Feel free to ask your questions in the comments below the post, we will be happy to answer them, have a nice day!
  • 1 reply
·
multimodalart 
posted an update 7 months ago
lunarflu 
posted an update 7 months ago
view post
Post
1894
Cool things this week from @huggingface !

🌎AI math olympiad winner NuminaMath is here!
🤗Announcing New Hugging Face and Keras NLP integration
✨UI overhaul to HF tokens!
🧊 Embed our dataset viewer on any webpage!

https://huggingface.co/blog/winning-aimo-progress-prize
https://huggingface.co/blog/keras-nlp-integration
https://huggingface.co/settings/tokens
https://x.com/julien_c/status/1812099420726456457

Check out the full list on our discord! 👇
https://discord.com/invite/JfAtkvEtRb
Taylor658 
posted an update 7 months ago
view post
Post
810
Researchers from Auburn University and the University of Alberta have explored the limitations of Vision Language Models (VLMs) in their recently published paper titled "Vision language models are blind." ( Vision language models are blind (2407.06581))

Key Findings:🔍
VLMs, including GPT-4o, Gemini-1.5 Pro, Claude-3 Sonnet, and Claude-3.5 Sonnet, struggle with basic visual tasks.
Tasks such as identifying where lines intersect or counting basic shapes are challenging for these models.
The authors noted, "The shockingly poor performance of four state-of-the-art VLMs suggests their vision is, at best, like of a person with myopia seeing fine details as blurry, and at worst, like an intelligent person that is blind making educated guesses"​(Vision Language Models Are Blind; 2024)​.

Human-like Myopia? 👓
VLMs may have a blind spot similar to human myopia.
This limitation makes it difficult for VLMs to perceive details.
Suggests a potential parallel between human and machine vision limitations.

Technical Details: 🔧
The researchers created a new benchmark called BlindTest.
BlindTest consists of simple visual tasks to evaluate VLMs low-level vision capabilities.
Four VLMs were assessed using BlindTest.
Many shortcomings were revealed in the models ability to process basic visual information.

Learn More: 🖼️
For a deeper dive into this research, check out the project page: https://vlmsareblind.github.io/
ehristoforu 
posted an update 7 months ago
view post
Post
6369
🤗 Hello from the Project Fluently team!

🥏 We are ready to announce a new series of Supple Diffusion models, these are new generation diffusion models (about 1-2 weeks left before release).

🦾 The new series aims to take diffusion models to the next level, with performance and versatility as the main goal.

🧐 How will our models be better than others? Firstly, we worked on the CLIP models, now they understand your requests better, it will become easier to process. Secondly, we trained the models with high quality, even better than all our previous ones. Thirdly, you won’t have to keep 20 models on your disk; only 4-6 will be enough.

🗺️ Roadmap:
1. Create Supple Diffusion Small
2. Creating Supple Diffusion Medium
3. Create Supple Diffusion Large

🎆 Our models are universal for realism, and for cartoons, and for anime, and for caricatures.

💖 The project really needs your support and your recommendations and reviews, please do not hesitate to write comments under this post, thank you!

🖼️ Below are demo images made with the pre-release version of Supple Diffusion Small.
·
Taylor658 
posted an update 8 months ago
view post
Post
697
🌍 Cohere for AI has announced that this July and August, it is inviting researchers from around the world to join Expedition Aya, a global initiative focused on launching projects using multilingual tools like Aya 23 and Aya 101. 🌐

Participants can start by joining the Aya server, where all organization will take place. They can share ideas and connect with others on Discord and the signup sheet. Various events will be hosted to help people find potential team members. 🤝

To support the projects, Cohere API credits will be issued. 💰

Over the course of six weeks, weekly check-in calls are also planned to help teams stay on track and receive support with using Aya. 🖥️

The expedition will wrap up at the end of August with a closing event to showcase everyone’s work and plan next steps. Participants who complete the expedition will also receive some Expedition Aya swag. 🎉

Links:
Join the Aya Discord: https://discord.com/invite/q9QRYkjpwk
Visit the Expedition Aya Minisite: https://sites.google.com/cohere.com/expedition-aya/home
  • 1 reply
·
Taylor658 
posted an update 8 months ago
view post
Post
940
🔍 A recently published technical report introduces MINT-1T, a dataset that will considerably expand open-source multimodal data. It features one trillion text tokens and three billion images and is scheduled for release in July 2024.

Researcher Affiliation:

University of Washington
Salesforce Research
Stanford University
University of Texas at Austin
University of California, Berkeley

Paper:
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
https://arxiv.org/pdf/2406.11271v1.pdf

GitHub:
https://github.com/mlfoundations/MINT-1T

Highlights:

MINT-1T Dataset: Largest open-source multimodal interleaved dataset with 1 trillion text tokens & 3 billion images. 📊🖼️
Diverse Sources: Incorporates data from HTML, PDFs, and ArXiv documents. 📄📚
Open Source: Dataset and code will be released at https://github.com/mlfoundations/MINT-1T. 🌐🔓
Broader Domain Representation: Uses diverse data sources for balanced domain representation. 🌍📚
Performance in Multimodal Tasks: The dataset’s scale and diversity should enhance multimodal task performance. 🤖💡

Datasheet Information:

Motivation: Addresses the gap in large-scale open-source multimodal datasets. 🌐📊
Composition: 927.6 million documents, including HTML, PDF, and ArXiv sources. 📄📚
Collection Process: Gathered from CommonCrawl WARC and WAT dumps, with rigorous filtering. 🗂️🔍
Preprocessing/Cleaning: Removal of low-quality text, duplicates and anonymization of sensitive information. 🧹🔒
Ethical Considerations: Measures to ensure privacy and avoid bias. ⚖️🔏
Uses: Training multimodal models, generating interleaved image-text sequences, and building retrieval systems. 🤖📖
Taylor658 
posted an update 8 months ago
view post
Post
828
With the CVPR conference (https://cvpr.thecvf.com) in full swing this week in Seattle 🏙️, the competition details for NeurIPS 2024 have just been released.🚀

Some of the competitions this year include:

🦾 MyoChallenge 2024: Physiological dexterity in bionic humans.
🌌 FAIR Universe: Handling uncertainties in fundamental science.
🧪 BELKA: Chemical assessment through big encoded libraries.
🏆 HAC: Hacker-Cup AI competition.
💰 Large-Scale Auction Challenge: Decision-making in competitive games.
📶 URGENT Challenge: Signal reconstruction and enhancement.
🛡️ LASC 2024: Safety in LLM and AI agents.

For more details, check out: https://blog.neurips.cc/2024/06/04/neurips-2024-competitions-announced
Taylor658 
posted an update 8 months ago
view post
Post
4463
Luma AI has just launched Dream Machine, a Sora and Kling AI-like tool that generates videos from simple text and images. 🎥
Dream Machine is out of beta and offers a free tier to test it out.

I tried this extremely simple prompt with the pic below and thought the capture of my prompt into a drone camera-like video was decent:

You are a drone operator. Create a 30-second video from a drone heading eastbound over the western suburbs of Bismarck, North Dakota, looking east towards the city on an overcast summer evening during the golden hour from an altitude of 200 ft.


Dream Machine also has a paid tier. However, like its paid tier text-to-image brethren from 2023 (who all fared EXTREMELY badly once good text-to-image capabilities became the norm in open and closed source LLMs), time will tell if the pay tier model will work for text and image to video. ⏳

This will be evident in 3 to 5 months once GPT-5, Gemini-2, Mistral-9, Llama 4, et al., all models with enhanced multimodal capabilities, are released. 🚀
Taylor658 
posted an update 8 months ago
view post
Post
2533
Researchers at Carnegie Mellon University have introduced Sotopia, a platform designed to evaluate and enhance AI’s social capabilities. Sotopia focuses on assessing AI’s performance in goal-oriented social interactions, like collaboration, negotiation, and competition.

🔍 Key Findings:
Performance Evaluation: The platform enables testing and comparison of different AI systems, with a specific emphasis on refining Mistral-7B. 🛠️
Benchmarking: Sotopia uses GPT-4 as a benchmark to evaluate other AI systems’ capabilities. 📏

🔧 Technical Points:
Foundation: Sotopia builds upon Mistral-7B, focusing on behavior cloning and self-reinforcement. 🏗️
Multi-Dimensional Assessment: Sotopia evaluates AI performance across 7 social dimensions, including believability, adherence to social norms, and successful goal completion. 🌐
Data Collection: The platform gathers data from human-human, human-AI, and AI-AI interactions. 📂

Sotopia Project Page: https://www.sotopia.world/
Check out the HF space here: cmu-lti/sotopia-space
Additional details are in the HF Collection: cmu-lti/sotopia-65f312c1bd04a8c4a9225e5b

ehristoforu 
posted an update 8 months ago
view post
Post
3866
🦾 Hello, I present Visionix Alpha - a new hyper-realistic model based on SDXL. The main difference from all existing realism models is the attention to detail, that is, I improved not only hyperrealism, but also the overall aesthetics, anatomy, the beauty of nature, and more, and the model also has the most different faces. This model is suitable not only for realistic photos, but also for generating 2.5d anime, realistic cartoons and more.

🤗 Model on HF: ehristoforu/Visionix-alpha
🥏 Model on CivitAI: https://civitai.com/models/505719
🪄 Playground (with base and inpaint model): ehristoforu/Visionix-Playground

✏️ Inpaint version on HF: ehristoforu/Visionix-alpha-inpainting
🖋️ Inpaint version on CivitAI: https://civitai.com/models/505719?modelVersionId=563519
  • 1 reply
·