Tim Dolan's picture

Tim Dolan

macadeliccc

AI & ML interests

None yet

Recent Activity

liked a model about 2 months ago
CohereForAI/c4ai-command-r7b-12-2024
updated a dataset 3 months ago
macadeliccc/US-SupremeCourt-RAG
View all activity

Organizations

Blog-explorers's profile picture Social Post Explorers's profile picture Cognitive Computations's profile picture Dev Mode Explorers's profile picture Hugging Face for Legal's profile picture

Posts 10

view post
Post
1032
My tool calling playgrounds repo has been updated again to include the use of flux1-schnell or dev image generation. This functionality is similar to using Dall-E 3 via the @ decorator in ChatGPT. Once the function is selected, the model will either extract or improve your prompt (depending on how you ask).

I have also included 2 notebooks that cover different ways to access Flux for your specific use case. The first method covers how to access flux via LitServe from Lightning AI. LitServe is a bare-bones inference engine with a focus on modularity rather than raw performance. LitServe supports text generation models as well as image generation, which is great for some use cases, but does not provide the caching mechanisms from a dedicated image generation solution.

Since dedicated caching mechanisms are so crucial to performance, I also included an example for how to integrate SwarmUI/ComfyUI to utilize a more dedicated infrastructure that may already be running as part of your tech stack. Resulting in a Llama-3.1 capable of utilizing specific ComfyUI JSON configs, and many different settings.

Lastly, I tested the response times for each over a small batch request to simulate a speed test.

It becomes clear quickly how efficient caching mechanisms can greatly reduce the generation time, even in a scenario where another model is called. An average 4.5 second response time is not bad at all when you consider that an 8B model is calling a 12B parameter model for a secondary generation.

Repo: https://github.com/tdolan21/tool-calling-playground
LitServe: https://github.com/Lightning-AI/LitServe
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

Articles 1

Article
3

Deploy hundreds of open source models on one GPU using LoRAX