Piotr Skalski's picture

Piotr Skalski PRO

SkalskiP

AI & ML interests

Computer Vision | Multimodality

Recent Activity

liked a model 10 days ago
Qwen/Qwen2.5-VL-3B-Instruct
liked a Space 10 days ago
deepseek-ai/Janus-Pro-7B
liked a model 10 days ago
deepseek-ai/Janus-Pro-1B
View all activity

Organizations

Hugging Face Fellows's profile picture Gradio-Blocks-Party's profile picture Roboflow's profile picture ZeroGPU Explorers's profile picture Social Post Explorers's profile picture

Posts 2

view post
Post
YOLO-World: Real-Time, Zero-Shot Object Detection 🔥 🔥 🔥

YOLO-World was designed to solve a limitation of existing zero-shot object detection models: speed. Whereas other state-of-the-art models use Transformers, a powerful but typically slower architecture, YOLO-World uses the faster CNN-based YOLO architecture.

YOLO-World provides three models: small with 13M (re-parametrized 77M), medium with 29M (re-parametrized 92M), and large with 48M (re-parametrized 110M) parameters.

The YOLO-World team benchmarked the model on the LVIS dataset and measured their performance on the V100 without any performance acceleration mechanisms like quantization or TensorRT.

According to the paper, YOLO-World reached 35.4 AP with 52.0 FPS for the L version and 26.2 AP with 74.1 FPS for the S version. While the V100 is a powerful GPU, achieving such high FPS on any device is impressive.

- 🔗 YOLO-World arXiv paper: https://lnkd.in/ddRBKCCX
- 🔗 my YOLO-World technical report: https://blog.roboflow.com/what-is-yolo-world
- 🤗 YOLO-World space: SkalskiP/YOLO-World

Articles 1

Article
183

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

datasets

None public yet