Harpreet Sahota's picture

Harpreet Sahota PRO

harpreetsahota

·

AI & ML interests

Deep learning, laguage models, prompt engineering, agents, multi-agent systems

Recent Activity

updated a dataset 2 days ago

Voxel51/WebUOT-238-Test

liked a dataset 2 days ago

Voxel51/WebUOT-238-Test

published a dataset 3 days ago

Voxel51/WebUOT-238-Test

View all activity

Organizations

Posts 3

Post

2201

The Coachella of Computer Vision, CVPR, is right around the corner. In anticipation of the conference, I curated a dataset of the papers.

I'll have a technical blog post out tomorrow doing some analysis on the dataset, but I'm so hyped that I wanted to get it out to the community ASAP.

The dataset consists of the following fields:

- An image of the first page of the paper
- title: The title of the paper
- authors_list: The list of authors
- abstract: The abstract of the paper
- arxiv_link: Link to the paper on arXiv
- other_link: Link to the project page, if found
- category_name: The primary category this paper according to [arXiv taxonomy](https://arxiv.org/category_taxonomy)
- all_categories: All categories this paper falls into, according to arXiv taxonomy
- keywords: Extracted using GPT-4o

Here's how I created the dataset 👇🏼

Generic code for building this dataset can be found [here](https://github.com/harpreetsahota204/CVPR-2024-Papers).

This dataset was built using the following steps:

- Scrape the CVPR 2024 website for accepted papers
- Use DuckDuckGo to search for a link to the paper's abstract on arXiv
- Use arXiv.py (python wrapper for the arXiv API) to extract the abstract and categories, and download the pdf for each paper
- Use pdf2image to save the image of paper's first page
- Use GPT-4o to extract keywords from the abstract

Voxel51/CVPR_2024_Papers

Articles 1

Article

9

The CVPR Survival Guide: Discovering Research That's Interesting to YOU!

View all Articles

Collections 1

spaces 4

AG4DP Example Chatbot

An example chatbot for the AG4DP Course

Chat With Website

RAQA on Chainlit - Chat The Hitchhikers Guide to the Galaxy with a fine-tuned GPT 3.5

Spidey-verse RAQA Application Chainlit Demo

models 8

harpreetsahota/CLIP-IQA

Updated Dec 5, 2024 • 1

harpreetsahota/DCVAI-Example-1

Updated Nov 4, 2024

harpreetsahota/coursera_week1_lesson7

Updated Sep 10, 2024

harpreetsahota/DeciLM-7B-Instruct-gptq-2bit-slim-orca

Text Generation • Updated Jan 30, 2024 • 91 • 1

harpreetsahota/DeciLM-7B-Instruct-gptq-4bit-slim-orca

Text Generation • Updated Jan 30, 2024 • 94 • 1

harpreetsahota/DeciLM-Base-ChatTuned-Blogv0.2

Text Generation • Updated Jan 27, 2024 • 6 • 1

harpreetsahota/DeciLM-6B-qlora-blog-post

Text Generation • Updated Dec 7, 2023 • 10

harpreetsahota/DeciLM-6B-hf-open-instruct-v1-blog-post

Text Generation • Updated Dec 7, 2023 • 16 • 1

datasets 41

harpreetsahota/ml-memes

Viewer • Updated 9 days ago • 57 • 42 • 1

harpreetsahota/memes-dataset

Viewer • Updated 9 days ago • 100 • 35 • 1

harpreetsahota/medium-blogs-example

Viewer • Updated 12 days ago • 13 • 50

harpreetsahota/wonders-of-the-world

Viewer • Updated 15 days ago • 3.75k • 48

harpreetsahota/random_short_videos

Viewer • Updated 29 days ago • 412 • 380

harpreetsahota/marvel-bobbleheads

Viewer • Updated Nov 8, 2024 • 151 • 88 • 2

harpreetsahota/marvel-masterpieces-with-3dmesh

Viewer • Updated Nov 5, 2024 • 510 • 234 • 3

harpreetsahota/marvel-masterpieces

Viewer • Updated Nov 5, 2024 • 255 • 79 • 1

harpreetsahota/videos-to-test-trackers

Preview • Updated Oct 21, 2024 • 77 • 1

harpreetsahota/coursera_week1_lesson7

Viewer • Updated Sep 10, 2024 • 4.16k • 127