|
# VayuBuddy Question Curation |
|
|
|
## π― Aim |
|
The purpose to create this templet is to have the automated interface to collect and manage data analytic questions for VayuBuddy |
|
|
|
## π Folder Structure |
|
|
|
The project is organized as follows: |
|
|
|
```bash |
|
project_root/ |
|
βββ app.py # Main Streamlit application |
|
βββ requirements.txt # Dependencies list |
|
βββ README.md # Documentation |
|
β |
|
βββ data/ |
|
β βββ questions/ # Stores question-related data |
|
β β βββ 0/ # Folder for question ID 0 |
|
β β β βββ question.txt # Question text |
|
β β β βββ answer.txt # Answer text |
|
β β β βββ code.py # Reference code for the question |
|
β β β βββ metadata.json # Metadata for the question |
|
β β βββ 1/ # Folder for question ID 1 |
|
β β β βββ question.txt # Question text |
|
β β β βββ answer.txt # Answer text |
|
β β β βββ code.py # Reference code for the question |
|
β β β βββ metadata.json # Metadata for the question |
|
β ... ... ... # and so on... |
|
β β ... ... |
|
β β |
|
β βββ raw_data/ # Stores the required CSV's |
|
β βββ NCAP_Funding.csv # NCAP Funding Data |
|
β βββ State.csv # States area & population Data |
|
β βββ Data.csv # Main AQI Data |
|
β |
|
βββ pages/ # Streamlit multipage support |
|
β βββ all_question.py # Page to view questions |
|
β βββ execute_code.py # Page to run the code of all questions |
|
β βββ add_question.py # Page to add new questions |
|
β βββ edit_question.py # Page to edit existing questions |
|
β βββ delete_question.py # Page to delete questions |
|
β |
|
βββ utils/ # Utility functions |
|
β βββ load_jsonl.py # Function to load questions a list |
|
β βββ data_to_jsonl.py # Function to convert question folders into JSONL |
|
β βββ jsonl_to_data.py # Function to convert JSONL into question folders |
|
β βββ code_services.py # Handles code formatting & execution |
|
β |
|
βββ output.jsonl # Processed question data in JSONL format |
|
``` |
|
|
|
This structure ensures **modularity** and **maintainability** of the project. π |
|
|
|
|
|
## π How to use this App |
|
|
|
- Add questions through ```Add Questions``` Page |
|
- Edit questions through ```Edit Questions``` Page |
|
- Delete questions through ```Delete Questions``` Page |
|
- The Data will not be saved in-case of missing fields or error in code |
|
|
|
### ```NOTE``` |
|
- while entering Data form code.py in ```Add Questions``` Page or ```Edit Questions``` Page either follow the ```true_code format``` i.e. all code written in the true_code function and true_code function called in the end of it's defination or follow ```No true_code format``` |
|
|
|
#### true_code format |
|
```python |
|
def true_code(): |
|
import pandas as pd |
|
|
|
df = pd.read_csv('data/raw_data/Data.csv', sep=",") |
|
|
|
data = df.groupby(['state','station'])['PM2.5'].mean() |
|
ans = data.idxmax()[0] |
|
print(ans) |
|
|
|
true_code() |
|
``` |
|
|
|
#### No true_code format |
|
```python |
|
import pandas as pd |
|
|
|
df = pd.read_csv('data/raw_data/Data.csv', sep=",") |
|
|
|
data = df.groupby(['state','station'])['PM2.5'].mean() |
|
ans = data.idxmax()[0] |
|
print(ans) |
|
``` |
|
|
|
## 𧩠Sample Question |
|
|
|
### question.txt |
|
```bash |
|
Which state has the highest average PM2.5 concentration across all stations? |
|
``` |
|
|
|
### answer.txt |
|
```bash |
|
Delhi |
|
``` |
|
|
|
### code.py |
|
```python |
|
def true_code(): |
|
import pandas as pd |
|
|
|
df = pd.read_csv('data/raw_data/Data.csv', sep=",") |
|
|
|
data = df.groupby(['state','station'])['PM2.5'].mean() |
|
ans = data.idxmax()[0] |
|
print(ans) |
|
|
|
true_code() |
|
``` |
|
|
|
### metadata.json |
|
```json |
|
{ |
|
"question_id": 0, |
|
"category": "spatial", |
|
"answer_category": "single", |
|
"plot": false, |
|
"libraries": [ |
|
"pandas" |
|
] |
|
} |
|
``` |
|
|
|
|
|
## π οΈ How to Set-Up project |
|
|
|
open the terminal in the empty folder and follow the following steps: |
|
|
|
### 1st step : clone repo |
|
```bash |
|
git clone https://github.com/ratnesh003/VayuBuddy-Question-Curation.git . |
|
``` |
|
|
|
### 2rd step : to install the dependencies to run the codes |
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
### 3nd step : to create dummy /data folder from already present output.jsonl |
|
```bash |
|
py .\utils\jsonl_to_data.py |
|
``` |