JeffYang52415 commited on
Commit
b01d107
·
unverified ·
1 Parent(s): fecdc3d

bug: fix minor bugs

Browse files
Files changed (2) hide show
  1. .github/workflows/huggingface-sync.yml +13 -2
  2. README.md +19 -17
.github/workflows/huggingface-sync.yml CHANGED
@@ -18,6 +18,15 @@ jobs:
18
  git config --global user.email "github-actions[bot]@users.noreply.github.com"
19
  git config --global user.name "github-actions[bot]"
20
 
 
 
 
 
 
 
 
 
 
21
  - name: Login to Hugging Face
22
  env:
23
  HF_TOKEN: ${{ secrets.HUGGINGFACE_TOKEN }}
@@ -26,5 +35,7 @@ jobs:
26
 
27
  - name: Push to Hugging Face Space
28
  run: |
29
- git remote add space https://huggingface.co/spaces/JeffYang52415/LLMEval-Dataset-Parser
30
- git push space main:main
 
 
 
18
  git config --global user.email "github-actions[bot]@users.noreply.github.com"
19
  git config --global user.name "github-actions[bot]"
20
 
21
+ - name: Set up Python
22
+ uses: actions/setup-python@v4
23
+ with:
24
+ python-version: "3.x"
25
+
26
+ - name: Install Hugging Face CLI
27
+ run: |
28
+ pip install --upgrade huggingface-hub
29
+
30
  - name: Login to Hugging Face
31
  env:
32
  HF_TOKEN: ${{ secrets.HUGGINGFACE_TOKEN }}
 
35
 
36
  - name: Push to Hugging Face Space
37
  run: |
38
+ git remote add space https://huggingface.co/spaces/JeffYang52415/LLMEval-Dataset-Parser || true
39
+ git fetch space || true
40
+ # Force push to ensure sync, use with caution
41
+ git push -f space main:main
README.md CHANGED
@@ -13,10 +13,12 @@ short_description: A collection of parsers for LLM benchmark datasets
13
 
14
  **LLMDataParser** is a Python library that provides parsers for benchmark datasets used in evaluating Large Language Models (LLMs). It offers a unified interface for loading and parsing datasets like **MMLU**, **GSM8k**, and others, streamlining dataset preparation for LLM evaluation. The library aims to simplify the process of working with common LLM benchmark datasets through a consistent API.
15
 
 
 
 
16
  ## Features
17
 
18
  - **Unified Interface**: Consistent `DatasetParser` for all datasets.
19
- - **LLM-Agnostic**: Independent of any specific language model.
20
  - **Easy to Use**: Simple methods and built-in Python types.
21
  - **Extensible**: Easily add support for new datasets.
22
  - **Gradio**: Built-in Gradio interface for interactive dataset exploration and testing.
@@ -78,22 +80,22 @@ Poetry manages the virtual environment and dependencies automatically, so you do
78
  Here's a simple example demonstrating how to use the library:
79
 
80
  ```python
81
- from llmdataparser import ParserRegistry
82
- # list all available parsers
83
- ParserRegistry.list_parsers()
84
- # get a parser
85
- parser = ParserRegistry.get_parser("mmlu")
86
- # load the parser
87
- parser.load() # optional: task_name, split
88
- # parse the parser
89
- parser.parse() # optional: split_names
90
-
91
- print(parser.task_names)
92
- print(parser.split_names)
93
- print(parser.get_dataset_description)
94
- print(parser.get_huggingface_link)
95
- print(parser.total_tasks)
96
- data = parser.get_parsed_data
97
  ```
98
 
99
  We also provide a Gradio demo for interactive testing:
 
13
 
14
  **LLMDataParser** is a Python library that provides parsers for benchmark datasets used in evaluating Large Language Models (LLMs). It offers a unified interface for loading and parsing datasets like **MMLU**, **GSM8k**, and others, streamlining dataset preparation for LLM evaluation. The library aims to simplify the process of working with common LLM benchmark datasets through a consistent API.
15
 
16
+ **Spaces**: You can also try out the online demo on Hugging Face Spaces:
17
+ [LLMEval Dataset Parser Demo](https://huggingface.co/spaces/JeffYang52415/LLMEval-Dataset-Parser)
18
+
19
  ## Features
20
 
21
  - **Unified Interface**: Consistent `DatasetParser` for all datasets.
 
22
  - **Easy to Use**: Simple methods and built-in Python types.
23
  - **Extensible**: Easily add support for new datasets.
24
  - **Gradio**: Built-in Gradio interface for interactive dataset exploration and testing.
 
80
  Here's a simple example demonstrating how to use the library:
81
 
82
  ```python
83
+ from llmdataparser import ParserRegistry
84
+ # list all available parsers
85
+ ParserRegistry.list_parsers()
86
+ # get a parser
87
+ parser = ParserRegistry.get_parser("mmlu")
88
+ # load the parser
89
+ parser.load() # optional: task_name, split
90
+ # parse the parser
91
+ parser.parse() # optional: split_names
92
+
93
+ print(parser.task_names)
94
+ print(parser.split_names)
95
+ print(parser.get_dataset_description)
96
+ print(parser.get_huggingface_link)
97
+ print(parser.total_tasks)
98
+ data = parser.get_parsed_data
99
  ```
100
 
101
  We also provide a Gradio demo for interactive testing: