Improve model card, add link to code
Browse filesThis PR improves the model card by adding a link to the paper [GuardReasoner: Towards Reasoning-based LLM Safeguards](https://huggingface.co/papers/2501.18492).
It also adds a license and changes the pipeline tag to text generation since the model generates text. It also links to the Github repository.
Please review and merge this PR if everything looks good.
README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
-
license:
|
4 |
base_model: meta-llama/Llama-3.2-1B
|
5 |
tags:
|
6 |
- llama-factory
|
@@ -9,7 +9,7 @@ tags:
|
|
9 |
model-index:
|
10 |
- name: GuardReasoner 1B
|
11 |
results: []
|
12 |
-
pipeline_tag: text-
|
13 |
language:
|
14 |
- en
|
15 |
metrics:
|
@@ -20,6 +20,8 @@ metrics:
|
|
20 |
|
21 |
This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) via R-SFT and HS-DPO. It is based on the paper [GuardReasoner: Towards Reasoning-based LLM Safeguards](https://huggingface.co/papers/2501.18492).
|
22 |
|
|
|
|
|
23 |
The training data of R-SFT can be found in [GuardReasonerTrain](https://huggingface.co/datasets/yueliu1999/GuardReasonerTrain).
|
24 |
|
25 |
|
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
+
license: apache-2.0
|
4 |
base_model: meta-llama/Llama-3.2-1B
|
5 |
tags:
|
6 |
- llama-factory
|
|
|
9 |
model-index:
|
10 |
- name: GuardReasoner 1B
|
11 |
results: []
|
12 |
+
pipeline_tag: text-generation
|
13 |
language:
|
14 |
- en
|
15 |
metrics:
|
|
|
20 |
|
21 |
This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) via R-SFT and HS-DPO. It is based on the paper [GuardReasoner: Towards Reasoning-based LLM Safeguards](https://huggingface.co/papers/2501.18492).
|
22 |
|
23 |
+
Code: https://github.com/yueliu1999/GuardReasoner/
|
24 |
+
|
25 |
The training data of R-SFT can be found in [GuardReasonerTrain](https://huggingface.co/datasets/yueliu1999/GuardReasonerTrain).
|
26 |
|
27 |
|