Merge branch 'main' of https://huggingface.co/erst/xlm-roberta-base-finetuned-nace into main
Browse files
README.md
CHANGED
@@ -1,10 +1,10 @@
|
|
1 |
-
# Classifying Text into
|
2 |
|
3 |
This model is [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) fine-tuned to classify descriptions of activities into [NACE Rev. 2](https://ec.europa.eu/eurostat/web/nace-rev2) codes.
|
4 |
|
5 |
|
6 |
## Data
|
7 |
-
The data used to fine-tune the model consist of 2.5 million descriptions of activities from Norwegian and Danish businesses. To improve the model's multilingual performance, random samples were machine translated into the following languages:
|
8 |
- English
|
9 |
- German
|
10 |
- Spanish
|
@@ -17,8 +17,8 @@ The data used to fine-tune the model consist of 2.5 million descriptions of acti
|
|
17 |
```python
|
18 |
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
|
19 |
|
20 |
-
tokenizer = AutoTokenizer.from_pretrained("erst/xlm-roberta-base-finetuned-
|
21 |
-
model = AutoModelForSequenceClassification.from_pretrained("erst/xlm-roberta-base-finetuned-
|
22 |
|
23 |
pl = pipeline(
|
24 |
"sentiment-analysis",
|
|
|
1 |
+
# Classifying Text into NACE Codes
|
2 |
|
3 |
This model is [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) fine-tuned to classify descriptions of activities into [NACE Rev. 2](https://ec.europa.eu/eurostat/web/nace-rev2) codes.
|
4 |
|
5 |
|
6 |
## Data
|
7 |
+
The data used to fine-tune the model consist of 2.5 million descriptions of activities from Norwegian and Danish businesses. To improve the model's multilingual performance, random samples of the Norwegian and Danish descriptions were machine translated into the following languages:
|
8 |
- English
|
9 |
- German
|
10 |
- Spanish
|
|
|
17 |
```python
|
18 |
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
|
19 |
|
20 |
+
tokenizer = AutoTokenizer.from_pretrained("erst/xlm-roberta-base-finetuned-nace")
|
21 |
+
model = AutoModelForSequenceClassification.from_pretrained("erst/xlm-roberta-base-finetuned-nace")
|
22 |
|
23 |
pl = pipeline(
|
24 |
"sentiment-analysis",
|