erst
/

xlm-roberta-base-finetuned-nace

Text Classification

Inference Endpoints

Model card Files Files and versions Community

CasperEriksen commited on Mar 2, 2021

Commit

cbf3afb

·

1 Parent(s): 95f26d8

Add README.md

Files changed (1) hide show

README.md +31 -0

README.md ADDED Viewed

	@@ -0,0 +1,31 @@

+# Classifying Text into DB07 Codes
+This model is [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) fine-tuned to classify descriptions of activities into [NACE Rev. 2](https://ec.europa.eu/eurostat/web/nace-rev2) codes.
+## Data
+The data used to fine-tune the model consist of 2.5 million descriptions of activities from Norwegian and Danish businesses. To improve the model's multilingual performance, random samples were machine translated into the following languages:
+- English
+- German
+- Spanish
+- French
+- Finnish
+## Quick Start
+```python
+from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
+tokenizer = AutoTokenizer.from_pretrained("erst/xlm-roberta-base-finetuned-db07")
+model = AutoModelForSequenceClassification.from_pretrained("erst/xlm-roberta-base-finetuned-db07")
+pl = pipeline(
+    "sentiment-analysis",
+    model=model,
+    tokenizer=tokenizer,
+    return_all_scores=False,
+)
+pl("We sell clothes")
+```