|
--- |
|
datasets: |
|
- tner/bc5cdr |
|
- tner/bionlp2004 |
|
- tner/btc |
|
- tner/conll2003 |
|
- tner/fin |
|
- tner/mit_movie_trivia |
|
- tner/mit_restaurant |
|
- tner/multinerd |
|
- tner/ontonotes5 |
|
- tner/tweebank_ner |
|
- tner/tweetner7 |
|
- tner/wikineural |
|
- tner/wnut2017 |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
- f1 |
|
pipeline_tag: token-classification |
|
--- |
|
# RoBERTa Span Detection |
|
This model is a fine-tuned model of [roberta-large](https://huggingface.co/roberta-large) after being trained on a **mixture of NER datasets**. |
|
|
|
Basically, this model can detect NER spans (with <u>no differenciation on classes</u>). Labels use the IBO format and are: |
|
- 'B-TAG': beginning token of span |
|
- 'I-TAG': inside token of span |
|
- 'O': token not a span |
|
|
|
# Usage |
|
This model has been trained in an efficient way and thus cannot be load directly from HuggingFace's hub. To use that model, please follow instructions on this [repo](https://github.com/AntoineBlanot/efficient-llm). |
|
|
|
# Data used for training |
|
- tner/bc5cdr |
|
- tner/bionlp2004 |
|
- tner/btc |
|
- tner/conll2003 |
|
- tner/fin |
|
- tner/mit_movie_trivia |
|
- tner/mit_restaurant |
|
- tner/multinerd |
|
- tner/ontonotes5 |
|
- tner/tweebank_ner |
|
- tner/tweetner7 |
|
- tner/wikineural |
|
- tner/wnut2017 |
|
|
|
# Evaluation results |
|
|
|
| Data | Accuracy | |
|
|:---:|:---------:| |
|
| validation | 0.972 | |