wsd-camembert-base-semcor-wngt-fr : almanach/camembert-base fine-tuned on Semcor+WNGT fr for Word Sense Disambiguation

wsd-camembert-base-semcor-wngt-fr is a Word Sense Disambiguation (WSD) model fine-tuned on the French version of Semcor and WNGT datasets with almanach/camembert-base as the pretrained BERT embeddings.

The fine-tuned model achieves the following performance on SemEval 2013 - fr:

Test F1 (%) GPUs Epochs
51.28 1xV100 32GB 40

πŸ“ Model Details

The WSD model is a Transformer encoder-decoder architecture, consisting of 6 layers in both the encoder and decoder, and leveraging pretrained BERT embeddings for enhanced semantic representation.

πŸ’» How to disambiguate a sentence

To disambiguate a sentence, please refer to the official NWSD repository.

βš™οΈ Training Details

Training and Test Data

We use Semcor.fr and WNGT.fr annotated with WordNet 3.0 sense keys IDs for the train/valid sets:

Train Valid
# utterances 143,597 4,000

The semeval2013task12.fr.xml test data is the French version of the SemEval-2013 Task 12 test set, with:

Test
# utterances 306

Training Procedure and Hyperparameters

We follow the training procedure provided in the NWSD github repository.

Training time

With 1xV100 32GB, the training took ~ 4 hours.

Libraries

Disambiguate:

  @inproceedings{vial-etal-2019-sense,
    title = "Sense Vocabulary Compression through the Semantic Knowledge of {W}ord{N}et for Neural Word Sense Disambiguation",
    author = {Vial, Lo{\"i}c  and
      Lecouteux, Benjamin  and
      Schwab, Didier},
    editor = "Vossen, Piek  and
      Fellbaum, Christiane",
    booktitle = "Proceedings of the 10th Global Wordnet Conference",
    month = jul,
    year = "2019",
    address = "Wroclaw, Poland",
    publisher = "Global Wordnet Association",
    url = "https://aclanthology.org/2019.gwc-1.14/",
    pages = "108--117",
}

πŸ’‘ Information

  • Developed by: CΓ©cile Macaire
  • Funded by [optional]: GENCI-IDRIS (Grant 2023-AD011013625R1) PROPICTO ANR-20-CE93-0005
  • Language(s) (NLP): French
  • License: MIT
  • Finetuned from model: almanach/camembert-base
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Propicto/wsd-camembert-base-semcor-wngt-fr

Finetuned
(104)
this model