license: cc-by-nc-sa-4.0
tags:
- generated_from_trainer
datasets:
- funsd-layoutlmv3
metrics:
- precision
- recall
- f1
- accuracy
base_model: microsoft/layoutlmv3-base
model-index:
- name: OCR-LayoutLMv3
results:
- task:
type: token-classification
name: Token Classification
dataset:
name: funsd-layoutlmv3
type: funsd-layoutlmv3
config: funsd
split: train
args: funsd
metrics:
- type: precision
value: 0.8988653182042428
name: Precision
- type: recall
value: 0.905116741182315
name: Recall
- type: f1
value: 0.9019801980198019
name: F1
- type: accuracy
value: 0.8403661000832046
name: Accuracy
OCR-LayoutLMv3
This model is a fine-tuned version of microsoft/layoutlmv3-base on the funsd-layoutlmv3 dataset. It achieves the following results on the evaluation set:
- Loss: 0.9788
- Precision: 0.8989
- Recall: 0.9051
- F1: 0.9020
- Accuracy: 0.8404
Model description
LayoutLMv3 is a pre-trained multimodal Transformer for Document AI with unified text and image masking. The simple unified architecture and training objectives make LayoutLMv3 a general-purpose pre-trained model. For example, LayoutLMv3 can be fine-tuned for both text-centric tasks, including form understanding, receipt understanding, and document visual question answering, and image-centric tasks such as document image classification and document layout analysis.
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei, Preprint 2022.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 2000
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
No log | 1.33 | 100 | 0.6966 | 0.7418 | 0.8063 | 0.7727 | 0.7801 |
No log | 2.67 | 200 | 0.5767 | 0.8104 | 0.8644 | 0.8365 | 0.8117 |
No log | 4.0 | 300 | 0.5355 | 0.8246 | 0.8852 | 0.8539 | 0.8295 |
No log | 5.33 | 400 | 0.5240 | 0.8706 | 0.8922 | 0.8813 | 0.8427 |
0.5326 | 6.67 | 500 | 0.6337 | 0.8528 | 0.8778 | 0.8651 | 0.8260 |
0.5326 | 8.0 | 600 | 0.6870 | 0.8698 | 0.8828 | 0.8762 | 0.8240 |
0.5326 | 9.33 | 700 | 0.6584 | 0.8723 | 0.9061 | 0.8889 | 0.8342 |
0.5326 | 10.67 | 800 | 0.7186 | 0.8868 | 0.9031 | 0.8949 | 0.8335 |
0.5326 | 12.0 | 900 | 0.6822 | 0.9040 | 0.9076 | 0.9058 | 0.8526 |
0.1248 | 13.33 | 1000 | 0.7042 | 0.8872 | 0.9021 | 0.8946 | 0.8511 |
0.1248 | 14.67 | 1100 | 0.7920 | 0.9027 | 0.9036 | 0.9032 | 0.8480 |
0.1248 | 16.0 | 1200 | 0.8052 | 0.8964 | 0.9151 | 0.9056 | 0.8389 |
0.1248 | 17.33 | 1300 | 0.8932 | 0.8995 | 0.9066 | 0.9030 | 0.8329 |
0.1248 | 18.67 | 1400 | 0.8728 | 0.8950 | 0.9061 | 0.9005 | 0.8398 |
0.0442 | 20.0 | 1500 | 0.9051 | 0.8960 | 0.9116 | 0.9037 | 0.8347 |
0.0442 | 21.33 | 1600 | 0.9587 | 0.8947 | 0.9031 | 0.8989 | 0.8401 |
0.0442 | 22.67 | 1700 | 0.9822 | 0.9042 | 0.9046 | 0.9044 | 0.8389 |
0.0442 | 24.0 | 1800 | 0.9734 | 0.9043 | 0.9061 | 0.9052 | 0.8391 |
0.0442 | 25.33 | 1900 | 0.9842 | 0.9042 | 0.9091 | 0.9066 | 0.8410 |
0.0225 | 26.67 | 2000 | 0.9788 | 0.8989 | 0.9051 | 0.9020 | 0.8404 |
Framework versions
- Transformers 4.25.0.dev0
- Pytorch 1.12.1
- Datasets 2.6.1
- Tokenizers 0.13.1