Configuration

title: string
Display title for the Space

emoji: string
Space emoji (emoji-only character allowed)

colorFrom: string
Color for Thumbnail gradient (red, yellow, green, blue, indigo, purple, pink, gray)

colorTo: string
Color for Thumbnail gradient (red, yellow, green, blue, indigo, purple, pink, gray)

sdk: string
Can be either gradio or streamlit

app_file: string
Path to your main application file (which contains either gradio or streamlit Python code).
Path is relative to the root of the repository.

pinned: boolean
Whether the Space stays on top of your list.

Custom LayoutLM Model for Invoice Processing

This repository hosts a custom implementation of the LayoutLM model, specifically fine-tuned for extracting key information from invoices. The model is designed to identify and extract various fields such as amounts, dates, and names from invoice documents.

Model Overview

This model is based on the LayoutLMv2 architecture and has been fine-tuned on a custom dataset of invoices. It is capable of performing token classification to extract the following entities:

  • Amount Including Tax
  • Due Date
  • Reference Number
  • Customer Name
  • Vendor Name
  • Issue Date
  • Amount

The model uses a custom set of labels to identify and classify these entities within the invoice documents.

Label Mapping

The model has been trained with the following label2id and id2label mappings:

label2id Mapping

label2id = {
    'I-Customer Name': 0,
    'B-Issue Date': 1,
    'I-Issue Date': 2,
    'I-Due Date': 3,
    'I-Amount': 4,
    'B-Due Date': 5,
    'O': 6,
    'B-Amount Including tax': 7,
    'B-Customer Name': 8,
    'B-Amount': 9,
    'I-Amount Including tax': 10,
    'B-Vendor Name': 11,
    'I-Vendor Name': 12,
    'I-Reference Number': 13,
    'B-Reference Number': 14
    }
id2label = {
    0: 'I-Customer Name',
    1: 'B-Issue Date',
    2: 'I-Issue Date',
    3: 'I-Due Date',
    4: 'I-Amount',
    5: 'B-Due Date',
    6: 'O',
    7: 'B-Amount Including tax',
    8: 'B-Customer Name',
    9: 'B-Amount',
    10: 'I-Amount Including tax',
    11: 'B-Vendor Name',
    12: 'I-Vendor Name',
    13: 'I-Reference Number',
    14: 'B-Reference Number'
    }


## Citation
@article{Xu2020LayoutLM,
  title={LayoutLM: Multi-modal Pre-training for Visually-Rich Document Understanding},
  author={Yiheng Xu and Minghao Li and Lei Cui and Shaohan Huang and Furu Wei and Ming Zhou},
  journal={ArXiv},
  year={2020},
  volume={abs/2012.14740}
}
Downloads last month
0
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.