library_name: setfit
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
  - accuracy
  - text: "Tax Invoice\nOriginal for Buyer/ Duplicate for Transporter/ Triplicate for Assessee\n\nSupplier Legal_Name: Mahaneadi Coalfields 4rea Code +MOO1\nLimited Area Description Jagannath\nSupplier Addresa . Jagriti Vihar, Buria Inv3ice Number -9100065259\nSambalpur 768020 Invoice Date :Dec 3, 2022\nSupplier City - Sambalpur Con=zact Reference: 3030007756\nSupplier State - Odisha Conzzact type ;Spot Auction\nSupplier Pincode : 768020 Salas Order 11240002677\nP.O - Jagriti Vihar, Burla Supplier GSTIN 2134ABCM5188P1Z3 Sale Order Date :Nov 10, 2022\n\nDistrict : Sambalpur - Supplier Email 7 Vode of Dispatch ;ROAD\n768020 Odasha\n\n+91-663-2542461\n\n+91-663-2542770\nWWH .MAHANADICOAL. IN ae4a8clfeb29d373 2dedeaa25bd3 903 7eec4d£013a425b74bd5801750c2ab132\n\nReceiver (Billed To) Consignee (Shipped to) Details of Dispatch\nName »HINDALCO INDUSTRIES LIMITED Name :\nParty Code 2000000395 Party Code 12000000395 Grade ‘G12\nAddress _ HIRAKUD SAMBALPUR SAMBALPUR Addresa _Within Odisha acv + 3701-4000\n768016 Sire :-100 MM\n\nCity * SAMBALPUR City\nPincode - 768016 Pincode Dispatch date Dec 3, 2022\n\nState code Gdasha State code < Plant S046\nPhone number. 06632481365 Phone number\nGSTIN 21AAACH1201R122 GSTIN\n\nE-Mail ID aaurav panigrahi@adityabirla com E-Mail ID\n\nCompany Name\n\na\n\nPARTICULARS\n\nPricing Description i ih | Rate Per UOM (INR Amount (INR)\n\nSizing Charges Fi\nas\n\n \n\nDMF( 30% of Royalty} 18545.29\n\nCGST( 2 5% ) 122.29\n\n[es\n\nTotal Amount: $17597.31\n\nRemarks/Note/ Declaration\n\nReverse Charge Applicable; No\n\nTotal Bill Value In words SIX LAKH SEVENTEEN THOUSAND PIVE HUNDRED NINETY SEV&N RUPEES THIRTY ONE PAISE\n\nCertified that the particulars givan above are true and correct and the amount indicated reprasents the price actually charged and that\n\nthera is no flow of additional consideration directly or indirectly from the buyer.\n\nArea : Jagannath\n— Rete ” Telephone . 6760269528\nLe Amount ° Fax Number : 6760269527\n\nAdviaing Bank Name _NA E-Mail Address . so-sales-jaga.mcl@coalindia, in\n\nThis is digitally verified document hence manual/ physical signature 1s not required\n\nAuthorized Signatory\n\n \n\f"
  - text: "Hirakud Administration\n\nFrom: Hirakud Administration\n\nSent: 18 December 2021 12:41 PM\n\nTo: sheela tower\n\nCe: Dusyant Dushkar; Sumit Kumar; Nagesh Pal; SANDEEP TIWARI; Raj Singh; Binay Dash\nSubject: RE: Room booking in hotel\n\nDear Sir,\n\nAs per trailing mail, kindly book 02 more rooms for our guest. Details as under:-\n\n1. Ms Alka Chaubey\n2. Mr Devguru Dash\n\nNote — Host will be Mr. Dusyant Dushkar. Check in date is 19.12.2021. Kindly correct the name of the guest\nin my previous mail, sl no 3 Mr.Subham Goel\n\nRegards\n~ Sandeep Kumar Ranbadia\n\nFrom: sheela tower [mailto:[email protected]]\nSent: 18 December 2021 12:22 PM\n\nTo: Hirakud Administration <[email protected]>\n\nCe: Dusyant Dushkar <[email protected]>; Sumit Kumar <[email protected]>; Nagesh Pal\n<[email protected]>; SANDEEP TIWARI <[email protected]>; Raj Singh\n\n<raj.kumarsingh>; Binay Dash <[email protected]>\n\nSubject: Re: Room booking in hotel\n\n   \n    \n\nCAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the\n‘sender and know the content is:safe.\n\nDear sir\nGreeting from sheelatowers!!!\n\nWe are confirm 04 business plus room for your guest.\n\nSra\n\nmcteCKotenat\n\na\n\na a\n\n \n\nRegards\nRanjt kumar sahoo\n9778403111\n\f"
  - text: "Tax Invoice\nOriginal for Buyer/ Duplicate for Transporter/ Triplicate for Assessee\n\nan ¢\n6 , Mc Supplier Legal Nama, Mahanadi Coalfields Are2 Code .MOO1\n» ta . — i\n\nLimited area Description ;Jagannath 3% Lane a Metareriz ete!\nSupplier Address Jagriti Vihar, Bur-a  |Invoice Number =. 9100067434 area isd Sooner at\nSambalpur 768020 Invoice Date :Dec 19, 2022 UA aarp LBL\n; Supplier City : Sambalpur Contract Reference:3030007756 :\n. L Supplier State : Odisha Contract type iSpot Auction\n+ Supplier Pincode : 7@8020 Sales Order 11240002677\nP.O. - Jagriti Vahar, Burla [Supplier GSTIN . 21 QAABCMS188P123 Sals Order Date «Nov 10, 2022\n\nDistrict : Sambalpur - Supplier Email Nod? of Dispatch .ROAD\n768020 Odasha\n\n+91-663-2542461\n\n*91-663+2542770\nWWW. MAHANADICOAL. IN IRN No: 21829066 2bd3486eef9a903185 74daciafc0«13744361e£93477eBd16aa0bast\n\nReceiver (Billed To} Conaignee (Shipred <a) Details of Dispatch\nName : HINDALCO INDUSTRIES LIMITED Name :\nParty Code ,2000000355 Party Code 7 20000003355 Grade » G12\nAddress . FTRAKUD = SAMBALFUR SAMBALPUR Address : Within Odis.a cv - 3701-4000\n768016 Size .-100 MM\n\nCity sSAMBALPUR city Dispatch date :Dec 19, 2022\nPincode . 768016 Pincode\n\nState code .Cdisha State code i Plant 15046\nPhone number: 06632481365 Phone number :\nGSTIN + 21AKRACH1201R122 GSTIN\n\nE-Mail ID : Saurav. [email protected] B-Mail ID\n\nCompany Name;\n\nPricing Desoription 7\n\nEvac Facility Charge 60.00 68384.40\nPoyalty Charges ( 14% of Basic Price) $84.12\n\nNMET Charges( 2% of Royalty)\n\neC\n\nGross Bill Value 5536.61 6309612.40\n\n6309612.40\nRemarks/Note/ Declaration Total Amount:\n\nReverse Charge Applicable No\n\nTotal Bill Value In words SIXTY THREE LAKH NINE THOUSAND SIX HUNDRED TWELVE RJPBES FORTY PAISE\n\nCertified that the particulars given above are true and correct and the amount indicated represents the price actually charged and that\n\nthere 18 no flow of additional consideration directly or indirectly Erom tke bayer.\n\nLC Ref.No NA\nLC Date\n\nLe Amount 90\nAdvising Bank Name - NA\n\nArea : Jagannath\nTelephone + 6760269528\nFax Number : ©760269527\nR-Mail Addresa : so-sales-jaga.mcl@coalindia. in\n\nThis is digitally verified document hence manual/ physical signature is not required\n\nAuthorized Signatory\n\n \n\n \n\f"
  - text: "UNITED MEDICAL STORE Patient Name: KASTURI uENA\n‘EW MARKET, C/O PRAFULLA KUMAR JENA\nHIRAKUD. SAMBALPUR. Dr. Name :\n\nMedicine Advice Slip: MA/2223/0668 “\nPhone :0663-2431670 Prescription Indent:M/2223/06299\n\nDL No. :SAWZ 486 R/487 RC Invoice No. ; 0002785 Date : 21/11/2022\n\nSe|__Qiy. [Pack [Product “Batch [Exp] HSN [ MRP | Table | Dis [5051] CO3i] Amount |\n\n1. 30 TAB] 30'S TELMA H TAB 11/24 | 30049099; 484.00! 432.14 0.001 6.00\nNEOPRIDE TOTAL CAP 7/24 30049099) 445.00) 0,00; 6.00\n\n \n\n \n\n \n\nSUB TOTAL :\n\nSGST\ner rH 2 ROFF :\n— ha GRAND TOTAL\n\nTe & Con itions For UNITED MEDICAL STORE R a ah\nBILL GRAND TOTAL IS CALCULATED ACCORDING TO 1D- 3306 Im- 1220\nMRP PRICE ( INCLUDING ALL GST TAXES ) Q _ 06 (ped)\n\n \n\f"
  - text: "Original for Buyer/\n\nSupplier\n\nSupplier\n\nSupplier\nSupplier\nSupplier\nSupplier\nSupplier\n\nP.O. - Jagriti Vihar, Burla\n\nDistrict : Sambalpur -\n\nLegal _Name:\n\nAddress\n\nCity\nState\nPincode\nGSTIN\nEmail\n\nTax Invoice\nDuplicate for Transporter/ Triplicate for Assessea\n\nMahanadi Coalfields\nLimited\n\nJagriti Vihar, Burla\nSambalpur 768020\n\n: Sambalpur\n\n: Odisha\n\n: 768020\n\n: 21AABCM5186P123\n\nArea Code\n\nArea Description\nInvaice Number\nInvoice Date\n\n:MOO1\nJagannath K Ss pasts\n9100067646 Le\n\nEee yy\n-Dec 21, 2022 4 PSIG Nae\nca ee\n\npar\n\nContract Raference 3030007756\n\nContract typa\nSales Order\n\nSale Order Date\nYods of Dispatch\n\n:Spot Auction\n11240002677\n:Nov 10, 2022\n:ROAD\n\n768020 Odisha\n+91-663-2542461\n+91-663-2542770\nWWW. MAHANADICOAL.IN\nReceiver (Billed To)\n-HINDALCO INDUSTRIES LIMITED\n: 2000000395\nHIRAKUD\n768016\n: SAMBALPUR\nPincoda : 768016\nState code Odisha\nPhone number. 06632481365\nGSTIN . 21AAACH1201R12Z\n\nE-Mail Ip\n\nIRN No. b£1b63c27ecdbbbeOedf{d3b343825 £06861 d200cd310b3e3344c6c0de297£635\n\nConsignee (Shipred -o)\n\nDetails of Dispatch\n\nName\nParty Code\n\nAddress\n\nName\nParty Code\n\nAddress\n\n'Gl2\n: 3701-4000\n-100 MM\n\n'Dec 21,\n\nGrade\n@cv\nSize\n\nDispatch date\n\n: 2000000395\n\nSAMBALPUR SAMBALPUR Within Odisha\n\nCity city\n\nPincede\n\nState code\nPhone number :\nGSTIN\n\n2022\n\nPlant : 5046\n\n- [email protected] E-Mail ID\n\nCompany Name.\n\n8001237358 42e00c0000 Bituminous Coal 27011200 1128.42\n\nPricing Description TW ate Pee WONT a ay\n\nSizing Charges 87.00 $8172.54\n\nSTC Charges\n\n60.00 67705,20\n\nEvac Facility Charge\n\n$54.12 625280.909\n\nRoyalty Charges ( 14% of Basic Price)\n\n \n\n11.08 12505.$0\n\nNMET Charges{ 2% of Royalty)\n\n166.24 187584 .03\n\nDLMF({ 30% of Royalty)\n137989.92\n\n \n\nCGST( 2.56}\n\n122.28 137989.92\n\nSGST( 2.5% }\n\n400.00 451368.00\n\nGST Comp Cess\n\nGross Bill Value\n\nNet Value\n\nRemarks/Note/ Declaration\n\nTotal Bill Value In words\n\nLC Ref No NA\nLc Date\n\nLe Amount 6\n\nAdvising Bank Name_: NA\n\n \n\nReverse Charge Applicable: No\n\nSIMTY TWO LAKH FORTY SIX THOUSAND NINE HUNDRED FOPTY FOUR RUPEES SEVENTY SIX PAISE\n\nCertified that the particulars givan above are true and correct and tha amount indicated represents the price\nthere is no flow of additional consideration directly or indirectly from the buyer.\n\nArea . Jagannath\nTelephone - 6760269528\nFax Number : 6760269527\n\nE-Mail Address\n\nThie is digitally verified document hence manual/ physical signature 16 not\n\n5536.01\n\n$536.01 6246944.76\n\nTotal Amount: 6246944.76\n\nactually charged and that\n\n: [email protected]\n\nAuthorized Signatory\n\n \n\f"
pipeline_tag: text-classification
inference: true
base_model: BAAI/bge-small-en-v1.5
  - name: SetFit with BAAI/bge-small-en-v1.5
      - task:
          type: text-classification
          name: Text Classification
          name: Unknown
          type: unknown
          split: test
          - type: accuracy
            value: 1
            name: Accuracy

SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
  • '. : Pt oh mM\nBaw iS\n\nWw tere\nPr pe 0 ok ji the\nFw: Pending Bills jer\n\n, Ronit Sarangi to: Vinit K Sinha i 22-01-2020 11:37\n
  • 'Tax Invoice\nOriginal for Buyer/ Duplicate for Transporter/ Triplicate for Assessee\n\nSupplier Legal Name; Mahanadi Coalfields Area Code :MO01\n7 Limited Area Description :Jagannath\nSupplier Address , Jagriti Vihar, Bur.a Invaice Number 19100066504\nSambalpur 768020 Involee Date :Dee 15, 2022\nSupplier City : Sambalpur Contract Reference: 3030007756\nSupplier State Odisha Contract type :Spot Auction\nSupplier Pincode : 768020 Salas Order 1240002677\n.P.O. - dJagriti vihar, Burla



Label Accuracy
all 1.0


Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Gopal2002/COAL_INVOICE_ZEON")
# Run inference
preds = model("UNITED MEDICAL STORE Patient Name: KASTURI uENA

Medicine Advice Slip: MA/2223/0668 “
Phone :0663-2431670 Prescription Indent:M/2223/06299

DL No. :SAWZ 486 R/487 RC Invoice No. ; 0002785 Date : 21/11/2022

Se|__Qiy. [Pack [Product “Batch [Exp] HSN [ MRP | Table | Dis [5051] CO3i] Amount |

1. 30 TAB] 30'S TELMA H TAB 11/24 | 30049099; 484.00! 432.14 0.001 6.00
NEOPRIDE TOTAL CAP 7/24 30049099) 445.00) 0,00; 6.00





er rH 2 ROFF :

Te & Con itions For UNITED MEDICAL STORE R a ah


Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 270.5442 4241
Label Training Sample Count
0 130
1 85

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (2, 2)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0013 1 0.2394 -
0.0657 50 0.1203 -
0.1314 100 0.0095 -
0.1971 150 0.0029 -
0.2628 200 0.0014 -
0.3285 250 0.0014 -
0.3942 300 0.0011 -
0.4599 350 0.0009 -
0.5256 400 0.0008 -
0.5913 450 0.0007 -
0.6570 500 0.0008 -
0.7227 550 0.0008 -
0.7884 600 0.0006 -
0.8541 650 0.0005 -
0.9198 700 0.0004 -
0.9855 750 0.0005 -
1.0512 800 0.0004 -
1.1170 850 0.0005 -
1.1827 900 0.0004 -
1.2484 950 0.0004 -
1.3141 1000 0.0003 -
1.3798 1050 0.0004 -
1.4455 1100 0.0004 -
1.5112 1150 0.0004 -
1.5769 1200 0.0005 -
1.6426 1250 0.0004 -
1.7083 1300 0.0003 -
1.7740 1350 0.0004 -
1.8397 1400 0.0005 -
1.9054 1450 0.0004 -
1.9711 1500 0.0003 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.16.1
  • Tokenizers: 0.15.0



