--- library_name: setfit tags: - setfit - sentence-transformers - text-classification - generated_from_setfit_trainer metrics: - accuracy widget: - text: Implementing the reform required strong support from all ministries involved. A major effort was required to present the conceptual change to car importers, politicians and the public. A great deal was also invested in public relations to describe the benefits of the tax, which by many was perceived as yet another attempt to increase tax revenues. A number of the most popular car models’ prices were about to increase – mostly large family, luxury and sport cars – but for many models, the retail price was actually reduced. - text: Workers in the formal sector. Formal sector workers also face economic risks. A number of them experience income instability due to contractualization, retrenchment, and firm closures. In 2014, contractual workers accounted for 22 percent of the total 4.5 million workers employed in establishments with 20 or more employees. - text: Building additional dams and power stations to further develop energy generation potential from the same river flow as well as develop new dam sites on parallel rivers in order to maintain the baseline hydropower electricity generation capacity to levels attainable under a ‘no-climate change’ scenario. Developing and implementing climate change compatible building/construction codes for buildings, roads, airports, airfields, dry ports, railways, bridges, dams and irrigation canals that are safe for human life and minimize economic damage that is likely to result from increasing extremes in flooding. - text: Another factor that increases farmer vulnerability is the remoteness of farm villages and lack of adequate road infrastructure. Across the three regions, roads are in a poor state and unevenly distributed, with many villages lacking roads that connect them to other villages. Even the main roads are often accessible only during the dry season. The livelihood implications of this isolation are significant, as farmers have difficulties getting their products to markets as well as obtaining agricultural inputs; in addition, farmers generally have to pay higher prices for agricultural inputs in remote areas, reducing their profit margins - text: This project aims to construct a desalination plant in the capital city in order to respond directly to drinking water supply needs. This new plant, which will have a capacity of 22,500 m3 daily, easily expandable to 45,000 m3, will be fuelled by renewable energy, which is expected to be provided by a wind farm planned for the second phase of the project. Funding: European Union. Rural Community Development and Water Mobilization Project (PRODERMO). pipeline_tag: text-classification inference: false base_model: sentence-transformers/all-mpnet-base-v2 --- # SetFit with sentence-transformers/all-mpnet-base-v2 This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) as the Sentence Transformer embedding model. A [SetFitHead](huggingface.co/docs/setfit/reference/main#setfit.SetFitHead) instance is used for classification. The model has been trained using an efficient few-shot learning technique that involves: 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. 2. Training a classification head with features from the fine-tuned Sentence Transformer. ## Model Details ### Model Description - **Model Type:** SetFit - **Sentence Transformer body:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) - **Classification head:** a [SetFitHead](huggingface.co/docs/setfit/reference/main#setfit.SetFitHead) instance - **Maximum Sequence Length:** 384 tokens - **Number of Classes:** 18 classes ### Model Sources - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) ## Uses ### Direct Use for Inference First install the SetFit library: ```bash pip install setfit ``` Then you can load this model and run inference. ```python from setfit import SetFitModel # Download from the 🤗 Hub model = SetFitModel.from_pretrained("leavoigt/vulnerability_multilabel_updated") # Run inference preds = model("Workers in the formal sector. Formal sector workers also face economic risks. A number of them experience income instability due to contractualization, retrenchment, and firm closures. In 2014, contractual workers accounted for 22 percent of the total 4.5 million workers employed in establishments with 20 or more employees.") ``` ## Training Details ### Training Set Metrics | Training set | Min | Median | Max | |:-------------|:----|:--------|:----| | Word count | 21 | 72.6472 | 238 | ### Training Hyperparameters - batch_size: (16, 2) - num_epochs: (1, 0) - max_steps: -1 - sampling_strategy: undersampling - body_learning_rate: (2e-05, 1e-05) - head_learning_rate: 0.01 - loss: CosineSimilarityLoss - distance_metric: cosine_distance - margin: 0.25 - end_to_end: False - use_amp: False - warmup_proportion: 0.01 - seed: 42 - eval_max_steps: -1 - load_best_model_at_end: False ### Training Results | Epoch | Step | Training Loss | Validation Loss | |:------:|:----:|:-------------:|:---------------:| | 0.0006 | 1 | 0.1906 | - | | 0.0316 | 50 | 0.1275 | 0.1394 | | 0.0631 | 100 | 0.0851 | 0.1247 | | 0.0947 | 150 | 0.0959 | 0.1269 | | 0.1263 | 200 | 0.1109 | 0.1179 | | 0.1578 | 250 | 0.0923 | 0.1354 | | 0.1894 | 300 | 0.063 | 0.1292 | | 0.2210 | 350 | 0.0555 | 0.1326 | | 0.2525 | 400 | 0.0362 | 0.1127 | | 0.2841 | 450 | 0.0582 | 0.132 | | 0.3157 | 500 | 0.0952 | 0.1339 | | 0.3472 | 550 | 0.0793 | 0.1171 | | 0.3788 | 600 | 0.059 | 0.1187 | | 0.4104 | 650 | 0.0373 | 0.1131 | | 0.4419 | 700 | 0.0593 | 0.1144 | | 0.4735 | 750 | 0.0405 | 0.1174 | | 0.5051 | 800 | 0.0284 | 0.1196 | | 0.5366 | 850 | 0.0329 | 0.1116 | | 0.5682 | 900 | 0.0895 | 0.1193 | | 0.5997 | 950 | 0.0576 | 0.1159 | | 0.6313 | 1000 | 0.0385 | 0.1203 | | 0.6629 | 1050 | 0.0842 | 0.1195 | | 0.6944 | 1100 | 0.0274 | 0.113 | | 0.7260 | 1150 | 0.0226 | 0.1137 | | 0.7576 | 1200 | 0.0276 | 0.1204 | | 0.7891 | 1250 | 0.0355 | 0.1163 | | 0.8207 | 1300 | 0.077 | 0.1161 | | 0.8523 | 1350 | 0.0735 | 0.1135 | | 0.8838 | 1400 | 0.0357 | 0.1175 | | 0.9154 | 1450 | 0.0313 | 0.1207 | | 0.9470 | 1500 | 0.0241 | 0.1159 | | 0.9785 | 1550 | 0.0339 | 0.1161 | ### Framework Versions - Python: 3.10.12 - SetFit: 1.0.3 - Sentence Transformers: 2.3.1 - Transformers: 4.38.1 - PyTorch: 2.1.0+cu121 - Datasets: 2.3.0 - Tokenizers: 0.15.2 ## Citation ### BibTeX ```bibtex @article{https://doi.org/10.48550/arxiv.2209.11055, doi = {10.48550/ARXIV.2209.11055}, url = {https://arxiv.org/abs/2209.11055}, author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Efficient Few-Shot Learning Without Prompts}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ```