SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("amplyfi/all-mpnet-base-v2_signal-types-training.json_multilabel")
# Run inference
preds = model("Delta Airlines Faces Fuel Supply Issues")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 5 11.0995 26

Training Hyperparameters

  • batch_size: (16, 2)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0010 1 0.477 -
0.0524 50 0.274 -
0.1047 100 0.2255 -
0.1571 150 0.1863 -
0.2094 200 0.1642 -
0.2618 250 0.1398 -
0.3141 300 0.1106 -
0.3665 350 0.088 -
0.4188 400 0.082 -
0.4712 450 0.0683 -
0.5236 500 0.0717 -
0.5759 550 0.0653 -
0.6283 600 0.0606 -
0.6806 650 0.0489 -
0.7330 700 0.0507 -
0.7853 750 0.0458 -
0.8377 800 0.0512 -
0.8901 850 0.047 -
0.9424 900 0.0388 -
0.9948 950 0.0414 -
1.0471 1000 0.0351 -
1.0995 1050 0.0383 -
1.1518 1100 0.0335 -
1.2042 1150 0.0325 -
1.2565 1200 0.0328 -
1.3089 1250 0.035 -
1.3613 1300 0.0295 -
1.4136 1350 0.0359 -
1.4660 1400 0.0296 -
1.5183 1450 0.0317 -
1.5707 1500 0.0301 -
1.6230 1550 0.0262 -
1.6754 1600 0.0342 -
1.7277 1650 0.0313 -
1.7801 1700 0.0327 -
1.8325 1750 0.03 -
1.8848 1800 0.0285 -
1.9372 1850 0.027 -
1.9895 1900 0.0277 -
2.0419 1950 0.0251 -
2.0942 2000 0.0302 -
2.1466 2050 0.0237 -
2.1990 2100 0.0233 -
2.2513 2150 0.0256 -
2.3037 2200 0.0246 -
2.3560 2250 0.0266 -
2.4084 2300 0.0322 -
2.4607 2350 0.0259 -
2.5131 2400 0.0262 -
2.5654 2450 0.0257 -
2.6178 2500 0.025 -
2.6702 2550 0.0234 -
2.7225 2600 0.0283 -
2.7749 2650 0.0287 -
2.8272 2700 0.0295 -
2.8796 2750 0.0254 -
2.9319 2800 0.0241 -
2.9843 2850 0.0196 -
3.0366 2900 0.0221 -
3.0890 2950 0.0222 -
3.1414 3000 0.0248 -
3.1937 3050 0.0282 -
3.2461 3100 0.0219 -
3.2984 3150 0.024 -
3.3508 3200 0.0196 -
3.4031 3250 0.0244 -
3.4555 3300 0.0255 -
3.5079 3350 0.0275 -
3.5602 3400 0.0239 -
3.6126 3450 0.0221 -
3.6649 3500 0.0239 -
3.7173 3550 0.0227 -
3.7696 3600 0.0239 -
3.8220 3650 0.0255 -
3.8743 3700 0.0247 -
3.9267 3750 0.0249 -
3.9791 3800 0.0239 -
4.0314 3850 0.0215 -
4.0838 3900 0.022 -
4.1361 3950 0.0206 -
4.1885 4000 0.0224 -
4.2408 4050 0.023 -
4.2932 4100 0.0235 -
4.3455 4150 0.0231 -
4.3979 4200 0.0246 -
4.4503 4250 0.0228 -
4.5026 4300 0.0225 -
4.5550 4350 0.0246 -
4.6073 4400 0.0212 -
4.6597 4450 0.0258 -
4.7120 4500 0.0207 -
4.7644 4550 0.0245 -
4.8168 4600 0.0258 -
4.8691 4650 0.0237 -
4.9215 4700 0.0219 -
4.9738 4750 0.0216 -
5.0262 4800 0.022 -
5.0785 4850 0.022 -
5.1309 4900 0.0187 -
5.1832 4950 0.0227 -
5.2356 5000 0.0212 -
5.2880 5050 0.0183 -
5.3403 5100 0.021 -
5.3927 5150 0.024 -
5.4450 5200 0.021 -
5.4974 5250 0.0227 -
5.5497 5300 0.0253 -
5.6021 5350 0.0229 -
5.6545 5400 0.0265 -
5.7068 5450 0.0198 -
5.7592 5500 0.0252 -
5.8115 5550 0.0242 -
5.8639 5600 0.022 -
5.9162 5650 0.0261 -
5.9686 5700 0.0186 -
6.0209 5750 0.0207 -
6.0733 5800 0.0222 -
6.1257 5850 0.025 -
6.1780 5900 0.0216 -
6.2304 5950 0.0195 -
6.2827 6000 0.0209 -
6.3351 6050 0.0174 -
6.3874 6100 0.0199 -
6.4398 6150 0.0241 -
6.4921 6200 0.0227 -
6.5445 6250 0.0228 -
6.5969 6300 0.0219 -
6.6492 6350 0.0196 -
6.7016 6400 0.0207 -
6.7539 6450 0.02 -
6.8063 6500 0.0232 -
6.8586 6550 0.0218 -
6.9110 6600 0.021 -
6.9634 6650 0.0213 -
7.0157 6700 0.0223 -
7.0681 6750 0.0224 -
7.1204 6800 0.0216 -
7.1728 6850 0.0231 -
7.2251 6900 0.019 -
7.2775 6950 0.0213 -
7.3298 7000 0.0219 -
7.3822 7050 0.0209 -
7.4346 7100 0.0206 -
7.4869 7150 0.0217 -
7.5393 7200 0.0203 -
7.5916 7250 0.0219 -
7.6440 7300 0.0192 -
7.6963 7350 0.0197 -
7.7487 7400 0.0188 -
7.8010 7450 0.0217 -
7.8534 7500 0.02 -
7.9058 7550 0.0224 -
7.9581 7600 0.0232 -
8.0105 7650 0.02 -
8.0628 7700 0.0207 -
8.1152 7750 0.0187 -
8.1675 7800 0.0185 -
8.2199 7850 0.0228 -
8.2723 7900 0.0187 -
8.3246 7950 0.0193 -
8.3770 8000 0.022 -
8.4293 8050 0.024 -
8.4817 8100 0.0186 -
8.5340 8150 0.0218 -
8.5864 8200 0.0169 -
8.6387 8250 0.0234 -
8.6911 8300 0.0218 -
8.7435 8350 0.0206 -
8.7958 8400 0.0229 -
8.8482 8450 0.021 -
8.9005 8500 0.0206 -
8.9529 8550 0.0195 -
9.0052 8600 0.0181 -
9.0576 8650 0.0211 -
9.1099 8700 0.0177 -
9.1623 8750 0.0214 -
9.2147 8800 0.0191 -
9.2670 8850 0.0193 -
9.3194 8900 0.0215 -
9.3717 8950 0.0199 -
9.4241 9000 0.0171 -
9.4764 9050 0.0194 -
9.5288 9100 0.0212 -
9.5812 9150 0.0206 -
9.6335 9200 0.0207 -
9.6859 9250 0.0183 -
9.7382 9300 0.0187 -
9.7906 9350 0.0206 -
9.8429 9400 0.0201 -
9.8953 9450 0.0188 -
9.9476 9500 0.0224 -
10.0 9550 0.0214 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.1
  • Sentence Transformers: 3.3.1
  • Transformers: 4.48.0.dev0
  • PyTorch: 2.5.1+cu124
  • Datasets: 3.1.0
  • Tokenizers: 0.21.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
0
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Model tree for amplyfi/all-mpnet-base-v2_signal-types-training.json_multilabel

Finetuned
(216)
this model