Model Card for Model ID

Model Details

Model Description

Developed by: Declan Bracken, Armando Ordorica, Michael Santorelli, Paul Zhou
Model type: Transformer
Language(s) (NLP): English
Finetuned from model: BERT_base_uncased

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

Direct Use

Create a custom class to load in the model, the label encoder, and the BERT tokenizer used for training (bert-base-uncased) as below. use the tokenizer to tokenize any input string you'd like, then pass it through the model to get outputs.

class BERTClassifier: def init(self, model_identifier): # Load the tokenizer from bert base uncased self.tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

    # Load the config
    config = AutoConfig.from_pretrained(model_identifier)

    # Load the model
    self.model = BertForSequenceClassification.from_pretrained(model_identifier, config=config)
    self.model.eval()  # Set the model to evaluation mode

    # Load the label encoder
    encoder_url = f'https://huggingface.co/{model_identifier}/resolve/main/model_encoder.pkl'
    self.labels = pickle.loads(requests.get(encoder_url).content)

def predict_category(self, text):
    # Tokenize the text
    inputs = self.tokenizer(text, return_tensors='pt', truncation=True, padding=True)

    # Predict
    with torch.no_grad():
        outputs = self.model(**inputs)

    # Get the prediction index
    prediction_idx = torch.argmax(outputs.logits, dim=1).item()

    # Decode the prediction index to get the label
    prediction_label = self.labels[prediction_idx]  # Use indexing for a NumPy array

    return prediction_label