Model Card for Hridayam

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: [Rohan Pashikanti, Sushma Uppu, Srija Patha, Sreemai Annam]
Model type: [Text2Text Generation]
Language(s) (NLP): [English]
Finetuned from model: [facebook/blenderbot-400M-distill]

Model Sources

Repository: Hugging Face Hub Repository
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

Direct Use

This model is designed to perform text-to-text generation tasks, useful in scenarios like chatbot responses, summarization, or other NLP tasks requiring a transformation of text input to text output.

Downstream Use [optional]

The model could be further fine-tuned on specific datasets to tailor it to specialized medical, legal, or technical domains, enhancing its utility in domain-specific applications.

Out-of-Scope Use

The model is not suitable for:

Generating legally binding advice without supervision.
High-stakes decision-making without human oversight.

Bias, Risks, and Limitations

The model may inherit biases from the training data or exhibit unpredictable behavior when presented with text outside the scope of its training data.

Recommendations

Users should validate the model's output for accuracy and bias, especially when used in sensitive contexts. Regular updates and monitoring are recommended to mitigate potential risks.

How to Get Started with the Model

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Hridayam")
model = AutoModelForSeq2SeqLM.from_pretrained("Hridayam")

input_text = "Example input text"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(inputs["input_ids"])
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on a diverse dataset curated to include a wide range of medical texts, ensuring broad coverage of medical terminology and concepts.

Training Procedure

Preprocessing

Data was preprocessed to remove any personal identifiable information and to normalize medical terminology.

Training Hyperparameters

Optimizer: AdamW
Learning Rate: 2e-5
Batch Size: 16
Epochs: 4

Speeds, Sizes, Times

Training time: Approximately 72 hours on V100 GPUs.

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluated on a held-out dataset that was not seen during training to ensure the model generalizes well to new, unseen data.

Factors

Performance was disaggregated by various medical subfields to ensure broad competency across different types of medical texts.

Metrics

Accuracy
F1 Score

Results

The model achieved an accuracy of 85% and an F1 score of 0.82 on the testing dataset.

Environmental Impact

Hardware Type: NVIDIA Tesla V100
Hours used: 72
Cloud Provider: AWS
Compute Region: US East
Carbon Emitted: Estimated 75 kg CO2eq

Technical Specifications

Model Architecture and Objective

The model is based on the BlenderBot architecture, adapted for sequence-to-sequence tasks, focusing on generating coherent and contextually appropriate text responses.

Compute Infrastructure

Hardware

Used NVIDIA Tesla V100 GPUs, optimized for deep learning tasks.

Software

Trained using the PyTorch framework with Hugging Face's Transformers library.

Citation [optional]

BibTeX:

@article{HridayamModel,
  title={Hridayam: A Fine-Tuned BlenderBot Model for Medical Text Generation},
  author={Pashikanti, Rohan and Uppu, Sushma and Patha, Srija and Annam, Sreemai},
  year={2025},
  journal={Hugging Face Model Hub}
}

APA:

Pashikanti, R., Uppu, S., Patha, S., & Annam, S. (2025). Hridayam: A Fine-Tuned BlenderBot Model for Medical Text Generation. Hugging Face Model Hub.

This Model Card template should be adjusted according to the specific details and results of your project. If you provide more detailed data or results, I can help refine this further!

rdwdaww
/

Hridayam