Model Card for Hridayam

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: [Rohan Pashikanti, Sushma Uppu, Srija Patha, Sreemai Annam]
  • Model type: [Text2Text Generation]
  • Language(s) (NLP): [English]
  • Finetuned from model: [facebook/blenderbot-400M-distill]

Model Sources

Uses

Direct Use

This model is designed to perform text-to-text generation tasks, useful in scenarios like chatbot responses, summarization, or other NLP tasks requiring a transformation of text input to text output.

Downstream Use [optional]

The model could be further fine-tuned on specific datasets to tailor it to specialized medical, legal, or technical domains, enhancing its utility in domain-specific applications.

Out-of-Scope Use

The model is not suitable for:

  • Generating legally binding advice without supervision.
  • High-stakes decision-making without human oversight.

Bias, Risks, and Limitations

The model may inherit biases from the training data or exhibit unpredictable behavior when presented with text outside the scope of its training data.

Recommendations

Users should validate the model's output for accuracy and bias, especially when used in sensitive contexts. Regular updates and monitoring are recommended to mitigate potential risks.

How to Get Started with the Model

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Hridayam")
model = AutoModelForSeq2SeqLM.from_pretrained("Hridayam")

input_text = "Example input text"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(inputs["input_ids"])
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on a diverse dataset curated to include a wide range of medical texts, ensuring broad coverage of medical terminology and concepts.

Training Procedure

Preprocessing

Data was preprocessed to remove any personal identifiable information and to normalize medical terminology.

Training Hyperparameters

  • Optimizer: AdamW
  • Learning Rate: 2e-5
  • Batch Size: 16
  • Epochs: 4

Speeds, Sizes, Times

  • Training time: Approximately 72 hours on V100 GPUs.

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluated on a held-out dataset that was not seen during training to ensure the model generalizes well to new, unseen data.

Factors

Performance was disaggregated by various medical subfields to ensure broad competency across different types of medical texts.

Metrics

  • Accuracy
  • F1 Score

Results

The model achieved an accuracy of 85% and an F1 score of 0.82 on the testing dataset.

Environmental Impact

  • Hardware Type: NVIDIA Tesla V100
  • Hours used: 72
  • Cloud Provider: AWS
  • Compute Region: US East
  • Carbon Emitted: Estimated 75 kg CO2eq

Technical Specifications

Model Architecture and Objective

The model is based on the BlenderBot architecture, adapted for sequence-to-sequence tasks, focusing on generating coherent and contextually appropriate text responses.

Compute Infrastructure

Hardware

Used NVIDIA Tesla V100 GPUs, optimized for deep learning tasks.

Software

Trained using the PyTorch framework with Hugging Face's Transformers library.

Citation [optional]

BibTeX:

@article{HridayamModel,
  title={Hridayam: A Fine-Tuned BlenderBot Model for Medical Text Generation},
  author={Pashikanti, Rohan and Uppu, Sushma and Patha, Srija and Annam, Sreemai},
  year={2025},
  journal={Hugging Face Model Hub}
}

APA:

Pashikanti, R., Uppu, S., Patha, S., & Annam, S. (2025). Hridayam: A Fine-Tuned BlenderBot Model for Medical Text Generation. Hugging Face Model Hub.

This Model Card template should be adjusted according to the specific details and results of your project. If you provide more detailed data or results, I can help refine this further!

Downloads last month
127
Safetensors
Model size
365M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for rdwdaww/Hridayam

Finetuned
(15)
this model