Reduce memory for inference

#28

Remove gradient calculation

Hugging Face TB Research org

Hi there! .generate already uses torch.no_grad, so this is not necessary. See https://github.com/huggingface/transformers/blob/bc9a6d8302334ae08d505437ab3f361af777956c/src/transformers/generation/utils.py#L1879 for more information.

Xenova changed pull request status to closed

Sign up or log in to comment