Is it possible to fine-tuning the 7b1 model on 4 A100 (80G) gpus?
#44
by
TTCoding
- opened
I have tried many configurations the FT 7b1 on four A100. But unfortunately, I got OOM all the time. So I am curious about minimal demands of GPUS to FT this model. Could you share your experiences?
If you freeze some layers, even in 1 A100 it is possible.
Check this: https://gitlab.inria.fr/synalp/plm4all/-/tree/main/finetune_accelerate
It is still a draft but it's running.
@hatimbr
hi, how to transform the quat fp32
model to fp16
, and then can i ft it with 24g RTX3090ti for fp16?
sure, this model-fit is fp32 of quantitated?
hi @redauzhang you can pass the parameter torch_dtype=torch.float16 (or even better torch_dtype=torch.bfloat16) in the from_pretrained method.
christopher
changed discussion status to
closed