This is very useful. Thank you for making it. But it's very large for 3.3B since it's the full model. I would love to know if quantization is possible/whether it affects the quality too much for this type of models.
· Sign up or log in to comment