Hi,I'm traing the deepseek-ai/deepseek-vl2 model and find that the default top_k method is noaux_tc. However, line 468 in modeling_deepseek.py shows that noaux_tc is not supported for traing. I wonder why.
noaux_tc
· Sign up or log in to comment