Michael Goin PRO
mgoin
AI & ML interests
LLM inference optimization, compression, quantization, pruning, distillation
Recent Activity
upvoted
a
paper
1 day ago
QuEST: Stable Training of LLMs with 1-Bit Weights and Activations
updated
a model
5 days ago
nm-testing/pixtral-12b-FP8-dynamic-all
updated
a model
5 days ago
neuralmagic/pixtral-12b-FP8-dynamic
Organizations
mgoin's activity
compressed-tensors MLA support requires fp8 activations and weights in group 'group_0',
2
#1 opened 8 days ago
by
samos123
How to load this model?
2
#1 opened 7 months ago
by
Frz614
Model does not run with VLLM
2
#3 opened about 2 months ago
by
aswad546
Nice model, any info on scripts used to quantize?
1
#1 opened 2 months ago
by
RonanMcGovern
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/-6Yq7oM_Ju6Zi2GEvobvb.jpeg)
Add config_format and load_format to vLLM args
#5 opened 3 months ago
by
mgoin
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60466e4b4f40b01b66151416/sWaFR-fi_Bk9vy3EC5K0f.jpeg)
Update config.json to use null for sliding_window
#4 opened 3 months ago
by
mgoin
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60466e4b4f40b01b66151416/sWaFR-fi_Bk9vy3EC5K0f.jpeg)
Adding `safetensors` variant of this model
#1 opened 3 months ago
by
SFconvertbot
![](https://cdn-avatars.huggingface.co/v1/production/uploads/635fd4cc14657fb8cff2a081/GDkyDwAcuqDBpaOvQgJuq.png)
Is this the standard GPTQ quantization?
1
#5 opened 3 months ago
by
molereddy
Model weights are not loaded
4
#3 opened 6 months ago
by
MarvelousMouse
![](https://cdn-avatars.huggingface.co/v1/production/uploads/no-auth/7kKZkYKyNEuJiQnWzAfOA.png)
Update model card
#1 opened 3 months ago
by
nm-research
Add chat_template to tokenizer_config.json
#1 opened 3 months ago
by
nm-research
Why is the Pixtral activation function "gelu" when the reference code uses "silu"?
2
#10 opened 4 months ago
by
mgoin
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60466e4b4f40b01b66151416/sWaFR-fi_Bk9vy3EC5K0f.jpeg)
Update tokenizer_config.json with chat_template
3
#11 opened 4 months ago
by
mgoin
![](https://cdn-avatars.huggingface.co/v1/production/uploads/60466e4b4f40b01b66151416/sWaFR-fi_Bk9vy3EC5K0f.jpeg)
Any chance your team is working on a 4-bit Llama-3.2-90B-Vision-Instruct-quantized.w4a16 version?
1
#1 opened 5 months ago
by
mrhendrey
Oom with 24g vram
3
#1 opened 5 months ago
by
Klopez
latest vllm docker (v0.6.2) fail to load
2
#1 opened 4 months ago
by
choronz333
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/4ataZp08Wdgqxv2eVySws.jpeg)
Issue with loading model
1
#1 opened 5 months ago
by
xSumukhax