Edit Models filters

Inference Providers

HF Inference API

Misc

8-bit precision

Misc with no match

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

20

Full-text search

Active filters: quark

fxmarty/llama-tiny-testing-quark-indev

Updated Oct 3, 2024 • 5

fxmarty/llama-tiny-int4-per-group-sym

Updated Oct 25, 2024 • 9

fxmarty/llama-tiny-w-fp8-a-fp8

Updated Oct 22, 2024 • 6

fxmarty/llama-tiny-w-fp8-a-fp8-o-fp8

Updated Oct 22, 2024 • 5

fxmarty/llama-tiny-w-int8-per-tensor

Updated Oct 22, 2024 • 7

fxmarty/llama-small-int4-per-group-sym-awq

Updated Oct 29, 2024 • 6

fxmarty/quark-legacy-int8

Updated Oct 10, 2024 • 28

fxmarty/llama-tiny-w-int8-b-int8-per-tensor

Updated Oct 22, 2024 • 8

fxmarty/llama-small-int4-per-group-sym-awq-old

Updated Oct 25, 2024 • 11

amd-quark/llama-tiny-w-int8-per-tensor

Updated Dec 18, 2024 • 223

amd-quark/llama-tiny-w-int8-b-int8-per-tensor

Updated Dec 18, 2024 • 224

amd-quark/llama-tiny-w-fp8-a-fp8

Updated Dec 18, 2024 • 223

amd-quark/llama-tiny-w-fp8-a-fp8-o-fp8

Updated Dec 18, 2024 • 223

amd-quark/llama-tiny-int4-per-group-sym

Updated Dec 18, 2024 • 223

amd-quark/llama-small-int4-per-group-sym-awq

Updated Dec 18, 2024 • 225

amd-quark/quark-legacy-int8

Updated Dec 18, 2024 • 132

amd/Llama-3.1-8B-Instruct-FP8-KV-Quark-test

Updated Jan 7 • 1.15k

amd/Llama-3.1-8B-Instruct-w-int8-a-int8-sym-test

Updated Jan 7 • 49

EmbeddedLLM/Llama-3.1-8B-Instruct-w_fp8_per_channel_sym

Text Generation • Updated 20 days ago • 6

amd-quark/llama-tiny-fp8-quark-quant-method

Updated 4 days ago • 67