compressa-ai
/

Llama-3-8B-Instruct-OmniQuant

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Llama-3-8B-Instruct-OmniQuant

2 contributors

History: 7 commits

Vasily Alexeev

add two stop toks in gen config

5413035 10 months ago

.gitattributes

1.52 kB

initial commit 10 months ago
README.md

6.96 kB

add asymm quantized model, add two eos in code sample 10 months ago
compressa-config.json

732 Bytes

add asymm quantized model, add two eos in code sample 10 months ago
config.json

898 Bytes

add asymm quantized model, add two eos in code sample 10 months ago
generation_config.json

131 Bytes

add two stop toks in gen config 10 months ago
model-00001-of-00002.safetensors

4.68 GB
LFS

add asymm quantized model, add two eos in code sample 10 months ago
model-00002-of-00002.safetensors

1.05 GB
LFS

add model weights and stuff 10 months ago
model.safetensors.index.json

78.5 kB

add model weights and stuff 10 months ago
quant_config.json

64 Bytes

add asymm quantized model, add two eos in code sample 10 months ago
special_tokens_map.json

301 Bytes

add model weights and stuff 10 months ago
tokenizer.json

9.08 MB

add model weights and stuff 10 months ago
tokenizer_config.json

51.4 kB

kinda fix eos token to stop model from chatting with itself 10 months ago