just-add-ai
/

Llama-3.3-70B-Instruct-FP8-Dynamic

Text Generation

text-generation-inference

Inference Endpoints

compressed-tensors

Model card Files Files and versions Community

Quantized Model Information

This repository is a 'FP8-Dynamic' quantized version of meta-llama/Llama-3.3-70B-Instruct, originally released by Meta AI.

For usage instructions please refer to the original model meta-llama/Llama-3.3-70B-Instruct.

Performance

All benchmarks were done using the LLM Evaluation Harness

		Llama-3.3-70B-Instruct-FP8-Dynamic	Llama-3.3-70B-Instruct (base)	recovery
mmlu	-	xx	xx	xx
		xx	xx	xx
hellaswag	acc	65.69	-
	acc_sterr	0.47	-
	acc_norm	84.36	-
	acc_sterr	0.36	-

Downloads last month: 6

Safetensors

Model size

70.6B params

Tensor type

BF16

·

F8_E4M3

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for just-add-ai/Llama-3.3-70B-Instruct-FP8-Dynamic

Base model

meta-llama/Llama-3.1-70B

Finetuned

meta-llama/Llama-3.3-70B-Instruct

Quantized

(90)

this model