Needle in a Haystack Evaluation Heatmap

Model Card for Model ID

merge between:

DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1 - 66%
meta-llama/Meta-Llama-3-8B-Instruct - 16%
DataGuard/pali-8B-v0.4.3 - 16%

Embedding, norm and head layers come from DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1 without changes

Downloads last month: 11

Safetensors

Model size

8.03B params

Tensor type

BF16

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Evaluation results

judge_match on squad_answerable
self-reported

0.639
judge_match on context_has_answer
self-reported

0.86
judge_match on jail_break
self-reported

0.099
judge_match on harmless_prompt
self-reported

0.926
judge_match on harmful_prompt
self-reported

0.689
acc on truthfulqa
self-reported

0.522
exact_match on gsm8k
self-reported

0.616
acc on mmlu
self-reported

0.634

View on Papers With Code