|
--- |
|
license: llama3 |
|
datasets: |
|
- BAAI/Infinity-Instruct |
|
base_model: |
|
- meta-llama/Meta-Llama-3.1-8B-Instruct |
|
--- |
|
|
|
We prune the Llama-3.1-8B-Instruct to 1.4B and fine-tune it with LLM-Neo method,which combines LoRA and KD in one. Training data is sampling from BAAI/Infinity-Instruct for 1 Million lines. |
|
|
|
## Benchmarks |
|
|
|
In this section, we report the results for Llama3.1-Neo-1B-100w on standard automatic benchmarks. For all the evaluations, we use [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) library. |
|
|
|
### Evaluation results |
|
|
|
<table> |
|
<tr> |
|
<td><strong>Category</strong> |
|
</td> |
|
<td><strong>Benchmark</strong> |
|
</td> |
|
<td><strong>Version</strong> |
|
</td> |
|
<td><strong>n-shot</strong> |
|
</td> |
|
<td><strong>Metric</strong> |
|
</td> |
|
<td><strong>Value</strong> |
|
</td> |
|
<td><strong>Stderr</strong> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan="2" >ARC |
|
</td> |
|
<td>ARC-Challenge</td> |
|
<td>1</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.1920</td> |
|
<td>± 0.0115</td> |
|
</tr> |
|
<tr> |
|
<td>ARC-Easy</td> |
|
<td>1</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.3834</td> |
|
<td>± 0.0100</td> |
|
</tr> |
|
<tr> |
|
<td rowspan="3" >CEVAL</td> |
|
<td>CEVAL (valid)</td> |
|
<td>N/A</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.2370</td> |
|
<td>± 0.0117</td> |
|
</tr> |
|
<tr> |
|
<td>CEVAL (Accountant)</td> |
|
<td>1</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.2449</td> |
|
<td>± 0.0621</td> |
|
</tr> |
|
<tr> |
|
<td>CEVAL (Advanced Mathematics)</td> |
|
<td>1</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.3158</td> |
|
<td>± 0.1096</td> |
|
</tr> |
|
<tr> |
|
<td rowspan="2" >MMLU</td> |
|
<td>MMLU</td> |
|
<td>N/A</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.2439</td> |
|
<td>± 0.0036</td> |
|
</tr> |
|
<tr> |
|
<td>MMLU (Abstract Algebra)</td> |
|
<td>0</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.2500</td> |
|
<td>± 0.0435</td> |
|
</tr> |
|
<tr> |
|
<td rowspan="2" >PIQA</td> |
|
<td>PIQA</td> |
|
<td>1</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.5843</td> |
|
<td>± 0.0115</td> |
|
</tr> |
|
<tr> |
|
<td>PIQA (Normalized)</td> |
|
<td>1</td> |
|
<td>0</td> |
|
<td>acc_norm</td> |
|
<td>0.5822</td> |
|
<td>± 0.0115</td> |
|
</tr> |
|
<tr> |
|
<td>Winogrande</td> |
|
<td>Winogrande</td> |
|
<td>1</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.5249</td> |
|
<td>± 0.0140</td> |
|
</tr> |
|
</table> |