YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quantization made by Richard Erkhov.

Github

Discord

Request more models

Llama-3.2-1B-it-chinese-kyara - AWQ

Original model description:

library_name: transformers license: cc-by-nc-4.0 language: - en - zh base_model: - meta-llama/Llama-3.2-1B-Instruct pipeline_tag: text-generation

Kyara: Knowledge Yielding Adaptive Retrieval Augmentation for LLM Fine-tuning

DOI

๐Ÿค— Hugging Face  ๏ฝœ ๐Ÿš€Github  ๏ฝœ  ๐Ÿ“‘ Paper  ๏ฝœ  ๐Ÿ“– English  ๏ฝœ  ๐Ÿ“– Chinese  ๏ฝœ  ๐Ÿ’ป Kaggle Notebook

kyara

Kyara (Knowledge Yielding Adaptive Retrieval Augmentation) is an experimental project aimed at improving language models through knowledge retrieval processes. The project seeks to enhance the modelโ€™s ability to adapt knowledge and improve language comprehension, particularly in underrepresented languages like Traditional Chinese. Given the relatively scarce availability of Traditional Chinese data compared to the vast corpus of English data used for model training, Kyara addresses this gap by expanding the limited corpus for this language.

This is a preview model, with the stable version set to be released soon.

Benchmark

All evaluations are conducted in a zero-shot setting.

Metric Kyara-1b-it Llama3.2-1b-it
TMMLUPlus 31.92 30.48
โ€ƒ- STEM 32.56 29.74
โ€ƒ- Humanities 30.60 29.89
โ€ƒ- Other 31.08 30.32
โ€ƒ- Social-Science 33.42 31.98
MMLU-Redux 41.40 19.62โบ
GSM8K 31.31 31.61
MATH-L5 5.55 2.91
CRUX 14 11
AlpacaEval 10.79 7.39

โบ: Llama3.2-1b-it appears to have failed to follow the output schema of ZeroEval on MMLU, with 45.28% of examples lacking answers, which has resulted in a lower MMLU score.

Downloads last month
5
Safetensors
Model size
393M params
Tensor type
I32
ยท
FP16
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.