--- license: llama3.1 datasets: - agentlans/crash-course base_model: - agentlans/Llama3.1-SuperDeepFuse model-index: - name: Llama3.1-SuperDeepFuse-CrashCourse12K results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: wis-k/instruction-following-eval split: train args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 71.87 name: averaged accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse-CrashCourse12K name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: SaylorTwift/bbh split: test args: num_few_shot: 3 metrics: - type: acc_norm value: 31.83 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse-CrashCourse12K name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: lighteval/MATH-Hard split: test args: num_few_shot: 4 metrics: - type: exact_match value: 17.67 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse-CrashCourse12K name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa split: train args: num_few_shot: 0 metrics: - type: acc_norm value: 8.39 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse-CrashCourse12K name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 8.6 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse-CrashCourse12K name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 29.24 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FLlama3.1-SuperDeepFuse-CrashCourse12K name: Open LLM Leaderboard --- # Llama3.1-SuperDeepFuse-CrashCourse12K Llama3.1-SuperDeepFuse-CrashCourse12K is an 8B parameter language model based on [Llama3.1-SuperDeepFuse](https://huggingface.co/agentlans/Llama3.1-SuperDeepFuse) and further fine-tuned on [agentlans/crash-course](https://huggingface.co/datasets/agentlans/crash-course). ## Model Details - **Base Model**: Llama3.1-SuperDeepFuse (8B parameters) - **Fine-tuning Dataset**: 12 000 samples from agentlans/crash-course (containing samples from 10 high-quality instruct datasets) - **Model Type**: Instruction-tuned language model - **Language(s)**: Multilingual - **License**: Follows standard Llama 3.1 usage terms ## Training Procedure ### Fine-tuning - **Method**: LoRA (Low-Rank Adaptation) - **Optimizer**: AdamW - **Learning Rate**: 5e-5 - **Batch Size**: 2 per device - **Gradient Accumulation Steps**: 8 - **Training Epochs**: 1 - **Max Sequence Length**: 2048 - **LoRA Configuration**: - Rank: 8 - Alpha: 16 - Dropout: 0.5 - Target: all layers - **Quantization**: 4-bit (bitsandbytes) - **Precision**: BF16 - **Other Techniques**: NEFTune (noise alpha: 5), RS-LoRA ## Performance and Limitations This model potentially offers: - Enhanced multi-task reasoning - Improved performance in mathematics and coding tasks - Better instruction-following abilities However: - Performance may be limited compared to larger model variants - Can produce misleading or incorrect outputs - Outputs should be independently verified for critical applications ## Additional Information - For the original model, see [agentlans/Llama3.1-SuperDeepFuse](https://huggingface.co/agentlans/Llama3.1-SuperDeepFuse) - For the base Llama 3.1 model, including training data and model architecture, refer to the original [Llama 3.1](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) model card. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/agentlans__Llama3.1-SuperDeepFuse-CrashCourse12K-details)! Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=agentlans%2FLlama3.1-SuperDeepFuse-CrashCourse12K&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)! | Metric |Value (%)| |-------------------|--------:| |**Average** | 27.93| |IFEval (0-Shot) | 71.87| |BBH (3-Shot) | 31.83| |MATH Lvl 5 (4-Shot)| 17.67| |GPQA (0-shot) | 8.39| |MuSR (0-shot) | 8.60| |MMLU-PRO (5-shot) | 29.24|