CALM-405B: The Largest Open-Source Agentic LLM
π Model Overview
CALM-405B is the largest open-source Conversational Agentic Language Model (LLM) ever created. This model sets a new standard in Conversational AI, seamlessly integrating both Task-Oriented Dialogue (TOD) capabilities and Language Agent (LA) functionalities. It is designed to push the boundaries of open-source agentic LLMs, excelling at multi-turn dialogue, tool usage, reasoning, and API execution. It is the best-performing fully open-source LLM on the Berkeley Function Calling Leaderboard V3 (BFCL V3), marking a historic leap in open-source AI research.
Model Sources
- Paper [optional]: [More Information Needed]
- Repository: [More Information Needed]
π Model Details
- Model Name: CALM-405B
- Developed by: Colloboration of UIUC Conversational AI LAB and Oumi
- License: Apache 2.0
- Architecture: Meta-Llama 3.1-405B Instruct
- Training Data: CALM-IT
- Fine-tuning Framework: Oumi
- Training Hardware: 8 NVIDIA H100 GPUs
- Training Duration: ~6.5 days
- Evaluation Benchmarks: MultiWOZ 2.4, BFCL V3, API-Bank
- Release Date: February 5, 2025
π Why CALM-405B is a Game-Changer
- π¨ Largest Open-Source Agentic LLM: A 405B parameter model that brings state-of-the-art agentic capabilities to the public domain.
- π― Best Open-Source Performance on BFCL V3: Outperforms leading proprietary models like GPT-4o, Gemini, and Claude in function-calling tasks.
- π True Zero-Shot Function Calling: Generalizes to unseen API tasks with unmatched accuracy.
- π€ Multi-Turn Dialogue Mastery: Excels at long conversations, task tracking, and complex reasoning.
- π API Tool Use and Reasoning: Makes precise API calls, interprets responses, and synthesizes coherent multi-step solutions.
- π Fully Open-Source & Reproducible: Released under Apache 2.0, including model weights, training logs, and datasets.
π Benchmark Performance
TODO: Add BFCL results
π§ Training Process
Fine-tuning Stages
- TOD Fine-tuning: Optimized for dialogue state tracking (e.g., augmented SNIPS in instruction-tuned format).
- Function Calling Fine-tuning: Trained to generate highly accurate API calls from LA datasets.
- ReAct-based Fine-tuning: Enhances multi-turn conversations with structured thought-action-observation-response reasoning.
Training Hyperparameters
- Base Model: Meta-Llama 3.1-405B Instruct
- LoRA Config: Rank = 16, Scaling Factor = 32
- Batch Size: 2
- Learning Rate: 1e-4
- Optimizer: AdamW (betas = 0.9, 0.999, epsilon = 1e-8)
- Precision: q4
- Warm-up Steps: 500
- Gradient Accumulation Steps: 1
π‘ How to Use CALM-405B
π¨ It requires 16xH100 NVIDIA GPUs for Inference.
π How to Load the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("uiuc-convai/CALM-8B")
model = AutoModelForCausalLM.from_pretrained("uiuc-convai/CALM-8B")
π Example Inference
TODO
More fine-tuning and community-driven optimizations are planned to enhance real-world usability.
π Citation
If you use CALM-405B in your research, please cite:
@article{yourpaper2024,
title={CALM: Conversational Agentic Language Model},
author={Your Name and Collaborators},
journal={Your Conference/Journal},
year={2024}
}
For more details, visit Project Repository or contact [email protected].
- Downloads last month
- 107
Model tree for uiuc-convai/CALM-405B
Base model
meta-llama/Llama-3.1-405B