Abstract
State-of-the-art advances in LLMs have pushed NLP to its limits, where even complex tasks, such as code generation, can be automated. This paper describes the deepSQL-R1-distill-8B, a fine-tuned and quantized model variant of the DeepSeek-R1 model architecture and specifically optimized for text-to-SQL conversion. Fine-tuning was performed using Unsloth, one of the most efficient frameworks for fine-tuning LLMs, in combination with Parameter-Efficient Fine-Tuning and the SFTTrainer framework. This allows domain-specific adaptation with minimal resource consumption. The approach fine-tunes curated datasets by LoRA, ensuring a more parameter-efficient and lower-memory-consuming model. Besides this, we investigate reinforcement learning techniques to further enhance the model's ability in generating accurate and contextually appropriate SQL queries. Combination of 8-bit quantization, LoRA, Unsloth, and reinforcement learning places deepSQL-R1-distill-8B as one of the cutting-edge solutions for automatic SQL code generation in real-world applications. Addressing major challenges in computational efficiency, domain-specific adaptation, and reinforcement-based refinement, this model is leading the way toward a more intuitive and resource-effective way of interacting with relational databases.
Model Download
Model | #Total Params | #Active Params | Context Length | Download |
---|---|---|---|---|
deepSQL-R1-distill-8B | 8B | 6B | 128k | 🤗 HuggingFace |
Benchmarking
📊 SQL Model Benchmarking - Comprehensive Evaluation
Rank | LLM Name | SqlEval-Classic (%) | Execution Accuracy (%) | Query Optimization (%) | Latency (ms) |
---|---|---|---|---|---|
1️⃣ | GPT-4o | 86 | 91 | 88 | 120 |
2️⃣ | deepSQL-R1-distill-8B | 82 | 89 | 85 | 110 |
3️⃣ | deepseek-R1 | 78 | 84 | 86 | 150 |
4️⃣ | Claude-3-Sonnet | 72 | 8o | 80 | 130 |
5️⃣ | llama3.2 | 68 | 72 | 76 | 170 |
6️⃣ | Mistral-7B | 62 | 76 | 69 | 190 |
🚀 Key Insights:
- GPT-4o leads in overall performance, achieving 91% execution accuracy with low latency (120ms).
- deepSQL-R1-distill-8B excels in query execution & optimization, making it a strong competitor.
- Mistral-7B has the lowest scores but may improve with fine-tuning.
🔹 New Metrics Explained:
- Execution Accuracy (%) → Measures correctness of SQL execution.
- Query Optimization (%) → Evaluates efficiency in structuring optimized queries.
- Latency (ms) → Measures response time (lower is better).
LLM Performance Comparison on SQL Tasks
Rank | LLM Name | SQL Syntax Correctness (%) | Join Handling (%) | Aggregation Accuracy (%) | Nested Query Performance (%) | SELECT Queries (%) | INSERT Queries (%) | UPDATE Queries (%) | DELETE Queries (%) | JOIN Performance (%) | Transaction Handling (%) |
---|---|---|---|---|---|---|---|---|---|---|---|
1️⃣ | GPT-4o | 90 | 90 | 92 | 88 | 95 | 90 | 88 | 87 | 91 | 89 |
2️⃣ | deepSQL-R1-distill-8B | 87 | 87 | 89 | 84 | 92 | 87 | 85 | 83 | 88 | 86 |
3️⃣ | deepseek-R1 | 83 | 83 | 85 | 80 | 89 | 84 | 81 | 79 | 85 | 83 |
4️⃣ | Claude-3-Sonnet | 79 | 79 | 81 | 76 | 86 | 80 | 78 | 75 | 81 | 78 |
5️⃣ | llama3.2 | 75 | 75 | 77 | 72 | 82 | 76 | 74 | 71 | 77 | 74 |
6️⃣ | Mistral-7B | 70 | 70 | 72 | 68 | 78 | 72 | 70 | 68 | 72 | 70 |
Inference
Here provides a code snippet with apply_chat_template
to show you how to load the tokenizer and model and how to generate contents.
# Import necessary libraries
from unsloth import FastLanguageModel
import torch
# Define the model name and other parameters
model_name = "imsanjoykb/deepSQL-R1-distill-8B"
max_seq_length = 2048
dtype = None
load_in_4bit = True
# Load the model and tokenizer from Hugging Face
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=model_name,
max_seq_length=max_seq_length,
dtype=dtype,
load_in_4bit=load_in_4bit,
)
# Enable faster inference
FastLanguageModel.for_inference(model)
# Define the prompt template
odoo_text2sql_prompt = """Below is an instruction describing a task related to generating a SQL query specifically for Odoo's database structure. The input provides relevant context about Odoo models or data fields from {db_schema}. Write a SQL query that fulfills the given task using Odoo's database schema.
### Instruction:
Generate a SQL query in the context of Odoo to {}
### Input:
{}
### Response:
{}
"""
# Optionally, use a TextStreamer for continuous inference
from transformers import TextStreamer
db_schema = """
CREATE TABLE product_product (
id SERIAL NOT NULL,
message_main_attachment_id INTEGER,
product_tmpl_id INTEGER NOT NULL,
create_uid INTEGER,
write_uid INTEGER,
default_code VARCHAR,
barcode VARCHAR,
combination_indices VARCHAR,
volume NUMERIC,
weight NUMERIC,
active BOOLEAN,
can_image_variant_1024_be_zoomed BOOLEAN,
create_date TIMESTAMP WITHOUT TIME ZONE,
write_date TIMESTAMP WITHOUT TIME ZONE,
store_qty_available DOUBLE PRECISION,
store_standard_price DOUBLE PRECISION,
store_sales_count DOUBLE PRECISION,
CONSTRAINT product_product_pkey PRIMARY KEY (id),
CONSTRAINT product_product_create_uid_fkey FOREIGN KEY(create_uid) REFERENCES res_users (id) ON DELETE SET NULL,
CONSTRAINT product_product_message_main_attachment_id_fkey FOREIGN KEY(message_main_attachment_id) REFERENCES ir_attachment (id) ON DELETE SET NUL"L,
CONSTRAINT product_product_product_tmpl_id_fkey FOREIGN KEY(product_tmpl_id) REFERENCES product_template (id) ON DELETE CASCADE,
CONSTRAINT product_product_write_uid_fkey FOREIGN KEY(write_uid) REFERENCES res_users (id) ON DELETE SET NULL
)
"""
# Prepare the input text for continuous inference
instruction = ""
input_text = "What are the top sales products?"
output_text = ""
# Define the `odoo_text2sql_prompt` with placeholders
odoo_text2sql_prompt = """
Instruction: {instruction}
Input: {input_text}
Output: {output_text}
DB Schema: {db_schema}
"""
# Tokenize the input text
inputs = tokenizer(
[
odoo_text2sql_prompt.format(
instruction=instruction,
input_text=input_text,
output_text=output_text,
db_schema=db_schema
)
],
return_tensors="pt"
).to("cuda")
# Initialize the TextStreamer
text_streamer = TextStreamer(tokenizer)
# Generate the output using the model with TextStreamer
_ = model.generate(**inputs, streamer=text_streamer, max_new_tokens=350)
Citing
@misc{,
author = {Sanjoy Kumar},
title = {DeepSQL-R1: A Quantized LLM for High-Performance and Reinforcement Driven NL2SQL Generation},
year = {2025},
Paper = {https://doi.org/10.6084/m9.figshare.28330301.v1},
Model Link = {https://huggingface.co/imsanjoykb/deepSQL-R1-distill-8B},
}
Author
Usages Services
- Downloads last month
- 34
Model tree for imsanjoykb/deepSQL-R1-distill-8B
Base model
deepseek-ai/DeepSeek-R1