Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,97 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
base_model:
|
4 |
+
- Qwen/Qwen2-7B-Instruct
|
5 |
+
tags:
|
6 |
+
- medical
|
7 |
+
---
|
8 |
+
|
9 |
+
# Diabetica-7B
|
10 |
+
|
11 |
+
<div align="center">
|
12 |
+
<h2>
|
13 |
+
An adapted large language model facilitates multiple medical tasks in diabetes care
|
14 |
+
</h2>
|
15 |
+
|
16 |
+
</div>
|
17 |
+
|
18 |
+
<p align="center">
|
19 |
+
<a href="https://github.com/waltonfuture/Diabetica" target="_blank">Code</a> |<a href="https://arxiv.org/pdf/2409.13191" target="_blank">Paper</a> <br>
|
20 |
+
</p>
|
21 |
+
|
22 |
+
## Introduction
|
23 |
+
|
24 |
+
Hello! Welcome to the huggingface repository for [Diabetica](https://arxiv.org/pdf/2409.13191).
|
25 |
+
|
26 |
+
Our study introduced a reproducible framework for developing a specialized LLM capable of handling various diabetes tasks. We present three key contributions:
|
27 |
+
|
28 |
+
- High-performance domain-specific model: Compared with previous generic LLMs, our model Diabetica, showed superior performance across a broad range of diabetes-related tasks, including diagnosis, treatment recommendations, medication management, lifestyle advice, patient education, and so on.
|
29 |
+
|
30 |
+
- Reproducible framework: We offered a detailed method for creating specialized medical LLMs using open-source models, curated disease-specific datasets, and fine-tuning techniques. This approach can be adapted to other medical fields, potentially accelerating AI-assisted care development.
|
31 |
+
|
32 |
+
- Comprehensive evaluation: We designed comprehensive benchmarks and conducted clinical trials to validate the model's effectiveness in clinical applications. This ensured our model's practical utility and sets a new standard for evaluating AI tools in diabetes care.
|
33 |
+
|
34 |
+
Please refer to our [GitHub Repo](https://github.com/waltonfuture/Diabetica) for more details.
|
35 |
+
|
36 |
+
|
37 |
+
|
38 |
+
## Model Inference
|
39 |
+
|
40 |
+
```bash
|
41 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
42 |
+
import torch
|
43 |
+
|
44 |
+
device = "cuda" # the device to load the model onto
|
45 |
+
model_path = 'WaltonFuture/Diabetica-7B'
|
46 |
+
|
47 |
+
model = AutoModelForCausalLM.from_pretrained(
|
48 |
+
model_path,
|
49 |
+
torch_dtype="auto",
|
50 |
+
device_map="auto"
|
51 |
+
)
|
52 |
+
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
53 |
+
|
54 |
+
def model_output(content):
|
55 |
+
messages = [
|
56 |
+
{"role": "system", "content": "You are a helpful assistant."},
|
57 |
+
{"role": "user", "content": content}
|
58 |
+
]
|
59 |
+
text = tokenizer.apply_chat_template(
|
60 |
+
messages,
|
61 |
+
tokenize=False,
|
62 |
+
add_generation_prompt=True
|
63 |
+
)
|
64 |
+
model_inputs = tokenizer([text], return_tensors="pt").to(device)
|
65 |
+
generated_ids = model.generate(
|
66 |
+
model_inputs.input_ids,
|
67 |
+
max_new_tokens=2048,
|
68 |
+
do_sample=True,
|
69 |
+
)
|
70 |
+
generated_ids = [
|
71 |
+
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
|
72 |
+
]
|
73 |
+
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
74 |
+
return response
|
75 |
+
|
76 |
+
prompt = "Hello! Please tell me something about diabetes."
|
77 |
+
|
78 |
+
response = model_output(prompt)
|
79 |
+
print(response)
|
80 |
+
```
|
81 |
+
|
82 |
+
|
83 |
+
|
84 |
+
|
85 |
+
## Citation
|
86 |
+
```
|
87 |
+
@misc{wei2024adaptedlargelanguagemodel,
|
88 |
+
title={An adapted large language model facilitates multiple medical tasks in diabetes care},
|
89 |
+
author={Lai Wei and Zhen Ying and Muyang He and Yutong Chen and Qian Yang and Yanzhe Hong and Jiaping Lu and Xiaoying Li and Weiran Huang and Ying Chen},
|
90 |
+
year={2024},
|
91 |
+
eprint={2409.13191},
|
92 |
+
archivePrefix={arXiv},
|
93 |
+
primaryClass={cs.CL},
|
94 |
+
url={https://arxiv.org/abs/2409.13191},
|
95 |
+
}
|
96 |
+
```
|
97 |
+
|