โป ORPO ํ์ต ๊ณผ์ ์์ ํ ํ๋ฆฟ์ ๋ฌธ์ ๊ฐ ์์์ต๋๋ค. ์ฌํ์ต ์์ ์ ๋๋ค.
์ค๋ช
- ์คํ์ผ ๋ฐ ๋จ์ด์ฅ์ ์ธํ ํ ์ ์๋ ๋ก์ปฌ ์->ํ ๋ฒ์ญ ๋ชจ๋ธ.
๋ค์ ํ
์คํธ๋ฅผ ํ๊ตญ์ด๋ก ๋ฒ์ญํด ์ฃผ์ธ์.
๋ฒ์ญ ์คํ์ผ: ์ผ๋ฐ ๋์ค, ๋ฐ๋ง, ๋
ธ๋ ๊ฐ์ฌ, ๋ถ๋๋ฌ์, ~ํ๋ค
๋จ์ด์ฅ: {'the director':'๋์ฐฝ์ญ', 'back to normal':'์ ์ํ'}
The director finally gets Maple back to normal.
# ์ถ๋ ฅ
๋์ฐฝ์ญ์ ๋ง์นจ๋ด ๋ฉ์ดํ์ ์ ์ํ์ํจ๋ค๋ค.'
์คํ์ผ์ ์ ๋ฌ ๋์, ์กด๋๋ง/๋ฐ๋ง ์ฌ๋ถ, ๋ฌธ์ฒด, ์ด๋ฏธ ๋ฑ์ ์ค์ ํ ์ ์์ต๋๋ค.
- ์ ํ: [๋ช ์ฌํ(Nominal), ํ์๋ฌธ (Declarative), ์๋ฌธ๋ฌธ (Interrogative), ๋ช ๋ น๋ฌธ (Imperative), ๊ฐํ๋ฌธ (Exclamatory), ์ฒญ์ ๋ฌธ (Propositive)]
- ๋์: [์ผ๋ฐ ๋์ค (General), ์ ๋ฌธ๊ฐ ์ง๋จ (Specialist), ์๋ (Children), ๊ฐ์ธ (Individual)]
- ๋ฌธ์ฒด: [๊ฒฉ์์ฒด (Formal), ๋น๊ฒฉ์์ฒด (Informal), ๋ฑ๋ฑํจ (Stiff), ๋ถ๋๋ฌ์ (Soft), ์น๊ทผํจ (Friendly), ์ ์คํจ (Polite)]
- ๋ถ์ผ: [ํ์ ์ (Academic), ๋ฒ๋ฅ ์ (Legal), ์ ๋ฌด์ (Professional), ๊ธฐ์ ์ (Technical), ๋ฌธํ์ (Literary), ์ผ์์ (Casual)]
- ์ดํฌ: [๋ฐ๋ง, ์กด๋๋ง]
- ์ด๋ฏธ: [~๋ค, ~๋๋ค, ~์ค, ~์, ~ํด]
- EXAONE-3.5-7.8B ๋ณธ์ฐ์ ๋ฅ๋ ฅ ๋๋ถ์, ํ์ตํ์ง ์์ ์คํ์ผ ์ค์ ๋ ์ด๋ ์ ๋ ๋ฐ์ํด ์ค๋๋ค.
๋จ์ด์ฅ์ ๊ฒฝ์ฐ, dictionary ํํ๋ก ์ ๊ณต๋์ด์ผ ํ๋ฉฐ, ์ ๋ ฅ๋ ๋จ์ด์ฅ์ ๊ณผ์ ๋ฐ์ํ๋ ๊ฒฝํฅ์ด ์์ผ๋ฏ๋ก ์ ์คํ๊ฒ ์ฌ์ฉํด์ผ ํฉ๋๋ค.
์์ ๋ฒ์ญ ์ฑ๋ฅ ์์ฒด๋ ์์๋ณด๋ค ๋ฎ์ ํธ์ ๋๋ค.
ํ์ต ๋ฐฉ๋ฒ
- SFT
- werty1248/Open-KoEn-Parallel-Style-Tag์ ํฌํจ๋ ๋ฐ์ดํฐ ์์ค ๋ฐ AI Hub์์ ์ด 3M๊ฐ์ ํ์ ๋ฒ์ญ ๋ฐ์ดํฐ ์์ง
- ์ ์ฒด ๋ฐ์ดํฐ์ 10%๋ฅผ ์ถ์ถํ์ฌ werty1248/Open-KoEn-Parallel-Style-Tag ์์ ์๊ฐ๋ ๋ฐฉ๋ฒ๋ก ์ผ๋ก ์คํ์ผ ํ๊ทธ ์์ฑ (300K๊ฐ)
- ์ ์ฒด 3M ๋ฐ์ดํฐ๋ฅผ EXAONE-3.5-7.8B-Instruct ๋ชจ๋ธ ๊ธฐ๋ฐ์ผ๋ก LoRA ํ์ต
Axolotl Config
base_model: beomi/EXAONE-3.5-7.8B-Instruct-Llamafied
model_type: AutoModelForCausalLM
tokenizer_config: beomi/EXAONE-3.5-7.8B-Instruct-Llamafied
tokenizer_type: AutoTokenizer
load_in_8bit: false
load_in_4bit: false
strict: false
datasets:
- path: werty1248/KoEn-Parallel-Full-Conv
field_messages: conversations
train_on_eos: turn
type: chat_template
chat_template: tokenizer_default
dataset_prepared_path: ./data_preparation
output_dir: /workspace/data
hf_use_auth_token: true
sequence_len: 8192
sample_packing: true
pad_to_sequence_len: true
adapter: lora
lora_r: 32
lora_alpha: 16
lora_dropout: 0.1
lora_target_modules:
lora_target_linear: true
lora_fan_in_fan_out:
peft_use_rslora: true
plugins:
- axolotl.integrations.liger.LigerPlugin
liger_rope: true
liger_rms_norm: true
liger_layer_norm: true
liger_glu_activation: true
liger_fused_linear_cross_entropy: true
wandb_project:
#wandb_entity:
#wandb_watch:
wandb_name:
#wandb_log_model:
gradient_accumulation_steps: 2
micro_batch_size: 1
num_epochs: 1
optimizer: paged_ademamix_32bit
lr_scheduler: cosine
learning_rate: 0.000005
weight_decay: 0.1
train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false
gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
warmup_steps: 100
evals_per_epoch: 1
eval_table_size:
deepspeed: ./deepspeed_configs/zero3_bf16.json
- RL
- werty1248/Open-KoEn-Parallel-Style-Glossary-DPO ๋ฐ์ดํฐ๋ฅผ ORPO๋ฅผ ์ฌ์ฉํ์ฌ ํ์ต
- SFT์ ๋ง์ฐฌ๊ฐ์ง๋ก LoRA๋ก ํ์ต์ ์ํํจ
Axolotl Config
โป ์ด๋๋ก ์คํํ๋ฉด ์๋ฌ๊ฐ ๋ฐ์ํฉ๋๋ค. Axolotl์ ORPO์ชฝ ๋ฐ์ดํฐ ๋ก๋ฉ & chat_template ์ฝ๋๋ฅผ ์์ ํด์ ์ฌ์ฉํ์ต๋๋ค.
base_model: werty1248/EXAONE-3.5-7.8B-SFT-Translation-Style-Tag
model_type: AutoModelForCausalLM
tokenizer_config: werty1248/EXAONE-3.5-7.8B-SFT-Translation-Style-Tag
tokenizer_type: AutoTokenizer
load_in_8bit: false
load_in_4bit: false
strict: false
rl: orpo
datasets:
- path: werty1248/Open-KoEn-Parallel-Style-Glossary-DPO
name: wo_system
type: chat_template.default
field_messages: messages
field_chosen: chosen
field_rejected: rejected
message_field_role: role
message_field_content: content
dataset_prepared_path: ./data_preparation
output_dir: /workspace/data
hf_use_auth_token: true
sequence_len: 8192
sample_packing: false
pad_to_sequence_len: true
adapter: lora
lora_r: 8
lora_alpha: 16
lora_dropout: 0.1
lora_target_modules:
lora_target_linear: true
lora_fan_in_fan_out:
peft_use_rslora: true
plugins:
- axolotl.integrations.liger.LigerPlugin
liger_rope: true
liger_rms_norm: true
liger_layer_norm: true
liger_glu_activation: true
liger_fused_linear_cross_entropy: true
wandb_project:
#wandb_entity:
#wandb_watch:
wandb_name:
#wandb_log_model:
gradient_accumulation_steps: 16
micro_batch_size: 1
num_epochs: 1
optimizer: adamw_torch
lr_scheduler: cosine
learning_rate: 0.000005
train_on_inputs: false
group_by_length: false
bf16: auto
gradient_checkpointing: true
flash_attention: true
saves_per_epoch: 1
logging_steps: 1
warmup_steps: 20
#deepspeed: ./deepspeed_configs/zero3_bf16.json
vLLM ์์ (๊ธฐ๋ณธ ๋ฒ์ญ vs ์คํ์ผ ์ถ๊ฐ ๋ฒ์ญ vs ๋จ์ด์ฅ ์ถ๊ฐ ๋ฒ์ญ vs ์คํ์ผ&๋จ์ด์ฅ ์ถ๊ฐ ๋ฒ์ญ)
!pip install vllm
- ์คํ ์ฝ๋
from vllm import LLM, SamplingParams
name = "werty1248/EXAONE-3.5-7.8B-SFT-Translation-Style-Tag-DPO"
llm = LLM(model=name, max_model_len=2048)
sampling_params = SamplingParams(temperature=0, max_tokens=512, stop=['[|assistant|]',]) # ํ
ํ๋ฆฟ ๋ฌธ์
normal_request = """๋ค์ ํ
์คํธ๋ฅผ ํ๊ตญ์ด๋ก ๋ฒ์ญํด ์ฃผ์ธ์.
The director finally gets Maple back to normal."""
style_request = """๋ค์ ํ
์คํธ๋ฅผ ํ๊ตญ์ด๋ก ๋ฒ์ญํด ์ฃผ์ธ์.
๋ฒ์ญ ์คํ์ผ: ์ผ๋ฐ ๋์ค, ๋ฐ๋ง, ๋
ธ๋ ๊ฐ์ฌ, ๋ถ๋๋ฌ์, ~ํ๋ค
The director finally gets Maple back to normal."""
glossary_request = """๋ค์ ํ
์คํธ๋ฅผ ํ๊ตญ์ด๋ก ๋ฒ์ญํด ์ฃผ์ธ์.
๋จ์ด์ฅ: {'the director':'๋์ฐฝ์ญ', 'back to normal':'์ ์ํ'}
The director finally gets Maple back to normal."""
style_glossary_request = """๋ค์ ํ
์คํธ๋ฅผ ํ๊ตญ์ด๋ก ๋ฒ์ญํด ์ฃผ์ธ์.
๋ฒ์ญ ์คํ์ผ: ์ผ๋ฐ ๋์ค, ๋ฐ๋ง, ๋
ธ๋ ๊ฐ์ฌ, ๋ถ๋๋ฌ์, ~ํ๋ค
๋จ์ด์ฅ: {'the director':'๋์ฐฝ์ญ', 'back to normal':'์ ์ํ'}
The director finally gets Maple back to normal."""
input_list = [[{"role":"user","content":normal_request}],
[{"role":"user","content":style_request}],
[{"role":"user","content":glossary_request}],
[{"role":"user","content":style_glossary_request}]]
outputs = llm.chat(input_list, sampling_params)
pred_list = [x.outputs[0].text for x in outputs]
print("์์ด: The director finally gets Maple back to normal.\n")
print("\n".join(pred_list))
- ์คํ ๊ฒฐ๊ณผ
์์ด: The director finally gets Maple back to normal.
๊ฐ๋
์ ๋ง์นจ๋ด ๋ฉ์ดํ์ ์ ์์ผ๋ก ๋๋ ค๋๋๋ค. # normal_request
๊ฐ๋
์ ๋ง์นจ๋ด ๋ฉ์ดํ์ ์ ์์ผ๋ก ๋๋ ค๋๋ค. # style_request
๋์ฐฝ์ญ์ ๋ง์นจ๋ด ๋ฉ์ดํ์ ์ ์ํ์ํต๋๋ค.' # glossary_request
๋์ฐฝ์ญ์ ๋ง์นจ๋ด ๋ฉ์ดํ์ ์ ์ํ์ํจ๋ค๋ค.' # style_glossary_request
- Downloads last month
- 22
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.