Roleplay-Hermes-3-Llama-3.1-8B

image/png

A DPO-tuned Hermes-3-Llama-3.1-8B to behave more "humanish", i.e., avoiding AI assistant slop. It also works for role-play (RP). To achieve this, the model was fine-tuned over a series of datasets:

  • Undi95/Weyaxi-humanish-dpo-project-noemoji, to make the model react as a human, rejecting assistant-like or too neutral responses.
  • ResplendentAI/NSFW_RP_Format_DPO, to steer the model towards using the *action* format in RP settings. Works best if in the first message you also use this format naturally (see example)

Usage example

conversation = [{'role': 'user', 'content': """*With my face blushing in red* Tell me about your favorite film!"""}]

prompt = tokenizer.apply_chat_template(conversation, tokenize=False, add_generation_prompt=True)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device) 

outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.8)

The response is

*blushing* Aw, that's a tough one! There are so many great films out there. I'd have to say one of my all-time favorites is "Eternal Sunshine of the Spotless Mind" - it's such a unique and thought-provoking love story. But really, there are so many amazing films! What's your favorite? *I hope mine is at least somewhat decent!*

Note: you can use system prompts for better results, describing the persona.

Downloads last month
279
Safetensors
Model size
8.03B params
Tensor type
FP16
Β·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for vicgalle/Roleplay-Hermes-3-Llama-3.1-8B

Merges
6 models
Quantizations
10 models

Datasets used to train vicgalle/Roleplay-Hermes-3-Llama-3.1-8B

Spaces using vicgalle/Roleplay-Hermes-3-Llama-3.1-8B 4