RegularizedSelfPlay
/

sppo_reversekl-0.05-PromptABC-LLAMA-3-8B-Instruct-SPPO-Iter3

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sppo_reversekl-0.05-PromptABC-LLAMA-3-8B-Instruct-SPPO-Iter3 / model-00004-of-00004.safetensors

Commit History

Upload LlamaForCausalLM

542297e
verified

angelahzyuan commited on 18 days ago