🧪 Experimental
An attempt to recover intelligence with a quick train, results are meh
Dumpling-Mistral-Nemo-8B
nbeerbower/mistral-nemo-kartoffel-PRUNE3 finetuned on:
- nbeerbower/GreatFirewall-DPO
- nbeerbower/Schule-DPO
- nbeerbower/Purpura-DPO
- nbeerbower/Arkhaios-DPO
- jondurbin/truthy-dpo-v0.1
- antiven0m/physical-reasoning-dpo
- flammenai/Date-DPO-NoAsterisks
- flammenai/Prude-Phi3-DPO
- Atsunori/HelpSteer2-DPO (1,000 samples)
- jondurbin/gutenberg-dpo-v0.1
- nbeerbower/gutenberg2-dpo
- nbeerbower/gutenberg-moderne-dpo.
Method
QLoRA ORPO tune with 2x RTX 3090 for 2 epochs.
- Downloads last month
- 24
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.