How to turn off the r1 mode when running it with huggingface api?

by securealex - opened 13 days ago

13 days ago

The output is very long and includes the model's thought process. This results in unnecessarily lengthy responses that are often not usable.

MiaoCata

11 days ago

•

edited 11 days ago

Same. If only the model could think in brief.

joaomsimoes

10 days ago

Just an opinion but the only way is for you to prompt it not to think (unlikely that it will work) or to finetune to your dataset. When the model does not do any reason it outputs Answer... You should replicate this behavior with your data

publicmutiny

7 days ago

•

edited 7 days ago

I'm also having this issue. It takes a long time to run the model on my hardware, but if I set max_tokens lower then all I get is 5-6 paragraphs of 'thinking' and no actual answer. Surely there is a way to 'turn this off'?

Please let me know if anyone comes up with a decent idea! :)

ryanmcdonough

4 days ago

Don't use a reasoning model then

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment