Spaces:

huggingface
/

HuggingDiscussions

Running

App Files Files Community

[FEEDBACK] Inference Providers

#49

by julien-c HF staff - opened 20 days ago

Discussion

julien-c

Hugging Face org 20 days ago

Any inference provider you love, and that you'd like to be able to access directly from the Hub?

reach-vb

Hugging Face org 9 days ago

•

edited 9 days ago

Love that I can call DeepSeek R1 directly from the Hub 🔥

from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="together",
    api_key="xxxxxxxxxxxxxxxxxxxxxxxx"
)

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?"
    }
]

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1", 
    messages=messages, 
    max_tokens=500
)

print(completion.choices[0].message)

benhaotang

9 days ago

•

edited 9 days ago

Is it possible to set a monthly payment budget or rate limits for all the external providers? I don't see such options in billings tab. In case a key is or session token is stolen, it can be quite dangerous to my thin wallet:(

julien-c

Hugging Face org 9 days ago

@benhaotang you already get spending notifications when crossing important thresholds ($10, $100, $1,000) but we'll add spending limits in the future

benhaotang

9 days ago

•

edited 9 days ago

@benhaotang you already get spending notifications when crossing important thresholds ($10, $100, $1,000) but we'll add spending limits in the future

Thanks for your quick reply, good to know!

sylanaustin

9 days ago

Would be great if you could add Nebius AI Studio to the list :) New inference provider on the market, with the absolute cheapest prices and the highest rate limits...

Hazzzardous

9 days ago

Could be good to add featherless.ai

teentitan

9 days ago

TitanML !!

57 hidden messages

Expand all

bharatcoder

2 days ago

Is there a page, where I can filter out models being supported by different providers? Currently one needs to go inside the model card, of every model to find out.

gulyasdavid1999

2 days ago

HuggingFace, please, provide as a documentation, that exactly tells us how much an Interference costs. (Example: Stable Diffusion HF Interference 0.001$/image etc.)

Moibe

2 days ago

I have been using this new feature with different providers to run Flux and I really like it a lot! But today february 4 it changed from 20,000 daily inferences to $2 dollars montly (200 inferences). I know its fair, and I actually don't use 20,000 inferences, not even 20 daily. But the change from having 20,000 daily inferences to having 6 daily, is shocking, you could have make it gradually or not even giving 20,000 daily since the beggining. It's hard to asimilate that sudden change 😭

julien-c

Hugging Face org about 4 hours ago

HuggingFace, please, provide as a documentation, that exactly tells us how much an Interference costs. (Example: Stable Diffusion HF Interference 0.001$/image etc.)

this is in progress, currently working on it

julien-c

Hugging Face org about 4 hours ago

•

edited about 4 hours ago

@Moibe do you use HF-inference, or external providers?

bharatcoder

about 3 hours ago

I have been using this new feature with different providers to run Flux and I really like it a lot! But today february 4 it changed from 20,000 daily inferences to $2 dollars montly (200 inferences). I know its fair, and I actually don't use 20,000 inferences, not even 20 daily. But the change from having 20,000 daily inferences to having 6 daily, is shocking, you could have make it gradually or not even giving 20,000 daily since the beggining. It's hard to asimilate that sudden change 😭

Right, it was a bit shocking. I would always like to use the HF-inference, (as my first choice). Initially I thought that those models that have HF-inference will be consuming from the existing quota (of 20,000). Only the 3rd party ones we have leverage of $2 for pro members. I agree, that 20,000 is bit too much, but the change is rather drastic.

julien-c

Hugging Face org about 3 hours ago

let me run the numbers, but for HF-inference the change should not be that drastic (most HF-inference requests are priced very very cheaply especially of course CPU based models)

The 20,000 daily figure was a bit unrealistic given it was "best-effort" meaning the rate of failures was quite high. Note that from now on, we exclude failing requests from those counts (we didn't up till now)

levalencia

about 1 hour ago

Any inference provider you love, and that you'd like to be able to access directly from the Hub?

I tested it and I love it, super easy!

The first question from my company was:
With inference providers can we setup something like private endpoints?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment