How can I get the logits from an endpoint call?

mfixman · August 26, 2024, 1:19pm

I’m attempting to do a query similar to this one using the Huggingface inference endpoints.

api_url = 'https://huggingface.co/proxy/api-inference.huggingface.co/models/meta-llama/Meta-Llama-3.1-70B-Instruct'
headers = {'Authorization': f'Bearer {token}'}
response = requests.post(api_url, headers = headers, json = {'inputs': 'What is the capital of France? The capital of France is : ')

I’m not just looking for the answer, but also for the logits of the generated search: I want to be able to calculate the probability of getting a certain answer.

I can do this with AutoModelForCausalInference, but most big models don’t fit my GPUs (and a HF Pro subscription is cheaper than another A100).

Is there any way to use the API this way?

nielsr · August 26, 2024, 1:33pm

Hi,

According to Detailed parameters, the serverless inference API does not support returning logits.

If you want that, you could define a custom handler on Inference Endpoints which also returns logits besides text.

mfixman · August 26, 2024, 1:56pm

Just checking: can I use the API calls in inference endpoints with logits as part of an experiment I’m doing in my local computer, or do I have to use gradio, Spaces, or some other library like that?

kerrmetric · August 30, 2024, 6:03am

+1 I’d be interested in an easy/standard way to do this as well.

Topic		Replies	Views
How do I get logits from an Inference API Wav2Vec2 model? Inference Endpoints on the Hub	1	86	August 6, 2024
Get logits from Inference API Classification Model (for Regression) 🤗Hub	2	1134	January 24, 2024
Inference Endpoints - No working code examples Inference Endpoints on the Hub	3	232	January 29, 2025
Inference Api ( serverless ) Endpoint Inference Endpoints on the Hub	0	483	April 24, 2024
How to use llm model's api? Beginners	2	5109	November 14, 2024

How can I get the logits from an endpoint call?

Related topics