Skip to content
This repository has been archived by the owner on Oct 24, 2024. It is now read-only.

Support Huggingface Inference API #140

Open
sjdthree opened this issue Jul 23, 2023 · 9 comments
Open

Support Huggingface Inference API #140

sjdthree opened this issue Jul 23, 2023 · 9 comments

Comments

@sjdthree
Copy link

In addition to openai, i would like to add the ability to call a model via huggingface inference API

This would allow the deployer to select from all the models on hf, including the new well-performing open source version of Llama, Llama2

It needs the huggingface api key, similar to openai.

Here is sample (untested) code using axios to fetch results from a "gpt2" model via huggingface API:

import { useEffect, useState } from 'react';
import axios from 'axios';

export default function Home() {
  const [output, setOutput] = useState(null);

  useEffect(() => {
    const callHuggingFaceAPI = async () => {
      try {
        const response = await axios.post('https://api-inference.huggingface.co/models/gpt2', {
          inputs: 'Hello, world!'
        }, {
          headers: {
            'Authorization': 'Bearer YOUR_HUGGINGFACE_API_TOKEN',
            'Content-Type': 'application/json'
          }
        });

        setOutput(response.data[0].generated_text);
      } catch (error) {
        console.error('Failed to call Hugging Face API:', error);
      }
    };

    callHuggingFaceAPI();
  }, []);

  return (
    <div>
      <h1>Hugging Face API Output:</h1>
      <p>{output}</p>
    </div>
  );
}
@sjdthree
Copy link
Author

the Huggingface_hub integration via langchain might provide an alternate route.
https://python.langchain.com/docs/modules/model_io/models/llms/integrations/huggingface_hub

@sjdthree
Copy link
Author

Here is some typescript / nextjs code snippets:

Create a new file in the pages/api directory. huggingface.ts:

import { NextApiRequest, NextApiResponse } from 'next';
import axios from 'axios';

export default async function handler(req: NextApiRequest, res: NextApiResponse) {
    if (req.method === 'POST') {
        try {
            const response = await axios.post(
                "https://api-inference.huggingface.co/models/gpt2",
                req.body,
                {
                    headers: { Authorization: `Bearer ${YOUR_HUGGINGFACE_API_TOKEN}` },
                }
            );
            res.status(200).json(response.data);
        } catch (error) {
            res.status(500).json({ error: 'Error calling Hugging Face API' });
        }
    } else {
        res.status(405).json({ error: 'Only POST requests are accepted' });
    }
}

Call this API route from your Next.js pages or components like this:

import axios from 'axios';

async function query(data: string) {
    const response = await axios.post('/api/huggingface', data);
    return response.data;
}

query("Can you please let us know more details about your ").then((response) => {
    console.log(JSON.stringify(response));
});

@miurla
Copy link
Owner

miurla commented Jul 27, 2023

@sjdthree
I researched Huggingface and was able to get it up and running easily. However, to run Llama2 as an API on Huggingface, we need to host the model on our own account.
It seems feasible to implement for developers, but there may be a cost involved to use it on the site. With Replicate(fixed period) or Llama API, it can be provided for free. What do you think?

https://huggingface.co/spaces/ysharma/Explore_llamav2_with_TGI

@sjdthree
Copy link
Author

@sjdthree I researched Huggingface and was able to get it up and running easily. However, to run Llama2 as an API on Huggingface, we need to host the model on our own account. It seems feasible to implement for developers, but there may be a cost involved to use it on the site. With Replicate(fixed period) or Llama API, it can be provided for free. What do you think?

https://huggingface.co/spaces/ysharma/Explore_llamav2_with_TGI

Yes, I like it!

I see Replicate.com as similar to Huggingface with a limited free tier then pay-for-speed / performance, etc. Is there a tier difference I'm missing?

I would strongly support options! so all three: hf, replicate and llama api.

How best to architect to handle these?

@miurla
Copy link
Owner

miurla commented Jul 28, 2023

see Replicate.com as similar to Huggingface with a limited free tier then pay-for-speed / performance, etc. Is there a tier difference I'm missing?

I have no differences in understanding.

I would strongly support options! so all three: hf, replicate and llama api.

Let's support multiple options. On our demo page, we should enable trying Llama2 with Replicate.

How best to architect to handle these?

The LLM calls are made using LangChain, and both HF and Replicate are supported. The selected model from the UI is passed as modelName in all calls. It seems like creating a function that returns an LLM instance based on the modelName would be a good idea.
First, replacing just BabyElfAGI should be sufficient.

@sjdthree
Copy link
Author

This sounds good.

Where would we change the drop-down on the front page?

image

@miurla
Copy link
Owner

miurla commented Jul 28, 2023

Thank you for checking it right away. It's defined in /src/utils/constants.ts.
https://github.com/miurla/babyagi-ui/blob/main/src/utils/constants.ts#L8-L20

The name here is used as a display name. In the LLM invocation, the modelName is passed the id.

@sjdthree
Copy link
Author

Ok sounds good. Did you want me to make the changes and post a PR for your review?

@miurla
Copy link
Owner

miurla commented Jul 28, 2023

I'm glad to hear that! It would be extremely helpful!
Can you please try it once? I’ll support you anytime.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants