HuggingFace Transformer Pipeline — Vision: How to Use, Deploy and Serve

Jiuhe Wang
2 min readApr 17, 2022

In this tutorial, we will explore how to use Hugging Face pipeline, and how to deploy it with Pinferencia as REST API.

Never use Pinferencia before?
Come on, it’s not too late. Check it out at GitHub now.

Download the model

The model will be automatically downloaded and initialized.

Let’s try a free image from pixabay.com:

Run the prediction by calling the classifier:

Result:

Amazingly easy! Now let’s try to:

Deploy the model

Without deployment, how could a machine learning tutorial be complete?

We will use Pinferencia to deploy the model.

Pinferencia is a great tool for both prototyping and production deployment for any model of any framework. It also has compatible API with other model deployment tools.

First, let’s install Pinferencia.

pip install “pinferencia[uvicorn]”

Now let’s create an app.py file with the codes:

Easy, right?

Predict

Curl

Python Requests

Run python test.py :

Interactive API Page

Even cooler, go to http://127.0.0.1:8000, and you will have an interactive ui from Pinferencia. You can send predict request just there!

--

--