GPT2 — Text Generation Transformer: How to Use & Serve with HuggingFace and Pinferencia

Jiuhe Wang
3 min readApr 18, 2022

What is text generation? Input some texts, and the model will predict what the following texts will be.

Sounds interesting. How can it be interesting without trying out the model by yourself?

GPT2 is one of the models with most downloads on HuggingFace.

HuggingFace makes it easy to use the pretrained model with just several lines.

Pinferencia makes it easy to serve the model with just three extra lines.

How to Use

The model will be downloaded automatically.

That’s it!

Let’s try it out a little bit:

predict("You look amazing today,")

And the result:

Let’s have a look at the first result.

You look amazing today, guys. If you’re still in school and you still have a job where you work in the field… you’re going to look ridiculous by now, you’re going to look really ridiculous.”

He turned to his friends

🤣 That’s the thing we’re looking for! If you run the prediction again, it’ll give different results every time.

How to Deploy

With Pinferencia, just add three more lines and your model goes online!

Never heard of Pinferencia? It’s not late. Go to its GitHub to take a look. Don’t forget to give it a star.

Install Pinferencia

pip install "pinferencia[uvicorn]"

Serve

Just add three lines to our previous codes and save it asapp.py .

Now go to the terminal and run.

uvicorn app:service --reload

Your service is online! Go to http://127.0.0.1:8000 and check out the API.

Test the Service

Curl

Result:

Or just use the interactive UI Pinferencia provides at http://127.0.0.1:

--

--