Deploy a Hugging Face model

Learn how to deploy a HuggingFace model in minutes


In this example we will show how to deploy a Hugging Face model to SlashMLO. We will use the Hugging Face Transformers and deploy it to SlashML. Then we will use the endpoint exposed by SlashML to make predictions on our model.

Import Dependencies

from slashml import ModelDeployment
import time
 
# you might have to install transfomers and torch
from transformers import pipeline

Train Model

def train_model():
    # Bring in model from huggingface
    return pipeline('fill-mask', model='bert-base-uncased')
 
my_model = train_model()

Deploy Model

# Replace `API_KEY` with your SlasML API token. Fetch the api key from https://slashml.com/settings/api-key
 
model = ModelDeployment(api_key=None)
 
# deploy model
response = model.deploy(model_name='my_model_3', model=my_model)

Wait For Results

# wait for it to be deployed
time.sleep(2)
status = model.status(model_version_id=response.id)
 
while status.status != 'READY':
    print(f'status: {status.status}')
    print('trying again in 5 seconds')
    time.sleep(5)
    status = model.status(model_version_id=response.id)
 
    if status.status == 'FAILED':
        raise Exception('Model deployment failed')

Model Inference

# submit prediction
input_text = 'Steve jobs is the [MASK] of Apple.'
prediction = model.predict(model_version_id=response.id, model_input=input_text)
print(prediction)