Convert Transformers to ONNX with Hugging Face Optimum
There are currently three ways to convert your Hugging Face Transformers models to ONNX. In this section, you will learn how to export distilbert-base-uncased-finetuned-sst-2-english for text-classification using all three methods going from the low-level torch API to the most user-friendly high-level API of optimum. Each method will do exactly the same
Export with torch.onnx (low-level)
torch.onnx enables you to convert model checkpoints to an ONNX graph by the export method. But you have to provide a lot of values like input_names, dynamic_axes, etc.
You’ll first need to install some dependencies:
pip install transformers torch
exporting our checkpoint with export
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
# load model and tokenizer
model_id = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
dummy_model_input = tokenizer("This is a sample", return_tensors="pt")
# export
torch.onnx.export(
model,
tuple(dummy_model_input.values()),
f="torch-model.onnx",
input_names=['input_ids', 'attention_mask'],
output_names=['logits'],
dynamic_axes={'input_ids': {0: 'batch_size', 1: 'sequence'},
'attention_mask': {0: 'batch_size', 1: 'sequence'},
'logits': {0: 'batch_size', 1: 'sequence'}},
do_constant_folding=True,
opset_version=13,
)
Export with transformers.onnx (mid-level)
transformers.onnx enables you to convert model checkpoints to an ONNX graph by leveraging configuration objects. That way you don’t have to provide the complex configuration for dynamic_axes etc.
You’ll first need to install some dependencies:
pip install transformers[onnx] torch
Exporting our checkpoint with the transformers.onnx.
from pathlib import Path
import transformers
from transformers.onnx import FeaturesManager
from transformers import AutoConfig, AutoTokenizer, AutoModelForSequenceClassification
# load model and tokenizer
model_id = "distilbert-base-uncased-finetuned-sst-2-english"
feature = "sequence-classification"
model = AutoModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
# load config
model_kind, model_onnx_config = FeaturesManager.check_supported_model_or_raise(model, feature=feature)
onnx_config = model_onnx_config(model.config)
# export
onnx_inputs, onnx_outputs = transformers.onnx.export(
preprocessor=tokenizer,
model=model,
config=onnx_config,
opset=13,
output=Path("trfs-model.onnx")
)
Export with Optimum (high-level)
Optimum Inference includes methods to convert vanilla Transformers models to ONNX using the ORTModelForXxx classes. To convert your Transformers model to ONNX you simply have to pass from_transformers=True to the from_pretrained() method and your model will be loaded and converted to ONNX leveraging the transformers.onnx package under the hood.
You’ll first need to install some dependencies:
pip install optimum[onnxruntime]
Exporting our checkpoint with ORTModelForSequenceClassification
from optimum.onnxruntime import ORTModelForSequenceClassification
model = ORTModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english",from_transformers=True)
The best part about the conversion with Optimum is that you can immediately use the model to run predictions or load it inside a pipeline.