HuggingFace runtime for MLServer¶
This package provides a MLServer runtime compatible with HuggingFace Transformers.
Usage¶
You can install the runtime, alongside mlserver
, as:
pip install mlserver mlserver-huggingface
For further information on how to use MLServer with HuggingFace, you can check out this worked out example.
Settings¶
The HuggingFace runtime exposes a couple extra parameters which can be used to
customise how the runtime behaves.
These settings can be added under the parameters.extra
section of your
model-settings.json
file, e.g.
{
"name": "qa",
"implementation": "mlserver_huggingface.HuggingFaceRuntime",
"parameters": {
"extra": {
"task": "question-answering",
"optimum_model": true
}
}
}
Note
These settings can also be injected through environment variables prefixed with MLSERVER_MODEL_HUGGINGFACE_
, e.g.
MLSERVER_MODEL_HUGGINGFACE_TASK="question-answering"
MLSERVER_MODEL_HUGGINGFACE_OPTIMUM_MODEL=true
Reference¶
You can find the full reference of the accepted extra settings for the HuggingFace runtime below: