Model Settings¶
In MLServer, each loaded model can be configured separately. This configuration will include model information (e.g. metadata about the accepted inputs), but also model-specific settings (e.g. number of parallel workers to run inference).
This configuration will usually be provided through a model-settings.json
file which sits next to the model artifacts.
However, it’s also possible to provide this through environment variables
prefixed with MLSERVER_MODEL_
(e.g. MLSERVER_MODEL_IMPLEMENTATION
). Note
that, in the latter case, this environment variables will be shared across all
loaded models (unless they get overriden by a model-settings.json
file).
Additionally, if no model-settings.json
file is found, MLServer will also try
to load a “default” model from these environment variables.
Settings¶
- pydantic settings mlserver.settings.ModelSettings¶
- Config:
extra: str = ignore
env_prefix: str = MLSERVER_MODEL_
env_file: str = .env
- Fields:
- field cache_enabled: bool = False¶
Enable caching for a specific model. This parameter can be used to disable cache for a specific model, if the server level caching is enabled. If the server level caching is disabled, this parameter value will have no effect.
- field implementation_: str [Required]¶
- field inputs: List[MetadataTensor] = []¶
Metadata about the inputs accepted by the model.
- field max_batch_size: int = 0¶
When adaptive batching is enabled, maximum number of requests to group together in a single batch.
- field max_batch_time: float = 0.0¶
When adaptive batching is enabled, maximum amount of time (in seconds) to wait for enough requests to build a full batch.
- field name: str = ''¶
Name of the model.
- field outputs: List[MetadataTensor] = []¶
Metadata about the outputs returned by the model.
- field parameters: ModelParameters | None = None¶
Extra parameters for each instance of this model.
- field platform: str = ''¶
Framework used to train and serialise the model (e.g. sklearn).
- field versions: List[str] = []¶
Versions of dependencies used to train the model (e.g. sklearn/0.20.1).
- model_post_init(context: Any, /) None ¶
This function is meant to behave like a BaseModel method to initialise private attributes.
It takes context as an argument since that’s what pydantic-core passes when calling it.
- Args:
self: The BaseModel instance. context: The context.
- classmethod model_validate(obj: Any) ModelSettings ¶
Validate a pydantic model instance.
- Args:
obj: The object to validate. strict: Whether to enforce types strictly. from_attributes: Whether to extract data from object attributes. context: Additional context to pass to the validator.
- Raises:
ValidationError: If the object could not be validated.
- Returns:
The validated model instance.
- classmethod parse_file(path: str) ModelSettings ¶
- parallel_workers: int | None¶
Data descriptor used to emit a runtime deprecation warning before accessing a deprecated field.
- Attributes:
msg: The deprecation message to be emitted. wrapped_property: The property instance if the deprecated field is a computed field, or None. field_name: The name of the field being deprecated.
- property version: str | None¶
- warm_workers: bool¶
Data descriptor used to emit a runtime deprecation warning before accessing a deprecated field.
- Attributes:
msg: The deprecation message to be emitted. wrapped_property: The property instance if the deprecated field is a computed field, or None. field_name: The name of the field being deprecated.
Extra Model Parameters¶
- pydantic settings mlserver.settings.ModelParameters¶
Parameters that apply only to a particular instance of a model. This can include things like model weights, or arbitrary
extra
parameters particular to the underlying inference runtime. The main difference with respect toModelSettings
is that parameters can change on each instance (e.g. each version) of the model.- Config:
extra: str = allow
env_prefix: str = MLSERVER_MODEL_
env_file: str = .env
- Fields:
- field content_type: str | None = None¶
Default content type to use for requests and responses.
- field environment_path: str | None = None¶
Path to a directory that contains the python environment to be used to load this model.
- field environment_tarball: str | None = None¶
Path to the environment tarball which should be used to load this model.
- field extra: dict | None = {}¶
Arbitrary settings, dependent on the inference runtime implementation.
- field format: str | None = None¶
Format of the model (only available on certain runtimes).
- field uri: str | None = None¶
URI where the model artifacts can be found. This path must be either absolute or relative to where MLServer is running.
- field version: str | None = None¶
Version of the model.