# Deployment with Seldon Core MLServer is used as the [core Python inference server](https://docs.seldon.io/projects/seldon-core/en/latest/graph/protocols.html#v2-kfserving-protocol) in [Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/index.html). Therefore, it should be straightforward to deploy your models either by using one of the [built-in pre-packaged servers](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/overview.html#two-types-of-model-servers) or by pointing to a [custom image of MLServer](../../runtimes/custom). ```{note} This section assumes a basic knowledge of Seldon Core and Kubernetes, as well as access to a working Kubernetes cluster with Seldon Core installed. To learn more about [Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/) or [how to install it](https://docs.seldon.io/projects/seldon-core/en/latest/nav/installation.html), please visit the [Seldon Core documentation](https://docs.seldon.io/projects/seldon-core/en/latest/index.html). ``` ## Pre-packaged Servers Out of the box, Seldon Core comes a few MLServer runtimes pre-configured to run straight away. This allows you to deploy a MLServer instance by just pointing to where your model artifact is and specifying what ML framework was used to train it. ### Usage To let Seldon Core know what framework was used to train your model, you can use the `implementation` field of your `SeldonDeployment` manifest. For example, to deploy a Scikit-Learn artifact stored remotely in GCS, one could do: ```{code-block} yaml --- emphasize-lines: 6, 11 --- apiVersion: machinelearning.seldon.io/v1 kind: SeldonDeployment metadata: name: my-model spec: protocol: v2 predictors: - name: default graph: name: classifier implementation: SKLEARN_SERVER modelUri: gs://seldon-models/sklearn/iris ``` As you can see highlighted above, all that we need to specify is that: - Our **inference deployment should use the [V2 inference protocol](https://docs.seldon.io/projects/seldon-core/en/latest/reference/apis/v2-protocol.html)**, which is done by **setting the `protocol` field to `kfserving`**. - Our **model artifact is a serialised Scikit-Learn model**, therefore it should be served using the [MLServer SKLearn runtime](../../runtimes/sklearn), which is done by **setting the `implementation` field to `SKLEARN_SERVER`**. Note that, while the `protocol` should always be set to `kfserving` (i.e. so that models are served using the [V2 inference protocol](https://docs.seldon.io/projects/seldon-core/en/latest/reference/apis/v2-protocol.html)), the value of the `implementation` field will be dependant on your ML framework. The valid values of the `implementation` field are [pre-determined by Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/graph/protocols.html#v2-kfserving-protocol). However, it should also be possible to [configure and add new ones](https://docs.seldon.io/projects/seldon-core/en/latest/servers/custom.html#adding-a-new-inference-server) (e.g. to support a [custom MLServer runtime](../../runtimes/custom)). Once you have your `SeldonDeployment` manifest ready, then the next step is to apply it to your cluster. There are multiple ways to do this, but the simplest is probably to just apply it directly through `kubectl`, by running: ```bash kubectl apply -f my-seldondeployment-manifest.yaml ``` To consult the supported values of the `implementation` field where MLServer is used, you can check the support table below. ### Supported Pre-packaged Servers As mentioned above, pre-packaged servers come built-in into Seldon Core. Therefore, only a pre-determined subset of them will be supported for a given release of Seldon Core. The table below shows a list of the currently supported values of the `implementation` field. Each row will also show what ML framework they correspond to and also what MLServer runtime will be enabled internally on your model deployment when used. | Framework | MLServer Runtime | Seldon Core Pre-packaged Server | Documentation | | ------------ | ------------------------------------------------ | ------------------------------- | -------------------------------------------------------------------------------------------- | | Scikit-Learn | [MLServer SKLearn](../../runtimes/sklearn) | `SKLEARN_SERVER` | [SKLearn Server](https://docs.seldon.io/projects/seldon-core/en/latest/servers/sklearn.html) | | XGBoost | [MLServer XGBoost](../../runtimes/xgboost) | `XGBOOST_SERVER` | [XGBoost Server](https://docs.seldon.io/projects/seldon-core/en/latest/servers/xgboost.html) | | MLflow | [MLServer MLflow](../../runtimes/mlflow) | `MLFLOW_SERVER` | [MLflow Server](https://docs.seldon.io/projects/seldon-core/en/latest/servers/mlflow.html) | | Tempo | [Tempo](https://tempo.readthedocs.io/en/latest/) | `TEMPO_SERVER` | [Tempo Server](https://docs.seldon.io/projects/seldon-core/en/latest/servers/tempo.html) | Note that, on top of the ones shown above (backed by MLServer), Seldon Core **also provides a [wider set](https://docs.seldon.io/projects/seldon-core/en/latest/nav/config/servers.html)** of pre-packaged servers. To check the full list, please visit the [Seldon Core documentation](https://docs.seldon.io/projects/seldon-core/en/latest/nav/config/servers.html). ## Custom Runtimes There could be cases where the pre-packaged MLServer runtimes supported out-of-the-box in Seldon Core may not be enough for our use case. The framework provided by MLServer makes it easy to [write custom runtimes](../../runtimes/custom), which can then get packaged up as images. These images then become self-contained model servers with your custom runtime. Therefore Seldon Core makes it as easy to deploy them into your serving infrastructure. ### Usage The `componentSpecs` field of the `SeldonDeployment` manifest will allow us to let Seldon Core know what image should be used to serve a custom model. For example, if we assume that our custom image has been tagged as `my-custom-server:0.1.0`, we could write our `SeldonDeployment` manifest as follows: ```{code-block} yaml --- emphasize-lines: 6, 15 --- apiVersion: machinelearning.seldon.io/v1 kind: SeldonDeployment metadata: name: my-model spec: protocol: v2 predictors: - name: default graph: name: classifier componentSpecs: - spec: containers: - name: classifier image: my-custom-server:0.1.0 ``` As we can see highlighted on the snippet above, all that's needed to deploy a custom MLServer image is: - Letting Seldon Core know that the model deployment will be served through the [V2 inference protocol](https://docs.seldon.io/projects/seldon-core/en/latest/reference/apis/v2-protocol.html)) by setting the `protocol` field to `v2`. - Pointing our model container to use our **custom MLServer image**, by specifying it on the `image` field of the `componentSpecs` section of the manifest. Once you have your `SeldonDeployment` manifest ready, then the next step is to apply it to your cluster. There are multiple ways to do this, but the simplest is probably to just apply it directly through `kubectl`, by running: ```bash kubectl apply -f my-seldondeployment-manifest.yaml ```