# Serving Scikit-Learn models Out of the box, `mlserver` supports the deployment and serving of `scikit-learn` models. By default, it will assume that these models have been [serialised using `joblib`](https://scikit-learn.org/stable/modules/model_persistence.html). In this example, we will cover how we can train and serialise a simple model, to then serve it using `mlserver`. ## Training The first step will be to train a simple `scikit-learn` model. For that, we will use the [MNIST example from the `scikit-learn` documentation](https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html) which trains an SVM model. ```python # Original source code and more details can be found in: # https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html # Import datasets, classifiers and performance metrics from sklearn import datasets, svm, metrics from sklearn.model_selection import train_test_split # The digits dataset digits = datasets.load_digits() # To apply a classifier on this data, we need to flatten the image, to # turn the data in a (samples, feature) matrix: n_samples = len(digits.images) data = digits.images.reshape((n_samples, -1)) # Create a classifier: a support vector classifier classifier = svm.SVC(gamma=0.001) # Split data into train and test subsets X_train, X_test, y_train, y_test = train_test_split( data, digits.target, test_size=0.5, shuffle=False) # We learn the digits on the first half of the digits classifier.fit(X_train, y_train) ``` ### Saving our trained model To save our trained model, we will serialise it using `joblib`. While this is not a perfect approach, it's currently the recommended method to persist models to disk in the [`scikit-learn` documentation](https://scikit-learn.org/stable/modules/model_persistence.html). Our model will be persisted as a file named `mnist-svm.joblib` ```python import joblib model_file_name = "mnist-svm.joblib" joblib.dump(classifier, model_file_name) ``` ## Serving Now that we have trained and saved our model, the next step will be to serve it using `mlserver`. For that, we will need to create 2 configuration files: - `settings.json`: holds the configuration of our server (e.g. ports, log level, etc.). - `model-settings.json`: holds the configuration of our model (e.g. input type, runtime to use, etc.). ### `settings.json` ```python %%writefile settings.json { "debug": "true" } ``` ### `model-settings.json` ```python %%writefile model-settings.json { "name": "mnist-svm", "implementation": "mlserver_sklearn.SKLearnModel", "parameters": { "uri": "./mnist-svm.joblib", "version": "v0.1.0" } } ``` ### Start serving our model Now that we have our config in-place, we can start the server by running `mlserver start .`. This needs to either be ran from the same directory where our config files are or pointing to the folder where they are. ```shell mlserver start . ``` Since this command will start the server and block the terminal, waiting for requests, this will need to be ran in the background on a separate terminal. ### Send test inference request We now have our model being served by `mlserver`. To make sure that everything is working as expected, let's send a request from our test set. For that, we can use the Python types that `mlserver` provides out of box, or we can build our request manually. ```python import requests x_0 = X_test[0:1] inference_request = { "inputs": [ { "name": "predict", "shape": x_0.shape, "datatype": "FP32", "data": x_0.tolist() } ] } endpoint = "http://localhost:8080/v2/models/mnist-svm/versions/v0.1.0/infer" response = requests.post(endpoint, json=inference_request) response.json() ``` As we can see above, the model predicted the input as the number `8`, which matches what's on the test set. ```python y_test[0] ``` ```python ```