# Model Repository API

MLServer supports loading and unloading models dynamically from a models repository.
This allows you to enable and disable the models accessible by MLServer on demand.
This extension builds on top of the support for [Multi-Model Serving](../mms/README.md), letting you change at runtime which models is MLServer currently serving.

The API to manage the model repository is modelled after [Triton's Model Repository extension](https://github.com/triton-inference-server/server/blob/master/docs/protocol/extension_model_repository.md) to the V2 Dataplane and is thus fully compatible with it.

This notebook will walk you through an example using the Model Repository API.


## Training

First of all, we will need to train some models.
For that, we will re-use the models we trained previously in the [Multi-Model Serving example](../mms/README.md).
You can check the details on how they are trained following that notebook.


```python
!cp -r ../mms/models/* ./models
```

## Serving

Next up, we will start our `mlserver` inference server.
Note that, by default, this will **load all our models**.

```shell
mlserver start .
```

## List available models

Now that we've got our inference server up and running, and serving 2 different models, we can start using the Model Repository API.
To get us started, we will first list all available models in the repository.


```python
import requests

response = requests.post("http://localhost:8080/v2/repository/index", json={})
response.json()
```

As we can, the repository lists 2 models (i.e. `mushroom-xgboost` and `mnist-svm`).
Note that the state for both is set to `READY`.
This means that both models are loaded, and thus ready for inference.

## Unloading our `mushroom-xgboost` model

We will now try to unload one of the 2 models, `mushroom-xgboost`.
This will unload the model from the inference server but will keep it available on our model repository.


```python
requests.post("http://localhost:8080/v2/repository/models/mushroom-xgboost/unload")
```

If we now try to list the models available in our repository, we will see that the `mushroom-xgboost` model is flagged as `UNAVAILABLE`.
This means that it's present in the repository but it's not loaded for inference.


```python
response = requests.post("http://localhost:8080/v2/repository/index", json={})
response.json()
```

## Loading our `mushroom-xgboost` model back

We will now load our model back into our inference server.


```python
requests.post("http://localhost:8080/v2/repository/models/mushroom-xgboost/load")
```

If we now try to list the models again, we will see that our `mushroom-xgboost` is back again, ready for inference.


```python
response = requests.post("http://localhost:8080/v2/repository/index", json={})
response.json()
```


```python

```