There's a tremendous rise in machine learning applications lately but are they really useful to the industry? Successful deployments and effective production–level operations lead to determining the actual value of these applications.

According to a survey by Algorithmia, 55% of the companies have never deployed a machine learning model. Moreover, 85% of the models cannot make it to production. Some of the main reasons for this failure are lack of talent, non-availability of processes that can manage change, and absence of automated systems. Hence to tackle these challenges, it is necessary to bring in the technicalities of DevOps and Operations with the machine learning development, which is what MLOps is all about.
MLOps, also known as Machine Learning Operations for Production, is a set of standardized practices that can be utilized to build, deploy, and govern the lifecycle of ML models. In simple words, MLOps are bunch of technical engineering and operational tasks that allows your machine learning model to be used by other users and applications accross the organization.
There are seven stages in a MLOps lifecycle, which executes iteratively and the success of machine learning application depends on the success of these individual steps. The problems faced at one step can cause backtracking to the previous step to check for any bugs introduced. Let’s understand what happens at every step in the MLOps lifecycle:

PyCaret is an open source, low-code machine learning library in Python that allows you to go from preparing your data to deploying your model within minutes in your choice of notebook environment.

MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. MLflow currently offers four components:

It would be easier to understand the MLOps process, pyCaret and MLflow using an example. For this exercise we'll use https://www.kaggle.com/ronitf/heart-disease-uci . This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to this date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Firstly, we'll install pycaret, import libraries and load data:
!pip install pycaret pandas shap
import pandas as pd from pycaret.classification import *
df = pd.read_csv('heart.csv')
df.head()

Common to all modules in PyCaret, the setup is the first and the only mandatory step in any machine learning experiment using PyCaret. This function takes care of all the data preparation required prior to training models. Here We will pass log_experiment = True and experiment_name = 'diamond' , this will tell PyCaret to automatically log all the metrics, hyperparameters, and model artifacts behind the scene as you progress through the modeling phase. This is possible due to integration with MLflow.
cat_features = ['sex', 'cp', 'fbs', 'restecg', 'exang', 'thal'] experiment = setup(df, target='target', categorical_features=cat_features, log_experiment = True, experiment_name = 'diamond')

Now that the data is ready, let's train the model using compare_models function. It will train all the algorithms available in the model library and evaluates multiple performance metrics using k-fold cross-validation.
best_model = compare_models()

Let’s now finalize the best model i.e. train the best model on the entire dataset including the test set and then save the pipeline as a pickle file. save_model function will save the entire pipeline (including the model) as a pickle file on your local disk.
save_model(best_model, model_name='ridge-model')

Remember we passed log_experiment = True in the setup function along with experiment_name = 'diamond' . Now we can initial MLflow UI to see all the logs of all the models and the pipeline.
mlflow ui
Now open your browser and type “localhost:5000”. It will open a UI like this:

Now, we can load this model at any time and test the data on it:
model = load_model('ridge-model')

model.predict(df.tail())

So, that's how an end-to-end machine learning model is saved and deployed and is available to use for industrial purposes.
Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in: