Data Scientist’s job chiefly deals with and revolves around data, data analysis and modeling. But what comes after modeling? What is the purpose of a saved model? Today we will answer these questions in the simplest way possible — by implementing it.
Deployment is not a simple task and it involves development skills as well as a good awareness of the machine learning model, the data and the task which the model is built for. Deployment requires setting up an environment that is not only capable of producing a prediction for given inputs, but also serving the prediction as a service to those who need it the most.
In this article, we will create a simple Fastai model to predict the price of a used car and we will also deploy the model as a service that can be accessed by users through a browser. We will not focus too much on the efficiency of the algorithm — but we’ll focus on the complete walkthrough on deploying the model in the simplest way possible.
The data used here is taken from MachineHack’s Predicting The Costs Of Used Cars – Hackathon.
To download the dataset, head to MachineHack, sign up and start the course.
Here are the features of the provided dataset:
Size of training set: 6,019 records
Size of test set: 1,234 records
- Name: The brand and model of the car.
- Location: The location in which the car is being sold or is available for purchase.
- Year: The year or edition of the model.
- Kilometers_Driven: The total kilometres driven in the car by the previous owner(s) in KM.
- Fuel_Type: The type of fuel used by the car.
- Transmission: The type of transmission used by the car.
- Owner_Type: Whether the ownership is Firsthand, Second hand or other.
- Mileage: The standard mileage offered by the car company in kmpl or km/kg
- Engine: The displacement volume of the engine in cc.
- Power: The maximum power of the engine in bhp.
- Seats: The number of seats in the car.
- New_Price: The price of a new car of the same model.
- Price: The price of the used car in INR Lakhs.(Target)
Setting Up The Project
Creating A Virtual Environment
We will create a python virtual environment for our project and we will create one using Anaconda.
You can download and install Anaconda here.
Once your Anaconda distribution is installed, a conda virtual environment can be created using the following command (Click here for detailed instructions):
conda create -n yourenvname python=x.x anaconda
The above command may take some time to set up the complete environment. Alternatively, we can make use of the virtualenv module which can be installed using
pip install virtualenv. To create a virtual environment, execute the following commands at your terminal in a directory where you want the project to reside:
#Activates your anaconda environment:
#Check your environment
If all of the above modules belong to anaconda distribution we can proceed:
#Creating a virtual environment
The above command will create a new directory with necessary python dependencies.
Now that we have our virtual environment we will deactivate Anaconda and activate the new environment.
#Move in to the virtual environment
#type the following command to activate the virtual environment
See the image below for reference:
It’s highly recommended to use the Anaconda distribution to create a virtual environment especially for MAC users as there may arise dependency issues while installing packages like Fastai.
Setting Up The Project Structure
Before we begin let’s structure our project directory:
Inside the virtual environment we will have the following files and directories:
- app.py: This is where we will write the Flask API to use our saved model for predicting the cost of used cars and serving it as an API.
- Model: The directory that stores the saved model.pkl file
- requirements.txt: This file contains all the modules required for the project. We will use pip to install the modules from this file.
- resources: This directory will contain all the resources including datasets and notebooks to train,save and export the model into a pickle file.
- template: This contains a template html page.
To create the above structure, type and execute the following commands within your python virtual environment directory.
mkdir model/ resources/ templates/
touch app.py requirements.txt
We will look into each of the above files and directories in the following sections.
Installing The Requirements
We will start by building the very foundation for our project. We will need the following libraries to make this work.Copy the following block in to the requirements.txt file we created above and save it.
To install the dependencies, open the terminal, change directory to the environment and type the following commands :
#activate the environment
When the environment is activated, you will see the environment name at the beginning of each terminal line.
#install the requirements in requirements.txt file using pip
pip install -r requirements.txt
Voila! Our project environment is ready! Now we can make and deploy our model.
Deploying The Model
We will deploy a Machine Learning model to predict the cost of a used car when the user inputs certain information such as the Brand, Year etc.
Building a Machine Learning Model
We tackled the problem in one of our previous articles using Fastai. We will use the same model for our task here. We will build a model, train it and will save and package it by adding just a couple of lines.
Since the solution to Predict The Cost Of Used Cars Hackathon is already explained, we will just go through them briefly.
Download the data sets from MachineHack and move it to the /resources directory in the virtual environment. Now launch the jupyter notebook and create a new notebook called modeling.ipynb
To launch jupyter notebook you must first activate the Anaconda environment by typing conda activate.
All the above processes has been explained in detail here.
After running the above notebook, we will have a model.pkl file and a models folder in the resources directory.
Just copy the model.pkl file into our virtual environment’s model/ directory.
Great! We have our model ready and we can start writing our API.
We can now go back to our project directory and start writing the flask API in the app.py file we created earlier.
Setting the Working Directories
cwd = os.getcwd()
path = cwd + '/model'
Initializing Flask API
app = Flask(__name__)
Loading The ML Model
#Loading the saved model using Fastai's load_learner method
model = load_learner(path, 'model.pkl')
A Home Page For The Web Service
#Defining the home page for the web service
API For Prediction
This block of code provides the fundamental functionality for our application which is to predict using the loaded machine learning model for the inputs provided by the user and then return the prediction to the user.
The code block defines a method that gets the values provided by the user using the request module, converts the inputs into a pandas series (this is done as Fastai learner can directly predict from a pandas series) and uses the loaded model to predict the price. The result is then returned to the HTML page using render_template method.
The API is now ready to serve.
Before serving our model, we will create a very simple HTML page to collect inputs for the model from users. Go to the templates directory that we created in our project environment and create an index.html file.
Copy the following contents into the index.html file and save the changes.
This is what the page looks like:
All Ready To Serve
Open up the terminal and change directory to the project environment. Type and execute the following commands:
Open up a browser and head to http://127.0.0.1:5000/
Let’s try it out:
All the source codes can be found here.