Artificial Intelligence (AI) is no longer just a buzzword but an important part of an organisation\u2019s value chain. Tech giants are pouring in resources and capital to build and achieve state-of-the-art solutions in the AI space for core business functions. This race between tech giants has led to a rise of solutions that has turned AI into a kids playground. \n\nToday AI frameworks are almost accessible to anyone with an internet connection and computer. The frameworks and libraries that have come up in the past years have proven its significance in implementing AI with ease.\n\nContents\n\n \tIntroduction To Fastai\n \tWhy is it important?\n \tMachine Learning with Fastai\n\n \tGetting The Dataset\n \tInstalling fastai\n \tGetting Started With Regression\n\n\n \tComplete Code\n\nIntroduction To Fastai\nIn this article, we will learn about an emerging framework called fastai. Fastai is a deep learning library focused on simplifying the implementation of Deep Learning networks and making it accessible.\n\nThe library comes with support for all major ML models such as vision, text, tabular, and collaborative filtering.\n\nFind the official GitHub repository here.\nWhy is it important?\nIn the world of Machine Learning, everything is about time and accuracy. The faster a model can generate close to accurate results, the better it is. Optimising a Machine Learning model to produce accurate results fast is not an easy task. This is where frameworks play a big role. The objective of any ML framework is simplicity in implementation and optimization. Fastai proves to have both of these qualities. \n\nIn the following section, we will see how fastai can produce impressive results without the need for extensive tuning or optimization.\nMachine Learning with Fastai\nGetting The Dataset\nFor this illustration, I am using MachineHack\u2019s Predicting The Costs Of Used Cars Hackathon dataset. Head to www.machinehack.com and sign up for the hackathon to download the datasets.\nInstalling fastai \nTo install fastai, type and enter pip install fastai on your command line. If you are using conda distribution, use conda activate to activate the environment before installing fastai library or type and enter conda install -c pytorch -c fastai fastai\n\nFor more information, visit the official github page here.\nGetting Started With Regression\nRegression With Fast.ai in 7 simple steps:\n\n \tImporting the libraries\n \tCreating a TabularList\n \tInitialising Neural Network\n \tTraining the model\n \tEvaluating the model\n \tA simple analysis on the predictions of the validation set\n \tPredicting using the network\n\nFor the complete code including Data Preprocessing, check the last section of the article.\nStep 1. Importing The Libraries\nimport pandas as pd\nimport numpy as np\nfrom fastai.tabular import *\n\nThe fastai.tabular package includes all operations required for transforming any tabular data.\nStep 2. Creating A TabularList\nTabularList creates a list of inputs in items for tabular data. cat_names and cont_names are the names of the categorical and continuous variables respectively. processor will be applied to the inputs or one will be created from the transforms in procs.\n\nThe procs argument specifies the kind of transformations that are required for the dataset. In the above code, block the single argument takes cares of all the data preprocessing stages such as filling missing values, encoding categorical features and normalising.\n\n\n\nCode Summary:\n\n \tInitializing\/Setting The parameters for TabularList such as path, dep_var, cat_names, cont_names and procs.\n \tSetting the index for the Validation set. The start index and End index are set in such a way that it takes the last 20% data from the training set for validation.\n \tCreating TabularList for Validation set from train_data. \n \tCreating TabularList for Test set from test_data. \n \tCreating a DataBunch for the network.DataBunch is a class that binds train_dl,valid_dl and test_dl in a data object.\n\n#Display the data batch\ndata.show_batch(rows = 10)\n\nThe above code block displays a batch of processed data from the Databunch. See the output below.\n\n\nStep 3. Initialising Neural Network\n#Initializing the network\nlearn = tabular_learner(data, layers=[300,200, 100, 50], metrics= [rmse,r2_score])\n\nThe above line of code will initialize a neural network with 4 layers and the number of nodes in each layer as 300,200, 100 and 50 respectively. \n\nThe network will use two primary metrics for evaluation:\n\n \tRoot Mean Squared Error(RMSE)\n \tR-Squared\n\n#Show the complete Summary of the model\nlearn.summary\n\nOutput:\n\n\nStep 4. Training The Model\n#Exploring the learning rates\nlearn.lr_find(start_lr = 1e-05,end_lr = 1e+05, num_it = 100)\nlearn.recorder.plot()\n\nOutput:\n\n\n\nLearning rate is a hyper-parameter that controls how much the weights of the network is being adjusted with respect to the loss gradient. The lr_find method helps explore the learning rate in a specified range. The graph shows the deviation in loss with respect to the learning rate.\n\n#Fitting data and training the network\nlearn.fit_one_cycle(25)\n\nThe above line trains the network for 25 epochs.\n\nOutput:\n\n\nStep 5. Evaluating The Model\n#Display Predictions On Training Data\nlearn.show_results(ds_type=DatasetType.Train,rows = 5)\n#Display Predictions On Validation Data\nlearn.show_results(ds_type=DatasetType.Valid)\n\nThe show_results method will display the data bunches along with predicted values. See the output below:\n\n\nFetching the Metrics \n#Getting The Training And Validation Errors\ntr = learn.validate(learn.data.train_dl)\nva = learn.validate(learn.data.valid_dl)\nprint("The Metrics used In Evaluating The Network:", str(learn.metrics))\nprint("\\nThe calculated RMSE & R-Squared For The Training Set :", tr[1:])\nprint("\\nThe calculated RMSE & R-Squared For The Validation Set :", va[1:])\n\nThe code block above will fetch and print the calculated metrics for the training and validation data.\n\nOutput:\n\n\n\nSummary:\n\nThe Root Mean Squared Error is the standard deviation of the errors\/residuals. It tells us the 'Goodness Of Fit' of a model. The lower the value of RMSE the better the model.\n\nThe R-Squared metric also called the coefficient of determination is used to understand the variation in the dependent variable(y) and the independent variable(X).The closer the value of R-Squared is to one, the better the model.\n\nThe above output suggests that:\n\nThe model\/network was able to attain an RMSE of 1.4678 and an R-squared of 0.9726 while training and an RMSE of 3.1737 and an R-squared of 0.9107 while Validating on the validation set.\nPotting The Losses\n#Plotting The losses for training and validation\nlearn.recorder.plot_losses()\n\nThe above code plots the training and validation losses.\n\n\n\nThe above graph shows the change in loss during the course of training the network. At the beginning of the training, we can see a high loss value. As the networks learned from the data, the loss started to drop until it could no longer improve during the course of training. The validation shows a relatively consistent and low loss values. \n\nNote :\n\nThe validation losses are only calculated once per epoch, whereas training losses are calculated after \nPlotting The Learning Rate, Momentum And Metrics\n#Plotting Momentum & Learning Rate\nlearn.recorder.plot_lr(show_moms=True)\n#Plotting the metrics of evaluation\nlearn.recorder.plot_metrics() \n\nOutput:\n\n\nStep 6. A Simple Analysis On The Predictions Of Validation Set\n#Plotting The Average Price For A Given Car Brand, -- Actual vs Predicted\nimport matplotlib.pyplot as plt\nplt.figure(figsize=(30, 3))\nplt.plot(val.groupby(['Brand']).mean()['Price'], linewidth = 3, )\nplt.plot(val.groupby(['Brand']).mean()['Predicted'],linewidth = 5, ls = '--')\nplt.title('Average Price By Brands')\nplt.xlabel('Brands')\nplt.ylabel('Price In Lacs')\nplt.legend()\nplt.show()\n\n\n\nThe above graph shows the comparison of the average actual price by Brand and the predicted price.\nStep 7. Predicting Using The Network\n#Predicting For a single observation\n#Test set data for row 0\ntest_data.iloc\n\n\n\n#Predicting For The Complete Test set\ntest_predictions = learn.get_preds(ds_type=DatasetType.Test)\n#Converting the tensor output to a list of predicted values\ntest_predictions = [i for i in test_predictions.tolist()]\n#Converting the prediction to . a dataframe\ntest_predictions = pd.DataFrame(test_predictions, columns = ['Price'])\n#Writing the predictions to an excel file.\npredictions.to_excel("Fast_ai_solution.xlsx", index = False)\n\nSubmit the above file here to find out your score. Good Luck!\nComplete Code\n\n\nClick here for notebook.