In the introductory lesson of this course, we learned the basic concepts behind Supervised Machine Learning. In this lesson, we will discuss why Python is commonly preferred for Machine Learning along with Python libraries for Machine Learning. We will learn how to install some essential libraries and tools available in Python that will be useful for us in solving Machine Learning problems.
Why is Python preferred for Machine Learning?
Python is one of the most popular programming languages of 2020 and it has a large global community of adopters.
Since a lot of people use it and contribute to it, Python now offers a wide spectrum of libraries for carrying out data extraction, pre-processing, visualization and modeling. Due to these libraries, we can build, train and evaluate many Machine Learning models in just four or five lines of code. Therefore, it is so popular in the data science community.
Installing necessary Python libraries and tools
For the purpose of this course, we will be installing a few Python libraries that will help us in building Machine Learning models and working effectively with data. It is recommended to install the libraries by setting up a virtual environment.
The necessary python libraries for machine learning (for this course) are listed below:
- NumPy is one of the fundamental libraries in Python containing functionality for working with multidimensional arrays, mathematical functions, and operations.
- SciPy contains a collection of functions for scientific computing in Python. It provides functionality for optimization, linear algebra, signal processing, image processing, etc. and is built on top of NumPy.
- Matplotlib provides functions for plotting line charts, histograms, scatter plots, etc. for visualizing data and its distributions.
- Pandas is used for operations and analysis of data.
- scikit-learn is the most popular and commonly used library for building and evaluating Machine Learning models in Python. It provides several state-of-the-art machine learning algorithms and tools that can be easily used in a few lines of code. The scikit-learn library is built on top of NumPy and SciPy.
- Jupyter Notebook is a browser-based interactive programming environment and a great tool for data analysis and visualizations.
If you don’t have a Python installation set up, please complete the installation first. You can then use pip to install all of these packages using the following command in command prompt/terminal.
$ pip install numpy scipy matplotlib scikit-learn pandas ipython jupyterlab
Before moving to the next lecture, we suggest you to open up a Jupyter Notebook and set up your coding environment. You can do so by using the command jupyter lab
in the command prompt/terminal of your working directory. If you are a more advanced user of Python and have your own preferences, please feel free to choose an IDE that you prefer. However, all of the coding examples in this course are written for execution on Jupyter Notebook cells.
Now that we have everything set up, we are now ready to start learning Machine Learning with Python. See you at the next lesson!