Welcome to your first lesson on Pandas for Data Science!
In this lesson, you will learn about Pandas along with how to install and import it in Python. You will also learn how to check the version of the installed Pandas library.
What is Pandas?
According to the official documentation, Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real-world data analysis in Python.
It is built on top of the Numpy library and it provides flexible data structures for manipulating numerical tables and time series. Additionally, it has the broader goal of becoming the most powerful and flexible open-source data analysis/manipulation tool available in any language and is working towards that goal.
Using only two kinds of data structures, Pandas Series and Pandas DataFrame, the library can handle the majority of data used in finance, statistics, and various other fields alike. You will be learning about these data structures in upcoming lessons.
How to install Pandas in Python?
Before installing Pandas, make sure you have an updated version of Python installed on your device. If not, you may learn how to do so from this article on installing Python.
Once Python is installed, Pandas can be installed using the Python Package Manager, pip, as follows:
pip install pandas
Note: If you are using Anaconda, Pandas comes pre-installed.
How to import Pandas in Python?
Starting from this section onwards in this course, we will be using Jupyter Notebooks as our Python Interactive Development Environment (IDE). If you do not have Jupyter Notebook installed, you can follow this guide to install and learn more about it.
To import Pandas in Python and to check the currently installed version, write the following code in a new cell of Jupyter Notebook and run it (shift+enter):
# Library import convention import pandas as pd # Check version of pandas installed print(pd.__version__)
If your Pandas version is greater than 1.0.0, then everything is now ready!
Note that, by writing the line,
import pandas as pd, we are importing the pandas library under a variable name
pd is just an arbitrary import name used by convention and is similar to how
np is used for numpy. This is just to maintain uniformity.
Now that you’ve installed Pandas and imported it in Python, head over to the next chapter where you will learn about one of the fundamental data structures used in Pandas called a Pandas Series.