In this lesson, you will learn what is a Pandas Series, how to create a Pandas Series in Python, and how to perform basic arithmetic operations using a Pandas Series in Python.
Pandas Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floats, etc). It has two main components: index and data values.
Consider the following example of a Pandas Series,
Here, the values on the left side (0, 1, 2, 3, 4) are the 'index' of the Pandas Series and the values on the right (London, New York, Tokyo, Paris, Beijing) are the actual 'data values' of the Series. As it can be observed, a Pandas Series resembles closely with either a column/row of a table where each value can be identified by an index that starts from 0.
A Pandas Series can be created from a Python List, Dictionary, NumPy arrays, scalar values, etc. Here, we will look at few methods of how to create a Pandas Series.
The general syntax for creating a Pandas Series is:
import pandas as pd series = pd.Series(data)
Here, data
can be of different types such as a Python dictionary, n-dimensional array, scalar value (like 10), etc. Generally, in real-world usage, the data is loaded from external data sources such as a database, CSV file, or an Excel file.
A Python List can be passed into the pd.Series() method to create a Pandas Series as shown in the following example:
import pandas as pd # Initializing a Python list lst = ['Python','Java','C','Ruby'] # Creating a series series = pd.Series(lst) print(series)
0 Python 1 Java 2 C 3 Ruby dtype: object
Here, we did not specify any index during defining our series, so Pandas assigns numerical values increasing from 0 as the index values. If we want to create a Series with a meaningful index, we can specify the index parameter during Series creation. Similarly, you can also assign a name to the Series by specifying the name parameter.
import pandas as pd # Initializing a Python list lst = ['Python','Java','C','Ruby'] # Creating a series series = pd.Series(lst, index = ['1st','2nd', '3rd','4th'], name = "Programming Languages") print(series)
1st Python 2nd Java 3rd C 4th Ruby Name: Programming Languages, dtype: object
Conversely, we can create a Python List from a Pandas Series as follows:
# Converting Pandas Series into Python List series_to_list = series.tolist() series_to_list
['Python', 'Java', 'C', 'Ruby']
A NumPy array can be passed into the pd.Series() method to create a Pandas Series as shown in the following example (similar to a Python List):
import pandas as pd import numpy as np #importing the NumPy module # Initializing a NumPy array numpy_array = np.array(['Apple','Mango','Grapes','Pineapple']) # Creating Pandas Series from NumPy array series_from_np = pd.Series(numpy_array, name = "Fruits") series_from_np
0 Apple 1 Mango 2 Grapes 3 Pineapple Name: Fruits, dtype: object
Conversely, we can create a NumPy array from a Pandas Series as follows:
# Converting Pandas Series into NumPy array series_to_numpy = series_from_np.to_numpy() series_to_numpy
array(['Apple', 'Mango', 'Grapes', 'Pineapple'], dtype=object)
To create a Pandas Series from a Python dictionary, we need to create a dictionary and pass it to the data parameter in the pd.Series() method. In this case, the index of the Pandas Series is set as the keys of the dictionary and the values are filled with corresponding values in the Pandas dictionary.
import pandas as pd # Initialize a Python dictionary dictionary = { '1st' : 'One', '2nd' : 'Two', '3rd' : 'Three', '4th' : 'Four'} # Creating a Pandas Series from dictionary series_from_dict = pd.Series(dictionary) series_from_dict
1st One 2nd Two 3rd Three 4th Four dtype: object
Conversely, we can create a Python dictionary from a Pandas Series as follows:
# Converting Pandas Series into dictionary series_to_dictionary = series_from_dict.to_dict() series_to_dictionary
{'1st': 'One', '2nd': 'Two', '3rd': 'Three', '4th': 'Four'}
Various arithmetic operations such as addition, subtraction, multiplication, etc can be performed on Pandas Series. Here are examples for the same:
import pandas as pd import numpy as np # Creating a Pandas Series s = pd.Series([1, 8, 5, 25, 6, 7]) print("\ns:\n", s) # Arithmetic operations in series print("\ns+s:\n", s+s) # vector addition print("\ns-s:\n", s-s) # vector substraction print("\ns*2:\n", s*2) # multiplying vector by a scalar print("\ns/2:\n", s/2) # dividing vector by a scalar print("\ne^s:\n", np.exp(s)) # finds exponent print("\nlog(s):\n", np.log(s)) # finds log
s: 0 1 1 8 2 5 3 25 4 6 5 7 dtype: int64 s+s: 0 2 1 16 2 10 3 50 4 12 5 14 dtype: int64 s-s: 0 0 1 0 2 0 3 0 4 0 5 0 dtype: int64 s*2: 0 2 1 16 2 10 3 50 4 12 5 14 dtype: int64 s/2: 0 0.5 1 4.0 2 2.5 3 12.5 4 3.0 5 3.5 dtype: float64 e^s: 0 2.718282e+00 1 2.980958e+03 2 1.484132e+02 3 7.200490e+10 4 4.034288e+02 5 1.096633e+03 dtype: float64 log(s): 0 0.000000 1 2.079442 2 1.609438 3 3.218876 4 1.791759 5 1.945910 dtype: float64
Now you know how to create a Pandas series and perform various operations on it. The next chapter will introduce you to another most commonly used data structure in Pandas, called the Pandas DataFrame.