Up until this chapter, we have discussed creating DataFrames using the data that we generated ourselves. In this chapter, we will learn to read data in Pandas from external sources like CSV file, JSON file, Excel file etc. Also, we will learn to write Pandas DataFrame to external files.
Pandas works with a variety of files (such as excel, JSON, CSV, etc) for loading as well as writing data. Pandas contain a set of data reader functions like read_csv, read_json, read_excel that generally return a pandas data structure i.e., Dataframes or Series. Also, the corresponding writer functions like to_csv, to_json, to_excel are object methods that help to convert pandas object into the specified format.
Handling CSV & Text Files in Pandas
CSV & text files(flat files) can be easily loaded as pandas DataFrame using the reader function read_csv(). The general syntax for it is:
dataFrameName = pandas.read_csv("path_of_file_stored")
In the syntax above:
- path_of_file_stored can be local file directory or a web link to file
- other optional parameters can be specified such as index_col, n_rows, header, skiprows, etc.
A DataFrame can also be converted into a CSV or a text file using the to_csv() function. The general syntax for it is:
dataFrameName.to_csv("path_to_file_storage")
EXAMPLE:
# Making necessary imports import pandas as pd df = pandas.read_csv("../test.csv") df.to_csv("../new_test.csv")
Handling JSON Files in Pandas
JSON files can be imported using the reader function read_json(). The general syntax for it is:
dataFrameName = pandas.read_json ('path_to_file_storage')
Similar to CSV files, ‘path_to_file_stored’ can be a local file directory or a web link to the JSON file.
Conversely, you can also convert a pandas DataFrame to a JSON file using the to_json() function. The syntax for it is:
dataFrameName.to_json ('path_to_file_storage')
EXAMPLE:
# Making necessary imports import pandas as pd df = pandas.read_json("../test.json") df.to_json("../new_test.json")
Handling Excel Files in Pandas
You can easily import an Excel file in pandas using the reader function read_excel(). The general syntax for it is:
dataFrameName = pandas.read_excel ('path_to_excelFile.xlsx', sheet_name='excel_sheet_name')
Conversely, you can also convert a pandas DataFrame to an excel file using to_excel(). Syntax:
dataFrameName.to_excel('path_to_store_excelFile')
EXAMPLE:
# Making necessary imports import pandas as pd df = pandas.read_excel("../test.xlsx", sheet_name="sheet1") df.to_json("../new_test.xlsx")
Now you should be familiar with loading different data files in Pandas. In the next two chapters, you will learn to manipulate and visualize the data in Pandas DataFrames.