Explore Courses
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Birla Institute of Management Technology Birla Institute of Management Technology Post Graduate Diploma in Management (BIMTECH)
  • 24 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Popular
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science & AI (Executive)
  • 12 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
University of MarylandIIIT BangalorePost Graduate Certificate in Data Science & AI (Executive)
  • 8-8.5 Months
upGradupGradData Science Bootcamp with AI
  • 6 months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
OP Jindal Global UniversityOP Jindal Global UniversityMaster of Design in User Experience Design
  • 12 Months
Popular
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Rushford, GenevaRushford Business SchoolDBA Doctorate in Technology (Computer Science)
  • 36 Months
IIIT BangaloreIIIT BangaloreCloud Computing and DevOps Program (Executive)
  • 8 Months
New
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Popular
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
Golden Gate University Golden Gate University Doctor of Business Administration in Digital Leadership
  • 36 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
Popular
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
Bestseller
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
IIIT BangaloreIIIT BangalorePost Graduate Certificate in Machine Learning & Deep Learning (Executive)
  • 8 Months
Bestseller
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in AI and Emerging Technologies (Blended Learning Program)
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
ESGCI, ParisESGCI, ParisDoctorate of Business Administration (DBA) from ESGCI, Paris
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration From Golden Gate University, San Francisco
  • 36 Months
Rushford Business SchoolRushford Business SchoolDoctor of Business Administration from Rushford Business School, Switzerland)
  • 36 Months
Edgewood CollegeEdgewood CollegeDoctorate of Business Administration from Edgewood College
  • 24 Months
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with Concentration in Generative AI
  • 36 Months
Golden Gate University Golden Gate University DBA in Digital Leadership from Golden Gate University, San Francisco
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Deakin Business School and Institute of Management Technology, GhaziabadDeakin Business School and IMT, GhaziabadMBA (Master of Business Administration)
  • 12 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science (Executive)
  • 12 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityO.P.Jindal Global University
  • 12 Months
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (AI/ML)
  • 36 Months
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDBA Specialisation in AI & ML
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
New
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGrad KnowledgeHutupGrad KnowledgeHutAzure Administrator Certification (AZ-104)
  • 24 Hours
KnowledgeHut upGradKnowledgeHut upGradAWS Cloud Practioner Essentials Certification
  • 1 Week
KnowledgeHut upGradKnowledgeHut upGradAzure Data Engineering Training (DP-203)
  • 1 Week
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
Loyola Institute of Business Administration (LIBA)Loyola Institute of Business Administration (LIBA)Executive PG Programme in Human Resource Management
  • 11 Months
Popular
Goa Institute of ManagementGoa Institute of ManagementExecutive PG Program in Healthcare Management
  • 11 Months
IMT GhaziabadIMT GhaziabadAdvanced General Management Program
  • 11 Months
Golden Gate UniversityGolden Gate UniversityProfessional Certificate in Global Business Management
  • 6-8 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
IU, GermanyIU, GermanyMaster of Business Administration (90 ECTS)
  • 18 Months
Bestseller
IU, GermanyIU, GermanyMaster in International Management (120 ECTS)
  • 24 Months
Popular
IU, GermanyIU, GermanyB.Sc. Computer Science (180 ECTS)
  • 36 Months
Clark UniversityClark UniversityMaster of Business Administration
  • 23 Months
New
Golden Gate UniversityGolden Gate UniversityMaster of Business Administration
  • 20 Months
Clark University, USClark University, USMS in Project Management
  • 20 Months
New
Edgewood CollegeEdgewood CollegeMaster of Business Administration
  • 23 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
KnowledgeHut upGradKnowledgeHut upGradBackend Development Bootcamp
  • Self-Paced
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 5 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
upGradupGradUI/UX Bootcamp
  • 3 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
upGradupGradDigital Marketing Accelerator Program
  • 05 Months

Python Pandas Tutorial: Everything Beginners Need to Know about Python Pandas

Updated on 25 November, 2022

6.21K+ views
9 min read

In this article, we’ll be taking a look at one of the popular libraries of Python essential for data professionals, Pandas. You’d get to learn about its basics as well as its operations.

Let’s get started. 

What is Pandas?

Python Pandas is popular for many reasons. Its primary application is data manipulation, its analysis as well as cleaning. You can use it for various data types and datasets, including unlabelled data, and ordered time-series data. To put it simply, we can say that Pandas is your data’s home. You can perform numerous operations on your data with this tool. 

You can convert the data format of a file, merge two data sets, make calculations, visualize it by taking help from Matplotlib, etc. With so many functionalities, it’s a popular choice among data professionals. That’s why learning about it is essential. And without understanding its working, you can’t use it, so in this Python Pandas tutorial, we’ll be focusing on the same. 

Read: Python Data Visualization Libraries

Role of Pandas in Data Science

The Pandas library is an integral part of any data professional’s arsenal. It’s based on NumPy, which is another popular Python library. A lot of NumPy’s structure is present in Pandas, so if you’re familiar with the former, you wouldn’t have any difficulty in getting familiar with the latter. 

Most of the time, experts use Pandas to feed data in SciPy for statistical analysis. They also use this data with Matplotlib or Scikit-learn for their functions (plotting functions and machine learning, respectively). 

Learn more about Python’s machine learning libraries.

Prerequisites

Before we begin discussing the working of Python Pandas and its operations, we should first make it clear as to who can use it properly and who can’t. You should first be familiar with Python’s underlying code and NumPy. 

The first one, i.e., Python’s fundamentals, is vital for obvious reasons. You wouldn’t understand much without knowing how Python code works. And even if you do, you wouldn’t be able to try out the code as you’d still need to learn the underlying code first. 

The second one, NumPy, is essential to learn because Pandas is based on it. Having an understanding of NumPy will help you considerably in getting familiar with Pandas. 

You can learn about Python through our blogs on data science and Python. We have many helpful guides and articles that can make you familiar with the basics. It’s free, and if you have any doubts, you can write them down in the comment section. 

If you’re familiar with both of the topics we mentioned, let’s take a look at Pandas deeply:

Learn data science course from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.

Installing Pandas

To use Pandas, you’ll have to install it. The best thing is, installation and import of Pandas is very easy. Just open up the command line (if you use a Mac, you’ll have to open the terminal) and install Pandas by using these codes:

For PC users: pip install pandas

For Mac users: conda install pandas

In Pandas, you’ll be dealing with series and dataframes. While a series refers to a column, a data frame refers to a multi-dimensional table that has multiple series. Let’s now take a look at the operations you can perform in Pandas.

Operations in Pandas

Now that we’ve discussed its importance and definition, we should now consider the actions you can perform in this Python Pandas tutorial. Pandas provides you with a lot of functions, and we’ve discussed them below:

Data viewing

You’ll want to print out some of the rows of your data set in the beginning to keep them as a visual reference. And you can do so with the .head() function. 

file1.head()

This function gives you the first five rows of the data frame. If you want to get more rows than the first five, you can just pass the required number in the function. Suppose you want the first 15 rows of the data frame, you’ll write the following code:

file1.head(15)

You also have the option of viewing the last five rows of the data frame. You can do so by using the .tail() function. And just like the .head() function, the .tail() function can also accept a number and give you the required quantity of rows.

file1.tail(20)

This code would give you the last 20 rows of your data frame. 

Getting Information

One of the first functions data scientists use with Pandas is .info(). That’s because it displays information about the data frame and gives you a deeper understanding of what you’re working with. Here’s how you use it in Pandas:

file1.info()

It provides you with a lot of useful information about the dataset, such as the quantity of the non-null values, the number of rows, the type of data present in a column, etc. 

Knowing the datatype of your data frame’s values is essential in many cases. Suppose you need to perform arithmetic operations on the data but it has strings. When you’d run your mathematical operations, you’d see an error pop up because you can’t perform such operations on strings. If one the other hand, you’d use the .info() function before doing any operations, you’d know already that you have strings. 

While the .info() function shows you the general information about your dataset, the .shape attribute gives you a tuple of your data frame. You can find out how many rows and columns your dataset has with the help of the .shape attribute. And you can use it in the following way:

file1.shape

This attribute doesn’t have parentheses because it only gives you a tuple of rows and columns. You’ll be using the .shape attribute quite often while cleaning your data. 

Also learn: Python Developer Salary in India

upGrad’s Exclusive Data Science Webinar for you –

Watch our Webinar on The Future of Consumer Data in an Open Data Economy

Concatenation

Let’s now discuss the concatenation attribute in this Python Pandas tutorial. Concatenation refers to joining two or more things together. So, with this attribute, you can combine two datasets without modifying their values or data points in any way. They combine together as is. You’ll have to use the .concat() function for this purpose. Here’s how:

 result = pd.concat([file1,file2])

It’ll combine the file1 and file2 dataframes and show them as a single data frame. 

df1 = pd.DataFrame({“HPI”:[80,90,70,60],”Int_Rate”:[2,1,2,3], “IND_GDP”:[50,45,45,67]}, index=[2001, 2002,2003,2004])

df2 = pd.DataFrame({“HPI”:[80,90,70,60],”Int_Rate”:[2,1,2,3],”IND_GDP”:[50,45,45,67]}, index=[2005, 2006,2007,2008])

concat= pd.concat([df1,df2])

print(concat)

The output of the above code: 

HPI IND_GDP Int_Rate

2001 80 50 2

2002 90 45 1

2003 70 45 2

2004 60 67 3

2005 80 50 2

2006 90 45 1

2007 70 45 2

2008 60 67 3

You must’ve noticed how the .concat() function has combined the two dataframes and converted them into one. 

Changing the Index

You can change the index values in your data frame as well. For that purpose, you’ll need to use the .set_index() function. In the parentheses of this function, you’d have to enter the details to change the index. Take a look at the following example to understand it better. 

import pandas as pd

df= pd.DataFrame({“Day”:[1,2,3,4], “Visitors”:[200, 100,230,300], “Bounce_Rate”:[20,45,60,10]})

df.set_index(“Day”, inplace= True)

print(df)

The output of the above code:

Bounce_Rate Visitors

Day

1 20   200

2 45   100

3 60   230

4 10   300

You can see that our code changed the index value of the data according to the days. 

Changing the Column Headers

You can change the column headers in Python Pandas as well. All you have to do is to use the .rename() function. You can enter the column names that were present initially in the parentheses and the column names you want to appear in the output code. 

Suppose you have a table with its column header as ‘Time,’ and you want to change it into ‘Hours.’ You can change the name of this column with the following code:

df = df.rename(columns={“Time” : “Hours”})

This code will change the name of the column header from ‘Time’ to ‘Hours.’ This is an excellent function for efficient practices. Let’s take a look at how you can convert the formats of your data. 

Data Munging

With data munging, you have the option of converting the format of specific data. You can convert a .csv file into an .html file or do vice versa. Here’s an example of how you can do so:

import pandas as pd

country= pd.read_csv(“D:UsersUser1Downloadsworld-bank-youth-unemploymentAPI_ILO_country_YU.csv”,index_col=0)

country.to_html(‘file1.html’)

After you’ve run this code, it’ll create an HTML file for you, which you can run on your browser. Data munging is an excellent function, and you’ll find its use in many situations. 

Conclusion

And now, we have reached the end of this Python Pandas tutorial. We hope you found it useful and informative. Python Pandas is a vast topic, and with the numerous functions it has, it would take some time for one to get familiar with it completely. 

If you’re interested in learning more about Python, its various libraries, including Pandas, and its application in data science, check out IIIT-B & upGrad’s PG Diploma in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.

Frequently Asked Questions (FAQs)

1. Do I need to know Python for using Pandas?

Before you get started with Pandas, you need to understand that it is a package built for Python. So, you definitely need to have a firm grip on the basics as well as the syntax of Python programming to start using Pandas with ease. Whenever it comes down to working with tabular data in Python, Pandas is considered the best choice.
But, you need to get clear with the syntax being used in Python before starting with Pandas. It is unnecessary to spend a huge amount of time on it, but you only need to put in enough time to get clear with the basic syntax so that you can start with tasks involving Pandas.

2. How long does it take to learn Pandas in Python?

Pandas is the most widely used Python library for dealing with tabular data. You can use Pandas for all the tasks that you might use Excel for. If you are already aware of Python programming and its syntax, then you can easily get familiar with the functioning of Pandas within two weeks. When you are beginning with Pandas, you should start with the basic data manipulation projects in order to get a grip.
As you progress further, you’ll notice that Pandas is a very useful data science tool that can be a key factor driving business decisions in several industries.

3. Should I prefer learning Numpy or Pandas first?

It is preferred to learn Numpy before Pandas because Numpy is the most fundamental module in Python for scientific computing. You will also receive the support of highly optimized multidimensional arrays that are considered to be the most basic data structure of every Machine Learning algorithm.
Once you are done with learning Numpy, then you should begin with Pandas because Pandas is considered to be an extension of Numpy. This is because the underlying code of Pandas uses the Numpy library extensively.