In the previous lecture, you saw some of the issues in raw data and understood the need for data cleaning. Let's learn how to correct some of these issues at the level of rows and columns.
Let’s summarise what you learnt in the form of a checklist. Make sure that you correctly identify these issues and resolve them before moving on to the next stage of data cleaning.
Checklist for fixing rows
Delete incorrect rows: Header rows, footer rows
Delete extra rows: Column number, indicators, blank rows, page number
Checklist for fixing columns
Add missing column names.
Rename columns consistently, with abbreviations and encoded columns.
Delete unnecessary columns.
Save this checklist for future reference:
In the next segment, we will talk about another very important concept: missing values.