An important concept in Pandas dataframes is that of the row and column indices. By default, each row is assigned indices starting from 0, which are represented to the left of the dataframe. For columns, the first row in the file (csv, text, etc.) is taken as the column header. If a header is not provided (header = none), then the case is similar to that of row indices (which start from 0).
Pandas library offers the functionality to set the index and column names of a dataframe manually. Let us now learn how to change or manipulate the default indices and replace them with more logical ones. The required notebook is the same as the previous segment.
You can use the following code to change the row indices:
dataframe_name.index
To change the index while loading the data from a file, you can use the attribute 'index_col':
pd.read_csv(filepath, index_col = column_number)
Note: The command del cars.index.name to delete the column index name has been replaced with cars.index.name = None in the latest version of pandas.
It is also possible to create multilevel indexing for your dataframe; this is known as hierarchical indexing. Let’s watch the following video and learn how to do it.
For column header, you can specify the column names using the following code:
dataframe_name.columns = list_of_column_names
In the next segment, you will learn about different operations that can be performed over 1D NumPy arrays.