Explore Courses
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Birla Institute of Management Technology Birla Institute of Management Technology Post Graduate Diploma in Management (BIMTECH)
  • 24 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Popular
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science & AI (Executive)
  • 12 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
University of MarylandIIIT BangalorePost Graduate Certificate in Data Science & AI (Executive)
  • 8-8.5 Months
upGradupGradData Science Bootcamp with AI
  • 6 months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
OP Jindal Global UniversityOP Jindal Global UniversityMaster of Design in User Experience Design
  • 12 Months
Popular
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Rushford, GenevaRushford Business SchoolDBA Doctorate in Technology (Computer Science)
  • 36 Months
IIIT BangaloreIIIT BangaloreCloud Computing and DevOps Program (Executive)
  • 8 Months
New
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Popular
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
Golden Gate University Golden Gate University Doctor of Business Administration in Digital Leadership
  • 36 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
Popular
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
Bestseller
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
IIIT BangaloreIIIT BangalorePost Graduate Certificate in Machine Learning & Deep Learning (Executive)
  • 8 Months
Bestseller
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in AI and Emerging Technologies (Blended Learning Program)
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
ESGCI, ParisESGCI, ParisDoctorate of Business Administration (DBA) from ESGCI, Paris
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration From Golden Gate University, San Francisco
  • 36 Months
Rushford Business SchoolRushford Business SchoolDoctor of Business Administration from Rushford Business School, Switzerland)
  • 36 Months
Edgewood CollegeEdgewood CollegeDoctorate of Business Administration from Edgewood College
  • 24 Months
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with Concentration in Generative AI
  • 36 Months
Golden Gate University Golden Gate University DBA in Digital Leadership from Golden Gate University, San Francisco
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Deakin Business School and Institute of Management Technology, GhaziabadDeakin Business School and IMT, GhaziabadMBA (Master of Business Administration)
  • 12 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science (Executive)
  • 12 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityO.P.Jindal Global University
  • 12 Months
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (AI/ML)
  • 36 Months
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDBA Specialisation in AI & ML
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
New
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGrad KnowledgeHutupGrad KnowledgeHutAzure Administrator Certification (AZ-104)
  • 24 Hours
KnowledgeHut upGradKnowledgeHut upGradAWS Cloud Practioner Essentials Certification
  • 1 Week
KnowledgeHut upGradKnowledgeHut upGradAzure Data Engineering Training (DP-203)
  • 1 Week
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
Loyola Institute of Business Administration (LIBA)Loyola Institute of Business Administration (LIBA)Executive PG Programme in Human Resource Management
  • 11 Months
Popular
Goa Institute of ManagementGoa Institute of ManagementExecutive PG Program in Healthcare Management
  • 11 Months
IMT GhaziabadIMT GhaziabadAdvanced General Management Program
  • 11 Months
Golden Gate UniversityGolden Gate UniversityProfessional Certificate in Global Business Management
  • 6-8 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
IU, GermanyIU, GermanyMaster of Business Administration (90 ECTS)
  • 18 Months
Bestseller
IU, GermanyIU, GermanyMaster in International Management (120 ECTS)
  • 24 Months
Popular
IU, GermanyIU, GermanyB.Sc. Computer Science (180 ECTS)
  • 36 Months
Clark UniversityClark UniversityMaster of Business Administration
  • 23 Months
New
Golden Gate UniversityGolden Gate UniversityMaster of Business Administration
  • 20 Months
Clark University, USClark University, USMS in Project Management
  • 20 Months
New
Edgewood CollegeEdgewood CollegeMaster of Business Administration
  • 23 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 5 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
upGradupGradUI/UX Bootcamp
  • 3 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
upGradupGradDigital Marketing Accelerator Program
  • 05 Months

What is Hashing in Data Structure? Explore Hashing Techniques, Benefits, Limitations, and More

By Rohan Vats

Updated on Feb 18, 2025 | 26 min read

Share:

Hashing in data structure is a technique that assigns each piece of data (often called a key) to a specific index in a hash table. This mapping happens through a function known as a hash function, which converts the key into a fixed-size code. Once the data lands at its correct spot, you can access it on average in constant time.

In simpler terms, hashing speeds up lookups. It makes life easier when you need quick results from large lists of items — such as user passwords, dictionary words, or catalog records. 

In this blog, you’ll see why hashing is considered a powerful method for fast data handling. You’ll learn how it works, which hashing techniques are common, how to build a custom hash table, and much more. Let’s get started!

What Is Hashing in Data Structure, and What are its Core Components?

Hashing in data structure assigns each data element, called a key, to a slot in a hash table through a function that converts the key into a numeric output. This output, or hash code, points to the exact place in the table for quick searches and updates. The goal is to keep operations (like inserting or finding items) at a constant average time, even when the dataset grows large. 

Each key gives the same code every time, which makes the system consistent. The technique is generally one-way, so you cannot work backwards from a hash code to recover the original key.

Let’s explore the components of hashing in detail now.

1. Key 

A key is the piece of data you want to store and retrieve. It can be a number, text, or any unique identifier. The system treats each key as the starting point for generating a hash code. 

When you use meaningful keys, locating or updating entries becomes much easier. A key must remain consistent across operations so that the same input always produces the same code.

2. Hash Function

The hash function converts each key into a numeric output called a hash code. A well-designed hash function distributes keys evenly across the hash table and reduces collisions. 

Simple versions might use modulo arithmetic (key % tableSize), while advanced ones use more complex math or encryption-based formulas. 

A good function should do the following things:

  • Run quickly for large volumes of keys.
  • Return distinct codes as often as possible to avoid collisions.
  • Produce consistent outputs so the same key always goes to the same code.

3. Hash Table

A hash table is an array or similar structure that reserves spots for each hash code. The table’s size depends on factors like expected data volume and desired efficiency. Each index in the table points to a location where data is kept. 

When a new key arrives, the hash function determines which slot the data should occupy. If a collision occurs, the system uses a separate strategy — like chaining or open addressing — to store items without discarding anything.

You can also check out upGrad’s free tutorial, Data Structures and Algorithms. Explore categories and algorithms. Master all the key operations in data handling and decipher DSA like never before.

Also Read: Hash Tables and Hash Maps in Python

How Does Hashing in Data Structure Work?

Each component in hashing works in unison: the key defines the data, the hash function determines where to place it, and the table holds it for instant lookups. This setup allows efficient insertion, retrieval, and deletion in large databases or any data-heavy scenario.

Here’s a step-by-step guide on how hashing works:

  • Receive the Key: You have a distinct identifier, such as a student ID, product code, or string, that needs storage.
  • Apply the Hash Function: The key is passed into the function, which calculates a number representing the key’s position in the table.
  • Obtain the Hash Code: The function outputs an integer (hash code) that stays the same each time you use the same key.
  • Place Data in the Hash Table: You use the hash code as an index. Data goes into the corresponding slot, ready for quick access.
  • Handle Collisions if Needed: When two different keys yield the same code, the system stores them without overwriting anything, often through methods like linking items in a list or probing the next free slot.

Let’s understand this better through two examples that illustrate how hashing works.

Example 1: Numeric Hashing

Suppose you have three numbers: 14, 19, and 23, along with a simple hash function h(key) = key mod   5. 

The table below shows how each key maps to a slot in a hash table of size 5 (indexes 0 through 4):

Key

H (key) =keymod 5

Resulting Index

14 14 % 5 = 4 4
19 19 % 5 = 4 4 (collision)
23 23 % 5 = 3 3

You place 14 at index 4, but 19 also maps to index 4, which causes a collision. The last key, 23, goes to index 3 with no issue. Collisions get handled through methods like chaining or open addressing, which you’ll learn more about later.

Example 2: String Hashing

When hashing a string like "CAT," you might add the ASCII values of each character (C=67, A=65, T=84) for a total of 216. If you apply 216 % 10, you get 6 as the final index. 

Below is a table for three strings:

String

Sum of ASCII Values

Computation

Final Index

CAT 67 + 65 + 84 = 216 216 % 10 = 6 6
DOG 68 + 79 + 71 = 218 218 % 10 = 8 8
KEY 75 + 69 + 89 = 233 233 % 10 = 3 3

Each string generates an index in the table, although collisions can still occur if different strings produce the same final value. The main advantage is that hashing can point you directly to each item's spot instead of forcing you to check every element in the table.

Take the leap into Data Science success – Explore upGrad's comprehensive data science courses and boost your career now!

Also Read: Sorting in Data Structure: Categories & Types [With Examples]

What Are Some Commonly-used Hashing Techniques in Data Structure?

Hash functions follow different approaches, or hashing techniques, to turn keys into compact codes. While all hashing techniques in a data structure aim to minimize collisions, they handle numeric or string inputs in unique ways.

  • Some focus on simple arithmetic
  • Others rely on more complex formulas

Below is a closer look at widely used hashing techniques, complete with examples to show how keys transform into hash codes.

1. Division Method

The division method is one of the simplest techniques. You take a key and divide it by the size of the hash table, then use the remainder as the index. This approach is quick because it involves a single division operation, but picking the right table size is important to reduce collisions.

How Does It Work?

  • Formula: h(key) = key mod   tableSize
  • Choice of tableSize: A prime number or a value unrelated to your data patterns helps spread out entries more evenly.

Let’s understand this through an example: Suppose you have keys [13, 27, 18, 42] and a hash table of size 10:

Key

Computation

Remainder

Index

13 13 ÷ 10 3 3
27 27 ÷ 10 7 7
18 18 ÷ 10 8 8
42 42 ÷ 10 2 2

You place each key at its remainder index. Even if different keys produce the same remainder, collisions can still happen, so you must have a plan to handle them (for example, chaining or open addressing).

2. Mid Square Method

The mid-square method squares the key and extracts the middle digits to form the index. This technique can work well if the squared values vary enough to reduce collisions.

How Does It Work?

  1. Square the key
  2. Extract a set number of middle digits from the result
  3. Convert those digits into a valid index (often through a final modulo operation)

Example: Use a table size of 100 for clarity. Let’s take the key ‘56’:

  • Square 56 to get 3136.
  • Choose the two middle digits, which are 13.
  • Store the data at index 13 in the table.

If the key is 123, then:

  1. 123 * 123 = 15129
  2. Middle digits could be 12 (from 15129).
  3. You store the data at index 12 in the table.

This approach distributes keys better than a basic division if the squared numbers vary enough. However, repeated patterns can still result in collisions.

This method taps into more of the key's digits by squaring it, often spreading values. However, if certain patterns repeat, collisions can still occur.

3. Folding Method

In folding, you slice the key into equal parts (except for the last piece if the key length isn't divisible evenly) and sum those parts to get the index. This method is useful when keys contain multiple segments, such as phone numbers or identification codes.

How Does It Work?

  1. Break the key into fixed-size chunks.
  2. Sum the numeric values of those chunks.
  3. If needed, take the sum modulo the table size to get the final index.

Example: Take a phone number-like key, such as 123456, and split it into two-digit segments: 12, 34, 56. 

  • Sum: 12+34+56=102
  • Index: If the table size is 10, then 102 mod  10 = 2

If the key had odd digits, the last chunk might have fewer digits. You still add it to the sum.

4. Multiplication Method

The multiplication method multiplies the key by a constant (often a fraction between 0 and 1), then extracts the fractional part and multiplies it by the table size. The goal is to spread out keys more uniformly than basic division.

How Does It Work?

  1. Pick a constant A between 0 and 1 (some approaches use the golden ratio fraction ~0.618).
  2. Multiply the key by A.
  3. Take the fractional part of the product.
  4. Multiply that fraction by the table size to find the index.

Example: Let’s use A = 0.7 and a table size of 10 for the key 45:

  1. 45×0.7=31.5
  2. Fractional part = 0.5
  3. Index = 0.5×10=5

The multiplication method can still collide but often handles data better than a plain division approach.

5. Universal Hashing

Universal hashing picks a random function from a set of possible options. This makes it harder for any adversary or pattern to consistently cause collisions. It can be more secure for certain applications, though it can add complexity.

How Does It Work?

  1. Maintain a family of functions (e.g., different multipliers or constants).
  2. Pick one at random when the system starts or on demand.
  3. Apply that chosen function to each key to compute the index.

Example: Say we keep a set of arithmetic functions like the ones listed below:

  • h1(key) = (a1 * key + b1) % tableSize
  • h2(key) = (a2 * key + b2) % tableSize

We pick one function at run time. Since it’s random, a worst-case scenario for one function may not apply to the next one.

6. Cryptographic Hashing

Though often used for security (like passwords or checksums), cryptographic hashing techniques can also serve in data structures if you need robust collision resistance. 

How Does It Work?

  • It’s a one-way operation (you can’t recover the original input from the hash).
  • Collision resistance is generally stronger, making it tough to find two keys with the same output.
  • Common examples include MD5, SHA-1, and SHA-256.

Example: When you apply SHA-256 to “apple,” you get a 64-character hexadecimal string. It’s more expensive computationally than simpler methods but offers greater security.

A data structure can be utilized in a variety of ways and can be studied through different means. For instance, if you would like to learn data analysis patterns, join upGrad’s free Analyzing Patterns in Data and Storytelling course right away! 

Also Read: What is MD5 Algorithm? How Does it Work?

How Is Hashing Implemented Across Different Programming Languages?

Many languages include built-in data structures that rely on hashing for quick operations. These structures typically accept a key, compute a hash code behind the scenes, and store the value in a table. 

Below are some short Python, Java, and C++ examples that illustrate how each handles key-value pairs.

Python Hash Method

Python’s built-in dictionary uses a hash function on each key. 

  • When you insert an item, Python decides the index based on that hash code.
  • Collisions are resolved automatically. 

Below is a short code snippet that stores and retrieves data.

def demo_python_dict():
    # Create a dictionary
    phonebook = {
        "Asha": 12345,
        "Rahul": 67890
    }
    
    # Insert a new key-value pair
    phonebook["Meera"] = 55555
    
    # Retrieve values
    print("Asha's number:", phonebook["Asha"])
    print("Rahul's number:", phonebook["Rahul"])
    print("Meera's number:", phonebook["Meera"])

if __name__ == "__main__":
    demo_python_dict()

Output:

Asha's number: 12345
Rahul's number: 67890
Meera's number: 55555

Python’s internal hash function generates an index for each name. When you call phonebook["Asha"], it uses that hash code to locate her number in constant average time.

Java Hash Method

Java provides the HashMap class, which maps keys to values using hashing. It automatically resizes the table when the load factor becomes high and handles collisions by linking entries in buckets.

import java.util.HashMap;

public class DemoHashMap {
    public static void main(String[] args) {
        HashMap<String, Integer> phonebook = new HashMap<>();
        
        // Insert entries
        phonebook.put("Asha", 12345);
        phonebook.put("Rahul", 67890);
        
        // Insert a new one
        phonebook.put("Meera", 55555);
        
        // Retrieve values
        System.out.println("Asha's number: " + phonebook.get("Asha"));
        System.out.println("Rahul's number: " + phonebook.get("Rahul"));
        System.out.println("Meera's number: " + phonebook.get("Meera"));
    }
}

Output:

Asha's number: 12345
Rahul's number: 67890
Meera's number: 55555

When you add a key, HashMap turns that key into a hash code and places the data in the corresponding bucket. If two keys end up in the same bucket, Java uses a linked list or tree to differentiate them internally.

Also Read: What is Hashtable in Java? Explained with Examples

C++ Hash Method

C++ provides an unordered_map for hashing. When you insert a key, it figures out the hash code and finds a spot for the value. Collisions are often handled by chaining.

#include <iostream>
#include <unordered_map>
#include <string>

int main() {
    std::unordered_map<std::string, int> phonebook;
    
    // Insert entries
    phonebook["Asha"] = 12345;
    phonebook["Rahul"] = 67890;
    
    // Insert a new one
    phonebook["Meera"] = 55555;
    
    // Retrieve values
    std::cout << "Asha's number: " << phonebook["Asha"] << std::endl;
    std::cout << "Rahul's number: " << phonebook["Rahul"] << std::endl;
    std::cout << "Meera's number: " << phonebook["Meera"] << std::endl;

    return 0;
}

Output:

Asha's number: 12345
Rahul's number: 67890
Meera's number: 55555

unordered_map calculates a hash code for each name. When you call phonebook["Asha"], it looks up the code and jumps to the correct spot. This method keeps data retrieval at a constant average time.

Also Read: Types of Data Structures in Python: List, Tuple, Sets & Dictionary

Why Is Hashing Important for Fast Data Retrieval?

Hashing makes large-scale lookups feel instant. You can store thousands or even millions of records and still reach the item you want in constant average time. Many data-driven systems rely on this technique for real-time performance, whether it’s checking passwords or searching entries in a phone directory. 

The idea is straightforward: convert each key into a numeric code, then use that code to jump directly to the data’s location. You skip repetitive scans through entire lists, which boosts speed dramatically. By cutting down the overhead of sequential searches, hashing keeps data operations efficient, even as your collection expands.

Below are some key reasons why hashing offers strong benefits for data retrieval:

  • Constant Average-Time Access: Most hash-based lookups complete in O(1) time, letting you insert, delete, or search data without scanning everything in memory.
  • Scalability With Large Datasets: You can scale up your storage while keeping the same time complexity, which allows systems to grow without a big performance hit.
  • Reduced Overhead: You avoid the heavy comparison loops that many other data structures need. This lightens the load on your processor and speeds up responses.
  • Flexibility for Different Types of DataNumeric inputs, strings, and other complex keys all map quickly to hash codes, which makes hashing suitable for a broad range of use cases.
  • Efficient Collision Handling: Hashing techniques include collision resolution methods like chaining or open addressing to store new entries without losing existing ones.

Why Do Collisions Occur and How to Handle Them?

Collisions happen when two or more distinct keys map to the same slot in a hash table. The table has a finite number of indices, but the possible set of keys might be far larger, so different inputs can generate identical hash codes. 

When this occurs, one slot tries to hold multiple items. 

Collisions are normal in hashing — no function guarantees absolute uniqueness in every case. However, you can reduce them by choosing a strong hash function and adjusting the table size. Even with those measures, you still need a proper plan to store colliding entries without losing data.

Here are some common reasons why collisions happen:

  • Limited Range of Hash Values: A table might hold fewer slots than your overall key space, forcing some keys to share the same index.
  • Poor Hash Function: If the function isn’t well-designed, it may map unrelated keys to similar values.
  • Unexpected Data Patterns: Certain inputs (e.g., all keys ending with the same number) can cause repeated results if the function doesn’t account for them.

Now, let’s explore the common collision-resolution methods you can use.

1. Separate Chaining (Open Hashing)

Separate chaining attaches items with the same hash index in a linked list. Each slot in the hash table can point to a separate chain of entries. This helps when collisions appear frequently because items keep extending the list instead of overwriting each other. 

Adding or searching for an element takes extra time if a chain becomes long, but it’s straightforward to implement and scales naturally.

How Does It Work?

  • Every index stores a pointer to a linked list.
  • When a key collides, the new item goes at the end (or the start) of that list.
  • Searching requires walking through the list until you find the matching key or confirm it’s not present.

2. Open Addressing (Closed Hashing)

Open addressing keeps all data within the main array. When a collision happens, the hash table probes other slots until it finds an empty space. The table never expands beyond its original array, which helps if you prefer a contiguous structure. However, overfilling can increase collision chains, so a balanced load factor is crucial.

Below are three probing techniques under open addressing:

  • Linear Probing: You check the next slot in a linear manner (index + 1, index + 2, and so on) until you find a free position or the item itself. This method is simple but can create “clusters” of occupied slots.
  • Quadratic Probing: The probing distance grows by a square value (1, 4, 9, 16...). This approach aims to reduce consecutive collisions by skipping slots more aggressively.
  • Double Hashing: A second hash function generates the step size for each subsequent probe. That second function makes it less likely that multiple keys will keep colliding at the same rate.

How Does it Work (Linear Probing Example)?

  • Let’s say key1 and key2 both land on index 3.
  • You store key1 at index 3.
  • When key2 arrives and finds index 3 occupied, it tries index 4, then index 0 if needed, until it finds an empty spot.
  • Lookups follow the same path: check index 3, then 4, etc., until the item is found or an empty slot appears.

How to Implement a Custom Hash Table?

A custom hash table gives you full control over how data gets stored and how collisions are resolved. You can pick a hash function, decide on the table size, and choose whether to use chaining or open addressing. 

Below is a step-by-step approach, followed by an example in pseudo-code.

Step 1. Choose a Hash Function

Select a function that spreads your keys. A simple choice is the division method:

index = key % tableSize

If you need a better distribution, you can choose more advanced approaches (mid-square, multiplication). Consider data patterns before deciding.

Step 2. Decide on Table Size

A table that is too small leads to excessive collisions.  Depending on the function, a prime number or power of two is common. Ensure you keep an eye on the load factor — if too many items fill up your table, your performance may decline.

Step 3. Select a Collision Resolution Method

  • Separate Chaining: Each slot holds a linked list for any keys that share the same index.
  • Open Addressing: You store each item directly in the array. If collisions occur, you probe for the next available slot. Methods include linear probing, quadratic probing, and double hashing.

Step 4. Implement Basic Operations

  1. Insert:
    • Compute the index using the hash function.
    • If the slot is empty, place the item there.
    • If it’s occupied, use your chosen collision strategy (e.g., add to the linked list for chaining, or probe until you find an open slot).
  2. Search:
    • Compute the index from the key.
    • Check that slot (or linked list/probed slots) until you find the key or confirm it doesn’t exist.
  3. Delete:
    • Compute the index.
    • Remove the key if present.
    • In open addressing, you might mark the slot as a “deleted” placeholder to keep the search chain intact.

Step 5. Manage the Load Factor and Rehashing

  • The load factor is (number of stored items) / (tableSize).
  • If it exceeds a certain threshold (commonly 0.7 or 0.75), consider doubling the table size and rehash all keys into the new slots. This prevents a buildup of collisions over time.

Example: Custom Hash Table (Using Chaining)

Below is a short pseudo-code showing how you might implement insertion and search using separate chaining.

Initialize array hashTable of size N
For i in [0..N-1]:
    hashTable[i] = EmptyLinkedList()

function hashFunction(key):
    return key % N

function insert(key, value):
    index = hashFunction(key)
    // Insert at the head or tail of the linked list
    hashTable[index].addNode(key, value)

function search(key):
    index = hashFunction(key)
    // Traverse linked list at hashTable[index]
    for node in hashTable[index]:
        if node.key == key:
            return node.value
    return null // Not found

How Does It Work?

  1. Initialization: You create an array of linked lists, each corresponding to a slot.
  2. hashFunction: A simple modulo operation.
  3. Insert: You compute the index, then add the new entry to the corresponding list. If more than one key maps to the same index, those keys line up in that list.
  4. Search: You compute the index, then walk through the list until you find the matching key or reach the end.

What About Collision Handling?
Here, collisions occur when more than one key lands on the same index. The linked list approach stores them all in that slot. This keeps the insert logic simple, although searches can take longer if the list grows too big. Balancing the table size and keeping an eye on the load factor help maintain a fast average case.

What Makes a Hash Function Good?

A solid hash function lies at the center of efficient data handling. It controls how keys turn into numeric codes, which affects how well a table avoids collisions and maintains quick lookups. 

If the function repeatedly sends different keys to the same index, you lose most of hashing’s benefits. By focusing on certain qualities — such as balanced distribution and low collision rates — you ensure that each stored item remains easy to access.

Below are the main features that define an effective hash function:

  • Uniform Distribution: The function should spread keys evenly across the table. When no single index collects too many items, lookups stay fast, and collisions remain rare.
  • Deterministic Outputs: Each time you apply the function to the same key, you must get the same code. Consistency keeps data retrieval accurate and prevents missed matches.
  • Low Collision Probability: Although collisions cannot be avoided entirely, a good hash function minimizes them, reducing overhead when applying collision-resolution methods.
  • Speed and Simplicity: The function needs to run quickly, especially in large-scale settings. Simple arithmetic operations (like modulo) and efficient algorithms (like multiplication-based hashing) often perform well.
  • Resistance to Patterns: Certain datasets have predictable patterns — like sequential IDs. A function that shifts or scrambles inputs in different ways helps dodge clusters of collisions.
  • Adaptability: In some cases, you can choose a function that suits your data. For instance, if you store strings of fixed length, you might pick a function that adds character codes and applies a final modulo. When the input type changes, you can adjust accordingly.

A carefully chosen hash function keeps operations near constant time. While no single formula works perfectly for every scenario, testing different approaches and measuring collision rates helps you pick the right one for your data.

When Not to Use Hashing?

Hashing techniques excel at direct lookups but do not handle every scenario equally well. If you rely on sequential order or frequently run range queries, you might find that a self-balancing tree or another data structure delivers smoother performance.  

Hash-based tables also lack a natural way to support prefix searches for text. In those cases, specialized structures like tries can handle data more effectively.

Below is a quick overview of situations where hashing may not be your best bet, along with alternative solutions:

Scenario

Why Hashing Falls Short?

Suggested Alternative

You need sorted data for easy traversal Hash tables store elements in arbitrary positions, losing any natural ordering. Self-balancing BST (e.g., AVL, Red-Black), or a B-tree
You run frequent range queries (like 5 < x < 20) Hash lookups require exact key matches; they are not built to handle range-based searches. BST structures, segment trees, or skip lists
You need prefix searches on strings Hashing locates exact matches by a key, but does not easily support partial matches. Trie (prefix tree)
You require floor and ceiling operations Hashing cannot quickly find the immediate smaller or bigger element if exact keys are missing. Balanced BST (e.g., AVL, Red-Black)
You must preserve insertion order A standard hash table may reorder items internally, losing the sequence in which they were added. Linked list, or specialized data structures that track order

What Are the Advantages of Hashing Data Structures?

Hashing techniques are known for delivering quick access to data, whether it’s a single lookup or a frequent insert. They let you manage large collections without scanning the entire dataset, making them a reliable choice for performance-focused applications. 

When your keys vary in structure or arrive in huge volumes, a well-chosen hash function and table size can keep operations from slowing down.

Below are some specific advantages you gain by using hashing:

  • Constant Average-Time Access: Standard operations (insertion, deletion, retrieval) complete in O(1) on average, which means your performance stays consistent even with many entries.
  • Efficient Memory Usage: You only need a fixed-size table (plus possible overhead for collision handling). This can be more space-friendly than keeping pointers everywhere, like in trees.
  • Simplicity of Implementation: Basic hashing methods — such as the division approach — are straightforward, allowing for quick prototypes and maintenance.
  • Flexibility in Key Types: You can hash integers, strings, or complex objects by providing a suitable function, which gives hashing broad applicability.
  • Stable Performance Under Growth: Rehashing and load-factor management keep lookups from degrading heavily. When your collection expands, you can resize the table and disperse items again.
  • Easy Integration: Many programming languages have built-in hash-based structures (like Python’s dictionaries or Java’s HashMap), so you can tap these benefits with minimal overhead.

Are There Any Disadvantages or Limitations of Hashing?

Hashing in data structure delivers swift lookups, but it isn’t a universal fix for data management. Certain operations — like finding the smallest element — may be slow if your structure relies solely on hash codes. 

Some functions can produce frequent collisions, reducing the benefits of constant-time access. Once you go beyond direct lookups, you start to see the trade-offs that come with a hash-based system.

Below are some key drawbacks to consider:

  • Collisions Remain Inevitable: Even with a well-designed function, different keys can map to the same index. Handling those collisions adds extra overhead.
  • No Natural Ordering: Hash tables don’t store items in a sorted manner. You'll struggle with typical hashing approaches if you rely on range queries or want the smallest/largest item quickly.
  • Rehashing Overhead: As your table fills up, you might need to increase its size and redistribute every entry. This process can spike CPU usage until it’s complete.
  • Quality of Hash Function: A poorly chosen algorithm might cluster many keys at the same index. That undercuts the speed benefits and can degrade performance to linear time in the worst case.
  • Fixed Table Size, Initially: You pick a size based on expected data volume, but real-world demands can exceed that. Dynamic resizing helps but introduces complexity.
  • Security Concerns: Some hashing methods are vulnerable if attackers can guess patterns. Cryptographic hash functions address this risk but add more processing costs.

What Are Real-World Applications of Hashing?

Hashing underpins everything from password validation in online services to database indexing at large scale. By converting keys into fixed-size outputs, hashing allows systems to locate, verify, or compare data in constant average time, even when volumes soar.

Below is a quick overview of how various industries and technologies rely on hashing:

Application

How Hashing Helps

Database Indexing Hash-based indices let you jump directly to rows without scanning entire tables.
Password Storage Storing hashed (and salted) passwords prevents anyone who sees the database from reading plain-text details.
Message Integrity Hash codes (like checksums) detect tampering or corruption, since any slight modification changes the result.
Caching Hashing speeds up lookups by computing a code for the requested item and returning its cached data if present.
Blockchain Each block references the previous block’s hash, so altering any block breaks the chain’s consistency.
Symbol Tables A hash table can store variable and function names, mapping them to memory addresses or attribute details.
File Integrity Hashes (like MD5, SHA-256) confirm that the file you receive matches the sender’s original version.
Load Balancing Consistent hashing assigns each incoming call to a server based on a hash value, which avoids excessive overlaps.

How Does Hashing Compare to Encryption?

Both hashing and encryption transform data, yet they serve distinct purposes. Hashing aims to produce a fixed code that helps confirm identity or integrity without revealing the original input. Encryption, on the other hand, encodes data in a way that can be reversed if the recipient knows the key. 

Below is a quick contrast of these two approaches:

Aspect

Hashing

Encryption

Reversibility Irreversible in practice; cannot retrieve original data. Reversible with the proper key
Primary Purpose Data integrity, authentication, or quick lookups. Protecting data confidentiality and privacy.
Output Length Typically a fixed-size string or numeric code. Often as large as or larger than the original data.
Keys Not required for basic hashing; salted hashes use extra data. Requires a key (public/private or shared) to lock and unlock data.
Collision Resistance Aims to minimize two distinct inputs producing the same output. Focuses on preventing unauthorized decryption rather than collisions.
Use Cases Password storage, checksums, verifying file integrity. Encrypting messages, securing files in transit, ensuring secrecy.

What Is the Role of the Load Factor and Rehashing?

A hash table works great at first, but it can slow down if too many items pile into the same space. The load factor tracks how many elements fill the table relative to its capacity. As that ratio grows, collisions become more likely, which makes it harder to keep lookups at constant time. 

To fix this problem, you can expand or rebuild the table — an action known as rehashing. 

Below is a closer look at how these two ideas keep hashing efficient.

Load Factor

The load factor is the fraction of occupied slots over the total table size. If you have 75 items in a table of 100 slots, the load factor is 0.75. A higher number means more collisions and longer searches because each index handles more entries.

  • Calculation: (Number of stored items) / (Table size)
  • Impact: Once it crosses a certain point (often 0.7 or 0.75), the chance of collisions soars.
  • Monitoring: By watching the load factor, you know when to resize or redistribute before performance takes a hit.

Rehashing

Rehashing involves creating a new table, often bigger than the old one, and computing fresh indices for every key. You repeat the hash function (or pick a new one) so items spread out again in a larger space.

  • When to Do It: Trigger rehashing if the load factor is too high. Some systems also do it when the table drops below a certain fill level.
  • Process: Allocate a new table of the chosen size. For each key in the old table, run the hash function and place it into the new table.
  • Result: Collisions drop, and lookups stay efficient. The main cost is that rehashing requires time to move everything over, so it’s done sparingly.

Load factor and rehashing go hand in hand. One checks the table’s capacity, and the other fixes collisions by spreading data across a roomier table. By managing both, you keep operations at constant average time, even as your dataset scales.

How Can upGrad Help You?

upGrad’s data science and machine learning courses equip you with the skills to master regression analysis through various courses. Here are some of the most popular courses with positive learner outcomes which you can try:

Ready to advance your career in data? Get personalized counseling from upGrad’s experts to help you choose the right program for your goals. You can also visit your nearest upGrad Career Center to kickstart your future!

Related Blogs You Might Like:

Boost your career with our popular Software Engineering courses, offering hands-on training and expert guidance to turn you into a skilled software developer.

Master in-demand Software Development skills like coding, system design, DevOps, and agile methodologies to excel in today’s competitive tech industry.

Stay informed with our widely-read Software Development articles, covering everything from coding techniques to the latest advancements in software engineering.

Frequently Asked Questions (FAQs)

1. Why is it called hashing?

2. What is hashing and hashmap?

3. Why is hashing used?

4. What is an example of hashing?

5. What is heap and hashing?

6. What is the principle of hashing?

7. What is the application of hashing?

8. What is the formula for a hash table?

9. What is the difference between hashing and hash tables?

10. Which hash has 32 characters?

11. What are the 3 main properties of hash function?

Rohan Vats

419 articles published

Get Free Consultation

+91

By submitting, I accept the T&C and
Privacy Policy

India’s #1 Tech University

Executive PG Certification in AI-Powered Full Stack Development

77%

seats filled

View Program

Top Resources

Recommended Programs

Suggested Blogs