View All
View All
View All
View All
View All
View All
View All
View All
View All
View All
View All
View All
View All

Normalization in SQL: 1NF, 2NF, 3NF & BCNF

By Rohan Vats

Updated on Jul 03, 2023 | 10 min read | 7.8k views

Share:

Normalization is a systematic process of ensuring that a relational database model is efficient, suitable for general-purpose querying and free of undesirable characteristics such as insertion, update, and deletion anomalies, leading to losing the integrity of the data. This normalization process also helps to eliminate data redundancy and reduces the chances of inconsistency after any insert, update, or delete operations.

Understanding Normalization in SQL 

Normalization is a process in database design that organizes data into logical and efficient structures. It ensures that the data is stored to reduce redundancy and minimize data anomalies, such as update, insert, and deletion anomalies. SQL/Structured Query Language, is a popular language used to manage and manipulate databases. Normalization in SQL server is a way of organizing data stored in tables to optimize the efficiency and accuracy of queries.

Uses of Normalization in SQL 

Normalization involves breaking data into its smallest logical units and creating relationships between them. This allows for reduced duplication and faster query performance when retrieving or manipulating data. It even helps ensure the integrity of the database by ensuring that related fields are not stored together in one table. For example, if an address is included in multiple columns within a single table, it can lead to problems if that address needs to be updated. All entries associated with the old address must be correctly identified and updated. With normalization, however, each part of the address (street name, city, etc.) is stored in its table, making it easier to update and manage.

Normalization in SQL server can also help reduce data storage costs, as redundant data is eliminated. With fewer tables to maintain, the database remains better organized and more efficient. An example of normalization in SQL with an example would be a table that stores customer information like name, address, phone number and email address. By applying the principles of normalization, this table could be broken down into three separate tables – one for names, one for addresses and one for contact details – eliminating any redundancy or duplication. This makes querying the database faster and reduces the risk of updating errors due to incorrect relationships between fields. Understanding how normalization works in SQL is essential for creating efficient databases that perform optimally when retrieving data.

For a better understanding, consider the following schema: Student (Name, Address, Subject, Grade)

Check out our free courses to get an edge over the competition.

There are a few problems or inefficiencies in this schema.

1) Redundancy: The student’s Address is repeated for each subject he is registered for.

2) Updating anomaly: We may have updated the Address in one tuple (row) while leaving it unchanged in the other rows. Thus we would not have a consistently unique address for each student.

3) Insertion Anomaly: We will not record a student’s Address without registering for at least one Subject. Similarly, when a student wants to enrol for a new Subject, it’s possible that a different Address to be inserted.

4) Deletion Anomaly: If a student decides to discontinue all the enrolled subjects, then the student’s address will also be lost in the process of deletion.

Thus, it is important to represent the user data by relations that do not create anomalies following tuple add, delete, or update operations. This can only be achieved by a careful analysis of the integrity constraints, especially the database’s data dependencies.

The relations should be designed so that only those attributes are grouped that exist naturally together. This can mostly be done by a basic understanding of the meaning of all data attributes. However, we still need some formal measure to ensure our design goal.

Check out upGrad’s Java Bootcamp

Normalization is that formal measure. It answers the question of why a particular grouping of attributes will be better than any other.

Seven normal forms exist as of today:

  • First Normal Form (1NF)
  • Second Normal Form (2NF)
  • Third Normal Form (3NF)
  • Boyce-Codd Normal Form (BCNF)
  • Fourth Normal Form (4NF)
  • Fifth Normal Form (5NF)
  • Sixth or Domain-key Normal form (6NF)

Read: Types of Views in SQL

First Normal Form (1NF or Minimal Form)

  • There’s no top-to-bottom ordering to the rows and left-to-right ordering to the columns.
  • There are no duplicate rows.
  • Every row-and-column intersection contains exactly one value from the applicable domain or null value. This condition indicates that all column values should be atomic, scalar, or holding only a single value. No repetition of information or values in multiple columns is allowed here.
  • All columns are regular (i.e. rows have no hidden components such as row IDs, object IDs, or hidden timestamps).

Check out upGrad’s Full Stack Development Bootcamp (JS/MERN)

Let’s take an example of a schema that is not normalized. Suppose a designer wishes to record the names and telephone numbers of customers. They define a customer table as shown:

Customer ID First Name Surname Telephone Numbers
123 Bimal Saha 555-861-2025
456 Kapil Khanna 555-403-1659, 555-776-4100
789 Kabita Roy 555-808-9633

Here, it is not in 1 NF. The Telephone Numbers column is not atomic or doesn’t have a scalar value, i.e. it has had more than one value, which can’t be allowed in 1 NF.

Coverage of AWS, Microsoft Azure and GCP services

Certification8 Months
View Program

Job-Linked Program

Bootcamp36 Weeks
View Program

To Make It 1 NF

  • We’ll first break (decompose) our single table into two.
  • Each table should have information about only one entity.
Customer ID First Name Surname
123 Bimal Saha
456 Kapil Khanna
789 Kabita Roy

 

Customer ID Telephone Numbers
123 555-861-2025
456 555-403-1659
456 555-776-4100
789 555-808-9633

Repeating groups of telephone numbers do not occur in this design. Instead, each Customer-to-Telephone Number link appears on its own record.

Checkout: Most Common SQL Interview Questions & Answers

Second Normal Form

Each normal form has more constraining criteria than its predecessor. So any table that is in second normal form (2NF) or higher is, by definition, also in 1NF. On the other hand, a table that is in 1NF may or may not be in 2NF; if it is in 2NF, it may or may not be in 3NF, and so on.

A 1NF table is said to be in 2NF if and only if none of its nonprime attributes is functionally dependent on a part (proper subset) of a candidate key. (A nonprime attribute does not belong to any candidate key.)

Note that when a 1NF table has no composite candidate keys (candidate keys consisting of more than one attribute), the table is automatically in 2NF.

Benefits of Normalization in SQL Server

  • Reduces redundancy and data anomalies
  • Improves query performance by eliminating duplicate data
  • Ensures integrity of the database by separating related fields
  • Reduces storage costs due to fewer tables required to store data
  • Makes updating easier as only related fields need to be updated when changes are made.

Overall, normalization in SQL server is a vital part of creating an efficient and accurate database. By understanding how normalization works and applying it correctly, developers can ensure that their databases perform optimally and remain organized with minimal effort. Normalization makes querying the database faster, reduces data storage costs, and helps maintain the system’s integrity – all important considerations for any business organization or application developer.

upGrad’s Exclusive Software Development Webinar for you –

SAAS Business – What is So Different?

Check If a Relation R (A, B, C, D, E) with FD Set as { BC ? D, AC ? BE, B ? E } is in 2NF?

  • As we can see, the closure of AC is (AC)+ = {A, C, B, E, D} by applying the membership algorithm. But none of its subsets can determine all attribute of relation by themselves, so AC is the candidate key for this relation. Moreover, neither A nor C can be derived from any other attribute of the relation, so there will be only 1 candidate key which is {AC}.
  • Here {A, C} are the prime attributes and {B, D, E} are the nonprime attributes.
  • The relation R is already in 1st normal form as a relational DBMS in 1NF does not allow multi-valued or composite attribute.

BC ? D is in 2nd normal form because BC is not a proper subset of candidate key AC,

AC ? BE is in 2nd normal form as AC itself is the candidate key, and

B ? E is in 2nd normal form B is not a proper subset of candidate key AC.

Thus the given relation R is in the 2nd Normal Form.

Third Normal Form

A table is said to be in 3NF if and only if for each of its functional dependencies.

X → A, at least one of the following conditions holds:

  • X contains A (that is, X → A is a trivial functional dependency), or
  • X is a super key, or
  • A is a prime attribute (i.e., A is present within a candidate key)

Another definition of 3NFstates that every non-key attribute of R is non-transitively dependent (i.e. directly dependent) on the primary key of R. This means no nonprime attribute (not part of candidate key) is functionally dependent on other nonprime attributes. If there are two dependencies such that A ? B and BC, then from these FDs, we may derive A ? C. This dependence A-C is transitive.

Example of 3NF:

Consider the following relation Order (Order#, Part, Supplier, UnitPrice, QtyOrdered) with the given set of FDs:

Order# ? Part, Supplier, QtyOrdered   and Supplier, Part ? UnitPrice)

Here Order# is key to the relation.

Using Amstrong’s axioms, we get

Order# ? Part, Order ? Supplier, and Order ? QtyOrdered.

Order# ? Part, Supplier and Supplier, Part ? Unit Price, both give Order# ? UnitPrice.

Thus, we see that all nonprime attributes are depending on the key (Order#). However, there exists a transitive dependency between Order# and UnitPrice. So this relation is not in 3NF. How do we make it in 3NF?

We cannot store the UnitPrice of any Part supplied by any Supplier unless someone places an order for that Part. So we will have to decompose the table to make it follow 3NF as follows.

Order (Order#, Part, Supplier, QtyOrdered) and Price Master (Part, Supplier, UnitPrice).

Now there are no transitive dependencies present. The relation is in 3NF.

Also Read:  SQL for Data Science 

Learn Software Courses online from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.

Conclusion

There’s more to normalization, like BCNF, 4NF, 5NF and 6NF. In short, BCNF is nothing but an extension of 3NF, as the last rule of 3NF doesn’t apply here. All functional dependencies need to have the key attributes on the left and none on the right-hand side. (BCNF is also called 3.5NF). However, normal forms from 4NF and beyond are scarcely implemented in regular practice.

If you’re interested to learn more about full-stack development, check out upGrad & IIIT-B’s Executive PG Program in Full-stack Software Development, which is designed for working professionals and offers 500+ hours of rigorous training, 9+ projects, and assignments, IIIT-B Alumni status, practical hands-on capstone projects & job assistance with top firms.

Frequently Asked Questions (FAQs)

1. What is database normalization?

2. What are the different types of normal forms?

3. How to normalize a database?

Rohan Vats

408 articles published

Get Free Consultation

+91

By submitting, I accept the T&C and
Privacy Policy

India’s #1 Tech University

Executive PG Certification in AI-Powered Full Stack Development

77%

seats filled

View Program

Top Resources

Recommended Programs

upGrad

AWS | upGrad KnowledgeHut

AWS Certified Solutions Architect - Associate Training (SAA-C03)

69 Cloud Lab Simulations

Certification

32-Hr Training by Dustin Brimberry

View Program
upGrad

Microsoft | upGrad KnowledgeHut

Microsoft Azure Data Engineering Certification

Access Digital Learning Library

Certification

45 Hrs Live Expert-Led Training

View Program
upGrad

upGrad KnowledgeHut

Professional Certificate Program in UI/UX Design & Design Thinking

#1 Course for UI/UX Designers

Bootcamp

3 Months

View Program