1. Home
SQL

SQL Tutorial: Learn Structured Query Language Basics

Learn all SQL tutorial concepts in detail and master your career today.

  • 59
  • 9 Hours
right-top-arrow

Tutorial Playlist

46 Lessons
41

SQL DISTINCT: A Comprehensive Guide

Updated on 23/06/202430 Views

Introduction

In a world of databases, there is a tool known as the DISTINCT Clause to filter your data. It ensures you only see unique stuff while running a query. This tool can be very helpful when we are trying to make sense of the information in a database.

Let’s take a closer look and understand the DISTINCT clause and its use. 

What is a Distinct Clause?

What does the Distinct Clause mean? Imagine you have a long list of things in a database, like a list of students' names. Now, there could be an instance where the same name might be repeated more than twice. In a class, there can be two people with the name John.

The Distinct Clause helps you by giving it a prompt saying “Hey, show me all the names only once, even if it is present multiple times in the list.”

Syntax

Using a Distinct Clause in SQL is quite simple. While writing a query to fetch data from the database, just PASTE the word Distinct right after the Select part. Next, you need to put the name of the column/s you wish to make unique right after that.

For example,

SELECT DISTINCT column 1, column 2…

FROM table_name

In the syntax above:

  • ‘Column 1’ ‘Column 2’... represents the column from which you wish to collect the values.
  • ‘Table_name’ states the name of the table that contains the data.

Now, let’s take a few examples:

Let’s understand a few examples to groom the practical usage of ‘Distinct Clause’:

Basic Usage

Let’s understand the first example that shows the usage of the Distinct Clause, which is to fetch the names of unique departments from the ‘employees’ table. Suppose you are managing a database for the HR department of an organization that needs you to generate a report that shows the different departments in the organization.

However, the table named ‘employees’ would contain multiple entries for each of the departments, as there would be several employees designated for each specific department. Moreover, use this SQL query to have a clean list of distinct departments without duplicates:

Select Distinct department

From employee;

This query parameter tells the database to look at the ‘employees’ table to fetch unique values out of the ‘department’ column. Moreover, doing this will allow you to filter the duplicate entries by leaving you with a fresh filtered list of distinct department names.

Additionally, this is useful for organizing and categorizing the data, as it ensures that each of the departments is represented only once regardless of the number of employees associated with that.

Let’s understand the ‘second’ practical usage of ‘Distinct Clause’:

Combining Columns

In the second example, we will take a step ahead and retrieve unique combinations of employees' names and their departments from the ‘employees’ table. 

Let’s an example. Suppose, you are asked to create a directory of an employee, organized by department. Instead of just listing the names of the departments, you want to include the employees’ names under each department. To get the result, you should use the ‘Distinct Clause’ on several columns.

Select Distinct name, department

From employee;

The above query generates all unique combinations of an employee name and departments from the ‘employees’ table. Moreover, applying Distinct to both its ‘name’ and ‘department’ columns simultaneously ensures that each pair of employee name and department lets only one result set appear.

This will help create organized and comprehensive lists, such as employee directories or team rosters.

Ordering Results

Let’s head into the third scenario, where you can fetch a unique list of departments from the ‘employees’ table, but you need results to be sorted alphabetically by department names.

To get this, you have to use the Distinct Clause in conjunction with the ORDER BY clause:

Select Distinct department

From employees

Order by department ASC

Moreover, in a query like this, a Distinct Clause enables only a unique department name to be returned, while the ‘ORDER BY’ clause specifies that the results should be stored in ascending order based on the ‘Department’ column. 

Additionally, doing this will not only allow you to eliminate duplicates but also present the unique department names in a well-structured and organized manner, thus making the information easier to read and interpret.

Joining Tables

Let’s understand the last example of a Distinct Clause that can be used while joining multiple tables. Take a scenario where you want to fetch unique combinations of names and their corresponding department from both ‘employee’ and ‘department’ tables. To do this, you can use SQL to join operations along with the Distinct Clause.

Select Distinct department e.name , d.department_name

From employees e

Join departments d ON e.department_id = d.department_id

The above example shows how flexible and useful the Distinct Clause is in different situations.

Now, let’s understand the Benefits of the Distinct Clause

What are the benefits of the Distinct Clause?

Data Accuracy

While dealing with the databases, accuracy is important. The Distinct Clause here ensures that the data presented in the query results is precise and reliable. Removing the duplicate rows prevents misleading interpretations that could arise from redundant information. 

For example, while analyzing sales data, the Distinct Clause ensures that each unique product or customer is represented exactly once, thus reflecting a true state of affairs without distortion.

Clearer Results

Have you looked at the long lists of data and struggled to make any sense of it? That’s where a Distinct Clause comes in handy. By filtering out the duplicate entries, the Distinct Clause makes the query demographics for a marketing team examining a campaign performance. Having concise data is essential for making an informed decision.

Efficiency

In addition to improving data accuracy and clarity, a Distinct Clause should enhance the efficiency of an SQL query. While considering a scenario where you are working with a large dataset containing thousands or even millions of records.

After using a Distinct Clause to eliminate duplicate rows, you’re reducing the amount of data that is required to be processed. It also leads to a significant performance gain, especially in resource-intensive operations like data aggregation.

A Simplified Reporting

Reporting is quite a crucial aspect of data analysis and decision-making among organizations. While you are preparing a financial statement, operational reports, or performance metrics, ensure that your data is presented accurately. 

A SQL Distinct keyword will help you simplify the report by ensuring that each unique entity is represented only once in the output. Doing this not only streamlines the reporting process but also lessens the risk of any confusion.

Now, let’s understand all the considerations of  SQL Distinct Keyword

What are the Considerations of SQL Distinct Keyword

Performance

Using a Distinct SQL keyword makes the query slower, especially with large data. Moreover, the database has to do buffer work to find and remove all duplicate entries.

Data Types

If all of the data does not match while using Distinct SQL, you might get some weird results or errors. For instance, using Distinct SQL numbers and words together can cause problems.

Memory Usage

A Distinct SQL needs more memory, especially with a lot of data. Lack of space can cause data storage problems. Keep the checks to rectify these errors.

Indexing

The right indexes in place can speed up SQL Distinct keywords. However,  you have to be careful because bad or missing indexes can slow things down. So, ensure a proper indexing strategy.

Handling NULL

A Distinct treats NULL values as unique to show them up separately in results. If you group these NULL values or handle them differently, you’ll be required to do that before using the Distinct SQL QUERY.

Wrapping Up!

By thoroughly understanding these factors and by knowing the nuances of Distinct usage, SQL developers and analysts can leverage this powerful feature effectively, while ensuring optimum performance and data accuracies in SQL queries.

To sum it up, an SQL Distinct keyword is quite a handy tool to filter duplicate data in your query results. Moreover, it is easy to use and helps in providing accurately clear results by improving the efficiency of your queries. 

But, remember to always keep an eye on the performance by considering the data types and indexes that you are working with.

Last but not the least, a little understanding and practice can make Distinct function in SQL your go-to solution to get unique insights from your database.

FAQs

  1. What is a Distinct in SQL?

    Distinct is a keyword in SQL that is used to gather unique values from a column or a set of columns in any query result.
  1. Can I do SELECT Distinct in SQL?

    Yes, it is easy to use Select Distinct to capture all unique rows from a table in SQL.
  1. How can you SELECT unique values in SQL?

    You can use SELECT DISTINCT to gather only some unique values from a column or a combination of columns in SQL.
  1. What is the difference between SELECT  and SELECT DISTINCT?

    SELECT SQL gathers all rows and columns from a table, while SELECT DISTINCT lets you fetch unique rows from the table.
  1. What is the example of a Distinct Clause?

    One of the examples of a Distinct Clause would be to fetch unique product categories from a table that contains product information.
  1. Can we apply Distinct on multiple columns?

    Yes, we can use Distinct on multiple columns to fetch unique combinations of values from those columns.
  1. How to Select Distinct Count?

    Choose the Select Distinct Count function to count the number of unique values in a column.
  1. What is the difference between Unique and Distinct?

    Unique SQL refers to the property of having zero duplicates within a set, while Distinct SQL refers to fetching unique values from a dataset.
  2. Can we use a Distinct SQL two times?

    Yes, you can use Distinct multiple times in a single SQL query to fetch unique combinations of values from different columns.
  1. Is a Distinct costly in SQL?

    Distinct can introduce a performance overhead, especially when you are working with large datasets or complex queries, as this requires additional computation to identify and remove duplicates.
Abhimita Debnath

Abhimita Debnath

Abhimita Debnath is one of the students in UpGrad Big Data Engineering program with BITS Pilani. She's a Senior Software Engineer in Infosys. She… Read More

Get Free Career Counselling
form image
+91
*
By clicking, I accept theT&Cand
Privacy Policy
image
Join 10M+ Learners & Transform Your Career
Learn on a personalised AI-powered platform that offers best-in-class content, live sessions & mentorship from leading industry experts.
right-top-arrowleft-top-arrow

upGrad Learner Support

Talk to our experts. We’re available 24/7.

text

Indian Nationals

1800 210 2020

text

Foreign Nationals

+918045604032

Disclaimer

upGrad does not grant credit; credits are granted, accepted or transferred at the sole discretion of the relevant educational institution offering the diploma or degree. We advise you to enquire further regarding the suitability of this program for your academic, professional requirements and job prospects before enr...