Explore Courses
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Birla Institute of Management Technology Birla Institute of Management Technology Post Graduate Diploma in Management (BIMTECH)
  • 24 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Popular
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science & AI (Executive)
  • 12 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
University of MarylandIIIT BangalorePost Graduate Certificate in Data Science & AI (Executive)
  • 8-8.5 Months
upGradupGradData Science Bootcamp with AI
  • 6 months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
OP Jindal Global UniversityOP Jindal Global UniversityMaster of Design in User Experience Design
  • 12 Months
Popular
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Rushford, GenevaRushford Business SchoolDBA Doctorate in Technology (Computer Science)
  • 36 Months
IIIT BangaloreIIIT BangaloreCloud Computing and DevOps Program (Executive)
  • 8 Months
New
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Popular
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
Golden Gate University Golden Gate University Doctor of Business Administration in Digital Leadership
  • 36 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
Popular
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
Bestseller
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
IIIT BangaloreIIIT BangalorePost Graduate Certificate in Machine Learning & Deep Learning (Executive)
  • 8 Months
Bestseller
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in AI and Emerging Technologies (Blended Learning Program)
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
ESGCI, ParisESGCI, ParisDoctorate of Business Administration (DBA) from ESGCI, Paris
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration From Golden Gate University, San Francisco
  • 36 Months
Rushford Business SchoolRushford Business SchoolDoctor of Business Administration from Rushford Business School, Switzerland)
  • 36 Months
Edgewood CollegeEdgewood CollegeDoctorate of Business Administration from Edgewood College
  • 24 Months
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with Concentration in Generative AI
  • 36 Months
Golden Gate University Golden Gate University DBA in Digital Leadership from Golden Gate University, San Francisco
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Deakin Business School and Institute of Management Technology, GhaziabadDeakin Business School and IMT, GhaziabadMBA (Master of Business Administration)
  • 12 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science (Executive)
  • 12 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityO.P.Jindal Global University
  • 12 Months
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (AI/ML)
  • 36 Months
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDBA Specialisation in AI & ML
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
New
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGrad KnowledgeHutupGrad KnowledgeHutAzure Administrator Certification (AZ-104)
  • 24 Hours
KnowledgeHut upGradKnowledgeHut upGradAWS Cloud Practioner Essentials Certification
  • 1 Week
KnowledgeHut upGradKnowledgeHut upGradAzure Data Engineering Training (DP-203)
  • 1 Week
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
Loyola Institute of Business Administration (LIBA)Loyola Institute of Business Administration (LIBA)Executive PG Programme in Human Resource Management
  • 11 Months
Popular
Goa Institute of ManagementGoa Institute of ManagementExecutive PG Program in Healthcare Management
  • 11 Months
IMT GhaziabadIMT GhaziabadAdvanced General Management Program
  • 11 Months
Golden Gate UniversityGolden Gate UniversityProfessional Certificate in Global Business Management
  • 6-8 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
IU, GermanyIU, GermanyMaster of Business Administration (90 ECTS)
  • 18 Months
Bestseller
IU, GermanyIU, GermanyMaster in International Management (120 ECTS)
  • 24 Months
Popular
IU, GermanyIU, GermanyB.Sc. Computer Science (180 ECTS)
  • 36 Months
Clark UniversityClark UniversityMaster of Business Administration
  • 23 Months
New
Golden Gate UniversityGolden Gate UniversityMaster of Business Administration
  • 20 Months
Clark University, USClark University, USMS in Project Management
  • 20 Months
New
Edgewood CollegeEdgewood CollegeMaster of Business Administration
  • 23 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
KnowledgeHut upGradKnowledgeHut upGradBackend Development Bootcamp
  • Self-Paced
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 5 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
upGradupGradUI/UX Bootcamp
  • 3 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
upGradupGradDigital Marketing Accelerator Program
  • 05 Months

K Means Clustering Matlab [With Source Code]

Updated on 23 September, 2022

9.52K+ views
9 min read

K-means clustering is one of the most commonly used techniques by data professionals. Due to the algorithm’s efficacy, it is demanded by numerous industries in various applications.

A data scientist’s job requires the implementation of Clustering in many stages. Many large-scale projects are currently based upon the clustering algorithm and have drastically raised the bar for the demand of data science professionals.

One of those algorithms is the K-means clustering, which is the basic idea of this article and its implementation with the MATLAB source code.

Before getting the topic’s hold, let’s have a quick look at what Clustering is, its significance, and how it can be implemented in real life. By the end of the post, you will come to know how crucial this algorithm is for understanding data in large sets.

Enrol for the Machine Learning Course from the World’s top Universities. Earn Masters, Executive PGP, or Advanced Certificate Programs to fast-track your career.

What is Clustering?

Data is the most critical component for any application, and a cluster is nothing but an accumulation of similar data points combined. As the name clearly defines, Clustering is the process of dividing a large chunk of data into subgroups or only clusters based on the data pattern.

In machine learning, Clustering is applied when there is no predefined data available. The ultimate aim is to group data into classes with high Intra-class similarity.

Clustering is used to explore data. Some real-life examples where it can be used are in market segmentation to find customers with similar behaviours, image segmentation/compression, document clustering with multiple topics, etc.

It is a requisite step before processing data to identify homogeneous groups for building supervised models. K-Means clustering is an unsupervised learning algorithm as we have to look for data to integrate similar observations and form distinct groups.

Let’s take a look at the K-Means algorithm, which is one of the most applied and the simplest clustering algorithms.

K-Means Clustering

Image Source

K-means clustering is one of the most desired unsupervised machine learning algorithms.

Unsupervised algorithms make conclusions from datasets using input vectors without referring to labelled outcomes.

It is an iterative distance-based or centroid-based algorithm that segregates the dataset into K distinct subgroups (clusters) where each data point belongs to one group. The similarity of the intra-cluster data points is increased, and the distance between the clusters is kept optimum.

The distance between the data points and the centroid of the cluster is kept at a minimum, such as Euclidean distance. In K-Means, each cluster is linked to a centroid. The primary aim is to minimise the distances between the points and the respective cluster centroid.

FYI: Free nlp course!

How K-Means Clustering Works?

As the clustering process means several iterations to be performed, the K-Means algorithm has a unique way of working. Here is a step-by-step explanation of the way it works:

Image Source

Step 1: Initially, define the number of clusters ‘K’.

Step 2: Initialise random K data points as centroids for each cluster.

If there are 2 clusters, the value of ‘K’ will be 2.

Step 3: Perform several iterations until the assigned data points to clusters do not change.

Step 4: Calculate the sum of the squared distance between data points and the centroids.

Step 5: Allocate each data point to the closest cluster (centroid) to minimise the distance.

Step 6: Take an average of the centroids of the clusters belonging to each other.

This is a single iteration process performed for computing the centroid and assigning the points to the cluster based on their distance from the centroid. Once all the centroids are defined, the process is stopped.

An Illustrative Example Depicting the Implementation of K-Means Clustering

Statement: One of the famous food chains, McDonald’s wants to open a chain of outlets across California and want to find out the locations that will fetch them maximum revenue.

What McDonald’s already Has?

Ø  A strong e-commerce presence

Ø  Online customer data for analysing locations from where the orders are made frequently

Possible challenges they could face

  • Analyzing the areas from where the orders are made frequently.
  • Comprehend how many outlets to be opened in the area
  • Figure out the locations for the outlets within all areas to keep a minimum distance between the store and delivery points.

All these points need a lot of analysis and mathematics to work on.

How can the K-means Clustering Method be used here?

With a predefined value of K, the K-means algorithm can be implemented in the following steps:

  • Identifying the store locations with K Partition of objects into K non-empty subsets.
  • Determining the cluster centroids of the partition.
  • Assigning each location to a specific cluster.
  • Calculating the distances from each location and allocate points to the cluster where the distance is minimum with the outlet.
  • After one iteration, re-allotting the points, find the centroid of the new cluster formed.

Likewise, the K-Means Clustering algorithm can be applied to a variety of applications in varied scales. The hospitality industry, crime investigation departments, and image resizing, to name a few.

K-Means algorithm is implemented using many languages such as R, Python, MATLAB, etc. In the next section, we will look at how K-Means Clustering MATLAB is applied.

Read: Types of Functions in Matlab

K-Means Algorithm Using MATLAB

K-Means is a largely used algorithm used by many professionals dealing with data science, machine learning, artificial intelligence, cryptography, and cybersecurity.

The core objective of using this algorithm is to find out the centroid of each cluster. The data given to a programmer is heterogeneous. Here is the MATLAB code for plotting the centroid of each cluster and assign the coordinates of each centroid:

Clustering MATLAB

Code:

rng default; % For reproducibility

X = [randn(100,2)*0.75+ones(100,2);

    randn(100,2)*0.5-ones(100,2)];

 opts=statset(‘Display’,’final’);

[idx,C]=kmeans(X,4,’Distance’,’cityblock’,’Replicates’,5,’Options’,opts);

 plot(X(idx==1,1),X(idx==1,2),’r.’,’MarkerSize’,12);

hold on;

plot(X(idx==2,1),X(idx==2,2),’b.’,’MarkerSize’,12);

plot(X(idx==3,1),X(idx==3,2),’g.’,’MarkerSize’,12);

plot(X(idx==4,1),X(idx==4,2),’y.’,’MarkerSize’,12);

plot(C(:,1),C(:,2),’Kx’,’MarkerSize’,15,’LineWidth’,3);

legend(‘Cluster 1′,’Cluster 2′,’Cluster 3′,’Cluster 4′,’Centroids’, ‘Location’,’NW’);

title(‘Cluster Assignments and centroids’);

hold off;

for i=1:size(C, 1)

display([‘Centroid ‘, num2str(i), ‘: X1 = ‘, num2str(C(i, 1)), ‘; X2 = ‘, num2str(C(i, 2))]);

end

 Output:

MATLAB Window Showing Four Clusters and Respective Centroids

Image Source 

Results:

The centroids obtained are as follows:

  1. The value of X1 & X2 for Centroid 1: 1.3661; 1.7232
  2. The value of X1 & X2 for Centroid 2: -1.015; -1.053
  3. The value of X1 & X2 for Centroid 3: 1.6565; 0.36376
  4. The value of X1 & X2 for Centroid 4: 0.35134; 0.85358

Some business areas where K-Means clustering can be implemented

K-means clustering is a versatile algorithm and can be used for many business use cases for any type of grouping. Some examples are:

 Ø  Behavioral Segregation:

  • Division using purchase history
  • Division using application, website, or platform activities
  • Identify customers’ image based on their interests
  • Profile creation with monitoring activities

Ø  Image Scaling

  • Image compression using Python

Ø  Sensor measurements:

  • Detect motion sensors activity types
  • Group images
  • Divide audio
  • Spot health monitoring groups

Ø  Determine bots or anomalies:

  • Separate activity groups from bots
  • Make a group of valid activities to clean up outlier detection

Ø  Inventory classification:

  • Make inventory groups by sales activity
  • Make inventory groups by manufacturing metrics

Must Read: MATLAB Data Types

Advantages of K-Means Clustering

There’s a reason why top professionals prefer the K-Means clustering algorithm. Some benefits it offers:

  • It is a fast, robust, and easier to understand the algorithm.
  • The end-efficiency is relatively high
  • Offers phenomenal results when data sets are different from each other. For higher variables values, K-Means works comparatively quicker
  • The clusters produced with K-Means are relatively tighter than other clustering methods.

Conclusion

K-means clustering is a broadly used approach for analysing data clusters. Once you gain command, it is easier to understand and apply and deliver results quickly.

We hope with this article; we could introduce you to this analysis technique. For any queries regarding the K-means algorithm, feel free to comment below.

Further, if this field of study interests you, have a look at our PG Diploma in Machine Learning and AI program which is specially curated for working professionals offering 30+ case studies & assignments, 25+ mentorship sessions from industry experts, 10 Practical Hands-on Capstone Projects, 450+ hours of learning and placement assistance.

Frequently Asked Questions (FAQs)

1. What is K Means clustering in machine learning?

This is a popular clustering algorithm used in unsupervised machine learning. K Means algorithm works on the principle of identification of K centroids randomly. From the next step, the algorithm tries to maximize the overall within cluster distance and also minimize the overall between cluster distance. K Means algorithm is an iterative approach. In each iteration, it selects the K Means from the current set of centroids. The algorithm then assigns each observation to the closest K Mean. The distance between two clusters is computed based on the distance between the two closest observations. The Centroid of a cluster is defined as the average of all the observations in the cluster.

2. What are the limitations of the K Means clustering algorithm?

There are some limitations of K Means that you will want to keep in mind when using it. K Means is not robust to outliers. The K Means algorithm only works well when all of your data points are approximately the same distance from the centroid. If some of your data points are far away from the centroid, this will bias the assignment of other data points to clusters. K Means does not guarantee a unique solution. If you have more than one cluster of points, there is no guarantee that K Means will return the same number of clusters each time the algorithm is run. K Means converges slowly. The algorithm converges very slowly, even on small datasets.

3. What are the advantages of K Means clustering?

It is effective for both single and multiple dimensions. It is applicable in both two and three dimensions. It is particularly useful in situations where there are many clusters. The clusters are obtained at the mid-point of the data points. A mean value is calculated for each cluster. Each point is divided by the standard deviation and then it is compared to the mean value. The mean value and the standard deviation are calculated for all clusters and points.