What Are Self Organizing Maps: Beginner’s Guide
Updated on Feb 12, 2025 | 11 min read | 6.7k views
Share:
For working professionals
For fresh graduates
More
Updated on Feb 12, 2025 | 11 min read | 6.7k views
Share:
Table of Contents
Do you ever feel like you’re swimming in data but don’t know how to understand it? Data is becoming more readable and less complex with rapid advances in data sciences, machine learning, and artificial intelligence.
Self organizing maps are an example of one such advancement that reduces the dimensionality of data to reveal correlations that would otherwise be difficult to decipher. Self organizing maps (SOMs) use an unsupervised learning approach to cluster and map data to unravel complex issues and problems. With machine learning expected to reach a two trillion dollar valuation by 2030, this is the right time to upskill and learn about SOM in machine learning.
Enrol for the Machine Learning Course from the World’s top Universities. Earn Master, Executive PGP, or Advanced Certificate Programs to fast-track your career.
If you want a headstart in the right direction to understand the self organizing maps, you are in the right place. Read on to know more!
Self organizing maps (SOM) were introduced in the 1980s by the Finnish computer scientist Teuvo Kalevi Kohonen, also known as Kohonen’s Map after him. Self organizing maps are an example of Artificial Neural Networks that reduces data dimensionality through self-organising neural networks that support knowledge-based processing. Drawing inspiration from the structure and functioning of the human neural system, neural networks process and develop algorithms to untangle complex patterns, correlations, and problems.
Self organizing maps are unsupervised neural networks trained through unsupervised and competitive learning algorithms. The networks develop their classifications without any external or specified target output. Hence, they are ‘self-organizing.’
The maps consist of two layers- the input layer and the output layer. By clustering and mapping, they take higher dimensional data sets and reduce them to a lower dimensional, discretised representation- usually two-dimensional- called a map. It helps simplify multidimensional, complex data while preserving the topological properties of the input layer.
Self organizing maps are the way of the future. The discretised representation of multidimensional training data simplifies complex issues. The critical function of transforming a higher dimensional dataset into a lower dimensional representation holds the key to uncomplicating training data. It makes data visualization easier for the human eye.
It does so without the threat of data loss from reducing training data into a lower dimensional output or dimensionality reduction. Unlike in Principal Component Analysis (PCA), self organizing maps have an advantage as they retain the topological or structural information of the training data lost in PCA. Therefore, in cases where all dimensions are essential, they are represented in the Kohonen map despite reducing the data into two-dimensional outer space.
Further, seismic facies analysis helps recognise and develop organized relational clusters or groups by identifying different individual features. Self organizing maps act as a calibration method that relates these clusters to physical reality in the absence of physical analogs.
Additionally, self organizing maps aid in text clustering. This critical preprocessing step enables verification of text to decipher how it can be converted to a mathematical expression through SOM and further analysed and processed. Moreover, SOM helps in exploratory data analysis by revealing underlying and hidden patterns, relationships, and groups within training data through clustering and visualisation.
Consequently, SOM in machine learning and artificial intelligence has many applications across fields- from pattern recognition, medical applications, telecommunications, robotics, product management, data mining and processing, and more!
The architecture of self organizing maps is essential in understanding what they do and how they do it. SOMs consist of two layers of nodes- the input layer and the output layer (or the Kohonen layer or the SOM layer). The two layers are directly connected. The input layer consists of source nodes that express features, attributes, or variables. They are represented as m-dimensional input vectors, x = (x₁, x₂…xₘ). The output layer, or the Kohonen layer, has nodes arranged in topological architecture, which is usually two-dimensional with a grid organisation consisting of rows and columns.
Each node has a specific location in the grid, and each input vector has a corresponding weight vector, w = (w₁, w₂…wₘ). These nodes indicate the maximum number of clusters possible from the input data. The adjacency of nodes depicts similarity between clusters, and the distance between neighbours is unimportant. The map thus takes on different shapes, typically forming rectangular or hexagonal grids. Each topological structure has specific properties, and the hexagonal is the preferred version.
Now that we know the basic architecture of self-organising maps, let’s understand how they function. We will breakdown the functionality of SOMs into the following steps:
The first step that initiates the mapping process in self organizing maps is the initialisation of weights to vectors. Random values are selected for the initial weight vectors (wₒ).
A sample of the input training vector (x) is chosen randomly from the input space.
Nodes compete to be activated and selected in this stage of the competition. The node whose weight vector is closest to the input vector becomes activated by computing their similarity using measurement methods. The most viable equation to measure distance [d₀ (t)] is the Euclidean distance for visual representation. The winning node is called the Best Matching Unit (BMU).
In this step, the topological neighbourhood radius [nr(t)] of the BMU [c(t)] is identified. In this stage, the process of cooperation takes place.
It is the stage of adaptation in which the weight vectors of the BMU and nodes that fall within the neighbourhood in the output space are updated using the weight updation equation. It helps nodes in the output space closely resemble and represent the features of the input space. Two parameters are essential: learning rate [α(t)] and neighbourhood size.
The process from step b onwards is repeated for N iterations till the feature map stops changing and takes on an identifiable shape.
Source: Example of how Self Organizing Maps work
Self-organising maps are valuable in simplifying data to reveal the underlying patterns and relationships. There are several advantages of SOMs.
There are several advantages of self organizing maps, but they also have certain disadvantages.
Here is an example code for implementing a SOM using Python:
import numpy as np
from matplotlib import pyplot as plt
class SOM:
def __init__(self, input_shape, output_shape, learning_rate=0.1, sigma=1.0):
self.input_shape = input_shape
self.output_shape = output_shape
self.learning_rate = learning_rate
self.sigma = sigma
self.grid = np.random.randn(*output_shape, input_shape)
def train(self, data, num_epochs):
for epoch in range(num_epochs):
for x in data:
winner = self._find_winner(x)
self._update_weights(x, winner)
def _find_winner(self, x):
x = np.expand_dims(x, axis=0)
distances = np.linalg.norm(self.grid – x, axis=-1)
return np.unravel_index(np.argmin(distances), self.output_shape)
def _update_weights(self, x, winner):
winner_weight = self.grid[winner]
distances = np.linalg.norm(np.indices(self.output_shape) – np.array(winner)[:, np.newaxis, np.newaxis], axis=0)
influence = np.exp(-distances ** 2 / (2 * self.sigma ** 2))
self.grid += self.learning_rate * influence[…, np.newaxis] * (x – winner_weight)
def get_map(self):
return self.grid.reshape(-1, self.input_shape)
Let’s go through this code step by step:
Here’s an example of how you can use this SOM implementation to cluster a dataset:
# Generate some sample data
data = np.random.randn(100, 2)
# Create a SOM with a 10×10 grid
som = SOM(input_shape=2, output_shape=(10, 10))
# Train the SOM for 100 epochs
som.train(data, num_epochs=100)
# Get the map of the SOM
map = som.get_map()
# Plot the data and the SOM
plt.scatter(data[:, 0], data[:, 1], color=’blue’)
plt.scatter(map[:, 0], map[:, 1], color=’red’)
plt.show()
In this example, we generate a dataset of 100 2-dimensional vectors using NumPy’s randn function. We create a SOM with a 10×10 grid and train it on the dataset for 100 epochs. Finally, we get the map of the SOM and plot it along with the original data using Matplotlib.
This should give you a good starting point for implementing SOMs in Python. The basic SOM algorithm has many variations and extensions, so feel free to experiment and explore!
The immensity of available data today can make recognizing the existing correlations, relationships, and patterns challenging. Converting this data into an easily readable and digestible model can be critical to forming actionable insights to solve real-world problems and issues. Self-Organizing Maps are an example of technological advancement that can benefit humanity. Clustering and mapping higher-dimensional data into lower-dimensional models while preserving all the topological properties make Self-Organizing Maps in machine learning highly applicable across fields.
If you want to be a part of the ever-evolving and advancing field of Machine Learning, look no further.
Step into the future with upGrad.
Sign up today for upGrad’s Advanced Certificate Programme in Machine Learning and NLP. Offered by IIIT Bangalore, the 8-month course from India’s #1 Technical University (Private) will jumpstart your career in the industry. The comprehensive curriculum includes subjects like Machine Learning, Natural Language Processing, Machine Translation, and Git, taught by an experienced faculty.
Enrol now and elevate your career by becoming part of an illustrious alumni network!
You can also check out our free courses offered by upGrad in Management, Data Science, Machine Learning, Digital Marketing, and Technology. All of these courses have top-notch learning resources, weekly live lectures, industry assignments, and a certificate of course completion – all free of cost!
Get Free Consultation
By submitting, I accept the T&C and
Privacy Policy
Top Resources