Python Program for Merge Sort
Updated on Dec 20, 2023 | 8 min read | 6.9k views
Share:
For working professionals
For fresh graduates
More
Updated on Dec 20, 2023 | 8 min read | 6.9k views
Share:
Table of Contents
As a multi-paradigm programming language with a structured, object-oriented design approach and simple and uncluttered syntax and grammar, Python is rapidly emerging as the language of choice for programmers working on projects of varying complexity and scale.
Python provides a modular library of pre-built algorithms that allows its users to perform various operations that may help them achieve their task in and of themselves or serve as a step along the way to achieving a larger, more complex goal. One of the more popular such algorithms is one that enables the Merge Sort functionality.
It is a general-purpose sorting technique that enables users to take a random dataset of any type and from any source and divides it into repetitive stages until eventually it is broken down into its individual components – a recursive technique, commonly referred to as the ‘divide and conquer’ method.
The algorithm then puts together the individual components – again in repetitive stages – but sorts them into a pre-decided, logical sequence at each stage along the way, using the basic comparison and swap until the entire data series is reconstituted in the desired logical sequence.
Check out our other data science courses at upGrad.
Take, for instance, a random dataset of letters of the alphabet: N, H, V, B, Q, D, Z, R.
Step 1: The original dataset first gets broken down into two groups as follows:
N, H, V, B Q, D, Z, R
Step 2: Both the resulting arrays get further sub-divided as follows:
N, H V, B Q, D Z, R
Step 3: Finally, all four arrays are further spit-up until the entire data series gets broken down into its individual components:
N H V B Q D Z R
The process then reverses, and the individual data points now begin to merge in a stage-wise manner. But over the course of this merging process, each element in each sub-array gets assesses and swapped so that they sort themselves out in a logical sequence (alphabetical order), as follows:
Step 4: Individual elements merge into pairs while swapping positions as required to form the correct sequence:
H, N B, V D, Q R, Z
Step 5: The recursive process of merging and sorting continues to the next iteration:
B, H, N, V D, Q, R, Z
Step 6: The entire data series is finally reconstituted in its logical alphabetical order:
B, D, H, N, Q, R, V, Z
There are two approaches to Merge Sort implementation in Python. The top-down approach and the bottom-up approach.
The more commonly used top-down approach is the one described above. It takes longer and uses up more memory, and is therefore inefficient when working with smaller datasets. However, it is far more reliable, particularly when applied to large datasets.
Input code:
def merge_sort (inp_arr):
size = len(inp_arr)
if size > 1:
middle = size // 2
left_arr = inp_arr(:middle)
rIght_arr = inp_arr(middle:)
merge_sort(left_arr)
merge _sort(right_arr)
i = 0
j = 0
k = 0
(Where i and j are the iterators for traversing the left and right halves of the data series, respectively, and k is the iterator of the overall data series).
left_size = len(left_arr)
right _size = len(right_arr)
while i < left_size and j < right size:
if left_arr(i) < right_arr (j):
inp_arr(k) - left_arr(i)
i >= 1
else:
inp_arr(k) = right_arr (j)
j += 1
k += 1
while i < left_size:
inp_arr (k) = left_arr(i)
i += 1
k += 1
while j < right_size:
inp_arr (k) = right_arr(j)
j += 1
k += 1
inp_arr = (N, H, V, B, Q, D, Z, R)
print(:Input Array:\n”)
print(inp_arr)
merge_sort (inp_arr)
print(“Sorted Array:\n”)
print (inp_arr)
Output:
Input Array: N, H, V, B, Q, D, Z, R
Output Array: B, D, H, N, Q, R, V, Z
Bottom-up approach:
The bottom-up approach is quicker, uses up less memory, and works efficiently with smaller datasets but may run into problems when working with large data sets. It is therefore less-frequently used.
def merge(left, right):
result = [] x, y = 0, 0
for k in range(0, len(left) + len(right)):
if i == len(left): # if at the end of 1st half,
result.append(right[j]) # add all values of 2nd half
j += 1
elif j == len(right): # if at the end of 2nd half,
result.append(left[x]) # add all values of 1st half
i += 1
elif right[j] < left[i]:
result.append(right[j])
j += 1
else:
result.append(left[i])
i += 1
return result
def mergesort(ar_list):
length = len(ar_list)
size = 1
while size < length:
size+=size # initializes at 2 as described
for pos in range(0, length, size):
start = pos
mid = pos + int(size / 2)
end = pos + size
left = ar_list[ start : mid ] right = ar_list[ mid : end ]
ar_list[start:end] = merge(left, right)
return ar_list
ar_list = [N, H, V, B, Q, D, Z, R] print(mergesort(ar_list))
Output:
Input array: N, H, V, B, Q, D, Z, R
Output array: B, D, H, N, Q, R, V, Z
Let’s apply the top-down approach to four random off-road vehicles in India:
Brand |
Model |
Ex-showroom price in Rs Crore |
Jeep | Wrangler | 0.58 |
Ford | Endeavour | 0.35 |
Jaguar Land Rover | Range Rover Sport | 2.42 |
Mercedes Benz | G-class | 1.76 |
Input code:
class Car:
def __init__(self, brand, model, price):
self.brand = brand
self.model = model
self.price = price
def __str__(self):
return str.format(“Brand: {}, Model: {}, Price: {}”, self.brand,
self.model, self.price)
def merge(list1, i, j, k, comp_fun):
left_copy = list1[i:k + 1]
r_sublist = list1[k+1:r+1]
left_copy_index = 0
j_sublist_index = 0
sorted_index = i
while left_copy_index < len(left_copy) and j_sublist_index <
len(j_sublist):
if comp_fun(left_copy[left_copy_index], j_sublist[j_sublist_index]):
list1[sorted_index] = left_copy[left_copy_index]
left_copy_index = left_copy_index + 1
else:
list1[sorted_index] = j_sublist[j_sublist_index]
j_sublist_index = j_sublist_index + 1
sorted_index = sorted_index + 1
while left_copy_index < len(left_copy):
list1[sorted_index] = left_copy[left_copy_index]
left_copy_index = left_copy_index + 1
sorted_index = sorted_index + 1
while j_sublist_index < len(j_sublist):
list1[sorted_index] = j_sublist[j_sublist_index]
j_sublist_index = j_sublist_index + 1
sorted_index = sorted_index + 1
def merge_sort(list1, i, j, comp_fun):
if i >= j:
return
k = (i + j)//2
merge_sort(list1, i, k, comp_fun)
merge_sort(list1, k + 1, j, comp_fun)
merge(list1,i, j, k, comp_fun)
car1 = Car(“Jeep”, “Wrangler”, 0.58)
car2 = Car(“Ford”, “Endeavour”, 0.35)
car3 = Car(“Jaguar Land Rover”, “Range Rover Sport”, 1.76)
car4 = Car(“Mercedes Benz”, “G-class”, 2.42)
list1 = [car1, car2, car3, car4]
merge_sort(list1, 0, len(list1) -1, lambda carA, carB: carA.brand < carB.brand)
print(“Cars sorted by brand:”)
for car in list1:
print(car)
print()
merge_sort(list1, 0, len(list1) -1, lambda carA, carB: carA.price< carB.price)
print(“Cars sorted by price:”)
for car in list1:
print(car)
Output:
Cars sorted by brand:
Ford Endeavour
Jaguar Land Rover Range Rover Sport
Jeep Wrangler
Mercedez Benz G-class
Cars sorted by price:
Ford Endeavour
Jeep Wrangler
Jaguar Land Rover Range Rover
Mercedez Benz G-class
Insertion sort and merge sort in Python are often considered to be the same thing. But keep scrolling to understand the difference between the two algorithms.
#include<stdio.h>
void insertionSort(int arr[], int n) {
int i, key, j;
for (i = 1; i < n; i++) {
key = arr[i];
j = i - 1;
while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
j = j - 1;
}
arr[j + 1] = key;
}
}
void printArray(int arr[], int n) {
int i;
for (i = 0; i < n; i++)
printf("%d ", arr[i]);
printf("\n");
}
int main() {
int arr[] = {12, 11, 13, 5, 6};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Given array is \n");
printArray(arr, n);
insertionSort(arr, n);
printf("\nSorted array is \n");
printArray(arr, n);
return 0;
}
Output:
Given array is
12 11 13 5 6
Sorted array is
5 6 11 12 13
The concept of merge sort in Python using recursion helps to sort the numbers on a big array. While using the recursion method, merge sort gets applied on the smaller arrays twice. It leads to the calling of the merge sort algorithm a total of four times.
We keep passing on the problem but need to deliver a specific thing at a particular point. The stopping point where you deliver something is when you are required to sort an array with a single number. It is something that’s completely in your control while using the recursion method.
The merge sort algorithm in Python has a time complexity of 0(nLogn) in the best, worst, as well as average cases. The time complexity can always create two halves from the array. The halves are then merged in linear time. For the space complexity, an extra array needs to be sorted to contain the resultant sorted array. Therefore, the space-time complexity can be defined as 0(n).
You can learn both theoretical and practical aspects of Python with upGrad’s Professional Degree in Data Science. This course helps you learn Python from scratch. Even if you are new to programming and coding, upGrad will offer you a two-week preparatory course so that you can pick up on the basics of programming. you will learn about various tools like Python, SQL,, while working on multiple industry projects.
Get Free Consultation
By submitting, I accept the T&C and
Privacy Policy
India’s #1 Tech University
Executive PG Certification in AI-Powered Full Stack Development
77%
seats filled
Top Resources