COURSES
MBAData Science & AnalyticsDoctorate Software & Tech AI | ML MarketingManagement
Professional Certificate Programme in HR Management and AnalyticsPost Graduate Certificate in Product ManagementExecutive Post Graduate Program in Healthcare ManagementExecutive PG Programme in Human Resource ManagementMBA in International Finance (integrated with ACCA, UK)Global Master Certificate in Integrated Supply Chain ManagementAdvanced General Management ProgramManagement EssentialsLeadership and Management in New Age BusinessProduct Management Online Certificate ProgramStrategic Human Resources Leadership Cornell Certificate ProgramHuman Resources Management Certificate Program for Indian ExecutivesGlobal Professional Certificate in Effective Leadership and ManagementCSM® Certification TrainingCSPO® Certification TrainingLeading SAFe® 5.1 Training (SAFe® Agilist Certification)SAFe® 5.1 POPM CertificationSAFe® 5.1 Scrum Master Certification (SSM)Implementing SAFe® 5.1 with SPC CertificationSAFe® 5 Release Train Engineer (RTE) CertificationPMP® Certification TrainingPRINCE2® Foundation and Practitioner Certification
Law
Job Linked
Bootcamps
Study Abroad
MS in Data AnalyticsMS in Project ManagementMS in Information TechnologyMasters Degree in Data Analytics and VisualizationMasters Degree in Artificial IntelligenceMBS in Entrepreneurship and MarketingMSc in Data AnalyticsMS in Data AnalyticsMS in Computer ScienceMaster of Science in Business AnalyticsMaster of Business Administration MS in Data ScienceMS in Information TechnologyMaster of Business AdministrationMS in Applied Data ScienceMaster of Business Administration | STEMMS in Data AnalyticsM.Sc. Data Science (60 ECTS)Master of Business AdministrationMS in Information Technology and Administrative Management MS in Computer Science Master of Business Administration Master of Business Administration-90 ECTSMSc International Business ManagementMS Data Science Master of Business Administration MSc Business Intelligence and Data ScienceMS Data Analytics MS in Management Information SystemsMSc International Business and ManagementMS Engineering ManagementMS in Machine Learning EngineeringMS in Engineering ManagementMSc Data EngineeringMSc Artificial Intelligence EngineeringMPS in InformaticsMPS in Applied Machine IntelligenceMS in Project ManagementMPS in AnalyticsMS in Project ManagementMS in Organizational LeadershipMPS in Analytics - NEU CanadaMBA with specializationMPS in Informatics - NEU Canada Master in Business AdministrationMS in Digital Marketing and MediaMSc Sustainable Tourism and Event ManagementMSc in Circular Economy and Sustainable InnovationMSc in Impact Finance and Fintech ManagementMS Computer ScienceMS in Applied StatisticsMaster in Computer Information SystemsMBA in Technology, Innovation and EntrepreneurshipMSc Data Science with Work PlacementMSc Global Business Management with Work Placement MBA with Work PlacementMS in Robotics and Autonomous SystemsMS in Civil EngineeringMS in Internet of ThingsMSc International Logistics and Supply Chain ManagementMBA- Business InformaticsMSc International ManagementMBA in Strategic Data Driven ManagementMSc Digital MarketingMBA Business and MarketingMaster of Business AdministrationMSc in Sustainable Global Supply Chain ManagementMSc Digital Business Analytics MSc in International HospitalityMSc Luxury and Innovation ManagementMaster of Business Administration-International Business ManagementMS in Computer EngineeringMS in Industrial and Systems EngineeringMSc International Business ManagementMaster in ManagementMSc MarketingMSc Business ManagementMSc Global Supply Chain ManagementMS in Information Systems and Technology with Business Intelligence and Analytics ConcentrationMSc Corporate FinanceMSc Data Analytics for BusinessMaster of Business AdministrationMaster of Business AdministrationMaster of Business AdministrationMSc in International FinanceMSc in International Management and Global LeadershipMaster of Business AdministrationBachelor of BusinessMaster of Business Administration 60 ECTSMaster of Business Administration 90 ECTSMaster of Business Administration 90 ECTSBachelor of Business AnalyticsBachelor of Information TechnologyMaster of Business AdministrationMBA Business AnalyticsMSc in Marketing Analytics and Data IntelligenceMS Biotechnology Management and EntrepreneurshipMSc in Luxury and Fashion ManagementMaster of Business Administration (90 ECTS)Bachelor of Business Administration (180 ECTS)B.Sc. Computer Science (180 ECTS) MSc in International Corporate Finance MSc in Sustainable Luxury and Creative IndustriesMSc Digital MarketingMSc Global Supply Chain Management (PGMP)MSc Marketing (PGMP)MSc Corporate Finance (PGMP)MSc Data Analytics for Business (PGMP)MS Business AnalyticsMaster of Business AdministrationMS Quantitative FinanceMS Fintech Management
For College Students

Analysis of Merge Sort in Data Structure

$$/$$

 

In the following video, we will analyse the Merge Sort algorithm. Now that we have used ‘divide and conquer’ in its implementation, it should ideally showcase an improved time efficiency compared to the previous algorithms. Let’s see if it does so.

$$/$$

Video Transcript

 

Now let us look at the time complexity of merge sort. So if you look at this code carefully and if you look at the first few steps of this pseudocode, you would see that here I am merely calculating my n. Here I'm only calculating my mid value which is n by two, or here I am just declaring my L and arrays. So, how much time do you think this will take? Will it really depend on the length of my input array? Suppose that my input array was of length 1 billion. So how much time would it take for me to calculate mid? Well, it would take me one single mathematical operation to calculate mid. And moreover, it would take me one single declaration to declare L and R. So basically, these entire all of these processes are going to take a constant amount of time because they involve a simple mathematical calculation where I'm calculating mid is equal to N by two and simply declaring my LNR array. Note that when it comes to inserting values in the LNR, we are doing it later. But here I'm merely just declaring my LNR array. So these steps are going to take a constant amount of time, which I'll denote by C. Now, moving on to these four loops here, I'm inserting all the elements from zero till mid minus one into L and I'm. Inserting all the remaining elements from mid till N minus one, into R. So basically, I have a large array of size N, and I'm inserting those N elements into two separate subarrays L and R. So how many steps does it involve? Well, it involves a total of n number of steps together. So writing down values in my left subarray and right subarray are total going to take N number of steps. So here both of these four loops together are going to take N amount of steps. Now, if you are still confused as to why these two for loops are taking a total N amount of time, so let us recall that in the beginning we had an array like this of eight elements and we divided it into elements of length for each.

 

Now, for dividing and filling values in these arrays, what I did was that I iterated from this position to this position and filled this array. So this took me a total number of four steps. Similarly, when I filled in the remaining elements into this array, this again took me a four number of steps. So how many steps did it take for me in total? It took me four plus four eight number of steps which was the length of A. And hence I say that it takes me a total of N number of steps for inserting all the values in my L and R array. Now, coming over to the next step, we see. I'm calling the merge sort function here on my L array and my merge sort function on my R array. So, if I denote the time complexity of this entire merge sort by T of N, note that T of N is merge sort on A, which is my larger array. So how much time is it going to take to do these merge sort operations on L and R? Now, since L and R both are of length N by two, so in terms of T of N, I can write them that they are going to take a time of T of N by two and N by two each.

 

It is because when I called merge sort of A, it took me T an amount of time. Now, when I call merge sort on array, which is half the length of a it will take me t? N by two amount of time which is for merge sort on l and it will take me t of n by two amount of time, which is merge sort on r now, coming over to the last step. I'm doing a merge operation here. So if you recall how I merged my two arrays. Well, I had these four arrays, these two arrays of length for each. And what I did was that I sequentially compared all the elements in these two smaller subarrays and I combined them into one. So I was beginning from these two points, comparing these two elements and writing them here. Then I was writing my next element here till when did I do it? Well, I did it till I reached the last element of my A array. So, if my length of A array is N, how many steps will it take for me to merge these L and R into my final A array? Yes, it is going to take me a total number of N steps, since I am merely taking elements from L and R, comparing them and putting them inside A. So this process that is this merging operation is going to take me a total number of N steps since I am only combining the elements from L and R and putting it into a final larger array a with which comprises of double elements of L and R. That is, it will take a total amount of N number of steps to merge these two smaller arrays.

 

Video Recap

 

 Merge sort is a sorting algorithm that utilizes the concept of divide and conquer.

  • A merge function is defined which takes in an input array, the starting index of that input array, the middle element, the mid index of that input array, and the ending index.

  • The length of the left and right subarrays are calculated.

  • Two new subarrays, i and r, are declared and populated with the respective elements of the left and right subarrays.

  • A merging process is executed where the smaller of the two elements is copied into a new array and the respective pointer is incremented.

  • If one of the subarrays has been exhausted, then the remaining elements of the other subarray are copied into the new array.

  • A sorting function is defined that takes in an input array, the starting index, and the ending index.

  • The sorting function calculates the mid index and calls itself recursively on the smaller subarrays.

  • Finally, the merge function is called on the left and right subarrays to sort and merge them.

  • The code is tested and verified to work correctly.

 

Till now you have learnt the breakdown of how many steps each part of the Merge Sort code takes. Let's use this information in the next video to calculate the time complexity of the algorithm.

$$/$$

Merge sort analysis in brief :

To analyse the time complexity, we can simply see what best divide and conquer can do to merge sort, to make it efficient.

            1     2       3     ………………………………........................................     n-1  n                         

 
             

 

  

.

      

Step 1: Divide

For dividing, we will compute the middle of the given ‘n’ number of elements. This would take a constant time, let’s say D(n)= Θ(1)

Step 2: Conquer

In this step, the sequence of ‘n’ elements is divided into n/2 elements each and now applying recursion, the n/2 elements would be further divided into n/2 which will continue until a single element is left. Therefore, we are recursively solving two sequences of n/2 elements that give us time complexity, let’s say 2T(n/2).

Step 3: Merge/Combine

Combining the subarrays of ‘n’ elements would take time C(n)= Θ(n).

 

T(n) = 0, if n<=1
T(n) = T(n/2) + T(n/2) + D(n)+ C(n), otherwise

 

T(n)= 2*T(n/2) + Θ(n) + Θ(1)      

// while finding the Big O /Theta we ignore the constants, hence theta(1) is ignored in the next step. And theta(n) is taken to be n because even in the worst case it can have an impact of ‘n’ only.

 

T(n)= 2*T(n/2) + n  ………………………………………… (i)

 

T(n/2) = 2*(2*T(n/4) + n/2) =T(n/)+n   ……………………………… (ii)

// In this step n in equation(i) is replaced by n/2 therefore, wherever there is n we have n/2, when we n/2 it is replaced with n/4 ([n/2]/2= n/4) and so on...

T(n) =  *T(n/2) + 2n ……………………. (iii)

// In this step ‘n’ in equation(ii), is replaced by 2n to make the same equation of the form T(n).

.

.

.

.

.

           = (2*T(n/2) + kn

  // just like equation (iii), we have obtained a general term for T(n)

Suppose n=2k , therefore k=logn

T()=nT(n/n)+logn*n

           = n*T(1) + nlogn

But T(1) = 0

T(n) = O(nlogn)

Some relief! Merge sort finally brings us out of O(N). The efficiency for merge sort comes out to be O(N log N). 

$$/$$