COURSES
MBAData Science & AnalyticsDoctorate Software & Tech AI | ML MarketingManagement
Professional Certificate Programme in HR Management and AnalyticsPost Graduate Certificate in Product ManagementExecutive Post Graduate Program in Healthcare ManagementExecutive PG Programme in Human Resource ManagementMBA in International Finance (integrated with ACCA, UK)Global Master Certificate in Integrated Supply Chain ManagementAdvanced General Management ProgramManagement EssentialsLeadership and Management in New Age BusinessProduct Management Online Certificate ProgramStrategic Human Resources Leadership Cornell Certificate ProgramHuman Resources Management Certificate Program for Indian ExecutivesGlobal Professional Certificate in Effective Leadership and ManagementCSM® Certification TrainingCSPO® Certification TrainingLeading SAFe® 5.1 Training (SAFe® Agilist Certification)SAFe® 5.1 POPM CertificationSAFe® 5.1 Scrum Master Certification (SSM)Implementing SAFe® 5.1 with SPC CertificationSAFe® 5 Release Train Engineer (RTE) CertificationPMP® Certification TrainingPRINCE2® Foundation and Practitioner Certification
Law
Job Linked
Bootcamps
Study Abroad
MS in Data AnalyticsMS in Project ManagementMS in Information TechnologyMasters Degree in Data Analytics and VisualizationMasters Degree in Artificial IntelligenceMBS in Entrepreneurship and MarketingMSc in Data AnalyticsMS in Data AnalyticsMS in Computer ScienceMaster of Science in Business AnalyticsMaster of Business Administration MS in Data ScienceMS in Information TechnologyMaster of Business AdministrationMS in Applied Data ScienceMaster of Business Administration | STEMMS in Data AnalyticsMaster of Business AdministrationMS in Information Technology and Administrative Management MS in Computer Science Master of Business Administration Master of Business Administration-90 ECTSMSc International Business ManagementMS Data Science Master of Business Administration MSc Business Intelligence and Data ScienceMS Data Analytics MS in Management Information SystemsMSc International Business and ManagementMS Engineering ManagementMS in Machine Learning EngineeringMS in Engineering ManagementMSc Data EngineeringMSc Artificial Intelligence EngineeringMPS in InformaticsMPS in Applied Machine IntelligenceMS in Project ManagementMPS in AnalyticsMS in Project ManagementMS in Organizational LeadershipMPS in Analytics - NEU CanadaMBA with specializationMPS in Informatics - NEU Canada Master in Business AdministrationMS in Digital Marketing and MediaMSc Sustainable Tourism and Event ManagementMSc in Circular Economy and Sustainable InnovationMSc in Impact Finance and Fintech ManagementMS Computer ScienceMBA in Technology, Innovation and EntrepreneurshipMSc Data Science with Work PlacementMSc Global Business Management with Work Placement MBA with Work PlacementMS in Robotics and Autonomous SystemsMS in Civil EngineeringMS in Internet of ThingsMSc International Logistics and Supply Chain ManagementMBA- Business InformaticsMSc International ManagementMBA in Strategic Data Driven ManagementMSc Digital MarketingMBA Business and MarketingMSc in Sustainable Global Supply Chain ManagementMSc Digital Business Analytics MSc in International HospitalityMSc Luxury and Innovation ManagementMaster of Business Administration-International Business ManagementMS in Computer EngineeringMS in Industrial and Systems EngineeringMaster in ManagementMSc MarketingMSc Global Supply Chain ManagementMS in Information Systems and Technology with Business Intelligence and Analytics ConcentrationMSc Corporate FinanceMSc Data Analytics for BusinessMaster of Business AdministrationMaster of Business AdministrationMaster of Business AdministrationMSc in International FinanceMSc in International Management and Global LeadershipMaster of Business AdministrationBachelor of BusinessBachelor of Business AnalyticsBachelor of Information TechnologyMaster of Business AdministrationMBA Business AnalyticsMSc in Marketing Analytics and Data IntelligenceMS Biotechnology Management and EntrepreneurshipMSc in Luxury and Fashion ManagementMaster of Business Administration (90 ECTS)Bachelor of Business Administration (180 ECTS)B.Sc. Computer Science (180 ECTS) MSc in International Corporate Finance MSc in Sustainable Luxury and Creative IndustriesMSc Digital MarketingMSc Global Supply Chain Management (PGMP)MSc Marketing (PGMP)MSc Corporate Finance (PGMP)MSc Data Analytics for Business (PGMP)MS Business AnalyticsMaster of Business AdministrationMS Quantitative FinanceMS Fintech ManagementMS Business Analytics PGMPState University of New York Bachelors Program - STEM MSc Business Intelligence and Data Science (PGMP)MSc International Logistics and Supply Chain Management ( PGMP)MSc International Management (PGMP)MSc Psychology & Management (PGMP)
For College Students

The Title Problem - How to Solve Them in Data Structures

$$/$$

Now that we have gone through all the search algorithms, let’s try to solve the first part of the problem we discussed at the beginning of the module. The first query was to get the score of a player using his/her email ID. Basically, you search for the email ID of the player and fetch the score corresponding to it. Let’s see how this is done.

$$/$$

Video Transcript

 

Now that we have understood important searching algorithms, we are in a good position to solve the first part of the leaderboard problem we discussed in the beginning of the module. The first query in the leaderboard example was to get the score of the user by their email address. So we need to search the user by their email address and we need to do it as fast as possible. Based on what you have learned so far, can you think of an approach which will make this operation as fast as possible? We can search for email naively and use a linear search, but linear search is order in. Can we do better? Yes, we can remember about Banner Research which we learned earlier. Banner Research has an order login time complexity. To put it more concretely, with say 1 million users in the system, we can find a player using his email ID in just 20 steps in worst case using binary search versus 1 million steps in the worst case using a linear search. This is a huge difference and binary search will help us look up the score of a player much more quickly than a naive linear search. That should be fast enough. To apply binary search we need to store our players in an array. But how do we perform binary search on email IDs which are essentially strings? The catch here is that strings also have a lexographical ordering that is nothing but alphabetical ordering. And in alphabetical ordering A is less than B, AB is less than AC, a is less than double A, and in our case if you consider email IDs, then ankit Gurg@gmail.com will be less than ankit Maheshwari@gmail.com. We will use this lexographical ordering to keep all the players sorted in our array, starting from the email ID starting with A's to the email ID starting with Z's. Now let's see how we can apply binary search to find a player in our array. Can you think of the steps which you will follow to search for the right player using binary search on their email IDs? We will first define the player object which will store in our sorted array.

 

So we have a class player in which we have three fields. The first is email ID which is a string. The second is score of the user, which is an integer, and the third would be created at timestamp. We will look at this third field created at timestamp later. Now let's go ahead and write the pseudocode for our binary search. Recall that we do binary search by recursively calling the same function with a smaller sub problem which is half the size of the previous one. Therefore we will need four parameters in the recursive function. The first parameter is L, which is the array of player objects in ascending histographical order of their email IDs. The second parameter is the email ID. To search for both of these parameters will carry forward as is to the sub problem in the recursive function call rest two parameters are the start and end indices of the subarray on which the search is being performed in the current function call.

 

In our functional call first we have the base condition where we check for the start pointer if it's greater than the end pointer. If the start pointer is greater than endpointer we return minus one that represents that the match is not found. And then we set the value of mid as the average of start and end pointer. Then we compare the input email with the email present at the midpointer. If the input email is less than the email present at the midpointer then we move towards left. In that case we'll keep the start pointer as same and we'll decrease the midpointer by one. On the other hand, if email is greater than the email present at the midpointer then we move towards the right. In that case we'll make start as mid plus one and we'll keep the end as same. But if the email is equal to the email at the midpointer then we have found the match and in that case we'll just return the middle email. Now you might wonder why did we have this particular base condition where you compare the start pointer with the end pointer and return minus one when start pointer is greater than the end pointer? In every function call we are either increasing the start pointer or decreasing the end pointer. So if the email does not exist inside the array then at certain function call our start pointer is bound to get greater than the end pointer. That is our base condition. Let's understand this base condition logic better through an example. Let's say we have an array of three strings a, m and z and we are searching for C in this array. If we apply our pseudocode on this array, we'll begin with start pointer at zero and end pointer as two and our mid will be one. And since C is clearly less than m so we move to the left of the mid. In a second iteration our start pointer will be zero and end pointer will also be zero. Hence our midpointer will also be zero. And since clearly C is greater than A, so we move to the right of the mid. In the third iteration our start pointer will become one and end pointer will become zero. There we have hit our base conditions. That means C does not exist inside this array.

 

Video Recap

 

  • The first query in the problem is to find a user's score by their email address.

  • Naive linear search is not efficient, so binary search is recommended.

  • Binary search has an order login time complexity, making it faster than linear search.

  • Strings have a lexicographical ordering that can be used to sort players in an array by their email addresses.

  • Pseudocode is provided for performing binary search on email IDs to find the right player.

  • The pseudocode uses recursive function calls with four parameters: L, the array of player objects in ascending lexicographical order; the email ID; and the start and end indices of the subarray being searched.

  • The function checks for the base condition where the start pointer is greater than the end pointer, returning  if the match is not found.

  • The base condition logic is explained through an example of searching for C in an array of strings a, m, and z.

 

Now that you are clear about the approach you need to follow, let’s move to the business end of the problem, the Java implementation.

$$/$$

Great! You just saw the code that searches for the email ID and fetches the score corresponding to it. Keep in mind that since you’re using binary search, you need a sorted list of email IDs. Here, the IDs were sorted in the lexicographical order. You will learn more about sorting in the next session.