The Amazons and the Flipkarts of the world have an extremely huge amount of data. So huge that it becomes difficult to analyze it on a single computer. You need a whole different infrastructure to deal with it. This huge amount of data is termed as 'big data' and analyzing it is termed as big data analytics.
Let's have Ujjyaini introduce you to the basics of big data analytics.
You learnt that big data is characterised by 3 Vs - Volume, Velocity and Variety. Volume refers to the size of the data, velocity refers to the rate at which the data is being received, and variety refers to the different types of data that you may get - images, text, numbers, speech, videos etc.
Big data requires a different architecture for it to be stored on the disks. It can not be stored simply like any other (relatively small) dataset, and that brings with it a completely different set of complications.
Parallel computing is quite simply the distribution of tasks between different machines like we distribute tasks among different people and then collate it to get the desired result.
Say, you want to add the first 100 positive numbers. One way is to add the first two numbers, then add the next number to the sum, and keep repeating till you reach 100. However, when you have huge data - say you want to add first thousand million numbers, adding numbers using this approach can take a huge amount of time. In such a case, you split the data into thousand subsets with each set containing one million numbers. The sum of each million is calculated separately. You now have a thousand such 'sums'. These thousand sums are then added to arrive at the final number. You will be amazed to know that even your smartphone employs these concepts to achieve faster results. It distributes work among multiple chips (cores) in the system and then collates the results.
Here, you can read more about Big Data and Parallel Computing.