Watch
INTRO VIDEO
“We are excited to partner with upGrad and launch this course on Machine Learning and Cloud. We are looking forward to teach and train the young, talented minds enrolling into the program," - Prof. Janakiram D.
Analyze movie data from the past hundred years and find out various insights to determine what makes a movie do well. Use data manipulation, data slicing, and various other data-frame operations to successfully find usable insights from the movies.
Analysis of the Uber dataset to understand the demand and supply of cab services using data visualization
As part of this case study, students must use the Boto3 SDK and AWS services to build an application that scans an image stored in S3 for celebrities.
Using EDA, help Stack Overflow implement the features in its web application such as sending notifications to relevant users if a question is raised and calculate the approximate time within which the user will receive an answer. Performing a time series to analyse the trend of data science and non data science related tags.
Similar to the BFSI or the healthcare industry, the telecommunications industry deals with big data situations. Now, these situations involve collecting, storing and analyzing data to generate insights. This case study involves usage of tools such as sqoop, hive and oozie for various jobs such as ingestion analysis and scheduling.
Analyze a public click stream dataset of a cosmetics store to extract valuable insights which generally data engineers come up with in an e-retail company Use effective querying, query optimization, file partitioning techniques for enhancing the performance of your queries.
Perform the real time hashtags analysis of tweets using the Twitter APIs Get the real time tweets from API and then filter out the hashtags using Spark Streaming.
Using the ALS algorithm, predicted the top 20 recommendation to each user who have rated certain movies based on his/her liking or disliking. The recommendation system is built on MovieLens dataset using a collaborative filtering technique i.e. ALS algorithm.
Implementation of market basket analysis using Spark with the help of Apriori algorithm. The assignment tests the business acumen by implementing multiple strategies over the results of the Apriori algorithm.
An e-commerce company is evaluating its customer experience. They run a survey among the customers to understand their feedback. The feedback shows that late delivery is one of the most frequent problems faced by the customer. However, the consumer experience manager says that the late delivery percentage is similar to the industry average. Hypothesis Tesing is used to establish that the organization has a poor record in delivery. In order to solve this issue, the company has come up with a solution. They decide to build an intelligent system that predicts the delivery date by taking a lot of factors into account. Once the algorithm is built, it is rolled out for 10% of the customers to check whether it is successful. Use A/B Testing to compare the late delivery complaint percentage before and after the algorithm was rolled out.
A common challenge in the real estate industry includes predicting the price of a flat given various factors such as area, bedrooms, parking, etc. Machine learning enables you to answer this question by making use of the relationship between these variables.
Build a pricing model to predict the fare of a taxi ride given set of attributes such as date-time, coordinates for pickup and drop off. Data understanding, outlier analysis, and effective feature engineering are essential in any modelling exercises.
The objective of the case study is to predict whether a customer leaves the network or not. A logistic regression model using the scikit learn library is built to predict the churn rates of individual customers. The data is divided in three different tables which need to be combined. Lot of importance is give to feature section, and the metrics used to evaluate the model.
The dataset contains information about advertisements shown to customers. The complete dataset is anonymised for privacy reasons. The objective of the case study is to build a logistic regression model which can predict whether an advertisement will be click or not. The case study includes model building a logistic model in pyspark. To do this EDA, and all the pre-processing is done on pyspark itself. Insights about handling anonymised data are provided. Business implication for section of hyperparameter (threshold) is also done. Performance optimization for spark is also discussed.
Learn how predictive analytics can be used to decide the creditworthiness of customers and whether they can be issued a credit card or not. Incorporating business relevant techniques and taking business driven decisions are crucial while building models especially in Finance Industry
You will be required to build a tree to predict the income of a given population, which is labelled as <= 50K and >50K. The attributes (predictors) are age, working-class type, marital status, gender, race, etc. Build a decision tree in python. Visualization and hyperparameter tuning included.
LibSVM dataset is used to represent data that contains numerous missing values in a compact form. Build a regression decision tree in spark using spark libraries.
To correctly predict whether the person has heart disease or not based on: id, age, gender, height, weight, ap_hi, ap_lo, cholesterol, gluc. smoke, alco, active, cardio. Build a decision tree in python using sklearn. Also, requires hyperparameter tuning and visualizing the tree.
To correct predict whether a transaction is fraud or not based on the previous records. Requires building a classification decision tree.
Use of Random forest model to recommend an FBI code based on the different attributes of the crime for the Chicago police.
Analyse the data of different matches to understand what kind of strategy works better to improve the ranking.
In this case study, we have built a clustering model for customer segmentation. The data contains various attributes such as InvoiceDate, UnitPrice, CustomerID etc. On the basis of this data, the supermarket store wants to segment its customers into clusters based on their shopping patterns.
Implemented the movie recommendation system (Item based filtering) by reducing the dimensionality of the movies vs tags dataset using PCA algorithm. PCA act as dimensionality reduction technique and using this we are able to compress the data in 100 dimensions instead of 1000 dimensions (tags) in the original dataset. Now using these 100 dimensions, we can find the top nearest 5 movies corresponding to a particular movie which came out to be very effective item based recommendation system.
Over 2,300 students have completed this course and started working at their dream job, whats stopping you?
The content will be a mix of interactive lectures from industry leaders as well as world-renowned faculty. Additionally, the program comprises live lectures dedicated to solving your academic queries and reinforcing learning. Offline upGrad BaseCamps will also facilitate peer-to-peer interactions.
Experience in areas of Data Analytics / Software Development / Cloud Management / Database Management / Machine Learning, etc.
Knowledge of at least one of the programming language such as R/Python/Java/C/C++
Undergraduate student in disciplines such as engineering, Maths/Statistics.
The program will benefit you in different ways depending on your prior experiences:
The program will familiarise you with the advancements in ML and Cloud. It will also help you understand how machine learning models are deployed using cloud so that you can transition to a Data Scientist / Machine Learning Engineer / Data Engineer / Data Analyst (using Cloud) roles.
This program is designed for working professionals looking to pick up skills in advanced concepts such as Machine Learning, Cloud, Data Warehousing, Data Management, Big Data Processing, and Deployment of Machine Learning Models. This program demands consistent work and time commitment over the entire duration of 12 months.
After you successfully complete the program, you will receive an Advanced Certification in Machine Learning and Cloud from IIT Madras.
upGrad, IIT Madras' world-class faculty, and many industry leaders have committed a lot of time in conceptualising and creating this program to make sure that the candidates can receive the best possible learning experience. Hence, we want to make sure that the participants of this program also show a very high level of commitment and passion for Machine Learning and Cloud.
The applicants will undergo profile evaluation, mandatory selection test designed to check mathematical and programming abilities, and a personal interview.
Refund Policy: (Programs with prep-session component)