BITS Pilani
In association with

PG Program in Big Data Engineering

Online11 MonthsStarts March 2018Rs. 2,25,000 (Incl. Taxes)

Rigorous Post-Graduate Program by BITS Pilani

Successfully complete all courses and be eligible to receive a Post-Graduate program certificate and transcript from BITS Pilani

Why Big Data?

High Impact Projects

Real time Aadhar verification, Amazon's User Recommendations, Facebook's newsfeed suggestions- all are possible due to Big Data!

Wide ranging Applications

Be it Manufacturing to E-commerce to Public Sector to Healthcare to Agriculture- Big Data is applicable everywhere! And its use cases are growing rapidly!

High demand for skills

Thousands of job openings from top companies with 30-60% salary hikes for skilled Big Data professionals

From BITS Pilani

Over the years, BITS has provided the highest quality technical education to students from all over India admitted on the basis of merit. Its graduates may be found throughout the world in all areas of engineering, science and commerce. The primary motive of BITS is to "train young men and women able and eager to create and put into action such ideas, methods, techniques and information".

Prof Nayan Khare, Program Co-ordinator,BITS Pilani
“This program presents a unique opportunity to create a long term career in one of the fastest growing industries in the country.”

Prof Nayan Khare

Program Co-ordinator,

BITS Pilani

Learn from Big Data Experts

Each course in this program has instructions from top academicians from BITS and leading industry experts in big data.

S Balasubramaniam
Prof. S Balasubramaniam

Dean - Academics and Resource Planning

Sourabh Mukherjee
Shankar Radhakrishnan

Director of Big Data Engineering


Sourabh Mukherjee
Sourabh Mukherjee

Big Data Leader & Industry Veteran, Ex-IBM & Cognizant

Chittaranjan Hota
Dr. Chittaranjan Hota

Associate Dean - Admissions Prof. Dept of Computer Science

Nayan Khare
Prof Nayan Khare

Asst. Prof, Dept. of Computer Science

Sourabh Mukherjee
Joy Mukherjee

Head of Engineering and Operations, Sparkline Data, Ex-Yahoo, American Express

Vimal SP
Prof. Vimal SP

Asst. Prof, Dept. of Computer Science

Shakun Gupta
Shakun Gupta

Senior Big Data Engineer, Founder and CTO of Slassy



Karthikeyan Sankaran
Karthikeyan Sankaran

Director, Data Science and Machine Learning, LatentView Analytics

Dhanashree NP
Dhanashree NP

Lecturer, Dept. of Computer Science


Program Syllabus

The program curriculum has been developed in collaboration with BITS faculty and leading Big Data companies from across sectors. Most of the courses also have an independent project for you to work on, which is sourced from the industry and adds immensely to your learning.

If you don’t have previous experience in programming or databases (SQL) , don't worry! By enrolling for the program, you get access to completely free, pre-program preparatory sessions which will augment your skills in fundamental Computer Science concepts.

Topics Covered:

  • Object Oriented Programming (OOP) using JAVA
  • Data Structures
  • Design and Analysis of Algorithms
  • Relational Database Management Systems (SQL)

Prep Sessions will be available to enrolled students from 15th Sept. 2017.

To learn more about why should you be taking prep sessions, click here

Duration : 8 weeks

In this course you will be given an introduction to Big Data and its common industry applications. You will also develop important foundations in data structures and algorithms that form the basis of the Big Data Systems used in the industry.

Topics Covered:

  • Introduction to Big Data and its Applications
  • Data Abstraction
  • Linear data structures like Hashtables, Hashmaps, Bloom Filters
  • Non-linear data structures like Binary Search Trees, KD Trees
  • Distributed Algorithm Design
  • Algorithm Design using MapReduce

Course Outcomes:

You will be able to select and implement appropriate data structures to solve big data problems and also write Map and Reduce codes for distributed processing of data.

Programming Language Used: Java

Duration : 8 weeks

In this course you will be exposed to the different platforms used for processing Big Data. Additionally you will also learn how to set up a virtual machine for processing Big Data on your own computer as well as on the cloud.

Topics Covered:

  • Distributed Computing Environment for Big Data
  • NoSQL databases for Big Data Storage Applications (HBase)
  • Distributed Processing of data using MapReduce & Pig
  • In-memory distributed processing using Apache Spark
  • Data Storage on Cloud (Amazon S3 & Dynamo DB)

Course Outcomes:

You will be able to perform batch processing operations on Big data on your own computer as well as on an Amazon EC2 instance. You will be able to retrieve and store data in HDFS & Hbase using MapReduce & Apache Pig

Tools & Technologies Used: Hadoop, HBase, Apache Pig, Apache Spark, Amazon S3 & Dynamo DB

Duration : 7 weeks

Learn about collecting and processing structured and unstructured data by performing ETL operations. Use workflow manager tools to learn automation of task flows

Topics Covered:

  • Performing ETL Operations
  • Concepts in Data Warehousing and its Relevance for Big Data
  • Ingesting data into Big Data Platforms using Apache Sqoop & Flume
  • Workflow management for Hadoop using OOZIE
  • Batch Processing on Cloud

Course Outcomes:

You will learn to choose and use tools to ingest structured and unstructured data into big data processing systems and use Hive to perform data transformations. You will also be able to process Big Data on Cloud using Amazon EMR and use OOZIE for managing your workflow.

Tools & Technologies Used: Sqoop, Apache Flume, Apache Hive, HBase, Amazon EMR

Duration : 4 weeks

Ever wondered how you receive a notification based on your location? The answer lies in exploiting Real Time & Streaming Data. This course will expose you to the exciting world of processing real time data.

Topics Covered:

  • Applications of Streaming Data in Industry
  • Sourcing Streaming data using Apache Flume
  • Building real-time data pipeline using Apache Storm
  • Streaming on Apache Spark

Course Outcomes:

You will be able to build real time data processing systems using Apache Storm and Apache Spark

Tools & Technologies Used: Apache Storm, Apache Flume, Apache Spark

Duration : 5 weeks

In this course you will be introduced to the field of Big Data Analytics and you will learn about the libraries in Apache Spark used to perform Regression, Classification, Clustering on Big Data.

Topics Covered:

  • Regression, Clustering & Classification using Spark MLLib
  • Building visualizations using Big Data
  • Case Studies on applications of Big Data Analytics

Course Outcomes:

  • You will be able to perform analytics on the big data using Spark MLLib and get knowledge of tools to visualize results.
  • Interested students will also have an opportunity to learn the basics of functional programming in Scala*

Tools & Technologies used:

Spark (MLLib) and Scala*

Duration : 6 weeks

Apply lessons learnt in the program in an industry relevant project by ingesting, processing and analyzing data on a big data platform in cloud.

Click here to know more about Capstone Project.

* signifies optional/additional learning material for interested students

You will receive the download link in your email.

Program Vitals

Course Duration

Mar'18 - Feb'19Online, 11 months

Time Commitment

8-10 hoursper week

Program Fee

Rs. 2,25,000 (Incl. taxes) Flexible EMI Options available