BITS Pilani
In association with

PG Program in Big Data Engineering

Online11 MonthsStarts Nov 2017Rs. 2,25,000 (Incl. Taxes)

Apply Now

Learn. Experience. Master.

Learn from the experts

Learn about Big Data concepts from leading BITS Pilani faculty and applications from seasoned industry professionals



Career Support & Learning Experience

Prepare for interviews and apply for suitable Big Data jobs through UpGrad’s network.


Master concepts through projects

Solve industry relevant projects and build an impressive portfolio. You also get a chance to work on a Capstone Project


Program Vitals

Course Duration

Nov'17 - Oct'18Online, 11 months

Time Commitment

8-10 hoursper week

Program Fee

Rs. 2,25,000 (Incl. taxes) Flexible EMI Options available

Program Syllabus

If you don’t have previous experience in programming or databases (SQL) , don't worry! By enrolling for the program, you get access to completely free, pre-program preparatory sessions which will augment your skills in fundamental Computer Science concepts.

Topics Covered:

  • Object Oriented Programming (OOP) using JAVA
  • Data Structures
  • Design and Analysis of Algorithms
  • Relational Database Management Systems (SQL)

Prep Sessions will be available to enrolled students from 15th Sept. 2017.

To learn more about why should you be taking prep sessions, click here

Duration : 8 weeks

In this course you will be given an introduction to Big Data and its common industry applications. You will also develop important foundations in data structures and algorithms that form the basis of the Big Data Systems used in the industry.

Topics Covered:

  • Introduction to Big Data and its Applications
  • Data Abstraction
  • Linear data structures like Hashtables, Hashmaps, Bloom Filters
  • Non-linear data structures like Binary Search Trees, KD Trees
  • Distributed Algorithm Design
  • Algorithm Design using MapReduce

Course Outcomes:

You will be able to select and implement appropriate data structures to solve big data problems and also write Map and Reduce codes for distributed processing of data.

Programming Language Used: Java

Duration : 8 weeks

In this course you will be exposed to the different platforms used for processing Big Data. Additionally you will also learn how to set up a virtual machine for processing Big Data on your own computer as well as on the cloud.

Topics Covered:

  • Distributed Computing Environment for Big Data
  • NoSQL databases for Big Data Storage Applications (HBase)
  • Distributed Processing of data using MapReduce & Pig
  • In-memory distributed processing using Apache Spark
  • Data Storage on Cloud (Amazon S3 & Dynamo DB)

Course Outcomes:

You will be able to perform batch processing operations on Big data on your own computer as well as on an Amazon EC2 instance. You will be able to retrieve and store data in HDFS & Hbase using MapReduce & Apache Pig

Tools & Technologies Used: Hadoop, HBase, Apache Pig, Apache Spark, Amazon S3 & Dynamo DB

Duration : 7 weeks

Learn about collecting and processing structured and unstructured data by performing ETL operations. Use workflow manager tools to learn automation of task flows

Topics Covered:

  • Performing ETL Operations
  • Concepts in Data Warehousing and its Relevance for Big Data
  • Ingesting data into Big Data Platforms using Apache Sqoop & Flume
  • Workflow management for Hadoop using OOZIE
  • Batch Processing on Cloud

Course Outcomes:

You will learn to choose and use tools to ingest structured and unstructured data into big data processing systems and use Hive to perform data transformations. You will also be able to process Big Data on Cloud using Amazon EMR and use OOZIE for managing your workflow.

Tools & Technologies Used: Sqoop, Apache Flume, Apache Hive, HBase, Amazon EMR

Duration : 4 weeks

Ever wondered how you receive a notification based on your location? The answer lies in exploiting Real Time & Streaming Data. This course will expose you to the exciting world of processing real time data.

Topics Covered:

  • Applications of Streaming Data in Industry
  • Sourcing Streaming data using Apache Flume
  • Building real-time data pipeline using Apache Storm
  • Streaming on Apache Spark

Course Outcomes:

You will be able to build real time data processing systems using Apache Storm and Apache Spark

Tools & Technologies Used: Apache Storm, Apache Flume, Apache Spark

Duration : 5 weeks

In this course you will be introduced to the field of Big Data Analytics and you will learn about the libraries in Apache Spark used to perform Regression, Classification, Clustering on Big Data.

Topics Covered:

  • Regression, Clustering & Classification using Spark MLLib
  • Building visualizations using Big Data
  • Case Studies on applications of Big Data Analytics

Course Outcomes:

  • You will be able to perform analytics on the big data using Spark MLLib and get knowledge of tools to visualize results.
  • Interested students will also have an opportunity to learn the basics of functional programming in Scala*

Tools & Technologies used:

Spark (MLLib) and Scala*

Duration : 6 weeks

Apply lessons learnt in the program in an industry relevant project by ingesting, processing and analyzing data on a big data platform in cloud.

* signifies optional/additional learning material for interested students

You will receive the download link in your email.