Big Data (Tech Orda Voucher)
Striving to gain market-oriented knowledge and skills to jumpstart your career in IT? Apply for this program and shape your professional path with EPAM experts.
Registration closed
600000 KZT
Program start
Nov 2022
25 weeks

Study alongside best EPAM practitioners with FREE voucher award provided by Tech Orda Program! To get into the program, you have to go through a competitive selection process.

Big Data is a variety of tools and approaches for processing both structured and unstructured data to use it for specific tasks and purposes. Big Data specialists help analyse data for insights that improve decisions and give confidence for making strategic business moves.


  • Learn about Data Engineer responsibilities, what data is, types of data, how to process and work with data, specifics of data systems and architectures
  • Understand the specifics of working with data in analytic, streaming and ML systems (including such frameworks as Spark, Hive, Hadoop-based ecosystems and streaming tools – Kafka, Flink, Spark), learn to build pipelines and work with this toolset
  • Gain grounded knowledge in cloud computing, understand when to use on-premises or cloud technologies, know the toolset of Data Engineer in cloud including such tools as Databricks, HDInsight, Event Hug in Azure, Dataflow, Dataproc in GCP, Glue, Athena, EMR in AWS


  • Real Projects. You will not only study, but learn how to perform. You will gain real project experience and prepare for a technical interview for your dream job
  • Accessible Content. We make complex things simple. We boil sophisticated concepts into digestible, bite-sized content without sacrificing quality and learning experience
  • Hands-on Learning. You will not only study how to learn, but learn how to do. You will assemble a toolkit that will help you become a best-in-class creator, capable of keeping up with the fast pace of technology
  • Career Services. You will receive all the tools to prepare for your dream job: from resume preparation to mocking technical interview to get ready for real interviews
How to get started?
  1. Register on this page
  2. Specify your interest in Big Data in the Additional information block of your profile (including reasons for choosing this training course)
  3. Take an English test by November 6 (it's available in your profile)
  4. Pass the Engineering Assessment by November 30 with the result of 70% and above (if successfully pass an English test and meet other requirements, we will send you a link to this assessment by November 7)
  5. Pass an interview with the recruiter by December 9
  6. Start learning on December 15
What will you learn?

Introduction to Data

You will get acquainted with the data engineering workflow and data product stages. Also, you will review the latest trends in data engineering alongside the key Big Data tools and applications. You will look at the characteristics of successful Big Data solutions and get familiar with the general architecture. Due attention will be paid to data governance and security issues. Moreover, you will get a comparison of the major cloud services providers.


This submodule will introduce you to the world of Hadoop, which is the number one choice when it comes to storing and processing Big Data. In the upcoming lessons, you will discover why this platform gained its massive popularity in the modern world, what benefits it brings to businesses, and how the high-speed processing and reliable storing of Big Data has been addressed in Hadoop. Furthermore, you will overview its ecosystem, get acquainted with the main features, and figure out its possibilities.


In the upcoming Submodule you will make a deep dive into Hive features and prepare yourself for work on Real-time Big Data and Hive projects. The following lessons will introduce you to User Defined Functions and the reasons why they are considered to be a powerful feature that allows users to extend Hive Query language. Also, you will be provided with information about using Transactions with ACID semantic and explanation of how they exactly work in Hive. What is more, you will get acquainted with Hive statistics and understand why it is crucial to be generated. And finally, you will have a detailed look at Hive optimization techniques in order to build an understanding of how the Hive's performance can be increased.


The upcoming lessons will familiarize you with the basics of the open source distributed processing system for Big Data workloads. In addition to getting acquainted with the key components, architecture, and various applications of Spark, you will discover the wealth of operations Spark offers, learn about extract, transform, load (ETL), and the three sets of APIs available in Spark. The following lessons will extend your knowledge about the Catalyst optimizer and introduce you to Project Tungsten, with the goal of building an understanding of how to improve the efficiency of Spark applications. You will also become familiar with Spark Streaming and how to use it for real-time analysis. 


You will be provided with meaningful information about Apache Kafka and discover why now Kafka cannot be considered as a messaging system only as it was handed over community in the beginning, overview main Kafka capabilities, advantages and drawbacks. What is more, you will get acquainted with Kafka Connect framework and Kafka Streams library and figure out which roles they undertake within Kafka architecture. In addition, you will learn how to optimize Kafka in order to achieve the service goal was set in your project and become familiar with monitoring process and key metrics to analyze Kafka performance.


You have the option to attend career services webinars to help you create a resume and obtain job search techniques. Our team will connect you with resources to successfully land your first job in your new career. Take advantage of 1:1 career advisory sessions to ask any questions and gain support!

Data Movement

You will become familiar with Flow-Based Programming and its main concepts, will be introduced to the terms "dataflow" and "data pipeline". Then, you will explore about Apache NiFi framework with special focus on its main features and Web UI. In addition, you will get acquainted with StreamSets Data Collector engine and figure out how it can be used for effective Data Movement.


According to analysts, up to 60% of Big Data projects are failing because they cannot scale at the enterprise level. Fortunately, taking a step-by-step approach to workflow orchestration can help you succeed. Big Data Workflows can be managed with Workflow tools. Thus, in the upcoming submodule, you will become familiar with these tools.


You will discover NoSQL databases, focus on their advantages, disadvantages, and peculiarities. You will also study different types of NoSQL databases: document, key-value, column, and graph. Then, you will examine the most popular NoSQL databases: MongoDB, HBase, and Cassandra. You will find out how these databases work, how to use them, and how their data models and architecture are organized. Finally, you will get some tips and advice and be comfortably explained which NoSQL database is the most suitable for different types of projects.


The upcoming submodule suggests you continue exploring engines that are commonly used when working within real-life Big Data projects. You will be introduced to Elasticsearch, specifically designed to solve a common but non-trivial problem in software development, which is searching without any surprise.


Big data and cloud computing are two distinct notions, but lately, they have become closely intertwined and almost inseparable. When it comes to Big Data, at some point, you will run into the need for ways to extract data on a much larger scale and better methods to process and analyze this data. Merging Big Data with cloud computing is a powerful combination that can completely transform your organization. 

Big Data (Tech Orda Voucher)
Nov 2022 · 25 weeks
Training · Online · Beginner
Registration closed
600000 KZT
*Price includes VAT assessed at 1%
Download  the public offer

Have any questions? Contact us

Contact Center