article-spots
article-carousel-spots
programs
Hard skills
Myths about Big Data, or Welcome to the Premier League
10 Jun 2020

This currently trending direction emerged… due to the need to solve a problem. In 2003, the so-called business card sites became exceedingly popular and their numbers grew at such an exorbitant rate that Google’s search engine was faced with an overwhelming load. This, in particular, gave an impulse for creation of the distributed computing technology. 

Today, Big Data solutions are what you turn to when the standard data processing is not enough, and you need to reduce the time to get results. Most often, these solutions are used to study consumers’ behavior, analyze their needs and motivations, optimize prices, create personalized offers and more. It is Big Data that enables Starbucks to send their customers coupons tailored to each person’s taste, McDonalds – to create restaurants based on the specifics of local markets, and Heineken – to generate personalized advertising messages. 

But Big Data does not live by marketing alone. Properly organized and processed data makes it possible to calculate potential risks to merchandise as Amazon does, map new routes for flights, determine the effectiveness of medical treatment, and even predict hurricanes and natural disasters a few days before a potential threat. 

So, now that we have established the benefits of Big Data for the modern world how about mastering the profession of Big data engineer, that many aspiring IT specialists are interested in?  


 

Entering the world of Big Data is accompanied by a large number of myths. With the help of Marian Fediv, Senior Big Data engineer from the ERAM Lviv office, we are going to debunk or confirm them. Marian has been working in Big Data since 2016 and is actively involved in the specialist training in this area. 

Myth №1. It is extremely difficult to take the first steps in Big Data and requirements for beginners are “stratospheric” 

Yes, that is true. If only because it is impossible to enter the specialization without knowing three programming languages - Java, Scala and Python. OK, so as not to be too intimidating from the start, let's clarify – you must have a perfect knowledge of one of the languages and know the other two enough to understand the syntax and be able to read code. 

Why it is important? Imagine a Big Data project as mutually dependent blocks, each based on its own technology. Some of them can be written in Python, and some – in Java, and in order to add any function, you need to know those languages. 

You might add knowledge of SQL and at least one of the three most powerful Cloud providers - AWS, Google Cloud or Azure - to the list of requirements straightaway. 

I don’t even mention that level of English should not be lower than B1 +, because that is a basic requirement for every IT specialist, regardless of direction. 

I started with Java and later retrained as a Big Data Engineer. From my own experience, I can say that after a certain stage it becomes easier, because many technologies and tools are similar. But there is a good reason that the head of the Big Data practice at the EPAM Lviv office greets new people on the team with the words: "Welcome to the premier league". 

Myth №2. Big Data Engineer needs in-depth knowledge of mathematics 

In fact, this notion stems from the confusion of the roles of Big Data Engineer and Data Scientist. The latter operate on a much smaller amount of data and they need a thorough knowledge of mathematics to build mathematical models based on data collected and organized by Big Data engineers. These specializations are completely different, both in terms of tasks and technology stack, so in real life, these roles are almost never performed by one person. 

Big Data is not so much about the amount of data as the value it brings to business and society when properly collected, systematized, processed and analyzed. It is worth noting that worldwide the term Big Data is gradually becoming obsolete because the same amount of data may be large for one company, but insignificant for another. Instead, the terms Data project, Data engineer, etc. are gaining popularity. 

Myth №3. Big Data engineers provide ready-made solutions for business owners based on data analysis 

In reality 70% of a Big Data Engineer's job is gathering customer data and combining odd bits of information into a single system. Imagine, for example, the merger of two retail chains, each of which has its own system of physical and online stores. The task of a Big Data Engineer is to integrate all existing systems, collect all available data and bring them to a common denominator. They identify all accounts, pull in the history of purchases for each one of them, so that in the future based on this data it would be possible to build an advertising campaign for products that are most likely to interest that particular customer. 

So, if you want to learn more about the Big Data direction, you might want to watch this short but insightful VIDEO, that will give you a gist of it, and for those who have already decided on a profession, I recommend to read the book “Hadoop. The definitive guide”.