There’s a Data Engineer Shortage: 4 Reasons Why
Since 2020, investments in data – and data teams – have gone up. Way up. Unsurprisingly, over the past two years, digital transformation strategies accelerated and the digital economy grew at a rate never recorded before in modern history. In fact, according to Dataspace, hiring in data-focused fields is up 63% versus this time last year. And the hiring is occurring across all industries, from Financial Services to Data Consulting, Automotive, Healthcare, and more.
While many of these organizations are looking for data scientists, the role HBR called the “sexiest job of the 21st century,” there’s a new data-focused role emerging which may usurp the data scientist for the “sexiest job” title: the data engineer.
What is a data engineer?
According to Coursera, data engineers “work in a variety of settings to build systems that collect, manage, and convert raw data into usable information for data scientists and business analysts to interpret.” Their goal is to make data accessible across their organizations so decision makers can use it to drive business decisions and boost performance.
Now that many organizations have their data scientists in place, they are realizing that they need data engineers, too. Organizations hire data engineers when they are looking to embrace DataOps as a discipline across their organization. Through the interconnected nature of data engineering, data integration, data quality, and data security/privacy. DataOps helps organizations deliver data that not only accelerates analytics but also enables analytics that were previously deemed impossible. And, it helps data teams streamline the process of deploying code, without the worry of breaking what’s already in production.
As a result of the data explosion caused by the unprecedented speed of digital transformation and adoption of cloud technology, organizations are realizing that DataOps – and data engineers – are a critical factor to the success of their data programs.
But as more and more companies look to embrace DataOps, they are finding it difficult to find the data engineering talent to support it.
Data engineers in demand
According to DICE, data engineering is the fastest growing tech job, growing by 50% in 2019. This aggressive growth in demand for data engineers is fueled by the ever-increasing needs sparked by big data.
To put this demand into perspective, an analysis of LinkedIn data engineer job listings in February 2020 showed that there are, on average, 2.5 data engineer job candidates per job listing. Compare this to 4.76 candidates per data scientist job listing, 10.8 candidates per web developer job listing, and 53.79 candidates per marketing manager job listing, and you can see the challenge.
And while it’s clear that data engineers are in short supply, the real question is why?
There are many reasons why companies struggle to find data engineers to fill their open roles, but we see these as the top four.
4 reasons why it’s hard to find a data engineer
There are four primary factors driving the shortage of data engineers: education, skills, salary, and burnout.
- Education
Do a quick search for undergraduate data engineering programs, and you’ll struggle to find more than just a few. Similar to data science just a few years ago, colleges and universities are not offering undergraduate programs in data engineering.
Today, many institutions offer data science programs at the undergraduate and graduate level, giving students a way to learn the skills needed to thrive in these roles. But because similar programs don’t exist for data engineers, many of them are self-taught. They learn their skills from a mash-up of online courses offered by providers such as EdX, Coursera, and Udemy. And that’s a challenge when trying to get hired since their “education” doesn’t come in the form of a formal degree or certificate.
Until colleges and universities begin to deliver programs that support data engineering in a meaningful way, companies should consider the skillsets a candidate possess – not the credential or degree on their resume. Which brings us to our next point.
- Skills
Data engineers possess a unique blend of skills that are different from data scientists. They need to know programming languages, understand how databases work, and how to design data pipelines.
They need to possess knowledge of the following, as well:
- Programming languages, like Python and SQL
- SQL and NoSQL databases
- ETL/ELT technologies such as dbt, Matillion, and Fivetran
- Streaming like Apache Kafka
- Infrastructure, including cloud infrastructure
Unlike data scientists who often “reskill” from other roles, data engineers require a more specialized skill set that can be harder to obtain given the limited number of programs and certifications available. Many data engineers reskill from traditional ETL developer roles, and that pivot is a logical one to make. But keep in mind that when hiring a data engineer, skills are more important than credentials.
- Salary
Data engineers are highly-paid roles. In fact, in some cases, their salaries are higher than data scientists. But many businesses looking for data engineers offer well below market value.
As the shortage of data engineers continues, companies should only expect these salaries to rise – and for candidates to demand more from the companies who wish to hire them. If companies want to hire top data engineering talent, they’ll need to pay. And pay well.
- Burnout
Because data engineers are in such high demand, there is another factor that comes in play: burnout. At many organizations, data engineers are the data firefighters. When a data problem arises, the business calls the data engineers. Sometimes they even blame them for the problem. And they expect the data engineers to work around the clock until the data problem is resolved.
This constant demand to react quickly to every data problem that comes up is exhausting and causing many data engineers to become burned out.
In fact, according to DataKitchen, 97% of data engineers are experiencing burnout, 70% say they will likely look for a new job in the next 12 months, and four out of five are considering leaving their career entirely. At a time when demand far outweighs supply, this is not good news.
To retain the rare data engineering talent, organizations must commit to DataOps. It’s essential to helping data engineers better manage the data pipelines and processes that feed information to the business. It helps them embrace continuous improvement so they can continuously scan the data processing environment, looking for constraints and bottlenecks. And, it helps relieve the pressure by bringing rigor to the development and management of data pipelines, enabling CI/CD across the data ecosystem.
Clearly, the demand for data engineers is growing. And as DataOps gains traction, the demand for data engineers will grow even higher.
If you’re struggling to find data engineering talent, take a look at your operations. If you haven’t embraced DataOps, we strongly suggest that you do so. You can quickly become an expert by reading our guide. When making an offer to a data engineer, make sure it aligns with market expectations. And prioritize skills over a degree.
Don’t let a data engineer shortage – or any other pitfall – derail your efforts to become data-driven. By taking these factors into consideration, you can help your organization better compete for top data engineering talent.
Get a free, no-obligation 30-minute demo of Tamr.
Discover how our AI-native MDM solution can help you master your data with ease!