choices

THE 5 MAIN DATA PROFILES IN 2021

For newcomers or companies just joining the Data Science train, it is sometimes hard to clearly understand what are the main profiles making a complete and data team and which profiles are available and sought after in the market.

Data Engineer

Data engineers are key roles in any data-driven organisation although they are often overlooked, in particular in smaller companies or those just starting. Their role is to manage all the data collection steps and let the data neatly organise in a database, data warehouse or data lake.

A robust and well-designed data collection pipeline is the key to efficient and reliable downstream data analyses. For example, if you are losing some data in the process, you won’t be able to trust any of the analyses performed on this data or if the system is very inefficient it might increase the analyses time by 2 or 3.

Data engineering skills are even more important when a company has to deal with a higher volume of data. Although someone with a different profile can manage a system with very small volumes, you need a good data engineer to deal with higher volumes sent from multiple origins.

A robust and well designed data collection pipeline is the key to efficient and reliable downstream data analyses.

Data Engineers

Data engineers usually have profiles with a stronger Computer Science background and are proficient with Unix systems, scripting languages like Python, SQL, Scala or Spark.

Business Intelligence Developer

Business Intelligence Developers (also referred to as BI Developers) are responsible for developing automated reporting solutions, usually under the form of dashboards. They rely on the work of data engineers but will often need to develop additional data models or data cubes to make their dashboards faster and allow further exploration for end users.

As data engineers, they also require a good computer science background, in particular a strong level of SQL, a good understanding of data modelling and the design of ETL processes. However, they also have tight connections with the business side of the companies as their solutions (e.g. dashboards) will be used by different departments, from financial departments to operation, product or business managers. For this reason, they also need to have good interpersonal skills, strict attention to detail and a good eye for visualisations.

In addition to SQL or other data-oriented programming languages, BI developers usually use tools such as Tableau, Power BI, Looker or Business Intelligence platforms.

Smart city and communication network concept. 1st step is always Data engineering.

BI Developers also need to have good interpersonal skills, a strict attention to detail and a good eye for visualisations.

Data Analyst

Data analysts are maybe the new data scientists and can also be referred to as business analysts or product analysts. The expectations for this role can be quite broad but they will generally involve the statistical analysis of data to answer a specific, often punctual, business question. The part of automation expected for such roles is generally less important than for BI developers and the focus is more on business-driven requests such as analysing the impact of an ad campaign or investigating why a given metric has suddenly decreased.

For these reasons, the required computer science background is generally lower than for other roles although data analysts still need a good level of SQL and often one language like R or Python. But it is not uncommon to see data analysts using mostly Excel or BI tools like Tableau. They also need a piece of decent knowledge in statistics as they will be expected to use statistical models (e.g. linear regressions) to analyse their data when relevant.

In addition, they need a strong business acumen as they work very closely with business or product managers and need to develop a strong understanding of the company activity and the market they are operating in to deliver relevant and actionable insights.

The data analyst role is still quite broad, as the data scientist role used to be as well, and it is not uncommon for data analysts to be expected to develop some machine learning models, set up some ETLs or build dashboards.

Data Scientist

The data scientist can be seen as a data analyst on steroids, with a deeper technical and knowledge breadth. As data analysts, they will need to carry out analytical projects to understand how to improve their business, but they can also be expected to design and analyse experiments, develop machine learning models or build full-fledged data products. They are not expected to know everything (they simply can’t) but to have a strong enough background to quickly learn new concepts and apply them to a business case. As with Data Analysts, a strong understanding of the business is also critical to thriving as they will have to interact with multiple stakeholders from departments as diverse as product, finance or leadership.

To achieve this, the skill set of a data scientist is usually very broad. They have a deeper Computer Science background (e.g. algorithm complexity, data structure), advanced statistics and machine learning knowledge and experience and often have a good understanding of other fields like Artificial Intelligence or Natural Language Processing to name a few. SQL and Python and/or R are must-have languages to know for data scientists.

The data scientist can be seen as a data analyst on steroids.

These profiles are very broad and Data Scientists roles are harder to grasp for applicants as many companies use them with very different meanings. In some cases, the job duties would be closer to those of a data analyst while in some cases, often in start-ups, data scientists are recruited when the skill set needed is actually that of a Data Engineer. This can generate some frustration when the selected candidate discovers after a few weeks that they have been recruited for something totally different to what they imagined.

AI/ML Engineer

Artificial Intelligence (AI) or Machine Learning (ML) Engineer roles are one of the most recent additions to the data jobs family. As the name suggests, they are closer to engineering than business roles. Their duty is to develop AI or ML models, sometimes in close collaboration with data scientists, and to build systems allowing these models to be used in web or mobile applications serving millions of users. In addition to the model-building part, they also need to have strong computer science skills such as building APIs, microservices or creating and managing distributed applications.

As mentioned above, the skills needed are highly overlapping with those of engineers, e.g. back end developers or dev ops engineers. Python is a common language mastered by AI/ML engineers although they might also use Javascript, Go, Java or any other back-end friendly programming language. They often use tools such as Kubernetes, Docker, AWS, APIs,… However, unlike data scientists or analysts, they are not working closely with business stakeholders as their prime focus is on delivering data products.

AI/ML engineers duty is to develop AI or ML models, sometimes in close collaboration with data scientists, and to build systems allowing these models to be used in web or mobile applications serving millions of users.

1 Comment

Comments are closed.