Statisticians vs. Data Scientists

The term “data scientist” is relatively new, but the reality is that people have been doing data science tasks for many years. In the absence of a data scientist, engineers and analysts would figure out how their specific databases worked, often by consulting an IT department, and learn basic tools to extract that data, often by adapting legacy knowledge at the company. In fact, when I worked as an engineer, about half of my job was what today could be called “data science.” I spent much of my time extracting data from databases, analyzing the data, and then presenting the data or using that data to make decisions.

A data scientist in today’s world should be equipped to work with a variety of different databases and different data sources. They should be adept at cleaning data and getting it into a useable form no matter the source. Data scientists should also have a clear understanding of their customer requirements and have a thorough understanding of what data is available and how to best utilize and access it. They should also have a working knowledge of statistics to perform standard analyses which will vary by industry and work function. It is also imperative that they are skilled at presenting data and communicating clearly to their stakeholders.

A statistician, by contrast, would likely have some working knowledge of the specific databases used by their company or their stakeholders. They would likely have some tools to extract common queries, but may struggle to extract data that is not already cleaned. The statistician would have many more tools to analyze nonstandard data and be able to much more adeptly deal with complex statistical issues than a typical data scientist. I would expect most data scientists to consult statisticians for issues that were statistically complex. Statisticians, like data scientists, should also be skilled at presenting data and clearly communicate how to best interpret the data to make business decisions.

On a personal level, I do not yet feel like I can claim to be a statistician or a data scientist. I have been working on my programming skills and feel like I’m getting better at extracting data from various sources, but I still feel like I have major programming gaps compared with a data scientist. I also feel like I have a long way to go before I call myself a true statistician. I do think with my previous industry experience I have a really good skillset when it comes to analyzing and presenting data and knowing how to use that data to make improvements within an organization. I hope to close some of the gaps to both statisticians and data scientists through this program at NC State.


<
Blog Archive
Archive of all previous blog posts
>
Next Post
Project 1 Summary