Big Data era has generated new types of professions. These new professions are related to data and their approach to data. Some of the notable professions are Data Scientist, Machine Learning Engineer and IoT Engineer. In this article, we are going to focus on Data Scientist. We are going to answer the following questions: what is a Data Scientist? What does a Data Scientist do? How can you become a Data Scientist? Where does a Data Scientist work?
What is a Data Scientist?
A Data Scientist is a person who is an expert in analyzing data using different statistics, linear algebra and machine learning techniques. By analyzing this data they can solve complex problems for all kind of businesses. This makes a Data Scientist a well needed professional in many areas of the industry.
Using the right techniques they extract knowledge from the data. This knowledge helps businesses to increase their revenue or optimize production, for example.
Where does a Data Scientist work?
A Data Scientist can work in any kind of field. Their goal is to solve the everyday problems businesses face. A Data Scientist would tell you when the appropriate time to invest in the stock market is or predict your customers taste and preference to show them the product that fits them best.
A Data Scientist has many different skillsets. This is the reason why they can work in all kind of fields. They must have the following kind of skills:
- a solid mathematical and statistical background
- know how to code well in different programming languages
- know how to access data bases and work with large amount of data
- know how a standard company works in order to give appropriate solutions to it
Mathematics
A Data Scientist must control linear algebra and calculus and statistics to understand the main kind of algorithms Machine Learning uses.
Programming & Databases
They must know how to code in Python, R and SQL. The first two are ideal to do data cleaning and to apply machine learning techniques to the data. SQL is a programming language that helps us to manage data within databases. So, if they want to extract data from them they must know how to code in SQL. Using Python + SQL one can access databases in an easy way. However, in some cases when working with large amounts of data using these tools is not the best option. To solve this problem they can use programs like Spark, Hive or Hadoop. These programs help them manage the data in an easier way.
Company Structure
Knowing how a company works is a skill all Data Scientist must-have. You can be the best data cleaner ever but if you do not know how to apply your skills to real-world problems you will never be a good Data Scientist.
There are a few other skills that are, in some cases, more difficult to learn. This is the case of communication. A Data Scientist must be a good communicator in order to share ideas and different points of view with the team. In fact, this may be more important than any other skill mentioned before. If you cannot share or explain your ideas you will have to work on your own and that will give you problems on the long term.
To conclude the answer to this question I will mention companies that have Data Scientist’s on their teams: IT companies, banks, supermarkets, airlines, labs, among others.
How can I become a Data Scientist?
Nowadays, many companies that want to hire a Data Scientist ask for Msc degree or Phd degree on Data Science, Machine Learning, Computer Science, Physics, Statistics or Mathematics. However, the truth is that in the last years, companies have given more importance to self-taught students over university titles. Does this mean that you have to quit from your studies? NO! But, my advice is that you must study something else apart from what you have been taught at school or College.
For example, I have studied Physics at College. Meanwhile, I have been studying Machine Learning, AI and Data Science on my own. This kind of study collation will give you a wide variety of skills. These skills will help you to become a great data scientist.
There is an expression that says as follows: ‘In order to succeed you have to be like a Swiss knife. You have to be prepared to confront any kind of setback’.
You will need the discipline to study on your own but; above all, you have to be prepared to study your entire life if you decide to become a data scientist. This is because these kinds of fields are continuously changing.
What kind of jobs does Data Science develop?
We will cover four different kinds of jobs a Data Scientist can do in a company.
There are some companies where being a Data Scientist is synonymous with being a data analyst. Your job might consist of tasks like pulling data out of SQL databases, becoming an Excel or Tableau master, and producing basic data visualizations and reporting dashboards. You may on occasion analyze the results of an A/B test or take the lead on your company’s Google Analytics account. This kind of disciplines will help you learn the ropes and expand your skills.
Some companies get to the point where they have a lot of traffic (and an increasingly large amount of data), and they start looking for someone to set up a lot of the data infrastructure that the company will need moving forward. They’re also looking for someone to provide analysis. You’ll see job postings listed under both “Data Scientist” and “Data Engineer” for this type of position. Since you’d be (one of) the first data hires, heavy statistics and machine learning expertise is less important than strong software engineering skills. As a result, you’ll have great opportunities to shine and grow via trial by fire, but there will be less guidance and you may face a greater risk of flopping or stagnating.
There are a number of companies for whom their data (or their data analysis platform) is their product. In this case, the data analysis or machine learning going on can be pretty intense. This is probably the ideal situation for someone who has a formal Mathematics, Statistics, or Physics background and is hoping to continue down a more academic path. Companies that fall into this group could be consumer-facing companies with massive amounts of data or companies that are offering a data-based service.
A lot of companies are looking for a generalist to join an established team of other data scientists. The company you’re interviewing for cares about data but probably isn’t a data company. It’s equally important that you can perform analysis, touch production code, visualize data, etc. Generally, these companies are either looking for generalists or they’re looking to fill a specific niche where they feel their team is lacking, such as data visualization or machine learning.
To sum up, these four kinds of jobs are depending on the kind of company you are hired. You can build your resume in order to be hired in any of these jobs. Nonetheless, I recommend you to build a resume that covers all of them.
Check the following image of the skills you must have to succeed in Data Science and other professions related to it:
Image 1. source
Image 2. Comparison between Data Science, Data Analytics and Big Data
Conclusions
Data Science is neither boring nor repetitive. You will face many different problems and situations in which you have to learn at the same time you solve the task. You need to be creative and proactive.
Remember that anyone can learn to manage large amounts of data, but not everyone can extract useful information from it to help a company improve and grow.