What is Data science?
Data Science is a field that focuses on extracting knowledge from data sets that are typically huge in amount. For example, you have the data captured by comments and reviews of products listed on the amazon or any other e commerce website. With this huge amount of data, we try to extract the knowledge about buying behavior of customers their likes and what they prefer most. This extracted knowledge is further used to improve the products and provide customer what they want. So, this whole process is known as data science.
Who is data scientist?
Data scientist is a person who uses data to understand and explain the different events happening around them or to help organizations to make better decisions.
The main job of a data scientist is to understand the various ways to gather data that makes sense.
So, if you like to play with numbers, have love with statistics and can find pattern in everything this is the write job for you.
Where to start to become a data scientist?
1: Learn coding
At almost every step of data science programming is used. So, first learn a programming language.
If you do not know programming at all then first start with C++, it may sound weird but this will help in basic logic building. Try to write simple codes by yourself.
Then after you know something about programming then learn python because it is the language widely used in data science. It simple to use and more than that easy to learn. Learn important functions and libraries that are used in data visualization in python.
This was your first step in your journey.
2: Learn Mathematics and statistics
Let’s start from basics, first you need to understand the basic concepts of statistics and slowly increase your level. Learn basic concepts of statistics like mean, median, mode etc. Learn to plot simple graphs, normal distribution, optimization, gradient descent. And some basics of differentiation and integration. Just try to understand what these actually means don’t mug up or just learn to solve.
3: Learn data visualization
After you learn to understand data and write codes you need to learn visualization of data because you cannot show just your code when someone ask about data. You need to give them a visual presentation of your data.
Some of the commonly used data visualization tools are Microsoft excel, Tableau and Hadoop.
4: Machine learning and deep learning
After you master playing data in python you need to learn machine learning concepts. For this you can start with freely available sources like YouTube and google.
You can learn python libraries like SkLearn, tensorflow. Learn to build a neural network.
5: Learn Linux
You need to learn Linux because of its speed and high demand in data scientists or any other software’s role. Most data science companies use Linux because of its advantages that it provides to analyze data. Most data scientists have their codes developed and deployed on the Linux OS.
Other important tips
Good communication skills and spoken English.
Try to update yourself regularly and learn more.
Utilize free available resources, don’t underestimate them.
Keep an eye on latest data science technologies.
Try to keep yourself ahead of competition.
Domain knowledge is also important.