Starting off a Data Science Journey.

Rishabh Choubey
3 min readDec 18, 2020

→ From a Professional Cricketer to Starting off a Data Science Journey.

I was a national Cricketer used to represent my state team in BCCI tournaments for the past 8 years. My journey by far has been a roller coaster ride in the previous profession that I was in . But seeing the world driving towards data and seeing how proper insights and visualization of data can actually impact decision making to a great extent , I was inclined towards this field . During my cricketing years I was fascinated seeing how the sports analysts would change the course of the game by analyzing the numbers of an opponent or an individual. That’s where my inclination towards Data Science increased and I decided to step into this ever growing field of Data Science .

What Data Science is ?

Data Science is a concept that combines Statistics , Mathematics, Calculus , Data analysis , Domain knowledge, applied over a dataset in order to make inferences, reports ,predictive models and visualizing dashboards.

Stream Independent:-

What attracted me most about this field is that it is stream independent i.e it doesn’t matter whether you come from a commerce , arts or any background if you are interested in statistics and fascinated towards data , this field is a huge pond to dive in and have a taste of its waters.

Data Science Essentials :-

Languages:-

Python → Python is a programming language with simple syntaxes and alot of inbuild libraries that has inbuilt Machine Learning Algorithms. There are alot of libraries like Numpy , Pandas ,Mtplotlib and scipy that is widely used in Data Science

R → R is a programming language that is especially designed for statisticians and data mining . It is optimized for computation

Julia → It is a high -level, high-performance ,dynamic programming langauge used for numerical analysis and computational science

Frameworks:-

TensorFlow → It is a framework to create machine learning models .Creator :- Google

Pytorch → It is a framework to create machine learning models . Creator :- Facebook

Jupyter Notebook → It is an interactive platform for Python that allows faster experimentation

Visualization Tools:-

Plotly → It provides a wide range of interactive graphical visualization libraries

Tableau → A Platform that provide various tools useful for visualizing data in form of a dashboard and creating a report simultaneously .

PowerBI → It is a Business Analytics Service , widely used for reporting and dashboard creation . Created by :- Microsoft

Platform:-

Anaconda Prompt :- Most widely used platform as it contains comprehensive free and open source distribution for Python and R Language

MY EXPERIENCE WITH TOOLS AND PLATFORMS AS A BEGINNER :

Platform :- Anaconda

Language :- Python , SQL

Framework :- Notebook (Google collab, Jupyter)

DBMS :- MYSQL

Visualiztion Tools :- Tableau, PowerBI

However, we all know that Microsoft Excel is a widely used program for data based analysis and documentation . But with the growing amount of data it becomes a tedious task to do everything manually . VBA however makes the task easy but VBA in itself is a tricky thing to master which is not the case with basics of Python ,SQL and Data Analysis.

For a beginner I found working on Jupyter Notebook as easy as it can be . It allows interactive experimentation and faster execution for the codes written in Python . For any amount of data working on Python is easy to start on with , also the programs written to perform over a given set of data for any particular process can be repeated over any number of datasets as per the requirement . Over time as we grow into this field we find Python to be a simple language to be used as we have lots of inbuilt libraries to perform complex mathematical functions and statistical decision making.

MYSQL on the other hand is an easy to master Database Management System , with simple queries to create alter manipulate and fetch data .

MYSQL works well in handling huge databases and a large amount of data with added security and integrity features.

To start off , I think the above details would be useful enough , I would write again in a few days on using of Python and MYSQL together and what wonders they do when used hand in hand .

--

--

Rishabh Choubey

An Engineer and a professional Cricketer turned towards Data Science. checkout some of my work at :- https://github.com/rishabh2812