Data Science Essentials[Draft]

Pradeep Ankem
3 min readJul 8, 2021

--

Sandbox

Logged date: 24th Aug, 2021

What is the quiz for the day: here

What is the challenge ? (Name finder for male and female)

What is the checklist of the Trainer ?

- Ear Plugs
- Mouse

What are the meeting details ?

Meeting details here

Advanced Visuals

Name some Advanced Visuals ?

🖌 Take this quiz here [Google Forms]

add Gapminder link- WordCloud
- Histogram
- Pareto Chart
- Sunburst Chart
- Complexity Network Chart
- Sankey Chart
- Maps Chart

What are Sparklines ?

Sparklines are inline Graphs, used to show trend of a rowExample: _ _ ▄ █ ░ ▄ █ █ █ █

How to learn Data Science in 45 Hours ?

Follow this thread

What are the tools are used so far ?

Excel, Power BI or Google Data Studio, Azure ML Studio, and Google Colab

What are the *rules* in the class ?

- If no data connection for 3 times, class gets cancelled- Laptop/Desktop is a must- Do Assignments on time- Expect Hands-on, Quiz and micro challenges - Less PPT business- Ācārya dēvō bhava

Contents

- Basics
- Break the ice
- Expectation setting/ Projects Take up
- What have you done research so far on this topic
- Difference between AI, Machine Learning and Deep Learning

Let’s begin..

         __
_(\ |@@|
(__/\__ \--/ __
\___|----| | __
\ }{ /\ )_ / _\
/\__/\ \__O (__
(--/\--) \__/
_)( )(_
`---''---`

Notes:

NRE21 Draw Robot

Tool-kit

QA Session:

What are the best books that needs to be referred ?

NRE43. What are the three subjects involved in Data science ?

[a] Stats, Code and SME
[b] Math, Code and SME
[c] Math, Social and Physics
[d] Biology, Math and SME

NRE12. What are the things that can be done with a Machine, if you are a Data Scientist ?

* Prediction
* Anomaly detection
* Clustering
* Classification
* Natural Language Understanding
* Reinforcement learning

Which is the most widely used Dataset format ?

[a] tsv
[b] csv
[c] .txt
[d] .xlsx

Which Programming language is widely used for Data Science ?

[a] Python
[b] R
[c] Java
[d] Javascript

Which Notebook environment is widely used by Data Scientists ?

[a] Jupyter Notebook
[b] Colab Research book
[c] Datalore
[d] None of the above

Who are the some of the notable personalities in Data Science world ?

* Ian Goodfellow
* Yann LeCun
* Andrew Ng
* Geoffrey Hinton
* Jack Vanderplas

What are some well-known applications used in Data Science ?

* Alexa, Siri, and their friends
* Tesla Auto-Pilot

What is the difference between Supervised and Unsupervised Learning ?

Day 2: Visuals (only basics)

Tool-kit:

Dataset: bit.do/titanicdf
https://quickchart.io/
Study on Infographics
Book: Data Visualisation for dummies
https://sites.google.com/view/robotic-future/visualization
Data Science Course link
Visual Cheat Sheet link

What is the difference between univariate and bivariate analysis ?

A sample bar chart, can be used in Timeline

▅▆▂▃▂▂▂▅▂▂▅▇▂▂▂▃▆▆▆▅▃▂▂▂▁▂▂▆▁▃

Stacked Chart

10 |
| ▄ █ ░
5 |▄ █ █ █ █
+- - - - - - - - - - - ->
J F M A M J J A S O N D

Bar Chart

_ _ ▄ █ ░ ▄ █ █ █ █

The Power of visuals:

years = {2000: 2, 2001: 9, 2002: 10, 2003: 9, 2004: 14, 2005: 11, 2006: 8, 2007: 10, 2008: 14, 2009: 19, 2010: 16, 2011: 17}for y in years:   print (y, years[y]*'|')var k = ["#", "+",  "$",  "X"];
var v = [0.2, 0.4, 0.15, 0.25];
var r = 10;

Output :

       $$$XXXX      
$$$$$XXXXXX
$$$$$$XXXXXXX
$$$$$$$XXXXXXXX
+$$$$$$$XXXXXXXXX
++$$$$$$XXXXXXXXX
+++++$$$$XXXXXXXXXX
++++++$$$XXXXXXXXXX
+++++++$$XXXXXXXXXX
+++++++++XXXXXXXXXX
++++++++++#########
+++++++++++########
+++++++++++########
++++++++++#######
+++++++++++######
++++++++++#####
+++++++++####
++++++++###
+++++++

Histogram Example:

Quiz

Use Google Data Studio/Google Sheets

Dataset: Titianic

Draw a Histogram on Age ?
Draw a Bar Chart on PClass ?
Draw a Pie Chart on Embarked ?
Draw a Matrix Correlation Plot for all the variables ?

Assign a pivot Table for Survived Vs Gender and Pclass Vs Survived ?

Give a Text box and provide count of missing values on Age ?

Draw a Timeline chart for Asian Paints for the last 7 days ?

Create Infographics for your Resume (use visme) ?

Create an interactive data visualisastion chart (do your own research) ?

┌──────────────────────┐
│ │
│ │
│ │
│ Dataset │
│ (Google Sheets)│
│ │
│ │
└──────────┬───────────┘






┌──────────▼───────────┐
│ │
│ │
│ Google │
│ Data │
│ Studio │
│ │
│ │
│ │
└───────────────-------

Day 3: Exploratory Data Analysis

The End

--

--

Pradeep Ankem
Pradeep Ankem

Written by Pradeep Ankem

In Parallel Universe, I would have been a Zen Monk.

No responses yet