### Know more about Data Science

**What is Data Science?**

Data Science is a mix of different tools, algorithms, and machine learning principles determined to find concealed designs from the crude information. Yet, how could this be not quite the same as how analysts have been doing years?

The answer lies in the difference between explanation and prediction.

As you can see from the above picture, an Data Analyst typically makes sense of what is happening by handling history of the information.

Then again, Data Scientist not in the least does the exploratory examination to find experiences from it, yet in addition utilizes different advanced machine learning algorithms to recognize the event of a specific occasion from now on.

An Data Scientist will take a gander at the information from many points, at times points not known before.

Customarily, the information that we had was for the most part organized and little in size, which could be examined by utilizing basic BI device.

Not at all like information in the customary frameworks which were generally organized, today the majority of the information is unstructured or semi-organized.

We should view the information patterns in the picture given underneath which shows that by 2020, more than 80 % of the information will be unstructured.

## TOTAL DATA STORED

This isn’t the main reason behind why Data Science has become so well known. How about we dig further and perceive how Data Science is being utilized in different areas.

## Machine learning for pattern discovery

On the off chance that you don’t have the boundaries in light of which you can make forecasts, then, at that point, you really want to figure out the secret examples inside the dataset to have the option to make significant expectations. This is only the unaided model as you don’t have any predefined names for gathering. The most widely recognized calculation utilized for design disclosure is Grouping.

## Who is a Data Scientist?

There are a many definitions accessible on Data Scientists. In basic words, an Data Scientists is one who design the craft of and research about Data Science.

The expression “Data Scientists” has been instituted subsequent to considering the way that an Data Scientists draws a great deal of data from the logical fields and applications whether it is statistics or mathematics.

Data scientists are those who crack complex data problems with their strong expertise in certain scientific disciplines.

## Prologue to the Data science syllabus

There are six key modules

#### Module 1: Python

Python is the most significant and essential subject that each data scientist ought to know about. In this segment, our teachers will take you through the essentials of Python and regions where it tends to be utilized. You will figure out how to utilize a portion of the ongoing instruments like Numpy, Pandas, and Matplotlib. Hence, module 1 includes –

- Environment set-up
- Jupyter overview
- Python Numpy
- Python Pandas
- Python Matplotlib

#### Module 2: R

Utilized for statistical and data analysis, R programming language is one of the high level factual dialects utilized in Data science. This module shows you how to investigate Data collections utilizing R. Here you will learn –

- An introduction to R
- Data structures in R
- Data visualization with R
- Data analysis with R

#### Module 3: Statistics

While working with Data, the information on statistics is fundamental and a significant range of abilities that you should have. In this module, you will learn –

- Important statistical concepts used in data science
- Difference between population and sample
- Types of variables
- Measures of central tendency
- Measures of variability
- Coefficient of variance
- Skewness and Kurtosis

#### Module 4: Inferential statistics

Inferential statistics is utilized to make speculations of populations, from which tests are drawn. This is another part of statistics, which assists you learn to analyze representative samples of large data set. In this module, you will learn –

- Normal distribution
- Test hypotheses
- Central limit theorem
- Confidence interval
- T-test
- Type I and II errors
- Student’s T distribution

#### Module 5: Regression and Anova

This topic will assist you with understanding how to lay out a connection between at least two items. ANOVA or analysis of variance is utilized to break down the distinctions among test sets. Here you will learn –

- Regression
- ANOVA
- R square
- Correlation and causation

#### Module 6: Exploratory data analysis

In this lesson you will learn –

- Data visualization
- Missing value analysis
- The correction matrix
- Outlier detection analysis

#### Module 7: Supervised machine learning

This is a comprehensive module to assist you with understanding how to cause machines or computers interpret human language. You will learn –

- Python Scikit tool
- Neural networks
- Support vector machine
- Logistic and linear regression
- Decision tree classifier

#### Module 9: Machine learning on cloud

In this lesson, you will learn –

- ML on cloud platform
- ML on AWS
- ML on Microsoft Azure