Sponsored Post.
Summarize Data Make New Columns Combine Data Sets df'w'.valuecounts Count number of rows with each unique value of variable len(df) # of rows in DataFrame. EDA Cheat Sheet(s) Why we EDA Things to keep in mind The plan Wrangling Basic things to do Helpful packages Data quality assessment and profiling Basic things to do Questions to consider Helpful packages Example backlog Exploration 1.
One of the most important parts of any Machine Learning (ML) project is performing Exploratory Data Analysis (EDA) to make sure the data is valid and that there are no obvious problems. EDA also helps you provide data-driven insights to business stakeholders before the project starts to ensure you’re asking the right questions.
In this tutorial, you’ll use Python and Pandas to:
- Explore a dataset and create visual distributions
- Identify and eliminate outliers
- Uncover correlations between two datasets
Creating an EDA is one of the first steps to building cleaner, more efficient machine learning and AI models. Read the tutorial and try it for yourself!
Data Mining¶
- Data Mining. PDF only.
Importing Data¶
- Importing Data. PDF.
Keras¶
- Keras. PDF.
Linear Algebra (with Numpy)¶
- Linear Algebra. PDF only.
- SciPy Linear Algebra. PDF.
Machine Learning¶
Machine Learning. PDF only.
- Supervised Learning;
- Unsupervised Learning;
- Deep Learning;
- Machine Learning Tips and Tricks;
- Probabilities and Statistics;
- Linear Algebra and Calculus.
Super pense-bête Machine Learning. PDF only.
Microsoft Azure Machine Learning. PDF.
- scikit-learn. PDF.
.
Numpy¶
- NumPy/SciPy/Pandas Cheat Sheet. PDF.
- Numpy. PDF.
Pandas¶
Eda Python Cheat Sheet Pdf
- Pandas DataFrame Notes. PDF only.
- Pandas. PDF.
- Pandas. PDF.
- Data Wrangling with Pandas. PDF.
Spark¶
- PySpark. PDF.
- PySpark SQL. PDF.
Visualization¶
Bokeh¶
- Bokeh. PDF.
Folium¶
- Folium. PDF.
Matplotlib¶
- Matplotlib Notes. PDF only.
- Matplotlib. PDF.
Cached
Plotly¶
- Plotly. PDF only.
Seaborn¶
- Seaborn. PDF.