2015
O'Reilly Media / Infinite Skills
Allen B. Downey
3:30
English
If you're a fledgling data scientist with only cursory statistical training and little experience with real world data sets, you may feel like you're stumbling around in the dark when you're asked to interpret and present data to decision makers. How do you validate the data? What analytic model should you use? How do you differentiate between correlation and causation? How do you ensure that your data is solid and your conclusions are on target?
Allen Downey, Professor of Computer Science at Olin College of Engineering, author of Think Stats, Think Python, and Think Complexity, provides safe passage around the common pitfalls of exploratory data analysis, so you can manage, analyze, and present data with confidence.
- Learn the fundamental tools and methodologies used in data science
- Discover best practices regarding the ETL (Extract, Transform, and Load) process and data validation
- Use the open science framework: practice version control, replication, and data pipelining
- Grasp the effectiveness of CDFs (Common Data Formats) in visualizing distributions
- Choose the correct analytic model for your data
- Comprehend statistical inference, effect size, confidence intervals, and hypothesis testing
- Discern the relationship between variables: understand scatter plots and scatter plot alternatives
- Understand correlation, linear least squares, linear regression, and logistic regression
- Master the Zen of testing your data and your conclusions
001. Introduction to Data Exploration
0101 Opportunities and Goals
0102 The State of Data
0103 Data Optimism
02. Getting Started
0201 Software Setup, IPython, and Import and Validation
0202 Data Organization
03. Visualizing Distributions
0301 PMFs and CDFs
04. Relationships Between Variables
0401 Scatterplots
0402 Correlation and Least Squares
05. Statistical Inference
0501 Introduction to Statistical Inference
0502 Effect Size
0503 Effect Size, Difference in Proportions
0504 Quantifying Precision
0505 Hypothesis Testing
06. Regression
0601 Linear Regression
0602 Logistic Regression
07. Modeling Distributions
0701 Modeling Distributions
08. Survival Analysis
0801 Survival Analysis
09. Inspection Paradox
0901 Inspection Paradox
oreilly.com, infiniteskills.com/training/data-exploration-in-python.html
Download File Size:885.98 MB