Head First Statistics
Read it now on the O’Reilly learning platform with a 10-day free trial.
O’Reilly members get unlimited access to books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.
Book description
Wouldn't it be great if there were a statistics book that made histograms, probability distributions, and chi square analysis more enjoyable than going to the dentist? Head First Statistics brings this typically dry subject to life, teaching you everything you want and need to know about statistics through engaging, interactive, and thought-provoking material, full of puzzles, stories, quizzes, visual aids, and real-world examples.
Whether you're a student, a professional, or just curious about statistical analysis, Head First's brain-friendly formula helps you get a firm grasp of statistics so you can understand key points and actually use them. Learn to present data visually with charts and plots; discover the difference between taking the average with mean, median, and mode, and why it's important; learn how to calculate probability and expectation; and much more.
Head First Statistics is ideal for high school and college students taking statistics and satisfies the requirements for passing the College Board's Advanced Placement (AP) Statistics Exam. With this book, you'll:
- Study the full range of topics covered in first-year statistics
- Tackle tough statistical concepts using Head First's dynamic, visually rich format proven to stimulate learning and help you retain knowledge
- Explore real-world scenarios, ranging from casino gambling to prescription drug testing, to bring statistical principles to life
- Discover how to measure spread, calculate odds through probability, and understand the normal, binomial, geometric, and Poisson distributions
- Conduct sampling, use correlation and regression, do hypothesis testing, perform chi square analysis, and more
Before you know it, you'll not only have mastered statistics, you'll also see how they work in the real world. Head First Statistics will help you pass your statistics course, and give you a firm understanding of the subject so you can apply the knowledge throughout your life.
Publisher resources
Table of contents
- Dedication
- A Note Regarding Supplemental Files
- Advance Praise for Head First Statistics
- Praise for other Head First books
- Author of Head First Statistics
- How to use this Book: Intro
-
1. Visualizing Information: First Impressions
- Statistics are everywhere
- But why learn statistics?
- A tale of two charts
- Manic Mango needs some charts
- The humble pie chart
- Chart failure
- Bar charts can allow for more accuracy
- Vertical bar charts
- Horizontal bar charts
- It’s a matter of scale
- Using frequency scales
- Dealing with multiple sets of data
- Your bar charts rock
- Categories vs. numbers
- Dealing with grouped data
- To make a histogram, start by finding bar widths
- Manic Mango needs another chart
- Make the area of histogram bars proportional to frequency
- Step 1: Find the bar widths
- Step 2: Find the bar heights
- Step 3: Draw your chart—a histogram
- Histograms can’t do everything
- Introducing cumulative frequency
- Drawing the cumulative frequency graph
- Choosing the right chart
- Manic Mango conquered the games market!
-
2. Measuring Central Tendency: The Middle Way
- Welcome to the Health Club
- A common measure of average is the mean
- Mean math
- Dealing with unknowns
- Back to the mean
- Handling frequencies
- Back to the Health Club
- Everybody was Kung Fu fighting
- Our data has outliers
- The butler outliers did it
- Watercooler conversation
- Finding the median
- Business is booming
- The Little Ducklings swimming class
- Frequency Magnets
- Frequency Magnets
- What went wrong with the mean and median?
- Introducing the mode
- Congratulations!
-
3. Measuring Variability and Spread: Power Ranges
- Wanted: one player
- We need to compare player scores
- Use the range to differentiate between data sets
- The problem with outliers
- We need to get away from outliers
- Quartiles come to the rescue
- The interquartile range excludes outliers
- Quartile anatomy
- We’re not just limited to quartiles
- So what are percentiles?
- Box and whisker plots let you visualize ranges
- Variability is more than just spread
- Calculating average distances
- We can calculate variation with the variance...
- ...but standard deviation is a more intuitive measure
- A quicker calculation for variance
- What if we need a baseline for comparison?
- Use standard scores to compare values across data sets
- Interpreting standard scores
- Statsville All Stars win the league!
-
4. Calculating Probabilities: Taking Chances
- Fat Dan’s Grand Slam
- Roll up for roulette!
- Your very own roulette board
- Place your bets now!
- What are the chances?
- Find roulette probabilities
- You can visualize probabilities with a Venn diagram
- It’s time to play!
- And the winning number is...
- Let’s bet on an even more likely event
- You can also add probabilities
- You win!
- Time for another bet
- Exclusive events and intersecting events
- Problems at the intersection
- Some more notation
- Another unlucky spin...
- ...but it’s time for another bet
- Conditions apply
- Find conditional probabilities
- You can visualize conditional probabilities with a probability tree
- Trees also help you calculate conditional probabilities
- Bad luck!
- We can find P(Black l Even) using the probabilities we already have
- Step 1: Finding P(Black ∩ Even)
- So where does this get us?
- Step 2: Finding P(Even)
- Step 3: Finding P(Black l Even)
- These results can be generalized to other problems
- Use the Law of Total Probability to find P(B)
- Introducing Bayes’ Theorem
- We have a winner!
- It’s time for one last bet
- If events affect each other, they are dependent
- If events do not affect each other, they are independent
- More on calculating probability for independent events
- Winner! Winner!
-
5. Using Discrete Probability Distributions: Manage Your Expectations
- Back at Fat Dan’s Casino
- We can compose a probability distribution for the slot machine
- Expectation gives you a prediction of the results...
- ... and variance tells you about the spread of the results
- Variances and probability distributions
- Let’s calculate the slot machine’s variance
- Fat Dan changed his prices
- There’s a linear relationship between E(X) and E(Y)
- Slot machine transformations
- General formulas for linear transforms
- Every pull of the lever is an independent observation
- Observation shortcuts
- New slot machine on the block
- Add E(X) and E(Y) to get E(X + Y)...
- ... and subtract E(X) and E(Y) to get E(X – Y)
- You can also add and subtract linear transformations
- Jackpot!
-
6. Permutations and Combinations: Making Arrangements
- The Statsville Derby
- It’s a three-horse race
- How many ways can they cross the finish line?
- Calculate the number of arrangements
- Going round in circles
- It’s time for the novelty race
- Arranging by individuals is different than arranging by type
- We need to arrange animals by type
- Generalize a formula for arranging duplicates
- It’s time for the twenty-horse race
- How many ways can we fill the top three positions?
- Examining permutations
- What if horse order doesn’t matter
- Examining combinations
- It’s the end of the race
-
7. Geometric, Binomial, and Poisson Distributions: Keeping Things Discrete
- Meet Chad, the hapless snowboarder
- We need to find Chad’s probability distribution
- There’s a pattern to this probability distribution
- The probability distribution can be represented algebraically
- The pattern of expectations for the geometric distribution
- Expectation is 1/p
- Finding the variance for our distribution
- You’ve mastered the geometric distribution
- Should you play, or walk away?
- Generalizing the probability for three questions
- Let’s generalize the probability further
- What’s the expectation and variance?
- Binomial expectation and variance
- The Statsville Cinema has a problem
- Expectation and variance for the Poisson distribution
- So what’s the probability distribution?
- Combine Poisson variables
- The Poisson in disguise
- Anyone for popcorn?
-
8. Using the Normal Distribution: Being Normal
- Discrete data takes exact values...
- ... but not all numeric data is discrete
- What’s the delay?
- We need a probability distribution for continuous data
- Probability density functions can be used for continuous data
- Probability = area
- To calculate probability, start by finding f(x)...
- ... then find probability by finding the area
- We’ve found the probability
- Searching for a soul sole mate
- Male modelling
- The normal distribution is an “ideal” model for continuous data
- So how do we find normal probabilities?
- Three steps to calculating normal probabilities
- Step 1: Determine your distribution
- Step 2: Standardize to N(0, 1)
- To standardize, first move the mean...
- ... then squash the width
- Now find Z for the specific value you want to find probability for
- Step 3: Look up the probability in your handy table
- Julie’s probability is in the table
- And they all lived happily ever after
-
9. Using the Normal Distribution ii: Beyond Normal
- Love is a roller coaster
- All aboard the Love Train
- Normal bride + normal groom
- It’s still just weight
- How’s the combined weight distributed?
- Finding probabilities
- More people want the Love Train
- Linear transforms describe underlying changes in values...
- ...and independent observations describe how many values you have
- Expectation and variance for independent observations
- Should we play, or walk away?
- Normal distribution to the rescue
- When to approximate the binomial distribution with the normal
- Revisiting the normal approximation
- The binomial is discrete, but the normal is continuous
- Apply a continuity correction before calculating the approximation
- All aboard the Love Train
- When to approximate the binomial distribution with the normal
- A runaway success!
-
10. Using Statistical Sampling: Taking Samples
- The Mighty Gumball taste test
- They’re running out of gumballs
- Test a gumball sample, not the whole gumball population
- How sampling works
- When sampling goes wrong
- How to design a sample
- Define your sampling frame
- Sometimes samples can be biased
- Sources of bias
- How to choose your sample
- Simple random sampling
- How to choose a simple random sample
- There are other types of sampling
- We can use stratified sampling...
- ...or we can use cluster sampling...
- ...or even systematic sampling
- Mighty Gumball has a sample
-
11. Estimating Populations and Samples: Making Predictions
- So how long does flavor really last for?
- Let’s start by estimating the population mean
- Point estimators can approximate population parameters
- Let’s estimate the population variance
- We need a different point estimator than sample variance
- Which formula’s which?
- Mighty Gumball has done more sampling
- It’s a question of proportion
- Buy your gumballs here!
- So how does this relate to sampling?
- The sampling distribution of proportions
- So what’s the expectation of Ps?
- And what’s the variance of Ps?
- Find the distribution of Ps
- Ps follows a normal distribution
- How many gumballs?
- We need probabilities for the sample mean
- The sampling distribution of the mean
- Find the expectation for X̄
- What about the the variance of X̄?
- So how is X̄ distributed?
- If n is large, X̄ can still be approximated by the normal distribution
- Using the central limit theorem
- Sampling saves the day!
-
12. Constructing Confidence Intervals: Guessing with Confidence
- Mighty Gumball is in trouble
- The problem with precision
- Introducing confidence intervals
- Four steps for finding confidence intervals
- Step 1: Choose your population statistic
- Step 2: Find its sampling distribution
- Point estimators to the rescue
- We’ve found the distribution for X̄
- Step 3: Decide on the level of confidence
- How to select an appropriate confidence level
- Step 4: Find the confidence limits
- Start by finding Z
- Rewrite the inequality in terms of μ
- Finally, find the value of X̄
- You’ve found the confidence interval
- Let’s summarize the steps
- Handy shortcuts for confidence intervals
- Just one more problem...
- Step 1: Choose your population statistic
- Step 2: Find its sampling distribution
- X̄ follows the t-distribution when the sample is small
- Find the standard score for the t-distribution
- Step 3: Decide on the level of confidence
- Step 4: Find the confidence limits
- Using t-distribution probability tables
- The t-distribution vs. the normal distribution
- You’ve found the confidence intervals!
-
13. Using Hypothesis Tests: Look At The Evidence
- Statsville’s new miracle drug
- So what’s the problem?
- Resolving the conflict from 50,000 feet
- The six steps for hypothesis testing
- Step 1: Decide on the hypothesis
- So what’s the alternative?
- Step 2: Choose your test statistic
- Step 3: Determine the critical region
- To find the critical region, first decide on the significance level
- Step 4: Find the p-value
- We’ve found the p-value
- Step 5: Is the sample result in the critical region?
- Step 6: Make your decision
- So what did we just do?
- What if the sample size is larger?
- Let’s conduct another hypothesis test
- Step 1: Decide on the hypotheses
- Step 2: Choose the test statistic
- Use the normal to approximate the binomial in our test statistic
- Step 3: Find the critical region
- SnoreCull failed the test
- Mistakes can happen
- Let’s start with Type I errors
- What about Type II errors?
- Finding errors for SnoreCull
- We need to find the range of values
- Find P(Type II error)
- Introducing power
- The doctor’s happy
-
14. The χ2 Distribution: There’s Something Going On...
- There may be trouble ahead at Fat Dan’s Casino
- Let’s start with the slot machines
- The χ2 test assesses difference
- So what does the test statistic represent?
- Two main uses of the χ2 distribution
- v represents degrees of freedom
- What’s the significance?
- Hypothesis testing with χ2
- You’ve solved the slot machine mystery
- Fat Dan has another problem
- the χ2 distribution can test for independence
- You can find the expected frequencies using probability
- So what are the frequencies?
- We still need to calculate degrees of freedom
- Generalizing the degrees of freedom
- And the formula is...
- You’ve saved the casino
-
15. Correlation and Regression: What’s My Line?
- Never trust the weather
- Let’s analyze sunshine and attendance
- Exploring types of data
- Visualizing bivariate data
- Scatter diagrams show you patterns
- Correlation vs. causation
- Predict values with a line of best fit
- Your best guess is still a guess
- We need to minimize the errors
- Introducing the sum of squared errors
- Find the equation for the line of best fit
- Finding the slope for the line of best fit
- Finding the slope for the line of best fit, part ii
- We’ve found b, but what about a?
- You’ve made the connection
- Let’s look at some correlations
- The correlation coefficient measures how well the line fits the data
- There’s a formula for calculating the correlation coefficient, r
- Find r for the concert data
- Find r for the concert data, continued
- You’ve saved the day!
- Leaving town...
- It’s been great having you here in Statsville!
-
A. Leftovers: The Top Ten Things (we didn’t cover)
- #1. Other ways of presenting data
- #2. Distribution anatomy
- #3. Experiments
- Designing your experiment
- #4. Least square regression alternate notation
- #5. The coefficient of determination
- #6. Non-linear relationships
- #7. The confidence interval for the slope of a regression line
- #8. Sampling distributions – the difference between two means
- #9. Sampling distributions – the difference between two proportions
- #10. E(X) and Var(X) for continuous probability distributions
- Finding E(X)
- Finding Var(X)
- B. Statistics Tables: Looking Things Up
- Index
- About the Author
- Copyright
Product information
- Title: Head First Statistics
- Author(s): Dawn Griffiths
- Release date: August 2008
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9780596527587
You might also like
book
Bayesian Statistics the Fun Way
by Will Kurt
Probability and statistics are increasingly important in a huge range of professions. But many people use …
book
Practical Statistics for Data Scientists, 2nd Edition
by Peter Bruce, Andrew Bruce, Peter Gedeck
Statistical methods are a key part of data science, yet few data scientists have formal statistical …
book
Head First Data Analysis
by Michael Milton
Today, interpreting data is a critical decision-making factor for businesses and organizations. If your job requires …
book
The Kaggle Book
by Konrad Banachewicz, Luca Massaron
Get a step ahead of your competitors with insights from over 30 Kaggle Masters and Grandmasters. …