Thursday, May 18, 2006
Deviations and Baseball Shapes
In the first part of class, we introduced the idea of a deviation.
- When we look at the ages of the Yankee players, we can summarize them by a mean.
- A deviation is the distance of an age from the mean. Randy Johnson's deviation is 7 which means he is seven years older than the mean age.
- To measure spread, we can use a typical deviation size. The standard deviation, or s for short, has a complicated formula, but it essentially tells us what a typical deviation is.
- We have already talked about baseball player shapes -- for example, Babe Ruth had an interesting physique. But here we are talking about shapes of baseball data.
- Most baseball statistics are counts of things like runs, hits, walks, home runs, etc. Distributions of counts tend to be right-skewed. Here's a graph of home runs of all 2005 regulars.

- Other baseball statistics are "derived". That means they are computed by a formula. Examples of derived statistics are batting averages, slugging percentages, ops, etc. These batches (for groups of regular players) tend to be symmetric. For example, here are the OBPs for all 2005 regulars.

- We'll talk later about a special distribution shape called the normal curve. Normal curves are easy to summarize knowing the mean and standard deviation.