Monday, May 15, 2006
Getting started
Today we got started looking at baseball from a statistical point of view. Here are some things we covered.
- There is a difference between statistics and Statistics. The small s statistics are the data we collect; the big s Statistics is the science of learning from data.
- Looking at Bernie Williams baseball card, we saw examples of categorical variables such as his fielding position and quantitative variables such as his batting average in 2000.
- Statistics always begins with a question, such as "Was Bernie Williams a big home run hitter?"
- In exploring data, we start by making a graph. Three useful graphs we discussed were a dotplot, a stemplot, and a time series plot.
- A graph gives you a picture of a data distribution. We describe a distribution by talking about its shape, a center value, spread, and any unusual characteristics such as outliers.
- We saw from the dotplot that Bernie averaged about 18 home runs a season. Looking at the time series plot, we see that Bernie peaked in home run hitting in the middle of his career and has decreased in recent years.
- Looking at Barry Bonds' walks, we see that Bonds tends to walk a lot especially in recent years. Pitchers are geninuely afraid of Bonds' home run talent.