College football is coming to a close in the final season of the BCS. I was watching Ohio State and Clemson play in the Orange Bowl and realized that there are a lot of people out there that understand football who struggle to understand concepts and definitions in statistics. I realized I could work through a simple problem and define some terms along the way.
A common belief among college football fans is that the teams from the Southeastern Conference (SEC) are better at football than teams from other conferences. In studying mathematics and statistics, we can check to see if this belief is true. To test this mathematically, we first have to define what exactly this belief is. If the goal was to do rigorous mathematics, I could model that the SEC is “better” and do complex analysis; however, since this is simply a blog, I will limit the analysis to attempting to answer the question “Are SEC teams better at winning Bowl Championship Series (BCS) bowl games than Big 10 teams?”
First, we need to decide what our populations and samples are. Here I am going to make a simplifying assumption and treat both the SEC and Big 10 as homogenous entities. That is both the SEC and Big 10 have a “skill level” that is unknown to us; that is they have some inherent percentage of BCS bowls that they would win if they played these bowls over and over infinitely. And then we have our samples, the observed records of SEC and Big 10 teams in BCS bowl games. The observation records (through the games of Jan 3., 2014) are SEC 17-9 and Big 10 13-15.
Since we have populations and observations, it is possible to set up a hypothesis and test it. In testing a hypothesis, there must be a proposed “null” hypothesis and an “alternative” hypothesis. I believe many elementary texts make the mistake of placing too much emphasis on the null hypothesis when generally the alternative hypothesis is more interesting and supported by the data. In this case the null hypothesis will be that the Big 10 and SEC conferences have an equal chance of winning BCS bowl games and the alternative will be that the SEC has a greater chance of winning BCS bowl games.
This is a point where students often become confused. We have the records; can’t we look at the records and see that the SEC wins BCS bowl games at a higher rate than the Big 10? The short answer is no, we can’t. If the SEC and Big 10 are equal at winning bowl games, there is some chance of observing the result we have. This chance (or probability) is the p-value; more precisely the probability of observing a result at least as extreme as the one we have, if the null hypothesis is true, is the p-value.
This leads to the obvious question of “How can we find this p-value?” The most straightforward way to approach this is to convert the win-loss records into percentages. For students that have studied statistics, I will define a couple of variables here to make this look more like what you may have done in your textbook.
Population 1: SEC football teams playing in BCS bowls
Population 2: Big 10 football teams playing in BCS bowls
P1 = proportion of SEC football teams playing in BCS bowls that win the game
P2 = proportion of Big 10 football teams playing in BCS bowls that win the game
Now we want to perform the hypothesis test
H0 : p1 ≤ p2
H1 : p1 > p2
A quick note here is that many introductory texts will write the null hypothesis as p1 = p2, but including the less than or equal to is formally correct as the null and alternative hypotheses should include all possible outcomes.
If we knew p1 and p2 this would be a trivial problem, but we don’t; all we have are the observations. We can use the observations to estimate the parameters p1 and p2. Note that parameters always belong to the population. In real world problems, these are rarely if ever known and we must estimate them form the data.
These are the proportion of BCS bowl games played in that each conference has won. For the purposes of this blog, wins that were later vacated are counted as wins. At this point I am going to make a simplifying assumption that is clearly not correct; I am going to assume that SEC teams winning BCS bowl games is independent of Big 10 teams winning BCS bowl games; this is clearly not correct as they sometimes play each other, but this is not published research, but simply a blog to show an example of what we can calculate.
Now we are ready to find the ever popular test statistic.
What this test statistic allows us to do is look up the probability of observing the actual outcome if the null hypothesis is true. That is, we look on the Z-table to see that probability corresponding to z=1.43 is 0.9236. This table value tells the probability that our observation would be no more extreme than what we observed; however, a p-value is the probability of observing an outcome at least as extreme as the observation, so the p-value is found as 1-0.9236 or 0.0764.
Now, what does this p-value of 0.0764 mean? It means we don’t know. Typically when the p-value is less than 0.05 we reject the null hypothesis as there is only a 5% chance of being wrong and when the p-value is greater than 0.1 we do not reject the null hypothesis because there would be a greater than 10% chance of incorrectly rejecting a true null hypothesis. When it is between 0.05 and 0.1 we cannot be certain of how to proceed. Someone once told me (I think he was joking) that when the p-value is between 0.05 and 0.1 you should consider who is paying for the study and draw the corresponding conclusions. The only conclusion I will draw here is there is a 7.6% chance that, in the BCS era, the results of the Big 10 and SEC teams could occur by random chance if they are equally skilled sat winning bowl games.
The three important statistics terms I bolded are defined below.
Population: A population is a large set of objects of similar nature.
Sample: A sample is a subset or some portion of a population. A sample is chosen to make inferences about a population.
P-value: A p-value is the probability of obtaining an observation at least as extreme as you present observation if the null hypothesis is true. That is this is the chance the result happened by random chance.