It is very important to correct the misuse of statistics regardless of the identity of the perpetrator. Sometimes, it may be even more important to correct well-known individuals because their erroneous statistical argument will have a much more substantial influence than if it had been committed by an average blogger.
One such case is that of Rebecca Watson (who has arguably done more than anyone to highlight important issues related to feminism in the skeptical community) and her analysis of the ages of specific female movie stars and the age of men playing their male love interests in a selection of their movies. The background leading up to her blog post entitled Leading Women Age, Too is that an article posted on Vulture showed data suggesting that male movie stars increase in age, whereas the age of the women playing their female love interests stays roughly within the same age range regardless of the age of the male actor (this is, as we shall see below, erroneous). Someone suggested doing a similar thing for female movie stars that has been doing movies for a long time. Since Watson is apparently “a party animal” she “got totally crazy and spent like an hour on IMDB just to satisfy your curiosity”.
The basic idea was to compare the age of the female movie star (Watson picked Meg Ryan, Julia Robers and Meryl Streep) in different movies with the age of who she believed to the female characters’ love interest and to see if there is a difference before and after the actress turns 40. Here is the logic of her statistical analysis:
If you’re interested, Meg’s mean age in this chart is 35.6 and her costar’s mean age is 39.9 (a difference of 4.3 years). Prior to the age of 40, her mean age is 31.8 and her love interest’s mean age is 37.8 (a difference of 6 years).
See the problem? Watson apparently thinks she can just average the age of the female movie star in all movies, then compare it with the average age of the male love interest in all movies. However, this is only possible for unpaired data and it is highly statistically inappropriate to attempt this for paired data .
Simplified, unpaired data is obtained when there is no connection between a specific data point in one group and a specific data point in the other e. g. one treatment group and one placebo group. Paired data, on the other hand, is something you get when such a connection exists e. g. measuring blood pressure in a patient before and after treatment. Clearly, in the latter case, you do not average the blood pressure for every patient before treatment and compare it with the average for every patient after treatment. Rather, you first calculate the difference before and after treatment for each individual, then average that difference.
Therefore, it is easy to understand that the more appropriate way to analyze the data would be to take the difference between the female star and the male love interest for each movie before and after the age of 40, then average them separately and compare.
I pointed this out in a comment on her blog post, where I also noticed that the original claim in the Vulture post was false, there was often a moderate correlation between the age of the male movie star and the age of his love interest as the male movie star got older (here is a webcite in case those comments happen to go missing).
Now, a rational person would understand and accept these rebuttals and correct their errors. This is what Phil Plait did after he performed the statistical errors I described in a previous post. A commenter by the name of Jack99 then made the following irrelevant objection:
Jack99 clearly misses the point: my objections were that Rebecca Watson treated the data as if it was unpaired, then it, in fact, was paired and that the original Vulture claim that the love interest of male movie stars do not age as themselves age is wrong. Here is my response:
Here is where Rebecca Watson joins the conversation. Does she acknowledge her error and fix it? Not even close:
I then go on to ask why she does not fix it and here is her stunning reply:
Note how Watson attempts to deflect and sweep her errors under the rug by suggesting that my arguments are ragy and incomprehensible. It is also fascinating to see that Watson apparently thinks that it is better to leave statistical errors uncorrected because she would have to “spend several more hours redoing chart” for something that was only “a joke post” to begin with.
Not only does Rebecca Watson not understand basic statistics, she also does not have sufficient intellectual integrity to correct her errors. Again, compare this with Phil Plait who I took to task in a previous post. He corrected every single statistical error he made (which were greater in both number and complexity) within 24 hours after having them pointed out to him.
Phil Plait has a high amount of intellectual integrity and honesty and cares about statistical accuracy. When it came to that issue, he was more or less at the summit of the skeptical ideal. In stark contrast, Rebecca Watson seems to prefer to chill out at the base camp.