# Computing statistical significance of HPV vaccine effectiveness data in the ATHENA baseline HPV study

Table 3, first two rows (Age group 21-24), of "The ATHENA human papillomavirus study: design, methods, and baseline results" (pubmed 21944226, full text) gives some results for HPV vaccine effectiveness, but it does not give a confidence interval, nor does it talk about statistical significance.

Yet some people are arguing, based on this data, that the HPV vaccine is not effective. So let's see if we can tell whether these results are statistically significant.

I'm terrible at statistics, but BMJ's "Statistics at Square One" has some handy tips about Chi squared tests. Let's try them out.

First, construct a null hypothesis:

Null hypothesis: vaccination does not reduce prevalence of HPV16
Then, organize the results as described in section "Fourfold tables":
```                HPV16+         HPV16-                  total
vaccinated      58             (720-58)=662            720
unvaccinated    (428-58)=370   (4914-720-428-58)=5564  4196
total           428            4493                    4914
```
Then apply the formula (here, I use Python to do it, but an online chi-squared calculator works great, too),
```  a = 58.0
b = 662.0
c = 428.0
d = 4493.0
print ((a*d - b*c)**2.0 * (a + b + c + d)) / ((a+b)*(c+d)*(b+d)*(a+c))
```
I get chi^2 is about 0.33.

Plugging chi^2 = 0.33 and degrees-of-freedeom = 1 into a P-value table says

P = 0.5657
and thus the results are not statistically significant. (Rather far from it; P has to be under 0.05 before it's barely significant, and below 0.01 for strong significance.)