P Testing

From Lesswrongwiki
Jump to: navigation, search

P Testing is used in statistical analysis.

In the circumstances where it is most frequently encountered, an article will refer to a study meeting or failing to meet a significance test of P <= 0.05.

A p-value less than .5 means that the actual experimental result or a more extreme one (what that means depends on one's choice of the null hypothesis and few other things) would happen with less than 0.5 chance if the null hypothesis is true. It does not follow that the explanation is better than (the explanation that the results were obtained by) chance.

Note that:

1: The p-value depends on the null hypothesis H0 and the results but it does not depend on the tested explanation (in fact there is no explanation causally linked to the test except "the null hypothesis is true/false").

2: The p-value is equal to P(result or more extreme | H0), which is neither equal to P(H0 | result) nor P(~H0 | result) (and of course not P(an explanation different from H0 | result)) nor related to any of them by a unique relation (even if we forget the "or more extreme" part). Another quantity, typically prior P(H0), is needed to calculate the posterior probability of H0 after observing the result.

3:The sentences "the obtained result is 27% likely due to chance" and "the result is 27% likely to happen by chance" sound similar, but the former is more likely to be understood as "having obtained this very result we conclude that there is 27% probability that no mechanism distinct from chance has caused it", while the latter is likely to be understood "assuming no mechanism distinct from chance is at work, this result is likely to be obtained with probability 27%". Since humans often misunderstand analogous probabilistic statements, it's wise to be very careful with formulations.