Statistical tests

jonjo · « **on:** 28/03/2005 19:55:04 »

Hi everyone this is my first post here! I hope this is the right place for this topic.

Something has been bugging me regarding statistical tests. I have been reading that if you analyse a data set you should state upfront what questions you want to answer. Ok so far. Now if later you find you would like to look at some other aspect in the data I've read this is a no-no. The reasoning I think is that if you do multiple tests at the 5% significance level on average 1 in 20 of the tests will lead you to declare significance falsely.

My point is if I don't look why should this make me right? Say I choose the second question the first time round? I hope this makes sense.
jonjo .

gsmollin · « **Reply #1 on:** 28/03/2005 22:03:32 »

I think there is some mis-understanding here. I can't imagine how computing different statistic on a random variable will change the random variable. Actually, it just makes no sense.

For example, suppose you have a random variable A, and you decide to compute its mean and variance. Later you decide to compute its third moment. Are you saying that would change the first two answers? I think not.

I can only guess that there are some other un-stated conditions that are involved here. In general, there are an un-limited number of statistics you could compute on a random varible without affecting the variable. There is no kind of "uncertainty principle" involved in this, except perhaps at the measurement level.

enidjane · « **Reply #2 on:** 29/03/2005 22:15:38 »

The essence of the scientific method is that a scientist creates a hypothesis, collects data to investigate it, and then runs statistics to see if the hypothesized relationship holds.

It is true, jonjo, that if you just keep running estimation after estimation after estimation, then you'll get SOMETHING to be significant eventually. This violates the scientific method, and is considered bad science. Statistical estimations are to provide empirical support for theoretical predictions.

chimera · « **Reply #3 on:** 30/03/2005 15:32:52 »

quote:
Originally posted by enidjane

It is true, jonjo, that if you just keep running estimation after estimation after estimation, then you'll get SOMETHING to be significant eventually. This violates the scientific method, and is considered bad science. Statistical estimations are to provide empirical support for theoretical predictions.

Yet, most real scientific discoveries are of this exact anathematic nature. In retrospect, also for pr reasons, this is largely brushed over. Got to keep up appearances... [

]

Calum · « **Reply #4 on:** 02/04/2005 14:56:23 »

It's true that 'data dredging' as it's sometimes called is not good practice. You should set out your hypothesis(es) and calculate your sample size based on your a priori question (funders, journals and often, ethics committees insist on this). If you find something intersting in your analyses afterwards, there is nothing to stop you reporting it in any publication but you should clearly state that it was not your original hypothesis.

As to your second point, well statistical significance has conventionally been set at 5% ie, you accept a probability of a statistical test giving you a 'false positive' result 5% of the time. This does mean that, as you say, 1 in 20 tests will give a false result so if you start doing multiple tests, then you increase your chances of rejecting your null hypothesis when it is in fact true (a type 1 error).

In practice though, there has been a move away from using the somewhat arbitrary 5% significance (you should really report the actual p value that your stats package gives you) and using confidence intervals instead.

Anyway, that was my first post too. Hope it I makes sense as well!

genegenie · « **Reply #5 on:** 02/04/2005 15:52:50 »

You can also adjust your alpha level to account for multiple testing. One method you can use is the Bonferroni correction for multiple testing. Basically you divide your alpha level by the amount of tests you are doing.

But I agree with Calum, best to report your p value.

Science is organized knowledge. Wisdom is organized life.

QotW - 20.03.01 - Do present fusion tests produce more energy than they consume? Started by melaniejsBoard Question of the Week	Replies: 18 Views: 26350	02/04/2020 15:37:04 by nudephil
Do lateral flow tests recognise dead covid cells? Started by Lewis ThomsonBoard COVID-19	Replies: 4 Views: 5243	09/02/2022 05:02:40 by set fair
My Lyme disease tests came back Positive.. so now what should I expect? Started by Karen W.Board Physiology & Medicine	Replies: 4 Views: 7914	23/03/2011 09:07:50 by Karen W.
How good are sight tests as a guide to driving ability? Started by PAOLO137Board General Science	Replies: 9 Views: 6323	25/07/2015 23:34:34 by evan_au
Could we determine leaf content using chromatography and urine tests? Started by thedocBoard Plant Sciences, Zoology & Evolution	Replies: 1 Views: 2733	31/05/2016 18:35:47 by RD

Statistical tests

jonjo (OP)

Statistical tests

gsmollin

Re: Statistical tests

enidjane

Re: Statistical tests

chimera

Re: Statistical tests

Calum

Re: Statistical tests

genegenie

Re: Statistical tests

Similar topics (5)

Follow us

cambridge_logo_footer.png