The Naked Scientists

The Naked Scientists Forum

Author Topic: Statistical tests  (Read 3401 times)

Offline jonjo

  • First timers
  • *
  • Posts: 1
    • View Profile
Statistical tests
« on: 28/03/2005 19:55:04 »
Hi everyone this is my first post here! I hope this is the right place for this topic.

Something has been bugging me regarding statistical tests. I have been reading that if you analyse a data set you should state upfront what questions you want to answer. Ok so far. Now if later you find you would like to look at some other aspect in the data I've read this is a no-no. The reasoning I think is that if you do multiple tests at the 5% significance level on average 1 in 20 of the tests will lead you to declare significance falsely.

My point is if I don't look why should this make me right? Say I choose the second question the first time round? I hope this makes sense.
jonjo .


 

Offline gsmollin

  • Hero Member
  • *****
  • Posts: 749
    • View Profile
Re: Statistical tests
« Reply #1 on: 28/03/2005 22:03:32 »
I think there is some mis-understanding here. I can't imagine how computing different statistic on a random variable will change the random variable. Actually, it just makes no sense.

For example, suppose you have a random variable A, and you decide to compute its mean and variance. Later you decide to compute its third moment. Are you saying that would change the first two answers? I think not.

I can only guess that there are some other un-stated conditions that are involved here. In general, there are an un-limited number of statistics you could compute on a random varible without affecting the variable. There is no kind of "uncertainty principle" involved in this, except perhaps at the measurement level.
 

Offline enidjane

  • First timers
  • *
  • Posts: 3
    • View Profile
Re: Statistical tests
« Reply #2 on: 29/03/2005 22:15:38 »
The essence of the scientific method is that a scientist creates a hypothesis, collects data to investigate it, and then runs statistics to see if the hypothesized relationship holds.

It is true, jonjo, that if you just keep running estimation after estimation after estimation, then you'll get SOMETHING to be significant eventually.  This violates the scientific method, and is considered bad science.  Statistical estimations are to provide empirical support for theoretical predictions.
 

Offline chimera

  • Sr. Member
  • ****
  • Posts: 475
    • View Profile
Re: Statistical tests
« Reply #3 on: 30/03/2005 15:32:52 »
quote:
Originally posted by enidjane

It is true, jonjo, that if you just keep running estimation after estimation after estimation, then you'll get SOMETHING to be significant eventually.  This violates the scientific method, and is considered bad science.  Statistical estimations are to provide empirical support for theoretical predictions.



Yet, most real scientific discoveries are of this exact anathematic nature. In retrospect, also for pr reasons, this is largely brushed over. Got to keep up appearances... :)
 

Offline Calum

  • First timers
  • *
  • Posts: 3
    • View Profile
Re: Statistical tests
« Reply #4 on: 02/04/2005 14:56:23 »
It's true that 'data dredging' as it's sometimes called is not good practice. You should set out your hypothesis(es) and calculate your sample size based on your a priori question (funders, journals and often, ethics committees insist on this). If you find something intersting in your analyses afterwards, there is nothing to stop you reporting it in any publication but you should clearly state that it was not your original hypothesis.

As to your second point, well statistical significance has conventionally been set at 5% ie, you accept a probability of a statistical test giving you a 'false positive' result 5% of the time. This does mean that, as you say, 1 in 20 tests will give a false result so if you start doing multiple tests, then you increase your chances of rejecting your null hypothesis when it is in fact true (a type 1 error).

In practice though, there has been a move away from using the somewhat arbitrary 5% significance (you should really report the actual p value that your stats package gives you) and using confidence intervals instead.

Anyway, that was my first post too. Hope it I makes sense as well!
 

Offline genegenie

  • Full Member
  • ***
  • Posts: 85
    • View Profile
Re: Statistical tests
« Reply #5 on: 02/04/2005 15:52:50 »
You can also adjust your alpha level to account for multiple testing. One method you can use is the Bonferroni correction for multiple testing. Basically you divide your alpha level by the amount of tests you are doing.

But I agree with Calum, best to report your p value.


Science is organized knowledge. Wisdom is organized life.
 

The Naked Scientists Forum

Re: Statistical tests
« Reply #5 on: 02/04/2005 15:52:50 »

 

SMF 2.0.10 | SMF © 2015, Simple Machines
SMFAds for Free Forums