It's true that 'data dredging' as it's sometimes called is not good practice. You should set out your hypothesis(es) and calculate your sample size based on your a priori question (funders, journals and often, ethics committees insist on this). If you find something intersting in your analyses afterwards, there is nothing to stop you reporting it in any publication but you should clearly state that it was not your original hypothesis.

As to your second point, well statistical significance has conventionally been set at 5% ie, you accept a probability of a statistical test giving you a 'false positive' result 5% of the time. This does mean that, as you say, 1 in 20 tests will give a false result so if you start doing multiple tests, then you increase your chances of rejecting your null hypothesis when it is in fact true (a type 1 error).

In practice though, there has been a move away from using the somewhat arbitrary 5% significance (you should really report the actual p value that your stats package gives you) and using confidence intervals instead.

Anyway, that was my first post too. Hope it I makes sense as well!