Naked Science Forum
General Science => General Science => Topic started by: scientizscht on 13/10/2021 22:55:11
-
Hello!
I am given two different forecasts and at the end of each day, I get the numbers of what actually happened.
What analysis can I use to compare these two forecasts on how well they predict?
I was thinking regression or ANOVA but I don't think they will work because the mean variance will be zero if day by day, the forecast predicts +20% then -20% then +20% etc.
I need to know how many times each forecast is off and by how much on average.
Next, I need to combine the forecasts with a formula to improve the predictions, is that possible?
Thanks!
-
A weather forecast is a many-splendored thing (or multi-dimensional, if you prefer).
- Checking its accuracy is also a multi-dimensional problem.
You would need to compare the accuracy:
- At every point in the area of the forecast
- For correct temperature
- air pressure
- windspeed
- precipitation
- cloud cover (if sunny days are important to you)
And since forecasts necessarily cover a period of time, you may need to assess the accuracy of a single forecast at multiple points in time (eg 1 day, 2 days, 4 days, 1 week, etc). You expect the accuracy to degrade as you predict further into the future.
-
Comparing the forecasts should be done for each forecast separately to itself, giving an error to each one.
The acceptable error should be stated.
ANOVA uses independent random samples, which might not fit, as the forecast uses training data, not random.
1. Values verification (multidimensional as mentioned by @evan_au (https://www.thenakedscientists.com/forum/index.php?action=profile;u=26889) might be done by RMSE (Mean Squared Error) for each predicted value.
2. Dichotomous (yes/no) forecasts.
3. 'In machine learning applications where logistic regression is used for binary classification, the MLE minimises the Cross entropy loss function.’ - Wikipedia (https://en.wikipedia.org/wiki/Logistic_regression).
Combining forecasts using a formula looks not possible, assuming the forecasts were done with different models.
Model improvement can be done, running the prediction for already known values, and checking the variance.
-
the mean variance will be zero if day by day, the forecast predicts +20% then -20% then +20% etc.
Do you know what a variance is?
If your data is 80,120,80,120... and so on, then the mean is 100, the standard deviation is about 20 and the variance is about 400.