Naked Science Forum

Life Sciences => The Environment => Topic started by: Marius_92 on 11/11/2020 09:32:56

Title: How can we make average photovoltaic data look more realistic?
Post by: Marius_92 on 11/11/2020 09:32:56
Hello everyone,

we are building a tool, that is supposed to estimate costs, that can be saved by using a specific optimization tool for integrated power management. For this we scrape photovoltaic (pv) power data from a website and compare it to realistic data from sensors of our own pv plant at the same location. In the attached plot, you can see both curves. The orange curve shows the data from the website, the blue curve the data from our plant. The orange curve is the same for every day since  the data from the website is only monthly average data, which we resampled to days. The differences in the curves can be explained by several factors, like cloudiness, dust, precipitation, humidity, pollution of the panels, and others.

Our question is: How can we make the average data (blue curve) look more realistic resp. more like the blue curve? Our idea is, to use a neural network, but we are not quite sure if this is a good idea though. We want to first calculate the monthly averages of the data of the blue curve. Then we want to train the network on this average and the acutal values. After training, it should be capable of deriving more realistic values out of the average data represented by the orange curve.

Do you think this can work? We are not experienced with neural networks and this is just a first clue. Do you have other ideas, how we can achieve our goal? Do we have to know the mentioned influencing factors (cloudiness etc.) anyway? Or will the approach with the neural network work even without knowing the actual reason for the differences of the curves? Suppose, we could identify the most influencing factors: Could we then complete our task without neural networks? How could this be done? Unfortunately we are no metereological experts, but maybe here are some?

Thank you in advance for your help!

Kind Regards

Marius
Title: Re: How can we make average photovoltaic data look more realistic?
Post by: alancalverd on 11/11/2020 15:00:58
You need to know more about the supposed "monthly average".

Was it measured above the cloud line? In the UK, for instance, there is very little cloud above 5000 ft (1500 m) most days, but very little sunshine on the ground (most of the country is below 1000 ft) with continuous cloud cover about half the time.

Collector and transfer efficiency is important. You have plotted  "power" on the vertical axis, but the power delivered from a PV system depends on how it is matched to the load. It is possible that the "average" graph cam from a small solar follower system feeding a constant load and your installation has fixed geometry and/or variable load.

However the day-to-day variation of the blue curve looks entirely realistic with a warm front (gradually thickening cloud) approaching in the last three days. So you can simply integrate the area under the blue curve over any given month to produce a daily average for that month and year, and repeat over several years to get a meaningful monthlyh prediction based on your local microclimate.

What you have demonstrated is the fundamental weakness of all "zero impact" renewables: the absence of day-to-day predictability of supply, which therefore requires something like 5 days' storage capacity (in temperate latitudes) if you are to use it as a replacement for fossil or nuclear fuel. If you set up shop above the arctic circle, of course, you need 6 months' storage capacity!
Title: Re: How can we make average photovoltaic data look more realistic?
Post by: evan_au on 11/11/2020 20:42:58
Quote from: Marius_92
How can we make the average data (orange curve) look more realistic resp. more like the blue curve?
It's a matter of information.

When you take a monthly average (orange curve), you throw away the day-to-day variation within the month (while retaining the month-to-month variation). You have thrown away information.
- If their data is an aggregate of solar panels spread over 100 miles radius, you have discarded even more information
- But the orange figure is so perfect that they may have just fitted a half-sine curve to the actual measured data (hence throwing away almost all information).

The blue curve retains the information of weather fronts (as Alan identified) and even individual clouds passing over your solar panels (sharp dips) or even gaps in the clouds (sharp spikes).

What you are asking is to restore the discarded information to the monthly average.
- When you don't have the discarded information for their test site(s)
- So you can't do it with real information, but you could "make up" something that looks like it has more realistic minute-by-minute and day-by-day variation
- But why bother? You can't validly claim that this is official data from source X, since you monkeyed with it.

What you can validly do is to discard an equal amount of information from your data collection, and then compare the two.
- Take a monthly average of your data, and see how it looks.
- Note that if your data is just for a single site, you will see the effect of individual clouds, but if their data is an aggregate of solar panels spread over 100 miles radius, you won't get the degree of variation you see with a single solar panel.

To see how to corrupt the data:
- Extract the data patterns out of your data: The half-sine of the Sun moving across the sky (the duration of sunlight and angle to your solar panel varies from month to month)
- Extract the long-term variations (spanning multiple days)
- Extract the short term variations (spanning hours and minutes)
- These parameters will typically be fractal-like behaviors
- You can then produce a Markov Chain or Brownian random-walk model that simulates all of the above.
- Just be aware that you are creating a realistic-looking work of fiction, not restoring discarded data to an official source.

See: https://en.wikipedia.org/wiki/Markov_chain
https://en.wikipedia.org/wiki/Random_walk

Quote from: alancalverd
the absence of day-to-day predictability of supply
It's true that a cloud layer reflects sunlight back into space, reducing solar power intensity on the ground.
- But the effect is not as great as you might expect, since flat-panel collectors can collect sunlight from all over the sky, so they still work fairly well with the diffuse light received on overcast days.
- This is unlike focused solar arrays, which relay on a clear view of the Sun to concentrate power on the collector element. These arrays work much better in a cloudless environment (eg a desert).
Title: Re: How can we make average photovoltaic data look more realistic?
Post by: Marius_92 on 12/11/2020 08:52:58
Tank you so much for your detailed and quick replies! I am working on this project only part-time and therefore will not be able to have a closer look and answer properly before monday. One question I can answer very quick:
The orange curves are derived from data from the website global solar atlas.
The site doesn't provide daily or even hourly data. Instead, it only gives the monthly average for each month, calculated from observations of several years. On this site, if you enter a location, on the bottom right they display hourly values over one average day for each month. We just took these average values for one day, and plotted them for all days of the month, to at least gain a whole month of data. This is why the orange curve looks so perfect - it is the same values plotted for each day
Title: Re: How can we make average photovoltaic data look more realistic?
Post by: evan_au on 12/11/2020 19:44:17
It seems that you want to simulate a realistic-looking pattern of solar panel output.
- If someone obtains the map coordinates of your test site, and looks back at the weather records for that location, they will find days where your simulation reports very high solar panel output, while the weather records show heavy rain (and the  reverse: simulated low output on a sunny day with no cloud cover).
- Alternatively, you could use the actual weather records for a data-driven simulation. Here you look up weather records for cloud cover, humidity or rainfall, and use this as an input to your simulation.

One of the computational techniques for analyzing and generating fractal series is Fractional Brownian motion.
- One of the inputs is the Hurst Parameter (H), which determines whether the data has short or long term correlation
- I expect that weather has long-term correlation, as some weather patterns last days to weeks
- Be careful in using traditional statistical tools like standard deviation to analyze your solar panel output - if H > 0.5, the standard deviation is undefined (or infinite, if you prefer).
- But the results will still look fairly reasonable, even if you do calculate a standard deviation on (say) data every 5 minutes  from your solar panel.
See: https://en.wikipedia.org/wiki/Fractional_Brownian_motion
https://en.wikipedia.org/wiki/Hurst_exponent
Title: Re: How can we make average photovoltaic data look more realistic?
Post by: alancalverd on 13/11/2020 10:58:48
My approach to estimating cost savings would be to measure your "actual available" and multiply by "actual demand" over whatever historic data you have. The result may be disappointing  if you are looking at domestic consumption because the demand is least when the availability is highest (midday) but could be encouraging for some non-time-sensitive industrial applications or commercial food production (the demand in UK hospitals used to peak between 11 am and 1230, when the kitchens were at full blast). 

The daily symmetry of the "website" figures is an obvious lie. In an early high-pressure weather system, sunlight intensity increases slowly as the morning mist evaporates and reaches a peak just after midday, as shown by the first two blue curves. The third and fourth days show the rapid early rise when the ground has dried (mist dissipates quickly) followed by  the formation of convective cloud. There is a small "evening kick" as the cumulus clouds dissipate. Day 4 shows the early appearance of cirrus, with the sharp rise characteristic of a dry surface but a reduced peak, and another evening kick. The warm sector stratus dominated day 5, and day 6 looks pretty  much the same - depressed and symmetrical with a couple of breaks. 

You might expect repetitive symmetrical yields in some dry deserts, but not anywhere in western Europe.

So your "actuals" cover the most probable range of daily cloud from "dry zero" to continuous cover and rain. All you need to do is to  multiply each point by the frequency of that cloud condition in each month., and the seasonal azimuth factors if you don't have data for every month. 

The good news is that this question originated from a German ISP. The dominant weather evolutions, particularly in the north,  are similar to the UK pattern I have used, and your local airport or gliding club will probably have immaculate historic records from which you can recreate any day's cloud base and cover in 20 minute samples. If not, try allmetsat or www.dwd.de for historic 4-hour synopses.
Title: Re: How can we make average photovoltaic data look more realistic?
Post by: Marius_92 on 16/11/2020 11:09:40
Thank you very much for your answers and for the links!
We decided now in advance, to go a quite simple way by just calculating for each time step the relative deviation of the measured value (blue curve) from the monthly average of the blue curve. Then we just 'add' this percentile deviation to the orange cure. this approach is simple and easy to implement and the data doesn't have to be very accurate anyway. Thereby we also avoid the seemingly fact, that there are more influence factors, other than clouds, that influence the solar power (like dust, precipitation, wind speed, ...).

@ evan_au For later, more complicated implementations, the proposed Markov Chain, the random walk or the fractional brownian motion look very promising to yield more accurate models. Unfortunately, for us it is not 100% clear, how they can be implemented and used for the simulation. However, these seem rather complicated and first we need to gather more information, how this can be implemented, since we are no experts in this field.

@alancalverd Your idea sounds very suitable. Do I get it right? You propose something like multiplicators according to cloud coverage? How would you determine these multiplicators? Can they be derived out of actual cloud data?
Title: Re: How can we make average photovoltaic data look more realistic?
Post by: alancalverd on 16/11/2020 11:40:05
Here's what allmetsat.com says about RAF Fairford as I write this

Current weather observation
The report was made 24 minutes ago, at 11:01 UTC
Wind 12 kt from the West/Southwest
Temperature 11°C
Humidity 71%
Pressure 1014 hPa
Visibility 10 km or more
Broken clouds at a height of 4400 ft
Broken clouds at a height of 12000 ft


"Few" = 1/8 to 2/8 of sky covered
"Scattered" = 3/8 - 4/8
"Broken" = 5/8 -7/8
"Overcast" = sky fully covered with cloud

If your local airport reports  high (> 12,000 ft) and low cloud, you can get an even better idea of what's happening.

Now associate your actual power readings with whatever METAR data they provide, over several days, and look for correlations. Wikipedia "METAR" gives a fairly comprehensive interpretation of the raw codes and you will quickly get to spot the significant bits.



 
Title: Re: How can we make average photovoltaic data look more realistic?
Post by: Marius_92 on 16/11/2020 14:50:14
Ah ok, now I get it. Thank you very much, this will help us a lot!