EPW File source:

Hi guys, I was checking out the System Advisor Model (SAM) by NRel and I eventually got to this page here and when I followed it I got here
I realized I could download epw files from this site for almost every place I was keen to analyse. All along I had been using epw files from epwmap only.

Is it correct to say the epw files on the website above by the PhotoVoltaic Geographical Information System are just as good as epwmap files?

Hi @gmandevhana,
Nice share. One way to review it will be to download epw for a location for which Energyplus provides a weather file. See the difference.

1 Like

Thanks for posting, @gmandevhana and this is a useful resource to know about.

We can say pretty definitively that these files are not as robust as the ones on EPWmap but they may be good enough for your purposes depending on what you need. It seems the primary purpose of SAM is to give information on the potential of renewables in different locations and so these epw files seem suitable enough for that purpose. However, if you are using an EPW file for a building energy simulation, you are better off using a nearby file in EPWmap if there is one available. I can say this for a few reasons:

All of the epw files on EPWmap come either from the US DoE database or the Onebuilding Database. The files in these databases are built using on-ground measurements with at least 18 years of data. The files from the SAM interface seem to be built using only 10 years of this data at most and so they may not be a very good representation of the climate of a given location over longer periods of time. For example, droughts in some parts of the world can last as long as a decade. Also, I would infer that the vast majority of the data in the SAM epw files comes from satellite measurements instead of on-ground measurements. I’m 99% sure that the solar radiation data in these epws is from satellite measurements and this may be the case for the air temperature measurements too. Understandably, satellite-derived data has much larger error bars around it than ground-observed data. For example, to get estimates of air temperature from satellites, one must typically start from long-wave radiant pictures of the earth, convert that over to a surface temperature, and then make some assumptions about the heat transfer between the surface and the air just above it to arrive at an air temperature. You can check this result against a few air temperature measurements on the ground to help calibrate everything and this can improve the accuracy of this method but it’s still not as good as the on-ground measurement itself. I don’t know if SAM lists the uncertainty of the data set anywhere but it could be off by a few degrees if it’s using a method to estimate air temperature like the one I just described.

All of this said, an epw that is off by a few degrees is better than no epw at all. So I would feel free to use that site whenever you can’t find an available EPW in epwmap or another on-ground source for data.

3 Likes

Thanks a lot @chris!! I had already started checking the files against epwmap files for certain cities and yes you are right, they use a 10yr data set which is obviously not as reliable as what DoE or Onebuilding provides. I am still checking their website to find out if they have uncertainty info somewhere. But as you said, a not-so-accurate epw file is slightly better than none at all, and my areas of interest are largely in Southern Africa, so I will just have to be more cautious when I use the SAM files. Thanks again.

Thanks a lot @devang, I had already started on the exercise :slight_smile: I will see how they vary.

1 Like

Hi Chris, I want to create an EPW file. I got weather data from ECMWF website (ERA5 data). How can I convert 1990-2020 years weather data to one year (12 months) data to make EPW file. Is there any statistical method such as get the average for each month each hour ? If you answer, I would like to be so happy. Thank you

Responding to an old thread, but just wanted to mention that gridded data computed using a combination of physics-based models and satellite data is considered a more reliable source then weather station data for (at minimum) solar and wind data for use in renewable energy modeling. This would explain why it’s appropriate for SAM to link to data from the NSRDB, rather then some collection of EPW files.

I only learned this myself from @josephyang’s posts here, where he explains exactly why it is better, and also the argument for reanalysis data for BEM: Looking for Beta Testers for Weather Data Service - #16 by josephyang

Also, I would infer that the vast majority of the data in the SAM epw files comes from satellite measurements instead of on-ground measurements. I’m 99% sure that the solar radiation data in these epws is from satellite measurements and this may be the case for the air temperature measurements too. Understandably, satellite-derived data has much larger error bars around it than ground-observed data.

This is probably true, but I believe the integration with physics-based modeling and the interpolation from real data makes the gridded data ultimately more reliable then raw satellite and weather station data. The fact that Climate One Building partially updated it’s weather files with ERA5 data for radiation indicates that this is definitely the case for radiation values at least.

1 Like

That’s what the mon_per_hour output of the LB Time Interval Operation component gives you.

@chris @ilayda

Is the idea to use the average month per hour values to select appropriate “Typical Metereological Months” (TMM) to assemble the TMY file? I’m only familiar with the original TMY-method at a very high level (Sandia Method[1][2]), but it seems like a month per hour measure would miss the impact of the varying distributions, right? The Sandia Method in contrast uses multiple statistical methods to capture the mean and distribution of weather data over input data spanning multiple years to select the TMM.

[1] https://www.nrel.gov/docs/fy08osti/43156.pdf (Description in Section 2.1)
[2] Generation of a typical meteorological year (Conference) | OSTI.GOV (Original Sandia paper, but no pdf link).

You probably know more about this than I do, @SaeranVasanthakumar , but I’m pretty confident that they don’t determine the typical months in TMY files solely by looking at the mean values per hour. Maybe the mean-per-hour one of several statistics that are used.

I was just answering the technical question of how to do that type of data analysis with Ladybug.

1 Like

Instead of weather station data vs. satellite data, it’s more like:

  • Option 1: (weather station data) + (empirical regression model) → weather file
  • Option 2: (weather station data + satellite data + weather balloon data + aircraft sensor data) + (numerical weather prediction model) → weather file

I’ve been suggesting using Option 2 because it has the benefits of using the state of the art methods in meteorological science, rather than depending on an empirical regression model developed in the 80’s (Perez method) when radiation data just wasn’t available.

In any case, I think if you’re used to getting EPW files from One Climate Building, you’ll probably soon be using this data anyhow. :slight_smile:

1 Like

@josephyang

I’ve been suggesting using Option 2 because it has the benefits of using the state of the art methods in meteorological science, rather than depending on an empirical regression model developed in the 80’s (Perez method) when radiation data just wasn’t available.

Personally, I’m interested in testing out Option 2 (especially with sites where I expect the local microclimate to differ significantly from the airport weather station) but the biggest bottleneck is that creating a TMY file from multiple years of raw ERA5 reanalysis data is a nontrivial process. As I briefly summarized in a couple of posts above, the statistical computation is not technically difficult, but requires more work and QA then I have time for. Do you know of any libraries/tools that takes care of the TMY conversion?

@kevinkircher Check out http://Climate.OneBuilding.org . We have a recent sets of TMYx based on 2004-2018 for 13000 locations worldwide. We’re getting ready to release completely revised TMYx with data through 2020. Courtesy of @oikoweather we have a global source of satellite solar radiation data.

Congratulations and great stuff! And amazing to see Climate One Building keep pushing state-of-the-art weather representation. I understand (at a very high-level) the critique of solar radiation data derived from weather station and the Perez sky model so this makes a lot of sense. Can I ask what was Climate One Building’s reasoning behind limiting the update to just the solar radiation data, and not also the incorporating other weather data (i.e. dry bulb, dew point temperature, wind speed etc) from ERA5 reanalysis? Or is your argument for using Option 2 for modeling/forecasting just limited to solar radiation values?

2 Likes

Personally, I’m interested in testing out Option 2 (especially with sites where I expect the local microclimate to differ significantly from the airport weather station) but the biggest bottleneck is that creating a TMY file from multiple years of raw ERA5 reanalysis data is a nontrivial process. As I briefly summarized in a couple of posts above, the statistical computation is not technically difficult, but requires more work and QA then I have time for. Do you know of any libraries/tools that takes care of the TMY conversion?

I don’t unfortunately. I’m in the same boat as you - the conversion process for generating TMY is rather simple but I don’t have the bandwidth to create one at the moment. I thought about perhaps asking Dru and Linda to get more information but haven’t gotten around to it.

Congratulations and great stuff! And amazing to see Climate One Building keep pushing state-of-the-art weather representation. I understand (at a very high-level) the critique of solar radiation data derived from weather station and the Perez sky model so this makes a lot of sense. Can I ask what was Climate One Building’s reasoning behind limiting the update to just the solar radiation data, and not also the incorporating other weather data (i.e. dry bulb, dew point temperature, wind speed etc) from ERA5 reanalysis? Or is your argument for using Option 2 for modeling/forecasting just limited to solar radiation values?

The original aim was to ensure that they have better solar data but Linda did also download all the meteorological parameters needed to create EPW file including surface pressure, temperature, dew-point temperature, relative humidity, wind speed, wind direction, etc. Dru mentioned once that they are using these to fill in any missing information they have but I’m not too sure how much they have ended up using. Given that the locations they have are specific to the airport weather station already, I assume that it could make sense.

My suggestion does also apply to other weather data variables also and I would have used the reanalysis data myself. On the other hand if I were in their place, I would probably be cautious about abandoning airport data all at once. For the parameters that are directly measured, weather station data as the ground truth for that specific location is still valid assumption. I’m not sure how close they are to publishing the updated weather files but once they do, I’ll touch base with them on how it all went and would be happy to share any lessons learned.

2 Likes

I’m not sure how close they are to publishing the updated weather files but once they do, I’ll touch base with them on how it all went and would be happy to share any lessons learned.

Please do, that would be great. I wonder if they’re modifying the way they create the TMY file based on changes in underlying assumptions due to the use of gridded data.

1 Like

Hi @chris @ilayda @SaeranVasanthakumar

The method for setting up a TRY (Test Reference Year) are well described in ISO 15927-4:2005.
This method is a little bit different from setting up a TMY.

This might get you further in terms of setting up a phyton script: TMY-/Code at master · smurphy-11/TMY- · GitHub

However, as e.g. Dru Crawley mentions here: PROCEEDINGS.cdr (ibpsa.org) it might be better to use a combination of e.g. two climate data methods and make two sets.

It would be nice to have “one ring to rule them all”, but I believe that a combination of a TMY and XMY file maybe wold be a better way to go, if one wants to both sizing HVAC-equipment and look at the energy consumption.

In the figure below I’ve plotted both a TMY and TRY file for Oslo (1991-2020) together with the variation in the measurements (max/min), to visualize the difference we got. As you can see, both methodologies don’t necessary captures the spring/summer in a good manner (evaluating cooling and indoor thermal comfort in building during this period).

2 Likes

@EspenHansen

Thanks for the resources, and references, definitely is useful.

In the figure below I’ve plotted both a TMY and TRY file for Oslo (1991-2020) together with the variation in the measurements (max/min), to visualize the difference we got. As you can see, both methodologies don’t necessary captures the spring/summer in a good manner (evaluating cooling and indoor thermal comfort in building during this period).

Actually, I don’t see why you think they aren’t capturing the spring/summer in a good manner. What should I be looking at?

Hi @SaeranVasanthakumar

I’ve tried to visualize the end part of the duration diagram, where you see that both the TMY and TRY behaves a bit different, than what we want. The red dotted line illustrates what I personally would like to define as ideal.