Total sky cover issue with climate data from One Building - TMY3, TMYx, and TMYx.2004-2018

I noticed that the Total Sky Cover (cloud cover) data coming from some epw files appears wrong. I’m curious if anyone can give me direction on how to know which sources to trust

I evaluated the TMYx.2004-2018 files hosted to and noticed that many are much cloudier than the corresponding TMY3 files that are hosted to the epwmap. Closer evaluation of the epw data shows that some cloud cover tenth values have 0 hours recorded even though there are 8760 total hours. This is unrealistic, as you’d expect a smooth collection of all potential tenth sky cover values. See below.

Washington DC - Reagan TMYx.2004-2018

Washington DC - Reagan TMY3

Notice the huge difference this has on clear hours. My current solution is just to use the TMY3 files that appear correct. Did ladybug tools evaluate epw quality when they posted them to the epw map? Is this why tmyx.2004-2018 are not included? Any insight into epw file quality and data errors would be much appreciated.

I think this is probably the airport cloud coverage reporting directly translated into numerical values

Different types of cloud cover conditions include SKC (sky clear), FEW (trace), SCT (scattered), BKN (broken) and OVC (overcast). Cloud cover is reported in terms of 1/8th of sky cover with 1-2/8th being FEW, 3-4/8ths being SCT, 5-7/8th being BKN and 8/8 denoted at OVC.


Here’s the total cloud coverage distribution for Jan 1, 2004 to Dec 31, 2018 for Washington DC:

I spoke with Linda and Dru from OneBulding. They were super friendly and helpful. They sent the attached word doc explaining the cloud cover data interpretation process. Cloud-data-findings.docx (17.2 KB)

@josephyang I think you are correct that these values have been translated from a different categorization system.

That makes sense, thanks for the follow up @LelandCurtis .

Now that Numerical Weather Prediction (NWP) model data are more widely available, I think perhaps these processes can be updated to use radiation and cloud coverage data directly from meteorological models (e.g. GFS, ECMWF, ERA5 etc.), so that they are corrected for atmospheric physics as noted on this thread.