Hi @chris.
First of all, thank you for this thorough answer. Things are much clearer to me now and I agree time representation in EPW is a pain (and I added my thumb up to that GH issue).
I tend to forget the first intent of Ladybug here was to import/modify/export EPW files, not to create files from scratch, and I missed the fact that the last item is moved to first position on import (because I don’t use the import).
I’ve been thinking this over and over for several hours now. I had trouble understanding because I was thinking in terms or time intervals (first step being [0:00 Jan 1, 2020
- 1:00 Jan 1, 2020
]) rather than points in time and I thought the first interval was the same everywhere except EPW would express it with the end time while everyone would express it with the start time. I realize I was wrong and your message above makes much more sense to me now.
It is not only a time representation issue. It is not about replacing 24:00 with 00:00. In fact, we don’t care much about the serialization of the date because AFAIU, the timestamp on each line is ignored and only the analysis period matters. It is about having the analysis period bounds correct.
The issue with the current solution is that the representation in the Ladybug world is kinda wrong, with the first value of the list being the last of the simulation. So we have an internal representation format that is twisted to account for a specificity of the external format.
I think this is what you address in your second paragraph:
In Legacy Ladybug, we tried following the EPW structure and made 1:00 the first time of the year (instead of 0:00) and we made the start hour-of-the-year (HOY) equal to 1 instead of 0. But we now see this as a mistake since it ultimately created several cases of mismatched datetimes across the plugin. So we knew that the start datetime definitely needed to be 0:00.
Defining the analysis period starting at 1 would allow to keep values in order with datetimes being consistently expressed. I can’t imagine the issues “across the plugin” because I’m only using a small subpart of it, so I’ll trust you on that one. Well, at least if the data is meant to be used in another simulation tool (I saw Radiance mentioned) and that tool starts at 0, I can see trouble coming.
Anyway, your explanation clearly presents the implementation choice as a compromise and my partial understanding hardly allows me to challenge it.
I still have a concern.
The docstring for from_missing_values
reads
from ladybug.epw import EPW
from ladybug.location import Location
epw = EPW.from_missing_values()
epw.location = Location(‘Denver Golden’,‘CO’,‘USA’,39.74,-105.18,-7.0,1829.0)
epw.dry_bulb_temperature.values = [20] * 8760
IIUC, when doing so, the dry_bulb_temperature
values are wrong because the last value should be moved to first position. To avoid discrepancies with other values. And because it will be moved back on export. I didn’t see any setter in the code doing this automatically. Obviously, it doesn’t matter in this example since the value is constant but you get the idea.
In fact, again IIUC, this makes for a terrible API because the user is allowed to manipulate the values but without explicit knowledge of the internals, he can’t imagine the first value will be sent to the end. This can only work if the arrays that is passed comes from an EPW import.
My personal use case being the creation of files from scratch, I can live with this and
- query my weather database starting at 1:00 and ending at 0:00 inclusive
- modify the export function to not move the first value to the end
But I’m interested in your feedback because either I’m still misunderstanding, or the internal representation is problematic and in fact should probably be hidden from the user.
The second and unrelated aspect of the problem when merging data from different sources, which I initially overlooked, is knowing which time period is covered by a timestamp.
In EPW, all values are point in time except illuminance/radiation which are aggregations over the last time step, which is equivalent to half-hour (from #55 and EnergyPlus Weather File (EPW) Data Dictionary: Auxiliary Programs — EnergyPlus 8.3).
My data source is Oikolab. It is unclear to me what exact time interval is represented by 0:00 Jan 1, 2020
. From their docs (https://docs.oikolab.com), it looks like they have all values representing a point in time except radiation/illuminance (and precipitation but we don’t need those) being values aggregated from the last hour. This seems to match with EPW. It might not be a coincidence. Either this is common practice, or they did it to match EPW. EPW export is quoted as a use case in their FAQ and they recommend LadyBug for the job. I sent them an email for confirmation/clarification.
Hopefully, all values match already and I’ll get away with it. Otherwise, I might have to shift some values. Or maybe there’ll be only a 30 minutes shift (if the value is point in time in Oikolab and half-hour in EPW) in which case I may interpolate or perhaps just let it go.
I can’t believe how much time I spent trying to understand this. I won’t blame it on LadyBug, rather EPW and myself. It’s not a total loss as I understand much more clearly what I’m doing now.