Glad that you’re getting closer to having something that works.
I would not recommend averaging each hour over the 5 years as this will create a flat line data set that misses the peak conditions that occur in a typical year. The real EPWs in the DOE database are made by statistically analyzing each month over the collection period. They then determine the most “typical” January, the most “typical” February, etc. This is done by looking at both average monthly temperature and standard deviation. Then, they string these months together to make a full year. Finally, they take the “transition” days between each month and interpolate them so that you don’t have a sharp jump in the data.
I would recommend doing something like this with your data set. Alternatively, you could just pick which of the 5 years is most typical (instead of a monthly analysis). Whatever you do, just don’t average each hour over the 5 years (for the reason stated earlier).