Hi @LelandCurtis, thanks for the question.
I think one way to look at it is that they’re actually both mostly modeled - so instead of ‘measured’ (ground truth) vs. ‘modeled’ (synthetic) but rather ‘sampled or empirically modeled’ vs. ‘best state estimation in terms of meteorological science’. For instance if we take solar/thermal radiation parameters, the cloud coverage values used to calculate these have been somewhat repurposed from the original intent (e.g. cloud ceilings reported are not the same as total cloud coverage), whereas NWP models even take into account things like aerosol and various atmospheric absorption spectrum. The former is mostly a regression model and the latter a physics-based one that also removes spatial and temporal sampling noise.
In the field of renewable energy modelling & forecast (e.g. solar & wind), I think it would be hard to find anyone who uses weather station data as a source and NREL also recommends using gridded data (their NSRDB is based on MERRA2 reanalysis data) over station data. The underlying NWP models are indeed getting better but apps such as DarkSky don’t run physics-based model and instead run ML-derived enhancements over other NWP models (e.g. GFS) using public radar dataset. Project such as NASA Power have also been created with the recognition of the limitation of station-based data.
I do also recommend reanalysis data over station-based data in terms of data integrity and physics-consistency, especially if any sort of calibration is being done but this is not to say that existing TMY files are grossly incorrect - for most use cases, I think it’s fine either way. I’ve been discussing with Dru and Linda about updating some of the TMY data with reanalysis-derived dataset but I’m not too sure what other efforts there are.
Hope this helps!