How to read large EPlus csv result file efficiently?

Grasshope · February 2, 2016, 8:46am

I have a large energyplus results file in csv format which is about 200MB. The Read EP Result component takes more than 10 minutes to read it…

May I ask if there is a more efficient way to read large Eplus csv result file?

Thanks!

AbrahamYezioro · February 2, 2016, 9:29am

I’m afraid you are going to drink a lot of coffee …

200 Mb is a lot of building.

CSV is not the more efficient file but this is what we have in HB. Maybe SQL can be more efficient to draw data, but it depends what do you want to do and how.

-A.

chris · February 3, 2016, 1:21pm

Grasshope,

Having E+ output an SQL file as Abraham states can help you pull aggregate sums/averages of dataset out without having to bring all of the small pieces of data in. However, if you want all of the data to come in, SQL is going to take the same amount of time to import to GH as the CSV does. So, in the end, there is no better substitute for reducing the amount of time it takes to import the data than to specify only the outputs you are interested in and at the timestep that you need the data at (ie monthly instead of hourly).

This is one of the reasons why I have been a bit lazy about building out the SQL importing features since, even though it might help improve the import time a bit in a few casese, most of the time I am best off only requesting the outputs that I need or simulating only the portion of the energy model needed to answer the question I am asking.

What data and timestep are trying to get in your case here? There may be a chance for SQL to help and this could be the excuse that I need to implement some SQL capabilities.

-Chris

mostapha · February 3, 2016, 2:54pm

Grasshoppe,

As Chris mentioned if you’re not looking for all the results there are solutions to make it happen faster even by parsing csv file. Are you trying to input all the data or just need a specific data?

One of the main issues with generating SQL file is that generating the sql file itself takes a very long time.

Mostapha

Grasshope · February 4, 2016, 9:32am

Dear Chris and Mostapha,

The IDF file is generated from a workflow similar to Chris’ outdoor microclimate map demo file on Hydra. So, only the output variables relevant to microclimate map analysis and total thermal energy are specified.

The huge size of the IDF file might be attributed to the total number of zones (77) and the hourly time step for simulation.

Why I need annual hourly results is because I want to calculate the percentage of outdoor spaces with an annual average UTCI temperature within the comfortable range of 9-26 degrees.

Unless I’m only interested in the UTCI of a particular hour, I’m not sure how I can simplify the workflow to reduce the size of the csv file and consequently, reducing the time to read the csv file…

Appreciate your advice!

THanks!

mostapha · February 4, 2016, 9:57am

Hmm… Can this be done in several steps? Is it a parametric study or are the geometries stay the same? I assume you run the energy simulation to get the surface temperature and in this case you only need the exterior surface temperature. I think there are a number of outputs that can be removed which might help to reduce the size of the file.

Grasshope · February 4, 2016, 10:13am

Thanks, Mostapha!

It’s a study using several different building forms.

Are zoneAirFlowVol and zoneAirHeatGain necessary for outdoor microclimate map analysis?

If they are not “must have” output for this type of analysis, can we just specify the zoneComfortMetrics and surfaceTempAnalysis as simulation output?

And if zone related parameters are not necessary, the zoneComfortMetrics output can be removed too, which may reduce the size of the csv file significantly… ?

Grasshope · February 4, 2016, 10:29am

OK, it seems the three groups of output variables, zoneComfortMetric, comfortMapVariables and surfaceTempAnalysis need to be specified to use the microclimate map analysis component.

And, without outputting zoneEnergyUse, the csv file can be reduced significantly … from 50MB to 3.7MB!

chris · February 4, 2016, 2:52pm

Grasshope,

A few months ago, I made it so that, if you are running the UTCI comfort map with only outdoor surfaces, the only thing that you need from the EnergyPlus simulation is the outdoor surface temperature.

The image above and the file attached show you that you can really boil it down to just this one output that you need from the E+ simulation for your case. You can even connect up a panel with only that output to the Run Simulation component to get a much faster import of data from the CSV.

-Chris

FastOutdoorClimateMap.gh (560 KB)

chris · February 4, 2016, 2:55pm

Also, I should really update that Hydra outdoor microclimate example now that this workflow is possible. Thanks for making me aware of it.

TheodorosGalanos · February 5, 2016, 6:26am

Hey Chris,

Thanks for sharing the file! That’s amazing. I will be using your workflow to conduct yearly outdoor thermal comfort studies once my CFDs progress and a fast way to get me started is what the doctor ordered

I do have a question though, I apologize in advance if it is obvious I’ve never run this to know the intricacies. Do we care about the type/typology/size/ etc. of the zone if we are only focusing on outdoor temperatures? I have two adjacent buildings on a public realm I’m studying but these buildings are one big box atm. Is it crucial to define specifically the facade (i.e. material distribution) and the use of the zones to get accurate results or is one big ass zone enough for a good assessment?

Thanks in advance.

Kind regards,

Theodore.

Grasshope · February 5, 2016, 7:10am

Dear Chris,

Thank you very much for your advice! This should increase the efficiency of outdoor microclimate analysis significantly.

Your example file works on my computer. So, I tested on my file attached here by specifying only the output variable related to surface outdoor temperature. However, I got the following warnings :

1. If you have connected a viewFactorMesh that includes regions on the indoors, you must connect up energy simulation data for zoneAirTemp, srfIndoorTemp, zoneAirFlowVol, zoneAirHeatGain, and zoneRelHumid.

I checked the workflow, and it seems to be the same as your (except those complicated geometry generation workflow). Appreciate if you can help to take a look and advise what I have missed here.

Thank you very much!

typology_v085_UTCI_simple_workflow_test.gh (755 KB)

pol_saante · May 29, 2024, 3:34pm

Hello all,

I’m reviving this topic because I have the same issue importing the sql file. I am running a multiple scenario analysis on a building with different constructions for each scenario. I have automated the workflow so that I can output all the idfs at once (usually ~20 scenarios) and run them in parallel with the run idf component, which has reduced the simulation time dramatically down to a few seconds.

However, my final step is to aggregate the hourly energy use into a csv file and the biggest bottleneck is in reading the sql files, which depending on the building’s complexity can take from 5 minutes up to 2 hours.

I would really like to find a solution for this. Any advice would be greatly appreciated.

Thanks,
Polina