Benchmark-Implementation (Feature #164)
In general a proper benchmark-environment has been implemented.
Now there are a couple of questions and to-dos left:
- General: Is the implementation ok?
- At the moment, there are only dummy-benchmark-exec for inverter and hmc, these have to be filled with kernel/function calls
- At the moment, everything is written into one file in the end. Perhaps one wants different files?
- There is a problem at the sizes of the kernels: Some have dynamical sizes (e.g. polyakov-loop) How to deal with that?
- Similar: for many kernels, the read/write-load has to be counted. Also, every load should be rechecked!
|blocks CL2QCD - Benchmark #218: First Optimizations for dslash_eoprec||New||02 Nov 2011|