Refactor CG-M and speed it up (Feature #732)


Added by Alessandro Sciarra over 2 years ago. Updated over 2 years ago.


Status:In Progress Start date:03 Jun 2015
Priority:Normal Due date:
Assignee:Alessandro Sciarra % Done:

30%

Category:-
Target version:-

Description

Some work on the multi-shifted inverter has been done in the Feature #562 but still this part of the code is far from being well implemented. The CG-M should be refactored. Maybe it makes sense to implement a class that does what at the moment is done in the function physics::algorithms::solvers::cg_m.


Associated revisions

Revision 57af7d15
Added by Alessandro Sciarra over 2 years ago

Improved CG-M performance (avoided GPU-host comm. and do not check always residuum).
refs #732

Revision 461a43c4
Added by Alessandro Sciarra over 2 years ago

Removed forgotten debug output in CG-M solver.
refs #732

History

Updated by Alessandro Sciarra over 2 years ago

  • Subject changed from Refactor CG-M to Refactor CG-M and speed it up
  • Status changed from New to In Progress

Updated by Alessandro Sciarra over 2 years ago

After several try I found out two points in the code that for sure were slowing down the code.

In the CG-M, each equation is solved at the same time, but when any reaches the desired precision (residuum per equation) then it is not considered any more. When all are below precision then the overall residuum is checked. Now, we know from the standard CG that it is convenient not to check the convergence at each iteration, but every some. This in the CG-M should be done not on the overall residuum but on the residuum per equation, since this is basically a call to the squarenorm kernel. This was not implemented and it brings already something if used properly.

The second even more important point is that I forgot to change a sax call like

1void sax(const Staggeredfield_eo* out, const hmc_float alpha, const Staggeredfield_eo& x);

to one like

1void sax(const Staggeredfield_eo* out, const Vector<hmc_float>& alpha, const int index_alpha, const Staggeredfield_eo& x);

in the case when alpha is a Vector (stored on the device). This means that I was doing an unnecessary communication host-device.

Still the code should be refactored!

  • Start date set to 03 Jun 2015
  • % Done changed from 0 to 30

Also available in: Atom PDF