hardware_code_molecular_dynamics_GPU (Failed) (Defect #700)


Added by Matthias Bach about 3 years ago. Updated about 3 years ago.


Status:Done Start date:19 Sep 2014
Priority:Normal Due date:
Assignee:Matthias Bach % Done:

100%

Category:-
Target version:2014.1 Estimated time:8.00 hours

Description

Test on gpu-dev03 (AMD Radeon HD7970)


Associated revisions

Revision b754a415
Added by Matthias Bach about 3 years ago

Fix reading of Gaugefields from ILDG files

The internally allocated buffer into which the read gauge field was written was not
passed back to the caller. Not only was the memory lost, but the caller was reading
some random location in memory instead of the read gauge field.

refs #700: Had to fix this on the way…

Revision 882b53d7
Added by Matthias Bach about 3 years ago

Safer initialization of Gaugefield

Do not dereference an uninitialized PRNG pointer.

refs #700

Revision 78f5b720
Added by Matthias Bach about 3 years ago

Work around miscompilation by Catalyst 14.4 on Radeon HD 7970.

Split fermion_force_eo kernel to avoid miscompilation.

fixes #700
fixes #706
fixes #707
fixes #708

History

Updated by Matthias Bach about 3 years ago

  • Target version set to 2014.1

Updated by Matthias Bach about 3 years ago

It seems the fermion_force_eo kernel is broken for kappa != 0. This will have to be investigated further.

  • Estimated time set to 8.00

Updated by Matthias Bach about 3 years ago

CPU results are fine...

Updated by Matthias Bach about 3 years ago

  • Status changed from New to Code Review
  • % Done changed from 0 to 100

Updated by Matthias Bach about 3 years ago

Christopher, I am just assigning this to you for review so you are informed about the change in 78f5b720.

In case you are observing any performance degradation or other issue will will have to look for a different workaround. Otherwise please close the ticket.

  • Assignee set to Christopher Pinke

Updated by Christopher Pinke about 3 years ago

Thanks, Matthias. It`s quite unfortunate that the kernel has to be split like this, but of course the newer driver should be usable.
Regarding the performance, I did not do any benchmarks. Anyway, I guess that the new driver will enhance performance on newer cards.
Would you guess that there is an overhead due to the increased number of kernels. It should be similar to the dslash..
In case I would create a ticket, but I think it should not be an issue..

  • Status changed from Code Review to Feedback
  • Assignee changed from Christopher Pinke to Matthias Bach

Updated by Matthias Bach about 3 years ago

  • Status changed from Feedback to Done

Also available in: Atom PDF