Version 1.0.0 Code and Documentation Release
We are happy to be able to release version 1.0.0 of CALDGEMM along with a Technical Report describing its inner workings and how we integrated the library into a modified version of the High Performance Linpack. You can get the tarball from the files section of the website or clone the git repository.
CALDGEMM allows you to use AMD GPUs to achieve the highest DGEMM performance possible on current hardware, basically reaching peak performance. It utilizes GotoBLAS for the CPU side of combined CPU/GPU computation.