Cyclops Tensor Framework
parallel arithmetic on multidimensional arrays
|
Performs recursive parallel matrix multiplication using the slice interface to extract blocks. More...
Functions | |
int | test_subworld_gemm (int n, int m, int k, int div_, World &dw) |
char * | getCmdOption (char **begin, char **end, const std::string &option) |
int | main (int argc, char **argv) |
Performs recursive parallel matrix multiplication using the slice interface to extract blocks.
char* getCmdOption | ( | char ** | begin, |
char ** | end, | ||
const std::string & | option | ||
) |
Definition at line 102 of file subworld_gemm.cxx.
Referenced by main().
int main | ( | int | argc, |
char ** | argv | ||
) |
Definition at line 112 of file subworld_gemm.cxx.
References getCmdOption(), ctf.core::np(), ctf.core::rank(), and test_subworld_gemm().
int test_subworld_gemm | ( | int | n, |
int | m, | ||
int | k, | ||
int | div_, | ||
World & | dw | ||
) |
Definition at line 13 of file subworld_gemm.cxx.
References CTF::Tensor< dtype >::add_from_subworld(), CTF::Tensor< dtype >::add_to_subworld(), CTF::World::comm, CTF::Tensor< dtype >::get_local_data(), CTF::Tensor< dtype >::norm2(), ctf.core::np(), NS, ctf.core::rank(), and CTF::Tensor< dtype >::write().
Referenced by main().