Cloud Computing/Paper
High-Performance Tensor Contractions for GPUs 핵심
MJSon
2017. 6. 2. 17:21
the MAGMA libraries were designed for heterogeneous architectures, including current investigations on new data layouts, functionalities, batched computations, and possibly generalizations to tensors, in order to provide applications new functionalities to deal efficiently with multi-dimensional data
we transform the tensor contractions to batched GEMM.
The kernels are organized as follows: (1) Read A and B into shared memory; (2) Compute AB in registers; (3) Update C. Reading A, B, and C is through functions in the tensor’s structure definition, which allows us to work with matrices that are not stored in the standard matrix format.
5.3 Autotuning
The model driven part comprises compiler code generation and optimizations (as in Section 4).
The resulting HP tensor contractions package will be released as open source through the MAGMA library.