MJay

High-Performance Tensor Contractions for GPUs 핵심 본문

Cloud Computing/Paper

High-Performance Tensor Contractions for GPUs 핵심

MJSon 2017. 6. 2. 17:21
the MAGMA libraries were designed for heterogeneous architectures, including current investigations on new data layouts, functionalities, batched computations, and possibly generalizations to tensors, in order to provide applications new functionalities to deal efficiently with multi-dimensional data


we transform the tensor contractions to batched GEMM. 


The kernels are organized as follows: (1) Read A and B into shared memory; (2) Compute AB in registers; (3) Update C. Reading A, B, and C is through functions in the tensor’s structure definition, which allows us to work with matrices that are not stored in the standard matrix format. 

5.3 Autotuning 

The model driven part comprises compiler code generation and optimizations (as in Section 4). 

The resulting HP tensor contractions package will be released as open source through the MAGMA library.