MJay
High-Performance Tensor Contractions for GPUs 핵심 본문
the MAGMA libraries were designed for heterogeneous architectures, including current investigations on new data layouts, functionalities, batched computations, and possibly generalizations to tensors, in order to provide applications new functionalities to deal efficiently with multi-dimensional data
we transform the tensor contractions to batched GEMM.
The kernels are organized as follows: (1) Read A and B into shared memory; (2) Compute AB in registers; (3) Update C. Reading A, B, and C is through functions in the tensor’s structure definition, which allows us to work with matrices that are not stored in the standard matrix format.
5.3 Autotuning
The model driven part comprises compiler code generation and optimizations (as in Section 4).
The resulting HP tensor contractions package will be released as open source through the MAGMA library.
'Cloud Computing > Paper' 카테고리의 다른 글
Unified Development for Mixed Multi-GPU and Multi-Coprocessor 핵심 (0) | 2017.06.02 |
---|---|
Flexible Linear Algebra Development and Scheduling with Cholesky Factorization 핵심 (0) | 2017.06.02 |
좋은 논문 찾는 방법 (0) | 2017.06.02 |
What Percentile Tells You about a Statistical Value - dummies (0) | 2017.05.06 |
HIPS/Spearmint - Bayesian Optimization Framekwork (0) | 2017.05.06 |