Kernel and Algorithm Patterns for CUDA
- Reductions and Memory Patterns
- Reduction Patterns in CUDA
- Mapping Data into CUDA's Memories
- Input/Output Convolution
- Generic Algorithm Description
- What could each thread be assigned?
- Thread Assignment Trade-offs
- What memory Space does the Data use?
- Stencil Computation: Fluid Dynamics, Image Convolution
- Bonded Input/Output Convolutions
Researchers should cite this work as follows: