llvm-project.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	harsh <harsh@nod-labs.com>	2022-01-25 02:37:52 +0000
committer	harsh <harsh@nod-labs.com>	2022-01-25 03:24:14 +0000
commit	e01e4c9115ad49479d01b6b6de4e83ee454bab24 (patch)
tree	b48ab7a89dd1f9cbf9c94eb8a5ed2a930b6d53c2 /llvm/lib/Bitcode/Reader/BitcodeReader.cpp
parent	810f13f0ebde70e679a097a9f5dbe37fe58ffa27 (diff)

Fix bugs in GPUToNVVM lowering

The current lowering from GPU to NVVM does not correctly handle the following cases when lowering the gpu shuffle op. 1. When the active width is set to 32 (all lanes), then the current approach computes (1 << 32) -1 which results in poison values in the LLVM IR. We fix this by defining the active mask as (-1) >> (32 - width). 2. In the case of shuffle up, the computation of the third operand c has to be different from the other 3 modes due to the op definition in the ISA reference. (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html) Specifically, the predicate value is computed as j >= maxLane for up and j <= maxLane for all other modes. We fix this by computing maskAndClamp as 32 - width for this mode. TEST: We modify the existing test and add more checks for the up mode. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D118086

Diffstat (limited to 'llvm/lib/Bitcode/Reader/BitcodeReader.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: