SuccessChanges

Summary

  1. AMDGPU: Split flat offsets that don't fit in DAG (details)
  2. AMDGPU: Increase vcc liveness scan threshold (details)
  3. [ConstantRange] Optimize nowrap region test, remove redundant tests; NFC (details)
  4. [ConstantRange] makeGuaranteedNoWrapRegion(): `shl` support (details)
  5. [InstCombine] Shift amount reassociation in shifty sign bit test (details)
  6. [InstCombine] Add tests for uadd/sub.sat(a, b) == 0; NFC (details)
  7. Fix buildbot error in SIRegisterInfo.cpp. (details)
Commit 7cd57dcd5b716dd1dab446974abd4c51d01038a7 by Matthew.Arsenault
AMDGPU: Split flat offsets that don't fit in DAG
We handle it this way for some other address spaces.
Since r349196, SILoadStoreOptimizer has been trying to do this. This is
after SIFoldOperands runs, which can change the addressing patterns.
It's simpler to just split this earlier.
llvm-svn: 375366
The file was modifiedllvm/test/CodeGen/AMDGPU/cgp-addressing-modes.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/global_atomics.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIInstrInfo.h
The file was modifiedllvm/test/CodeGen/AMDGPU/offset-split-global.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/store-hi16.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIInstrInfo.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/offset-split-flat.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/flat-address-space.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/global-saddr.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/global_atomics_i64.ll
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
Commit e5be543a55986e353d40d79702eef5cff3934348 by Matthew.Arsenault
AMDGPU: Increase vcc liveness scan threshold
Avoids a test regression in a future patch. Also add debug printing on
this case, so I waste less time debugging folds in the future.
llvm-svn: 375367
The file was modifiedllvm/test/CodeGen/AMDGPU/fence-barrier.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/copy-illegal-type.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/ds-negative-offset-addressing-mode-loop.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIFoldOperands.cpp
Commit 926dae33ba658d72e9f8d76d004cd943d6280250 by nikita.ppv
[ConstantRange] Optimize nowrap region test, remove redundant tests; NFC
Enumerate one less constant range in TestNoWrapRegionExhaustive, which
was unnecessary. This allows us to bump the bit count from 3 to 5 while
keeping reasonable timing.
Drop four tests for multiply nowrap regions, as these cover subsets of
the exhaustive test. They do use a wider bitwidth, but I don't think
it's worthwhile to have them additionally now.
llvm-svn: 375369
The file was modifiedllvm/unittests/IR/ConstantRangeTest.cpp
Commit 4b6223263a3c1fb98bc69e8eb6722d48e4eb9f49 by lebedev.ri
[ConstantRange] makeGuaranteedNoWrapRegion(): `shl` support
Summary: If all the shifts amount are already poison-producing, then we
can add more poison-producing flags ontop:
https://rise4fun.com/Alive/Ocwi
Otherwise, we should only consider the possible range of shift amts that
don't result in poison.
For unsigned range not not overflow, we must not shift out any set bits,
and the actual limit for `x` can be computed by backtransforming the
maximal value we could ever get out of the `shl` - `-1` through
`lshr`. If the `x` is any larger than that then it will overflow.
Likewise for signed range, but just in signed domain..
This is based on the general idea outlined by @nikic in
https://reviews.llvm.org/D68672#1714990
Reviewers: nikic, sanjoy
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits, nikic
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69217
llvm-svn: 375370
The file was modifiedllvm/unittests/IR/ConstantRangeTest.cpp
The file was modifiedllvm/lib/IR/ConstantRange.cpp
Commit 49483a3bc2253c9e252e5e37b709534e3b6e51cc by lebedev.ri
[InstCombine] Shift amount reassociation in shifty sign bit test
(PR43595)
Summary: This problem consists of several parts:
* Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`.
This is trivial, and easy to do, we have a fold for it.
* Shift amount reassociation - if we have two identical shifts,
and we can simplify-add their shift amounts together,
then we likely can just perform them as a single shift.
But this is finicky, has one-use restrictions,
and shift opcodes must be identical.
But there is a super-pattern where both of these work together. to
produce sign bit test from two shifts + comparison. We do indeed already
handle this in most cases. But since we get that fold transitively, it
has one-use restrictions. And what's worse, in this case the
right-shifts aren't required to be identical, and we can't handle that
transitively:
If the total shift amount is bitwidth-1, only a sign bit will remain in
the output value. But if we look at this from the perspective of two
shifts, we can't fold - we can't possibly know what bit pattern we'd
produce via two shifts, it will be *some* kind of a mask produced from
original sign bit, but we just can't tell it's shape:
https://rise4fun.com/Alive/cM0 https://rise4fun.com/Alive/9IN
But it will *only* contain sign bit and zeros. So from the perspective
of sign bit test, we're good: https://rise4fun.com/Alive/FRz
https://rise4fun.com/Alive/qBU Superb!
So the simplest solution is to extend
`reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a
sudo-analysis mode that will ignore extra-uses, and will only check
whether a) those are two right shifts and b) they end up with
bitwidth(x)-1 shift amount and return either the original value that we
sign-checking, or null.
This does not have any functionality change for the existing
`reassociateShiftAmtsOfTwoSameDirectionShifts()`.
All that being said, as disscussed in the review, this yet again
increases usage of instsimplify in instcombine as utility. Some day that
may need to be reevaluated.
https://bugs.llvm.org/show_bug.cgi?id=43595
Reviewers: spatel, efriedma, vsk
Reviewed By: spatel
Subscribers: xbolva00, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68930
llvm-svn: 375371
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
The file was modifiedllvm/test/Transforms/InstCombine/sign-bit-test-via-right-shifting-all-other-bits.ll
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineInternal.h
Commit c08666abafb449f97c58eb8a730e56a085b0812f by nikita.ppv
[InstCombine] Add tests for uadd/sub.sat(a, b) == 0; NFC
llvm-svn: 375372
The file was modifiedllvm/test/Transforms/InstCombine/saturating-add-sub.ll
Commit 5fa36e42c43bc0816ad96597e20416a3cb8cd4dd by Zinovy Nis
Fix buildbot error in SIRegisterInfo.cpp.
llvm-svn: 375373
The file was modifiedllvm/lib/Target/AMDGPU/SIRegisterInfo.cpp