SuccessChanges

Summary

  1. [llvm-ar] Fix support for archives with members larger than 4GB llvm-ar outputs a strange error message when handling archives with members larger than 4GB due to not checking file size when passing the value as an unsigned 32 bit integer. This overflow issue caused malformed archives to be created.: https://bugs.llvm.org/show_bug.cgi?id=38058 This change allows for members above 4GB and will error in a case that is over the formats size limit, a 10 digit decimal integer. Differential Revision: https://reviews.llvm.org/D65093
  2. [ARM][LowOverheadLoops] Fix branch target codegen While lowering test.set.loop.iterations, it wasn't checked how the brcond was using the result and so the wls could branch to the loop preheader instead of not entering it. The same was true for loop.decrement.reg. So brcond and br_cc and now lowered manually when using the hwloop intrinsics. During this we now check whether the result has been negated and whether we're using SETEQ or SETNE and 0 or 1. We can then figure out which basic block the WLS and LE should be targeting. Differential Revision: https://reviews.llvm.org/D64616
  3. Fix MSVC warning about extending a uint32_t shift result to uint64_t. NFCI.
  4. [SLPVectorizer] Revert local change that got accidently got committed in rL366799 This wasn't part of D63281
  5. Revert [RISCV] Re-enable rv32i-aliases-invalid.s test This reverts r366797 (git commit 53f9fec8e8b58f5a904bbfb4a1d648cde65aa860)
  6. [NFC][InstCombine] Fixup commutative/negative tests with icmp preds in @llvm.umul.with.overflow tests
  7. [InstSimplify][NFC] Tests for skipping 'div-by-0' checks before inverted @llvm.umul.with.overflow It would be already handled by the non-inverted case if we were hoisting the `not` in InstCombine, but we don't (granted, we don't sink it in this case either), so this is a separate case.
  8. [NFC][PhaseOredering][SimplifyCFG] Add more runlines to umul.with.overflow tests This way it will be more obvious that the problem is both in cost threshold and in hardcoded benefit check, plus will show how the instsimplify cleans this all in the end.
  9. [TargetLowering] Add SimplifyMultipleUseDemandedBits This patch introduces the DAG version of SimplifyMultipleUseDemandedBits, which attempts to peek through ops (mainly and/or/xor so far) that don't contribute to the demandedbits/elts of a node - which means we can do this even in cases where we have multiple uses of an op, which normally requires us to demanded all bits/elts. The intention is to remove a similar instruction - SelectionDAG::GetDemandedBits - once SimplifyMultipleUseDemandedBits has matured. The InstCombine version of SimplifyMultipleUseDemandedBits can constant fold which I haven't added here yet, and so far I've only wired this up to some basic binops (and/or/xor/add/sub/mul) to demonstrate its use. We do see a couple of regressions that need to be addressed: AMDGPU unsigned dot product codegen retains an AND mask (for ZERO_EXTEND) that it previously removed (but otherwise the dotproduct codegen is a lot better). X86/AVX2 has poor handling of vector ANY_EXTEND/ANY_EXTEND_VECTOR_INREG - it prematurely gets converted to ZERO_EXTEND_VECTOR_INREG. The code owners have confirmed its ok for these cases to fixed up in future patches. Differential Revision: https://reviews.llvm.org/D63281
Revision 366813 by gbreynoo:
[llvm-ar] Fix support for archives with members larger than 4GB

llvm-ar outputs a strange error message when handling archives with
members larger than 4GB due to not checking file size when passing the
value as an unsigned 32 bit integer. This overflow issue caused
malformed archives to be created.:

https://bugs.llvm.org/show_bug.cgi?id=38058

This change allows for members above 4GB and will error in a case that
is over the formats size limit, a 10 digit decimal integer.

Differential Revision: https://reviews.llvm.org/D65093
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/include/llvm/Object/Archive.h (diff)llvm.src/include/llvm/Object/Archive.h
The file was modified/llvm/trunk/lib/Object/Archive.cpp (diff)llvm.src/lib/Object/Archive.cpp
The file was modified/llvm/trunk/lib/Object/ArchiveWriter.cpp (diff)llvm.src/lib/Object/ArchiveWriter.cpp
Revision 366809 by sam_parker:
[ARM][LowOverheadLoops] Fix branch target codegen
   
While lowering test.set.loop.iterations, it wasn't checked how the
brcond was using the result and so the wls could branch to the loop
preheader instead of not entering it. The same was true for
loop.decrement.reg.
   
So brcond and br_cc and now lowered manually when using the hwloop
intrinsics. During this we now check whether the result has been
negated and whether we're using SETEQ or SETNE and 0 or 1. We can
then figure out which basic block the WLS and LE should be targeting.

Differential Revision: https://reviews.llvm.org/D64616
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/lib/Target/ARM/ARMISelDAGToDAG.cpp (diff)llvm.src/lib/Target/ARM/ARMISelDAGToDAG.cpp
The file was modified/llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp (diff)llvm.src/lib/Target/ARM/ARMISelLowering.cpp
The file was modified/llvm/trunk/lib/Target/ARM/ARMISelLowering.h (diff)llvm.src/lib/Target/ARM/ARMISelLowering.h
The file was modified/llvm/trunk/lib/Target/ARM/ARMInstrInfo.td (diff)llvm.src/lib/Target/ARM/ARMInstrInfo.td
The file was added/llvm/trunk/test/CodeGen/Thumb2/LowOverheadLoops/branch-targets.llllvm.src/test/CodeGen/Thumb2/LowOverheadLoops/branch-targets.ll
Revision 366808 by rksimon:
Fix MSVC warning about extending a uint32_t shift result to uint64_t. NFCI.
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (diff)llvm.src/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
Revision 366807 by rksimon:
[SLPVectorizer] Revert local change that got accidently got committed in rL366799

This wasn't part of D63281
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp (diff)llvm.src/lib/Transforms/Vectorize/SLPVectorizer.cpp
Revision 366806 by lenary:
Revert [RISCV] Re-enable rv32i-aliases-invalid.s test

This reverts r366797 (git commit 53f9fec8e8b58f5a904bbfb4a1d648cde65aa860)
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/test/MC/RISCV/rv32i-aliases-invalid.s (diff)llvm.src/test/MC/RISCV/rv32i-aliases-invalid.s
Revision 366802 by lebedevri:
[NFC][InstCombine] Fixup commutative/negative tests with icmp preds in @llvm.umul.with.overflow tests
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/test/Transforms/InstCombine/unsigned-mul-lack-of-overflow-check-via-udiv-of-allones.ll (diff)llvm.src/test/Transforms/InstCombine/unsigned-mul-lack-of-overflow-check-via-udiv-of-allones.ll
The file was modified/llvm/trunk/test/Transforms/InstCombine/unsigned-mul-overflow-check-via-udiv-of-allones.ll (diff)llvm.src/test/Transforms/InstCombine/unsigned-mul-overflow-check-via-udiv-of-allones.ll
Revision 366801 by lebedevri:
[InstSimplify][NFC] Tests for skipping 'div-by-0' checks before inverted @llvm.umul.with.overflow

It would be already handled by the non-inverted case if we were hoisting
the `not` in InstCombine, but we don't (granted, we don't sink it
in this case either), so this is a separate case.
Change TypePath in RepositoryPath in Workspace
The file was added/llvm/trunk/test/Transforms/InstSimplify/div-by-0-guard-before-smul_ov-not.llllvm.src/test/Transforms/InstSimplify/div-by-0-guard-before-smul_ov-not.ll
The file was added/llvm/trunk/test/Transforms/InstSimplify/div-by-0-guard-before-umul_ov-not.llllvm.src/test/Transforms/InstSimplify/div-by-0-guard-before-umul_ov-not.ll
Revision 366800 by lebedevri:
[NFC][PhaseOredering][SimplifyCFG] Add more runlines to umul.with.overflow tests

This way it will be more obvious that the problem is both
in cost threshold and in hardcoded benefit check,
plus will show how the instsimplify cleans this all in the end.
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/test/Transforms/PhaseOrdering/unsigned-multiply-overflow-check.ll (diff)llvm.src/test/Transforms/PhaseOrdering/unsigned-multiply-overflow-check.ll
The file was modified/llvm/trunk/test/Transforms/SimplifyCFG/unsigned-multiplication-will-overflow.ll (diff)llvm.src/test/Transforms/SimplifyCFG/unsigned-multiplication-will-overflow.ll
Revision 366799 by rksimon:
[TargetLowering] Add SimplifyMultipleUseDemandedBits

This patch introduces the DAG version of SimplifyMultipleUseDemandedBits, which attempts to peek through ops (mainly and/or/xor so far) that don't contribute to the demandedbits/elts of a node - which means we can do this even in cases where we have multiple uses of an op, which normally requires us to demanded all bits/elts. The intention is to remove a similar instruction - SelectionDAG::GetDemandedBits - once SimplifyMultipleUseDemandedBits has matured.

The InstCombine version of SimplifyMultipleUseDemandedBits can constant fold which I haven't added here yet, and so far I've only wired this up to some basic binops (and/or/xor/add/sub/mul) to demonstrate its use.

We do see a couple of regressions that need to be addressed:

    AMDGPU unsigned dot product codegen retains an AND mask (for ZERO_EXTEND) that it previously removed (but otherwise the dotproduct codegen is a lot better).

    X86/AVX2 has poor handling of vector ANY_EXTEND/ANY_EXTEND_VECTOR_INREG - it prematurely gets converted to ZERO_EXTEND_VECTOR_INREG.

The code owners have confirmed its ok for these cases to fixed up in future patches.

Differential Revision: https://reviews.llvm.org/D63281
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/include/llvm/CodeGen/TargetLowering.h (diff)llvm.src/include/llvm/CodeGen/TargetLowering.h
The file was modified/llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp (diff)llvm.src/lib/CodeGen/SelectionDAG/TargetLowering.cpp
The file was modified/llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp (diff)llvm.src/lib/Transforms/Vectorize/SLPVectorizer.cpp
The file was modified/llvm/trunk/test/CodeGen/AArch64/bitfield-insert.ll (diff)llvm.src/test/CodeGen/AArch64/bitfield-insert.ll
The file was modified/llvm/trunk/test/CodeGen/AMDGPU/idot4s.ll (diff)llvm.src/test/CodeGen/AMDGPU/idot4s.ll
The file was modified/llvm/trunk/test/CodeGen/AMDGPU/idot4u.ll (diff)llvm.src/test/CodeGen/AMDGPU/idot4u.ll
The file was modified/llvm/trunk/test/CodeGen/AMDGPU/idot8s.ll (diff)llvm.src/test/CodeGen/AMDGPU/idot8s.ll
The file was modified/llvm/trunk/test/CodeGen/AMDGPU/idot8u.ll (diff)llvm.src/test/CodeGen/AMDGPU/idot8u.ll
The file was modified/llvm/trunk/test/CodeGen/AMDGPU/sdiv.ll (diff)llvm.src/test/CodeGen/AMDGPU/sdiv.ll
The file was modified/llvm/trunk/test/CodeGen/SystemZ/store_nonbytesized_vecs.ll (diff)llvm.src/test/CodeGen/SystemZ/store_nonbytesized_vecs.ll
The file was modified/llvm/trunk/test/CodeGen/X86/2012-08-07-CmpISelBug.ll (diff)llvm.src/test/CodeGen/X86/2012-08-07-CmpISelBug.ll
The file was modified/llvm/trunk/test/CodeGen/X86/vector-fshl-128.ll (diff)llvm.src/test/CodeGen/X86/vector-fshl-128.ll
The file was modified/llvm/trunk/test/CodeGen/X86/vector-reduce-mul-widen.ll (diff)llvm.src/test/CodeGen/X86/vector-reduce-mul-widen.ll
The file was modified/llvm/trunk/test/CodeGen/X86/vector-reduce-mul.ll (diff)llvm.src/test/CodeGen/X86/vector-reduce-mul.ll