Changes

Summary

  1. [NFC][InstCombine] Add tests for smin reduction w/ i1 element type (PR51259) (details)
  2. [InstCombine] `vector_reduce_smin(?ext(<n x i1>))` --> `?ext(vector_reduce_{or,and}(<n x i1>))` (PR51259) (details)
  3. [NFC][InstCombine] Add tests for smax reduction w/ i1 element type (PR51259) (details)
  4. [InstCombine] `vector_reduce_smax(?ext(<n x i1>))` --> `?ext(vector_reduce_{and,or}(<n x i1>))` (PR51259) (details)
  5. [AArch64][GlobalISel] Emit extloads for ZExt/SExt values in assignValueToAddress (details)
  6. [NFC][InstCombine] Add tests for and reduction w/ i1 element type (PR51259) (details)
  7. [NFC][InstCombine] Add tests for or reduction w/ i1 element type (PR51259) (details)
  8. [InstCombine] `vector_reduce_{or,and}(?ext(<n x i1>))` --> `?ext(vector_reduce_{or,and}(<n x i1>))` (PR51259) (details)
  9. [BasicTTIImpl][LoopUnroll] getUnrollingPreferences(): emit ORE remark when advising against unrolling due to a call in a loop (details)
  10. Improve UBSan documentation (details)
  11. [mlir][sparse] use consistent type for COO object and sparse tensor storage (details)
  12. [profile] Move assertIsZero to InstrProfilingUtil.c (details)
  13. [clang] Add support for optional flag -fnew-infallible to restrict exception propagation (details)
  14. [AArch64][SelectionDAG] Support passing/returning scalable vectors with unusual types. (details)
  15. [GlobalOpt] Fix the assert for stored once non-pointer to global address (details)
  16. [NFC][tsan] clang-format two files (details)
Commit 4551a4184700cce21d3e63b03ccedefab6dd205f by lebedev.ri
[NFC][InstCombine] Add tests for smin reduction w/ i1 element type (PR51259)
The file was addedllvm/test/Transforms/InstCombine/reduction-smin-sext-zext-i1.ll
Commit f47b7b6d10c77cce77c9456f788bcc77b3a19ebb by lebedev.ri
[InstCombine] `vector_reduce_smin(?ext(<n x i1>))` --> `?ext(vector_reduce_{or,and}(<n x i1>))` (PR51259)

Alive2 agrees:
https://alive2.llvm.org/ce/z/noXtZ8 (self)
https://alive2.llvm.org/ce/z/JNrN6C (zext)
https://alive2.llvm.org/ce/z/58snuN (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
The file was modifiedllvm/test/Transforms/InstCombine/reduction-smin-sext-zext-i1.ll
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
Commit d7482a2bded334710816ea0fc9fbbb6ec09d673e by lebedev.ri
[NFC][InstCombine] Add tests for smax reduction w/ i1 element type (PR51259)
The file was addedllvm/test/Transforms/InstCombine/reduction-smax-sext-zext-i1.ll
Commit 554fc9ad0a24f6689c61d080c9451edd2ddc90b1 by lebedev.ri
[InstCombine] `vector_reduce_smax(?ext(<n x i1>))` --> `?ext(vector_reduce_{and,or}(<n x i1>))` (PR51259)

Alive2 agrees:
https://alive2.llvm.org/ce/z/3oqir9 (self)
https://alive2.llvm.org/ce/z/6cuI5m (zext)
https://alive2.llvm.org/ce/z/4FL8rD (sext)

We already handle `vector_reduce_and(<n x i1>)`,
so let's just combine into the already-handled pattern
and let the existing fold do the rest.
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
The file was modifiedllvm/test/Transforms/InstCombine/reduction-smax-sext-zext-i1.ll
Commit bd13c8e610cad6c60e2b35264bcff9a4e4934615 by Jessica Paquette
[AArch64][GlobalISel] Emit extloads for ZExt/SExt values in assignValueToAddress

When a value is expected to be extended, we should emit an extended load rather
than a normal G_LOAD.

Add checklines to arm64-abi.ll which show that we now emit the correct loads.

For ease of comparison: https://godbolt.org/z/8WvY6EfdE

Differential Revision: https://reviews.llvm.org/D107313
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/call-lowering-zeroext.ll
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/call-translator-ios.ll
The file was modifiedllvm/test/CodeGen/AArch64/arm64-abi.ll
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/call-lowering-signext.ll
Commit a22449336ed918ef5946d5f89c50df9404a2c062 by lebedev.ri
[NFC][InstCombine] Add tests for and reduction w/ i1 element type (PR51259)
The file was addedllvm/test/Transforms/InstCombine/reduction-and-sext-zext-i1.ll
Commit cdb0dfdffaaf061ba1b4e5653e6179db152ed891 by lebedev.ri
[NFC][InstCombine] Add tests for or reduction w/ i1 element type (PR51259)
The file was addedllvm/test/Transforms/InstCombine/reduction-or-sext-zext-i1.ll
Commit 4ba3326f17ddabc1f427508a927a987d812ac543 by lebedev.ri
[InstCombine] `vector_reduce_{or,and}(?ext(<n x i1>))` --> `?ext(vector_reduce_{or,and}(<n x i1>))` (PR51259)

This allows the expansion logic to actually trigger if the argument
was extended from i1 element type, like the rest of the reductions expect.

Alive2 agrees:
https://alive2.llvm.org/ce/z/wcfews (or zext)
https://alive2.llvm.org/ce/z/FCXNFx (or sext)
https://alive2.llvm.org/ce/z/f26zUY (and zext)
https://alive2.llvm.org/ce/z/jprViN (and sext)
The file was modifiedllvm/test/Transforms/InstCombine/reduction-or-sext-zext-i1.ll
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
The file was modifiedllvm/test/Transforms/InstCombine/reduction-and-sext-zext-i1.ll
Commit 6f6e9a867f2ace8c8b99eb8008e17dd63116bcde by lebedev.ri
[BasicTTIImpl][LoopUnroll] getUnrollingPreferences(): emit ORE remark when advising against unrolling due to a call in a loop

I'm not sure this is the best way to approach this,
but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D107271
The file was modifiedllvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h
The file was modifiedllvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
The file was modifiedllvm/lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp
The file was modifiedllvm/lib/Analysis/TargetTransformInfo.cpp
The file was modifiedllvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp
The file was modifiedllvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCTargetTransformInfo.h
The file was modifiedllvm/lib/Target/ARM/ARMTargetTransformInfo.h
The file was modifiedllvm/include/llvm/Analysis/TargetTransformInfo.h
The file was modifiedllvm/include/llvm/Analysis/TargetTransformInfoImpl.h
The file was modifiedllvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
The file was modifiedllvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.h
The file was modifiedllvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
The file was modifiedllvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp
The file was modifiedllvm/include/llvm/CodeGen/BasicTTIImpl.h
The file was modifiedllvm/include/llvm/Transforms/Utils/UnrollLoop.h
The file was modifiedllvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
The file was modifiedllvm/lib/Target/Hexagon/HexagonTargetTransformInfo.h
The file was modifiedllvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h
The file was modifiedllvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
The file was addedllvm/test/Transforms/LoopUnroll/X86/call-remark.ll
Commit 65e9d7efb090756e16bbb5ff929efbc795a8b0d4 by 31459023+hctim
Improve UBSan documentation

Add more checks, info on -fno-sanitize=..., and reference to 5/2021 UBSan Oracle blog.

Authored By: DianeMeirowitz
Reviewed By: hctim

Differential Revision: https://reviews.llvm.org/D106908
The file was modifiedclang/docs/UndefinedBehaviorSanitizer.rst
Commit 52c87e0437808ed7249aaf73f43eda77e3f91d4d by ajcbik
[mlir][sparse] use consistent type for COO object and sparse tensor storage

There was a slightly mismatch between the double COO and actual numerical
type in the final sparse tensor storage (due to external formats always
using double). This minor revision removes that inconsistency by using a
properly typed COO and casting during the "add" method instead. This also
prepares alternative ways of initializing the COO object.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D107310
The file was modifiedmlir/lib/ExecutionEngine/SparseUtils.cpp
Commit 3b0a9e7b392a916d961c5a8c09b570e4656267d5 by Vedant Kumar
[profile] Move assertIsZero to InstrProfilingUtil.c

... and rename it to 'warnIfNonZero' to better-reflect what it actually
does.

The goal is to minimize the amount of logic that's conditionally
compiled under '#if __APPLE__'.
The file was modifiedcompiler-rt/lib/profile/InstrProfilingUtil.c
The file was modifiedcompiler-rt/lib/profile/InstrProfilingFile.c
The file was modifiedcompiler-rt/lib/profile/InstrProfilingUtil.h
Commit b40a2a533a9dfb8dd5afb1f3b7d277da1e19f235 by modimo
[clang] Add support for optional flag -fnew-infallible to restrict exception propagation

The declaration for the global new function in C++ is generated in the compiler front-end. When examining exception propagation, we found that this is the largest root throw site propagator requiring unwind code to be generated for callers up the stack. Allowing this to be handled immediately with termination stops upward propagation and leads to significantly less landing pads generated. This in turns leads to a performance and .text size win.

With `-fnew-infallible` this annotates the declaration with `throw()` and `__attribute__((returns_nonnull))`.  `throw()` allows the compiler to assume exceptions do not propagate out of new and eliminate it as a root throw site. Note that the definition of global new is user-replaceable so users should ensure that the one used follows these semantics.

Measuring internally, we're seeing at 0.5% CPU win in one of our large internal FB workload. Measuring on clang self-build (cd0a1226b50081e86eb75a89d01e8782423971a0) we get:

thinlto/

        "dwarfehprepare.NumCleanupLandingPadsRemaining": 153494,
        "dwarfehprepare.NumNoUnwind": 26309,
thinlto_newinfallible/

        "dwarfehprepare.NumCleanupLandingPadsRemaining": 143660,
        "dwarfehprepare.NumNoUnwind": 28744,

a 1-143660/153494 = 6.4% reduction in landing pads and a 28744/26309 = 9.3% increase in the number of nounwind functions.

Testing:
ninja check-all
new test case to make sure these attributes are added correctly to global new.

Reviewed By: urnathan

Differential Revision: https://reviews.llvm.org/D105225
The file was modifiedclang/include/clang/Basic/LangOptions.def
The file was modifiedclang/lib/Sema/SemaExprCXX.cpp
The file was modifiedclang/docs/ClangCommandLineReference.rst
The file was modifiedclang/include/clang/Driver/Options.td
The file was modifiedclang/lib/Driver/ToolChains/Clang.cpp
The file was addedclang/test/CodeGenCXX/new-infallible.cpp
Commit 1f62af63467e4834e1e386619b3eccab245489d4 by efriedma
[AArch64][SelectionDAG] Support passing/returning scalable vectors with unusual types.

This adds handling for two cases:

1. A scalable vector where the element type is promoted.
2. A scalable vector where the element count is odd (or more generally,
   not divisble by the element count of the part type).

(Some element types still don't work; for example, <vscale x 2 x i128>,
or <vscale x 2 x fp128>.)

Differential Revision: https://reviews.llvm.org/D105591
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
The file was modifiedllvm/test/CodeGen/AArch64/sve-breakdown-scalable-vectortype.ll
The file was modifiedllvm/lib/CodeGen/TargetLoweringBase.cpp
Commit 7ce98cf56e3ea9c8dd8d55f6f61b1bed9de4c70a by scui
[GlobalOpt] Fix the assert for stored once non-pointer to global address

This is to fix the assert @bjope reported due to the code change of https://reviews.llvm.org/D106589. The test case from @bjope is also included.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D107302
The file was addedllvm/test/Transforms/GlobalOpt/2021-08-02-CastStoreOnceP2I.ll
The file was modifiedllvm/lib/Transforms/IPO/GlobalOpt.cpp
Commit 9205143f07009cc4801b979d4d467d6c3c02450b by Vitaly Buka
[NFC][tsan] clang-format two files
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_rtl.cpp
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_interface_inl.h