Changes

Summary

  1. [RISCV] Use llvm::stable_sort instead of std::stable_sort. NFC (details)
  2. [indvars] Canonicalize exit conditions to unsigned using range info (details)
  3. Extend transform introduced in D111896 to multiple exits (details)
  4. [SCEV] Avoid compile time explosion in ScalarEvolution::isImpliedCond (details)
  5. Revert "Reland [clang] Pass -clear-ast-before-backend in Clang::ConstructJob()" (details)
  6. [SCEV] Fix formatting error introduced by D112080 (details)
  7. [lldb] improve the help strings for gdb-remote and kdp-remote (details)
  8. [cuda] Add address space predicate funuctions. (details)
  9. [lldb/test] Update test/API/functionalities/load_lazy to macOS 12 (details)
  10. [driver] Explicitly specify `-fbuild-session-timestamp` in seconds. (details)
  11. [fir] Add character utility functions in FIRBuilder (details)
  12. [x86] add tests for psubus; NFC (details)
  13. [clang][Driver] Make multiarch output file basenames reproducible (details)
  14. [x86] add special-case lowering for usubsat for pre-SSE4 (details)
  15. [Driver][Gnu] Delete unneeded -Bstatic dispatch for arm/thumb (details)
  16. [llvm-reduce] Add reduction passes to reduce operands to undef/1/0 (details)
  17. [WebAssembly] Emit clangast in custom section aligned by 4 bytes (details)
  18. Implementation of `ReshapeNoopOptimization` canonicalizer. (details)
  19. Add MLIR_INSTALL_AGGREGATE_OBJECTS and default it to ON. (details)
  20. [NVPTX] Add a late SROA pass which allows optimizing away more allocas. (details)
  21. BPF: set .BTF and .BTF.ext section alignment to 4 (details)
Commit dc8a5f9419f5cc35fc3c9e8698ba3ebb6a3f974f by craig.topper
[RISCV] Use llvm::stable_sort instead of std::stable_sort. NFC
The file was modifiedclang/utils/TableGen/RISCVVEmitter.cpp
Commit fca0218875f5117110d38b9cd7503bc2789693d3 by listmail
[indvars] Canonicalize exit conditions to unsigned using range info

This patch duplicates a bit of logic we apply to comparisons encountered during the IV users walk to conditions which feed exit conditions. Why? simplifyAndExtend has a very limited list of users it walks. In particular, in the examples is stops at the zext and never visits the icmp. (Because we can't fold the zext to an addrec yet in SCEV.) Being willing to visit when we haven't simplified regresses multiple tests (seemingly because of less optimal results when computing trip counts).

Note that this can be trivially extended to multiple exiting blocks. I'm leaving that to a future patch (solely to cut down on the number of versions of the same code in review at once.)

Differential Revision: https://reviews.llvm.org/D111896
The file was modifiedllvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll
The file was modifiedllvm/lib/Transforms/Scalar/IndVarSimplify.cpp
Commit 0836a1059dcf8e4fbf408248bf5eed13dfd93f7b by listmail
Extend transform introduced in D111896 to multiple exits

This is trivial.  It was left out of the original review only because we had multiple copies of the same code in review at the same time, and keeping them in sync was easiest if the structure was kept in sync.
The file was modifiedllvm/test/Transforms/IndVarSimplify/finite-exit-comparisons.ll
The file was modifiedllvm/lib/Transforms/Scalar/IndVarSimplify.cpp
Commit 08619006a0c0694477f143dc1552eab35701e50b by bjorn.a.pettersson
[SCEV] Avoid compile time explosion in ScalarEvolution::isImpliedCond

As seen in PR51869 the ScalarEvolution::isImpliedCond function might
end up spending lots of time when doing the isKnownPredicate checks.

Calling isKnownPredicate for example result in isKnownViaInduction
being called, which might result in isLoopBackedgeGuardedByCond being
called, and then we might get one or more new calls to isImpliedCond.
Even if the scenario described here isn't an infinite loop, using
some random generated C programs as input indicates that those
isKnownPredicate checks quite often returns true. On the other hand,
the third condition that needs to be fulfilled in order to "prove
implications via truncation", i.e. the isImpliedCondBalancedTypes
check, is rarely fulfilled.
I also made some similar experiments to look at how often we would
get the same result when using isKnownViaNonRecursiveReasoning instead
of isKnownPredicate. So far I haven't seen a single case when codegen
is negatively impacted by using isKnownViaNonRecursiveReasoning. On
the other hand, it seems like we get rid of the compile time explosion
seen in PR51869 that way. Hence this patch.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D112080
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
The file was addedllvm/test/Analysis/ScalarEvolution/pr51869-scalar-evolution-prove-implications-via-truncation.ll
Commit 57553ce43281a7c379e375161320cc09d8236839 by zequanwu
Revert "Reland [clang] Pass -clear-ast-before-backend in Clang::ConstructJob()"

This reverts commit 1fb24fe85a19ae71b00875ff6c96ef1831dcf7e3.

This causes clang crash on chromium. See repro at https://bugs.chromium.org/p/chromium/issues/detail?id=1261551#c1.
The file was modifiedclang/lib/Driver/ToolChains/Clang.cpp
The file was modifiedclang/lib/Interpreter/Interpreter.cpp
Commit 9c44a0996c8cf87607807751be2315020c582c66 by bjorn.a.pettersson
[SCEV] Fix formatting error introduced by D112080

Accidentally pushed D112080 without this clang-format cleanup.
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
Commit 8ac5a6641fa4d742fb4599b485c40700e773f01f by Lawrence D'Anna
[lldb] improve the help strings for gdb-remote and kdp-remote

The help string can be more helpful by explaining these are
aliases for 'process connect'

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D111965
The file was modifiedlldb/source/Interpreter/CommandInterpreter.cpp
Commit 6fe902daf931dedf6e958b43c043cb57bb612daf by michael.hliao
[cuda] Add address space predicate funuctions.

- Add the missing NVVM predicate builtins on address space checking
- Redefine them as pure functions so that they could be used in
  __builtin_assume.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D112053
The file was modifiedclang/lib/Headers/__clang_cuda_runtime_wrapper.h
The file was modifiedclang/include/clang/Basic/BuiltinsNVPTX.def
Commit 5e004b03f72a17f916b93792eb778dfa9e7a09cc by Vedant Kumar
[lldb/test] Update test/API/functionalities/load_lazy to macOS 12

In macOS 12, dyld switched to using chained fixups. As a result, all symbols
are bound at launch and there are no lazy pointers any more. Since we wish to
import/dlopen() a dylib with missing symbols, we need to use a weak import.
This applies to all macOS 12-aligned OS releases, e.g. iOS 15, etc.

rdar://81295101

Differential Revision: https://reviews.llvm.org/D112034
The file was modifiedlldb/test/API/functionalities/load_lazy/Makefile
Commit 91e19f66e51ac3fda2309f5e67b02fcccd4d58a0 by vsapsai
[driver] Explicitly specify `-fbuild-session-timestamp` in seconds.

Representation of the file's last modification time depends on the file
system and isn't guaranteed to be in seconds. Cast to seconds explicitly
and tighten the test case to check the magnitude of the calculated
value, so we can catch passing milliseconds or nanoseconds.

rdar://83915615

Differential Revision: https://reviews.llvm.org/D111205
The file was modifiedclang/test/Driver/modules.m
The file was modifiedclang/lib/Driver/ToolChains/Clang.cpp
Commit c983aeddcf5af992d2a807d3f4f8cdc27cbf63b1 by clementval
[fir] Add character utility functions in FIRBuilder

Extract part of D111337 in order to mke it smaller
and easier to review. This patch add some utility
functions to the FIRBuilder.

Add the following utility functions:
- getCharacterLengthType
- createStringLiteral
- locationToFilename
- characterWithDynamicLen
- sequenceWithNonConstantShape
- hasDynamicSize

These bring up the BoxValue implementation together with it.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: AlexisPerry

Differential Revision: https://reviews.llvm.org/D112074

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
The file was addedflang/include/flang/Optimizer/Builder/BoxValue.h
The file was modifiedflang/include/flang/Optimizer/Dialect/FIRType.h
The file was modifiedflang/lib/Optimizer/Builder/CMakeLists.txt
The file was modifiedflang/lib/Optimizer/Builder/FIRBuilder.cpp
The file was addedflang/include/flang/Optimizer/Support/Matcher.h
The file was modifiedflang/include/flang/Optimizer/Builder/FIRBuilder.h
The file was modifiedflang/lib/Optimizer/Dialect/FIRType.cpp
The file was addedflang/lib/Optimizer/Builder/BoxValue.cpp
The file was modifiedflang/lib/Optimizer/Dialect/FIROps.cpp
The file was modifiedflang/unittests/Optimizer/Builder/FIRBuilderTest.cpp
Commit e2faf721b2b9597ed4a68f52fdffb14b6cb7db7c by spatel
[x86] add tests for psubus; NFC
The file was modifiedllvm/test/CodeGen/X86/psubus.ll
Commit 17386cb4dc89afad62623b9bc08516b99b9c6df7 by keithbsmiley
[clang][Driver] Make multiarch output file basenames reproducible

When building a multiarch MachO binary, previously the intermediate
output file names would contain random characters. On macOS this
filename, since it's used when linking, ended up being used as a
stable-ish identifier for the adhoc codesignature of the binary, leading
to non-reproducible binaries. This change uses the architecture, when
available, to create a stable, but unique, basename for the file.

Differential Revision: https://reviews.llvm.org/D111269
The file was modifiedclang/test/Driver/darwin-dsymutil.c
The file was modifiedclang/lib/Driver/Driver.cpp
Commit 92a0389b0425a9535a99a0ce13ba0eeda2bce7ad by spatel
[x86] add special-case lowering for usubsat for pre-SSE4

usubsat X, SMIN --> (X ^ SMIN) & (X s>> BW-1)

This would be a regression with D112085 where we combine to
usubsat more aggressively, so avoid that by matching the
special-case where we are subtracting SMIN (signmask):
https://alive2.llvm.org/ce/z/4_3gBD

Differential Revision: https://reviews.llvm.org/D112095
The file was modifiedllvm/test/CodeGen/X86/psubus.ll
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
Commit 922bf57fc8fe41ebcbbe581a7c8e730fbebf572f by i
[Driver][Gnu] Delete unneeded -Bstatic dispatch for arm/thumb

Historically -static and -Bstatic are synonym.
gold made the semantics of -static slightly stronger but that does not matter.
The file was modifiedclang/lib/Driver/ToolChains/Gnu.cpp
Commit 9660563950aaed54020bfdf0be07e7096a9553e4 by aeubanks
[llvm-reduce] Add reduction passes to reduce operands to undef/1/0

Having non-undef constants in a final llvm-reduce output is nicer than
having undefs.

This splits the existing reduce-operands pass into three, one which does
the same as the current pass of reducing to undef, and two more to
reduce to the constant 1 and the constant 0. Do not reduce to undef if
the operand is a ConstantData, and do not reduce 0s to 1s.

Reducing GEP operands very frequently causes invalid IR (since types may
not match up if we index differently into a struct), so don't touch GEPs.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D111765
The file was modifiedllvm/tools/llvm-reduce/DeltaManager.cpp
The file was modifiedllvm/tools/llvm-reduce/deltas/ReduceOperands.cpp
The file was modifiedllvm/test/tools/llvm-reduce/remove-operands.ll
The file was modifiedllvm/tools/llvm-reduce/deltas/ReduceOperands.h
The file was modifiedllvm/test/tools/llvm-reduce/remove-invoked-functions.ll
Commit 1813fde9cc0b56cee42d9b82e6f22fa00a59cdf9 by sbc
[WebAssembly] Emit clangast in custom section aligned by 4 bytes

Emit __clangast in custom section instead of named data segment
to find it while iterating sections.
This could be avoided if all data segements (the wasm sense) were
represented as their own sections (in the llvm sense).
This can be resolved by https://github.com/WebAssembly/tool-conventions/issues/138

And the on-disk hashtable in clangast needs to be aligned by 4 bytes,
so add paddings in name length field in custom section header.

The length of clangast section name can be represented in 1 byte
by leb128, and possible maximum pads are 3 bytes, so the section
name length won't be invalid in theory.

Fixes https://bugs.llvm.org/show_bug.cgi?id=35928

Differential Revision: https://reviews.llvm.org/D74531
The file was addedclang/test/PCH/pch-wasm.c
The file was addedllvm/test/MC/WebAssembly/custom-section-alignment.ll
The file was modifiedclang/lib/CodeGen/ObjectFilePCHContainerOperations.cpp
The file was modifiedllvm/lib/MC/WasmObjectWriter.cpp
Commit 9c62bb55f473a9d0db16b894708ed09f2346ae9d by rob.suderman
Implementation of `ReshapeNoopOptimization` canonicalizer.

This canonicalizer replaces reshapes of constant tensors that contain the updated shape (skipping the reshape operation).

Differential Revision: https://reviews.llvm.org/D112038
The file was modifiedmlir/lib/Dialect/Tosa/IR/TosaOps.cpp
The file was modifiedmlir/test/Dialect/Tosa/canonicalize.mlir
Commit a897590f11b6cb2bacf6cd317a5f96b1d39ed2f2 by stellaraccident
Add MLIR_INSTALL_AGGREGATE_OBJECTS and default it to ON.

* Package maintainers can opt to disable installation of these objects.
* Per discussion on https://reviews.llvm.org/D111504

Differential Revision: https://reviews.llvm.org/D112090
The file was modifiedmlir/CMakeLists.txt
The file was modifiedmlir/cmake/modules/AddMLIR.cmake
The file was modifiedmlir/cmake/modules/MLIRConfig.cmake.in
Commit b6b7fe60a444e03387b0e8be31bc1742ead36b25 by tra
[NVPTX] Add a late SROA pass which allows optimizing away more allocas.

Fixes performance regression https://bugs.llvm.org/show_bug.cgi?id=52037

Differential Revision: https://reviews.llvm.org/D111471
The file was addedllvm/test/CodeGen/NVPTX/b52037.ll
The file was modifiedllvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
Commit cd40b5a71290bab313cc431fb9a90ac3f9f3fa02 by yhs
BPF: set .BTF and .BTF.ext section alignment to 4

Currently, .BTF and .BTF.ext has default alignment of 1.
For example,
  $ cat t.c
    int foo() { return 0; }
  $ clang -target bpf -O2 -c -g t.c
  $ llvm-readelf -S t.o
    ...
    Section Headers:
    [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
    ...
    [ 7] .BTF              PROGBITS        0000000000000000 000167 00008b 00      0   0  1
    [ 8] .BTF.ext          PROGBITS        0000000000000000 0001f2 000050 00      0   0  1

But to have no misaligned data access, .BTF and .BTF.ext
actually requires alignment of 4. Misalignment is not an issue
for architecture like x64/arm64 as it can handle it well. But
some architectures like mips may incur a trap if .BTF/.BTF.ext
is not properly aligned.

This patch explicitly forced .BTF and .BTF.ext alignment to be 4.
For the above example, we will have
    [ 7] .BTF              PROGBITS        0000000000000000 000168 00008b 00      0   0  4
    [ 8] .BTF.ext          PROGBITS        0000000000000000 0001f4 000050 00      0   0  4

Differential Revision: https://reviews.llvm.org/D112106
The file was modifiedllvm/lib/Target/BPF/BTFDebug.cpp
The file was addedllvm/test/CodeGen/BPF/BTF/align.ll