SuccessChanges

Summary

  1. [X86] Try to pass DebugLoc by const-ref to avoid costly TrackingMDNodeRef copies. NFCI. (details)
  2. [SLP] Fix spill cost computation for insertelement tree node (details)
  3. [VectorCombine] Add tests with assumes involvind variable index. (details)
  4. [Local] collectBitParts - reduce maximum recursion depth. (details)
  5. [Local] collectBitParts - for bswap-only matches, limit shift amounts to whole bytes to reduce compile time. (details)
  6. IR+AArch64: add a "swiftasync" argument attribute. (details)
  7. [WebAssembly] Support Emscripten EH/SjLj in Wasm64 (details)
  8. [WebAssembly] Omit DBG_VALUE after terminator (details)
  9. [LoopVectorizationLegality] NFC: Mark some interfaces as 'const' (details)
  10. [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM ANDNPS tests (details)
  11. [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VANDNPS tests (details)
  12. [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VANDNPS tests (details)
  13. [X86] AMD Zen 3: same-reg SSE XMM ANDNPS is a 1-cycle(!) dep-breaking zero-idiom (details)
  14. [X86] AMD Zen 3: same-reg AVX XMM VANDNPS is a zero-cycle(!) dep-breaking zero-idiom (details)
  15. [X86] AMD Zen 3: same-reg AVX YMM VANDNPS is a zero-cycle(!) dep-breaking zero-idiom (details)
  16. [NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM ANDNPD tests (details)
  17. [NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VANDNPD tests (details)
  18. [NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VANDNPD tests (details)
  19. [X86] AMD Zen 3: same-reg SSE XMM ANDNPD is a 1-cycle(!) dep-breaking zero-idiom (details)
  20. [X86] AMD Zen 3: same-reg AVX XMM VANDNPD is a zero-cycle(!) dep-breaking zero-idiom (details)
  21. [X86] AMD Zen 3: same-reg AVX YMM VANDNPD is a zero-cycle(!) dep-breaking zero-idiom (details)
  22. [TableGen] Remove unneeded forward defs. NFC. (details)
  23. [Transforms][Debugify] Fix "Missing line" false alarm on PHI nodes (details)
  24. [clang][NFC] remove unused return value (details)
  25. [SDAG] reduce code duplication for extend_vec_inreg combines; NFC (details)
Commit 5ed56a821c0622869739a3ae752eea97a1ee1f48 by llvm-dev
[X86] Try to pass DebugLoc by const-ref to avoid costly TrackingMDNodeRef copies. NFCI.
The file was modifiedllvm/lib/Target/X86/X86FloatingPoint.cpp
The file was modifiedllvm/lib/Target/X86/X86CallFrameOptimization.cpp
The file was modifiedllvm/lib/Target/X86/X86OptimizeLEAs.cpp
The file was modifiedllvm/lib/Target/X86/X86PadShortFunction.cpp
The file was modifiedllvm/lib/Target/X86/X86SpeculativeLoadHardening.cpp
The file was modifiedllvm/lib/Target/X86/X86WinAllocaExpander.cpp
The file was modifiedllvm/lib/Target/X86/X86FrameLowering.cpp
The file was modifiedllvm/lib/Target/X86/X86CmovConversion.cpp
Commit 207cdd7ed9fc545a615d9bb244a7d9a2158e61ed by anton.a.afanasyev
[SLP] Fix spill cost computation for insertelement tree node

This is follow up for D98714, bugfixing.
The file was modifiedllvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
Commit 7ba0e99aec6e461f41ebad608893f8c280836165 by flo
[VectorCombine] Add tests with assumes involvind variable index.

Add test cases with variable indices together with assumes guaranteeing
that the indices are valid.
The file was modifiedllvm/test/Transforms/VectorCombine/load-insert-store.ll
The file was modifiedllvm/test/Transforms/VectorCombine/AArch64/load-extractelement-scalarization.ll
Commit 78c8451cd7b1d789fdd81beac0d3f7172bdce31c by llvm-dev
[Local] collectBitParts - reduce maximum recursion depth.

As noticed on D90170, the recursion depth for matching a maximum of a i128 bitwidth was too high.

@lebedev.ri mentioned that we can probably do better by limiting the number of collected Values instead of just depth, but I'll look at that later.
The file was modifiedllvm/lib/Transforms/Utils/Local.cpp
Commit 079bbea2b20dbfd24e4df654bae1c4324dcde754 by llvm-dev
[Local] collectBitParts - for bswap-only matches, limit shift amounts to whole bytes to reduce compile time.
The file was modifiedllvm/lib/Transforms/Utils/Local.cpp
Commit ea0eec69f16e0f1b00fec413986e4e44f6f627fa by Tim Northover
IR+AArch64: add a "swiftasync" argument attribute.

This extends any frame record created in the function to include that
parameter, passed in X22.

The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001
in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect
of this is that tools walking the stack should expect to see one of three
values there:

  * 0b0000 => a normal, non-extended record with just [FP, LR]
  * 0b0001 => the extended record [X22, FP, LR]
  * 0b1111 => kernel space, and a non-extended record.

All other values are currently reserved.

If compiling for arm64e this context pointer is address-discriminated with the
discriminator 0xc31a and the DB (process-specific) key.

There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing
front-ends access to this slot (and forcing its creation initialized to nullptr
if necessary).
The file was modifiedllvm/lib/CodeGen/PrologEpilogInserter.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64FrameLowering.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
The file was modifiedllvm/lib/CodeGen/SelectionDAG/FastISel.cpp
The file was modifiedllvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64InstrFormats.td
The file was modifiedllvm/lib/IR/Verifier.cpp
The file was addedllvm/test/CodeGen/AArch64/swift-async-unwind.ll
The file was modifiedllvm/lib/Target/AArch64/AArch64CallingConvention.td
The file was modifiedllvm/lib/Bitcode/Reader/BitcodeReader.cpp
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
The file was addedllvm/test/CodeGen/AArch64/swift-async.ll
The file was modifiedllvm/lib/Target/AArch64/AArch64FastISel.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64InstrInfo.cpp
The file was modifiedllvm/test/Bitcode/attributes.ll
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
The file was modifiedllvm/lib/Bitcode/Writer/BitcodeWriter.cpp
The file was modifiedllvm/lib/AsmParser/LLLexer.cpp
The file was modifiedllvm/include/llvm/CodeGen/TargetCallingConv.h
The file was modifiedllvm/lib/Target/AArch64/AArch64InstrInfo.td
The file was modifiedllvm/lib/IR/Attributes.cpp
The file was modifiedllvm/include/llvm/Bitcode/LLVMBitCodes.h
The file was modifiedllvm/docs/LangRef.rst
The file was modifiedllvm/include/llvm/AsmParser/LLToken.h
The file was addedllvm/test/Verifier/swiftasync.ll
The file was modifiedllvm/lib/CodeGen/GlobalISel/CallLowering.cpp
The file was modifiedllvm/include/llvm/CodeGen/TargetFrameLowering.h
The file was modifiedllvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
The file was modifiedllvm/test/Bitcode/compatibility.ll
The file was modifiedllvm/lib/AsmParser/LLParser.cpp
The file was modifiedllvm/include/llvm/Target/TargetCallingConv.td
The file was modifiedllvm/include/llvm/CodeGen/TargetLowering.h
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64FrameLowering.h
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
The file was modifiedllvm/include/llvm/IR/Attributes.td
The file was addedllvm/test/CodeGen/AArch64/swift-async-reg.ll
The file was modifiedllvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp
The file was modifiedllvm/lib/Transforms/Utils/CodeExtractor.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.cpp
The file was modifiedllvm/include/llvm/IR/Intrinsics.td
Commit 8e35a18e4ad416c48c8e48492676fb189ee2c720 by aheejin
[WebAssembly] Support Emscripten EH/SjLj in Wasm64

In wasm64, the signatures of some library functions and global variables
defined in Emscripten change:
- `emscripten_longjmp`: `(i32, i32) -> ()` -> `(i64, i32) -> ()`
  This changes because the first argument is the address of a memory
  buffer. This in turn causes more changes below.
- `setThrew`: `(i32, i32) -> ()` -> `(i64, i32) -> ()`
  `emscripten_longjmp` calls `setThrew` with the i64 buffer argument as
  the first parameter.
- `__THREW__` (global var): `i32` to `i64`
  `setThrew`'s first argument is set to this `__THREW__` variable, so it
  should change to i64 as well.
- `testSetjmp`: `(i32, i32*, i32) -> (i32)` -> `(i64, i32*, i32) -> (i32)`
  In the code transformation done in this pass, the value of `__THREW__`
  is passed as the first parameter of `testSetjmp`.

This patch creates some helper functions to easily get types that become
different depending on the wasm32/wasm64, and uses them to change
various function signatures and code transformations. Also updates the
tests with WASM32/WASM64 check lines.

(Untested) Emscripten side patch: https://github.com/emscripten-core/emscripten/pull/14108

Reviewed By: aardappel

Differential Revision: https://reviews.llvm.org/D101985
The file was modifiedllvm/test/CodeGen/WebAssembly/lower-em-ehsjlj-options.ll
The file was modifiedllvm/test/CodeGen/WebAssembly/lower-em-sjlj.ll
The file was modifiedllvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
The file was modifiedllvm/test/CodeGen/WebAssembly/lower-em-exceptions.ll
Commit 71fbfb499aaaefbb2b24e5652df5525684766bfc by aheejin
[WebAssembly] Omit DBG_VALUE after terminator

When a stackified variable has an associated `DBG_VALUE` instruction,
DebugFixup pass adds a `DBG_VALUE` instruction after the stackified
value's last use to clear the variable's debug range info. But when the
last use instruction is a terminator, it can cause a verification
failure (when run with `-verify-machineinstrs`) because there are no
instructions allowed after a terminator.

For example:
```
%myvar = ...
DBG_VALUE target-index(wasm-operand-stack), $noreg, !"myvar", ...
BR_IF 0, %myvar, ...
DBG_VALUE $noreg, $noreg, !"myvar", ...
```
In this test, `%myvar` is stackified, so the first `DBG_VALUE`
instruction's first operand has changed to `wasm-operand-stack` to
denote it. And an additional `DBG_VALUE` instruction is added after its
last use, `BR_IF`, to signal variable `myvar` is not in the operand
stack anymore. But because the `DBG_VALUE` instruction is added after
the `BR_IF`, a terminator, it fails MachineVerifier.

`DBG_VALUE` instructions are used in `DbgEntityHistoryCalculator` to
compute value ranges to emit DWARF info, and it turns out the
`DbgEntityHistoryCalculator` terminates ranges at the end of a BB, so we
don't need to emit `DBG_VALUE` after a terminator.

Fixes https://bugs.llvm.org/show_bug.cgi?id=50175.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D102309
The file was modifiedllvm/lib/Target/WebAssembly/WebAssemblyDebugFixup.cpp
The file was modifiedllvm/test/CodeGen/WebAssembly/stackified-debug.ll
Commit f82966d19a8bb8531f71913737fcc751bb6ae3e2 by sander.desmalen
[LoopVectorizationLegality] NFC: Mark some interfaces as 'const'

This patch marks blockNeedsPredication, isConsecutivePtr, isMaskRequired
and getSymbolicStrides as 'const'.
The file was modifiedllvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
Commit a657808948f2aa886f7f3321bc29c5458091cf1b by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM ANDNPS tests
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-sse-xmm.s
Commit a57006d627d312ae0abb9d90a94f023576cf8886 by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VANDNPS tests
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-avx-xmm.s
Commit c79c7bb980054fa7c1ebe5aae0e90755fe9a1314 by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VANDNPS tests
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-avx-ymm.s
Commit f38dcbecb643e30931b56bbcf37254477eac3977 by lebedev.ri
[X86] AMD Zen 3: same-reg SSE XMM ANDNPS is a 1-cycle(!) dep-breaking zero-idiom

Same as SSE XMM XORPS/XORPD, it is not zero-cycle, even though it breaks the deps.
As confirmed by the exegesis measurements, and ref docs.
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-sse-xmm.s
Commit fd4cbc822b6d4c8ce0e6e162e1c0648686e5a834 by lebedev.ri
[X86] AMD Zen 3: same-reg AVX XMM VANDNPS is a zero-cycle(!) dep-breaking zero-idiom

As confirmed by exegesis measurements, and ref docs.
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-avx-xmm.s
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
Commit d8a595b81c114d2c8251790d2d2e55f6659db3bd by lebedev.ri
[X86] AMD Zen 3: same-reg AVX YMM VANDNPS is a zero-cycle(!) dep-breaking zero-idiom

As confirmed by exegesis measurements, and ref docs.
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-avx-ymm.s
Commit 055fa84cd88fe56161d7a68d1ed0648c4bf2d35d by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: add same-reg SSE XMM ANDNPD tests
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-sse-xmm.s
Commit 0b7e52e7259ca82242c61b0df6336d03ad50b62d by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX XMM VANDNPD tests
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-avx-xmm.s
Commit 3221e06e9b8559e8b26a9f4f0a6d1a39c29bc226 by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: add same-reg AVX YMM VANDNPD tests
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-avx-ymm.s
Commit 38ceb46fb03d10e24b83d2261ad70efd555067a2 by lebedev.ri
[X86] AMD Zen 3: same-reg SSE XMM ANDNPD is a 1-cycle(!) dep-breaking zero-idiom

As confirmed by exegesis measurements, and ref docs.
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-sse-xmm.s
Commit 17f99a8a41c0291081894354b454c16742c29787 by lebedev.ri
[X86] AMD Zen 3: same-reg AVX XMM VANDNPD is a zero-cycle(!) dep-breaking zero-idiom

As confirmed by exegesis measurements, and ref docs.
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-avx-xmm.s
Commit 4af4afe014a70f6b57a083b0d5565773dae6b094 by lebedev.ri
[X86] AMD Zen 3: same-reg AVX YMM VANDNPD is a zero-cycle(!) dep-breaking zero-idiom

As confirmed by exegesis measurements, and ref docs.
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-avx-ymm.s
Commit 6ec66f681c3763916933057849587a44cfb8e6da by jay.foad
[TableGen] Remove unneeded forward defs. NFC.
The file was modifiedllvm/include/llvm/Target/Target.td
The file was modifiedllvm/include/llvm/Target/TargetSchedule.td
Commit 01c90bbd4fd12aa86db4a47577addb47e6e84289 by djordje.todorovic
[Transforms][Debugify] Fix "Missing line" false alarm on PHI nodes

This is a fix for https://bugs.llvm.org/show_bug.cgi?id=49959

The "Missing line" false alarm was introduced in D75242.

Patch by Yilong Guo<yilong.guo@intel.com>

Differential Revision: https://reviews.llvm.org/D100446
The file was addedllvm/test/DebugInfo/debugify-ignore-phi.ll
The file was modifiedllvm/lib/Transforms/Utils/Debugify.cpp
Commit 0566f979619cf49a62804a7e3530438f1319fa7c by nathan
[clang][NFC] remove unused return value

In working on p0388 (ary[N] -> ary[] conversion), I discovered neither
use of UnwrapSimilarArrayTypes used the return value. So let's nuke
it.

Differential Revision: https://reviews.llvm.org/D102480
The file was modifiedclang/include/clang/AST/ASTContext.h
The file was modifiedclang/lib/AST/ASTContext.cpp
Commit 9dfd7f9b6775fb6d5e51285ae211b6a77b747d98 by spatel
[SDAG] reduce code duplication for extend_vec_inreg combines; NFC

These are identical so far, and I was looking at adding a fold
for a pattern with scalar_to_vector which would also nd up duplicated.
The file was modifiedllvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp