FailedChanges

Summary

  1. [DAGCombiner] allow store merging non-i8 truncated ops (details)
  2. Reapply D70800: Fix AArch64 AAPCS frame record chain (details)
  3. [test] Rewrite various tests to not use constprop (details)
  4. [AArch64][SVE] Add lowering for llvm fceil (details)
  5. [InstSimplify] Add additional umax tests (NFC) (details)
  6. [InstSimplify] Fold min/max intrinsic based on icmp of operands (details)
  7. [VectorCombine] adjust test for better coverage; NFC (details)
  8. [libomptarget][amdgpu] Improve thread safety, remove dead code (details)
  9. [mlir][vector] Add vector.bitcast operation (details)
  10. [LangRef] Memset/memcpy/memmove can take undef/poison pointer if the size is 0 (details)
  11. [AArch64] Use CCAssignFnForReturn helper in more spots. NFC. (details)
  12. [IR] Remove noundef from masked store/load/gather/scatter's pointer operands (details)
  13. [X86] Default to -mtune=generic unless -march is passed to the driver. Add TuneCPU to the AST serialization (details)
Commit 54a5dd485c4d04d142a58c9349ada0c897cbeae6 by spatel
[DAGCombiner] allow store merging non-i8 truncated ops

We have a gap in our store merging capabilities for shift+truncate
patterns as discussed in:
https://llvm.org/PR46662

I generalized the code/comments for this function in earlier commits,
so we only need ease the type restriction and adjust the address/endian
checking to make this work.

AArch64 lets us switch endian to make sure that patterns are matched
either way.

Differential Revision: https://reviews.llvm.org/D86420
The file was modifiedllvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
The file was modifiedllvm/test/CodeGen/X86/stores-merging.ll
The file was modifiedllvm/test/CodeGen/AArch64/merge-trunc-store.ll
Commit 9936455204fd6ab72715cc9d67385ddc93e072ed by resistor
Reapply D70800: Fix AArch64 AAPCS frame record chain

Original Commit Message:
After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register)
may be placed in the middle of a stack frame if a function has both callee-saved
general-purpose registers and floating point registers. This will break the stack unwinders
that simply walk through the frame records (based on the guarantee from AAPCS64
"The Frame Pointer" section). This commit fixes the problem by adding the frame record offset.

Patch By: logan
The file was addedllvm/test/CodeGen/AArch64/framelayout-frame-record.mir
The file was addedllvm/test/CodeGen/AArch64/framelayout-fp-csr.ll
The file was modifiedllvm/lib/Target/AArch64/AArch64FrameLowering.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
Commit d1e6103a791309764f2a281eb3a5da01b9946511 by aeubanks
[test] Rewrite various tests to not use constprop

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D86653
The file was modifiedllvm/test/Transforms/Reassociate/2002-05-15-SubReassociate.ll
The file was modifiedllvm/test/Transforms/Reassociate/fast-SubReassociate.ll
The file was modifiedllvm/test/Transforms/Reassociate/otherops.ll
The file was modifiedllvm/test/Transforms/Inline/externally_available.ll
Commit fd536eeed99effb190337d1e500ef8e2dbb74920 by dancgr
[AArch64][SVE] Add lowering for llvm fceil

Add the functionality to lower fceil for passthru variant

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D84548
The file was modifiedllvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
The file was modifiedllvm/test/CodeGen/AArch64/sve-fp.ll
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.h
The file was modifiedllvm/lib/Target/AArch64/SVEInstrFormats.td
Commit b73c5a0736fd6e42cf8ca7330cd4bb98aee6bcdc by nikita.ppv
[InstSimplify] Add additional umax tests (NFC)

A sample of some folds we get if we perform icmp simplification
on min/max intrinsics.
The file was modifiedllvm/test/Transforms/InstSimplify/maxmin_intrinsics.ll
Commit d7c119d89c5f6d0789cfd0a139c80e23912c0bb0 by nikita.ppv
[InstSimplify] Fold min/max intrinsic based on icmp of operands

This is a reboot of D84655, now performing the inner icmp
simplification query without undef folds.

It should be possible to handle the current foldMinMaxSharedOp()
fold based on this, by moving the logic into icmp of min/max instead,
making it more general. We can't drop the folds for constant operands,
because those also allow undef, which we exclude here.

The tests use assumes for exhaustive coverage, and have a few
more examples of misc folds we get based on icmp simplification.

Differential Revision: https://reviews.llvm.org/D85929
The file was modifiedllvm/lib/Analysis/InstructionSimplify.cpp
The file was modifiedllvm/test/Transforms/InstSimplify/maxmin_intrinsics.ll
Commit 9cea682faaa097e15891b945e74e7a8fdb4d7069 by spatel
[VectorCombine] adjust test for better coverage; NFC

A >2x insert might crash if we do not generate the shuffle mask carefully.

D86160
The file was modifiedllvm/test/Transforms/VectorCombine/X86/load.ll
Commit 5d989fb37d7cfb4f7766a45d4efc82b5add3811f by jonchesterfield
[libomptarget][amdgpu] Improve thread safety, remove dead code
The file was modifiedopenmp/libomptarget/plugins/amdgpu/src/rtl.cpp
The file was modifiedopenmp/libomptarget/plugins/amdgpu/impl/atmi.cpp
The file was removedopenmp/libomptarget/plugins/amdgpu/impl/atmi_kl.h
The file was modifiedopenmp/libomptarget/plugins/amdgpu/impl/atmi.h
The file was modifiedopenmp/libomptarget/plugins/amdgpu/impl/atmi_interop_hsa.cpp
The file was modifiedopenmp/libomptarget/plugins/amdgpu/impl/rt.h
The file was modifiedopenmp/libomptarget/plugins/amdgpu/impl/machine.h
The file was modifiedopenmp/libomptarget/plugins/amdgpu/impl/data.cpp
The file was modifiedopenmp/libomptarget/plugins/amdgpu/impl/atmi_runtime.h
The file was modifiedopenmp/libomptarget/plugins/amdgpu/impl/utils.cpp
The file was modifiedopenmp/libomptarget/plugins/amdgpu/impl/system.cpp
The file was modifiedopenmp/libomptarget/plugins/amdgpu/impl/machine.cpp
Commit 5fbfe2ec4f8baf6a4729f9dc2e4fe16f269921eb by thomasraoux
[mlir][vector] Add vector.bitcast operation

Based on the RFC discussed here:
https://llvm.discourse.group/t/rfc-vector-standard-add-bitcast-operation/1628/

Adding a vector.bitcast operation that allows casting to a vector of different
element type. The most minor dimension bitwidth must stay unchanged.

Differential Revision: https://reviews.llvm.org/D86580
The file was modifiedmlir/test/Dialect/Vector/invalid.mlir
The file was modifiedmlir/test/Dialect/Vector/canonicalize.mlir
The file was modifiedmlir/include/mlir/Dialect/Vector/VectorOps.td
The file was modifiedmlir/lib/Dialect/Vector/VectorOps.cpp
The file was modifiedmlir/test/Dialect/Vector/ops.mlir
Commit 24dd04116db34e97271a520b6ab2397c67c627cb by aqjune
[LangRef] Memset/memcpy/memmove can take undef/poison pointer if the size is 0

According to the current LangRef, Memset/memcpy/memmove can take a
null/dangling pointer if the size is zero.
(Relevant thread: http://lists.llvm.org/pipermail/llvm-dev/2017-July/115665.html )
This patch expands it and allows the functions to take undef/poison pointers
too.

This required the updates in the align attribute since it isn't specified
what is the alignment of undef/poison pointers.
This patch states that their alignment is 1.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86643
The file was modifiedllvm/docs/LangRef.rst
Commit 383f7c88589c5cf60fc09fd7d9b30ddd65642c34 by Ahmed Bougacha
[AArch64] Use CCAssignFnForReturn helper in more spots. NFC.

It was added for GISel, but SDAG could use it too!
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.cpp
Commit 0c55889d809027136048a0d144209a2bc282e7fc by aqjune
[IR] Remove noundef from masked store/load/gather/scatter's pointer operands

As discussed in D86576, noundef attribute is removed from masked store/load/gather/scatter's
pointer operands.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86656
The file was modifiedmlir/test/Target/llvmir-intrinsics.mlir
The file was modifiedllvm/include/llvm/IR/Intrinsics.td
Commit 71f3169e1baeff262583b35ef88f8fb6df7be85e by craig.topper
[X86] Default to -mtune=generic unless -march is passed to the driver. Add TuneCPU to the AST serialization

This patch defaults to -mtune=generic unless -march is present. If -march is present we'll use the empty string unless its overridden by mtune. The back should use the target cpu if the tune-cpu isn't present.

It also adds AST serialization support to fix some tests that emit AST and parse it back. These tests diff the IR against the output from not going through AST. So if we don't serialize the tune CPU we fail the diff.

Differential Revision: https://reviews.llvm.org/D86488
The file was modifiedclang/lib/Serialization/ASTWriter.cpp
The file was modifiedclang/lib/Frontend/FrontendActions.cpp
The file was modifiedclang/test/Driver/x86-mtune.c
The file was modifiedclang/lib/Frontend/CompilerInvocation.cpp
The file was modifiedclang/lib/Basic/Targets/X86.h
The file was modifiedclang/lib/Serialization/ASTReader.cpp
The file was modifiedclang/lib/Driver/ToolChains/Clang.cpp
The file was modifiedclang/test/Modules/module_file_info.m