1. [ASTImporter] Properly delete decls from SavedImportPaths (details)
  2. AMDGPU: Fix interaction of tfe and d16 (details)
  3. AMDGPU/GlobalISel: Handle atomic_inc/atomic_dec (details)
  4. AMDGPU/GlobalISel: Fix RegbankSelect for llvm.amdgcn.fmul.legacy (details)
  5. [MachineScheduler] Allow clustering mem ops with complex addresses (details)
  6. [AArch64][SVE] Add patterns for unpredicated load/store to (details)
  7. [ARM] MVE Gather Scatter cost model tests. NFC (details)
  8. [ARM] Basic gather scatter cost model (details)
  9. [VE] setcc isel patterns (details)
  10. [InstCombine] fneg(X + C) --> -C - X (details)
  11. Unconditionally enable lvalue function designators; NFC (details)
Commit 4481eefbe8425c63289186dd13319aaa7043e67f by Raphael Isemann
[ASTImporter] Properly delete decls from SavedImportPaths
Summary: We see a significant regression (~40% slower on large
codebases) in expression evaluation after A sampling profile shows the extra
time is spent in SavedImportPathsTy::operator[] when called from
ASTImporter::Import. I believe this is because ASTImporter::Import adds
an element to the SavedImportPaths map for each decl unconditionally
To fix this, we call SavedImportPathsTy::erase on the declaration rather
than clearing its value vector. That way we do not accidentally
introduce new empty elements.  (With this patch the performance is
restored, and we do not see SavedImportPathsTy::operator[] in the
profile anymore.)
Reviewers: martong, teemperor, a.sidorin, shafik
Reviewed By: martong
Subscribers: rnkovacs, cfe-commits
Tags: #clang
Differential Revision:
The file was modifiedclang/lib/AST/ASTImporter.cpp (diff)
Commit 9c928649a085646c4c779bac095643b50b464d83 by arsenm2
AMDGPU: Fix interaction of tfe and d16
This using the wrong result register, and dropping the result entirely
for v2f16. This would fail to select on the scalar case. I believe it
was also mishandling packed/unpacked subtargets.
The file was modifiedllvm/lib/Target/AMDGPU/SIISelLowering.cpp (diff)
The file was addedllvm/test/CodeGen/AMDGPU/image-load-d16-tfe.ll
Commit a722cbf77cc638064592c508ea0c1be13775ee31 by arsenm2
AMDGPU/GlobalISel: Handle atomic_inc/atomic_dec
The intermediate instruction drops the extra volatile argument. We are
missing an atomic ordering on these.
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (diff)
The file was modifiedllvm/lib/Target/AMDGPU/ (diff)
The file was modifiedllvm/lib/Target/AMDGPU/ (diff)
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (diff)
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.atomic.dec.ll
The file was removedllvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.atomic.dec.mir
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (diff)
The file was removedllvm/test/CodeGen/AMDGPU/GlobalISel/
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h (diff)
Commit 70096ca111ee2848fb2e29a7cb3e4fb7e3ba9ef9 by arsenm2
AMDGPU/GlobalISel: Fix RegbankSelect for llvm.amdgcn.fmul.legacy
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.fmul.legacy.mir
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (diff)
Commit e0f0d0e55cc7d389ad0692fbc9678e7895978355 by jay.foad
[MachineScheduler] Allow clustering mem ops with complex addresses
The generic BaseMemOpClusterMutation calls into TargetInstrInfo to
analyze the address of each load/store instruction, and again to decide
whether two instructions should be clustered. Previously this had to
represent each address as a single base operand plus a constant byte
offset. This patch extends it to support any number of base operands.
The old target hook getMemOperandWithOffset is now a convenience
function for callers that are only prepared to handle a single base
operand. It calls the new more general target hook
The only requirements for the base operands returned by
getMemOperandsWithOffset are:
- they can be sorted by MemOpInfo::Compare, such that clusterable ops
get sorted next to each other, and
- shouldClusterMemOps knows what they mean.
One simple follow-on is to enable clustering of AMDGPU FLAT instructions
with both vaddr and saddr (base register + offset register). I've left a
FIXME in the code for this case.
Differential Revision:
The file was modifiedllvm/include/llvm/CodeGen/TargetInstrInfo.h (diff)
The file was modifiedllvm/lib/Target/AMDGPU/SIInstrInfo.h (diff)
The file was modifiedllvm/lib/CodeGen/TargetInstrInfo.cpp (diff)
The file was modifiedllvm/lib/Target/AArch64/AArch64InstrInfo.h (diff)
The file was modifiedllvm/lib/Target/Lanai/LanaiInstrInfo.h (diff)
The file was modifiedllvm/lib/Target/Hexagon/HexagonInstrInfo.h (diff)
The file was modifiedllvm/lib/CodeGen/MachineScheduler.cpp (diff)
The file was modifiedllvm/lib/Target/Lanai/LanaiInstrInfo.cpp (diff)
The file was modifiedllvm/lib/Target/AMDGPU/SIInstrInfo.cpp (diff)
The file was modifiedllvm/lib/Target/Hexagon/HexagonInstrInfo.cpp (diff)
The file was modifiedllvm/lib/Target/X86/X86InstrInfo.cpp (diff)
The file was modifiedllvm/lib/Target/X86/X86InstrInfo.h (diff)
The file was modifiedllvm/lib/Target/AArch64/AArch64InstrInfo.cpp (diff)
Commit 4cf16efe49766d454eda74927a547a0cf587f540 by sander.desmalen
[AArch64][SVE] Add patterns for unpredicated load/store to
This patch also fixes up a number of cases in DAGCombine and
SelectionDAGBuilder where the size of a scalable vector is used in a
fixed-width context (thus triggering an assertion failure).
Reviewers: efriedma, c-rhodes, rovka, cameron.mcinally
Reviewed By: efriedma
Tags: #llvm
Differential Revision:
The file was modifiedllvm/include/llvm/Analysis/MemoryLocation.h (diff)
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (diff)
The file was modifiedllvm/lib/Analysis/Loads.cpp (diff)
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.cpp (diff)
The file was modifiedllvm/lib/Target/AArch64/ (diff)
The file was modifiedllvm/lib/Target/AArch64/ (diff)
The file was modifiedllvm/lib/CodeGen/CodeGenPrepare.cpp (diff)
The file was addedllvm/test/CodeGen/AArch64/spillfill-sve.ll
The file was modifiedllvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (diff)
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (diff)
The file was modifiedllvm/lib/Target/AArch64/AArch64InstrInfo.cpp (diff)
The file was modifiedllvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h (diff)
Commit 0b83e14804c46aaf8ba40863bb6d1a3cf175b997 by
[ARM] MVE Gather Scatter cost model tests. NFC
The file was addedllvm/test/Analysis/CostModel/ARM/mve-gather-scatter-cost.ll
Commit e9c198278e2193a8ba78686ef8acc49c587dd40e by
[ARM] Basic gather scatter cost model
This is a very basic MVE gather/scatter cost model, based roughly on the
code that we will currently produce. It does not handle truncating
scatters or extending gathers correctly yet, as it is difficult to tell
that they are going to be correctly extended/truncated from the limited
information in the cost function.
This can be improved as we extend support for these in the future.
Based on code originally written by David Sherwood.
Differential Revision:
The file was modifiedllvm/test/Analysis/CostModel/ARM/mve-gather-scatter-cost.ll (diff)
The file was modifiedllvm/lib/Target/ARM/ARMTargetTransformInfo.cpp (diff)
The file was modifiedllvm/lib/Target/ARM/ARMTargetTransformInfo.h (diff)
Commit dc69265eea888e8c6255aebcdd6650420dd00cfb by simon.moll
[VE] setcc isel patterns
Summary: SETCC isel patterns and tests for i32/64 and fp32/64 comparison
Reviewers: arsenm, rengolin, craig.topper, k-ishizaka
Reviewed By: arsenm
Subscribers: merge_guards_bot, wdng, hiraditya, llvm-commits
Tags: #ve, #llvm
Differential Revision:
The file was addedllvm/test/CodeGen/VE/setccf64.ll
The file was addedllvm/test/CodeGen/VE/setcci64i.ll
The file was modifiedllvm/lib/Target/VE/ (diff)
The file was addedllvm/test/CodeGen/VE/setcci32i.ll
The file was addedllvm/test/CodeGen/VE/setccf32.ll
The file was addedllvm/test/CodeGen/VE/setcci64.ll
The file was modifiedllvm/lib/Target/VE/VEISelLowering.cpp (diff)
The file was addedllvm/test/CodeGen/VE/setccf32i.ll
The file was addedllvm/test/CodeGen/VE/setcci32.ll
The file was addedllvm/test/CodeGen/VE/setccf64i.ll
Commit 0ade2abdb01f4a16b1f08d1a78d664b9e9d5f3b5 by spatel
[InstCombine] fneg(X + C) --> -C - X
This is 1 of the potential folds uncovered by extending D72521.
We don't seem to do this in the backend either (unless I'm not seeing
some target-specific transform).
icc and gcc (appears to be target-specific) do this transform.
Differential Revision:
The file was modifiedllvm/test/Transforms/InstCombine/fneg.ll (diff)
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp (diff)
Commit 968561bcdc34c7d74482fe3bb69a045abf08d2c1 by aaron
Unconditionally enable lvalue function designators; NFC
We previously had to guard against older MSVC and GCC versions which had
rvalue references but not support for marking functions with ref
qualifiers. However, having bumped our minimum required version to MSVC
2017 and GCC 5.1 mean we can unconditionally enable this feature. Rather
than keeping the macro around, this replaces use of the macro with the
actual ref qualifier.
The file was modifiedllvm/include/llvm/Support/Compiler.h (diff)
The file was modifiedllvm/include/llvm/ADT/Optional.h (diff)
The file was modifiedllvm/include/llvm/ADT/PointerIntPair.h (diff)
The file was modifiedclang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h (diff)
The file was modifiedllvm/unittests/ADT/OptionalTest.cpp (diff)