SuccessChanges

Summary

  1. [mlir] fix shared-libs build (details)
  2. [LoopVectorize] Don't use strict reductions when reordering is allowed (details)
  3. NVPTXTargetLowering::LowerReturn - Pass DataLayout by reference. NFCI. (details)
  4. ValueTrackingTest.cpp - Pass DataLayout by reference. NFCI. (details)
  5. MemCpyOptimizer.cpp - hasUndefContentsMSSA - Pass DataLayout by reference. NFCI. (details)
  6. [CostModel][X86] Improve AVX1/AVX2 truncation costs (details)
  7. OptBisect.cpp - remove unused include. NFCI. (details)
  8. [InstCombine] Add instcombine fold for extractelement + splat for scalable vectors (details)
  9. [RISCV] Add a test case showing inefficient vector codegen (details)
  10. [OpenCL] Add memory_scope_all_devices (details)
  11. [CostModel] Return an invalid cost for memory ops with unsupported types (details)
  12. [OpenMP][OMPD] Implementation of OMPD debugging library - libompd. (details)
  13. [LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass (details)
  14. [clang] p1099 using enum part 1 (details)
  15. [VE][NFC] IRBuilder<> -> IRBuilderBase (details)
Commit 7116468ca9d0ebec1d97d58d99879eaf7e7a3982 by zinenko
[mlir] fix shared-libs build
The file was modifiedmlir/lib/Analysis/CMakeLists.txt
Commit 14eeccfe9adb372c76d11d6ffa98fdb6e9808acc by kerry.mclaughlin
[LoopVectorize] Don't use strict reductions when reordering is allowed

If the `-enable-strict-reductions` flag is set to true, then currently we will
always choose to vectorize the loop with strict in-order reductions. This is
not necessary where we allow the reordering of FP operations, such as
when loop hints are passed via metadata.

This patch moves useOrderedReductions so that we can also check whether
loop hints allow reordering, in which case we should use the default
behaviour of vectorizing with unordered reductions.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D103814
The file was modifiedllvm/test/Transforms/LoopVectorize/AArch64/strict-fadd.ll
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp
The file was modifiedllvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll
Commit 27f3041c88ac4e392da7c1f071f8516947c7a1c7 by llvm-dev
NVPTXTargetLowering::LowerReturn - Pass DataLayout by reference. NFCI.
The file was modifiedllvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
Commit 4ad59f9a5a9589e7d0608dafb99ee4f2db67cb95 by llvm-dev
ValueTrackingTest.cpp - Pass DataLayout by reference. NFCI.
The file was modifiedllvm/unittests/Analysis/ValueTrackingTest.cpp
Commit 596004a94748e427ff59956e74d8ed4eb0e109d4 by llvm-dev
MemCpyOptimizer.cpp - hasUndefContentsMSSA - Pass DataLayout by reference. NFCI.
The file was modifiedllvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
Commit 49d3a367c0376a95b9518e90426cdd6d5508e64a by llvm-dev
[CostModel][X86] Improve AVX1/AVX2 truncation costs

Based off the worse case numbers generated by D103695, we were overestimating the cost of a number of vector truncations:

AVX2: v2i32->v2i8, v2i64->v2i16 + v4i64->v4i32
AVX1: v2i32->v2i8, v4i64->v4i16 + v16i16->v16i8

Once we have a working set of conversion costs, the intention is to cleanup the tables and use legalized types a lot more to reduce the number of entries we currently have.
The file was modifiedllvm/test/Analysis/CostModel/X86/cast.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/arith-fix.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/arith-overflow.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/arith.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/trunc.ll
The file was modifiedllvm/lib/Target/X86/X86TargetTransformInfo.cpp
The file was modifiedllvm/test/Analysis/CostModel/X86/min-legal-vector-width.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/rem.ll
Commit f96b5e801d67dac4fb1b94566aa4be3a3a5756d5 by llvm-dev
OptBisect.cpp - remove unused include. NFCI.

StringRef.h is included in OptBisect.h and we have no uses of std::string.
The file was modifiedllvm/lib/IR/OptBisect.cpp
Commit 6fd1604d14335a31268bbc477de27e81310f39ef by caroline.concatto
[InstCombine] Add instcombine fold for extractelement + splat for scalable vectors

This patch allows that scalable vector can also use the fold that already
exists for fixed vector, only when the lane index is lower than the minimum
number of elements of the vector.

Differential Revision: https://reviews.llvm.org/D102404
The file was modifiedllvm/lib/Analysis/InstructionSimplify.cpp
The file was modifiedllvm/test/Transforms/InstCombine/vscale_extractelement-inseltpoison.ll
The file was modifiedllvm/test/Transforms/InstCombine/vscale_extractelement.ll
The file was modifiedllvm/test/Transforms/InstCombine/gep-vector-indices.ll
Commit ccd1e087f3702d5ccdfcce24ac7f7d2877921165 by fraser
[RISCV] Add a test case showing inefficient vector codegen
The file was addedllvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitcast-large-vector.ll
Commit d54e7b731e662e3ec19c590172c9827e3e184829 by sven.vanhaastregt
[OpenCL] Add memory_scope_all_devices

Add the `memory_scope_all_devices` enum value, which is restricted to
OpenCL 3.0 or newer and the `__opencl_c_atomic_scope_all_devices`
feature.  Also guard `memory_scope_all_svm_devices` accordingly, which
is already available in OpenCL 2.0.

The `__opencl_c_atomic_scope_all_devices` feature is header-only, so
set its define to 1 in `opencl-c-base.h`.  This is done
unconditionally at the moment, as the mechanism for disabling
header-only options hasn't been decided yet.

This patch only adds a negative test for now.  Ideally adding a CL3.0
run line to atomic-ops.cl should suffice as a positive test, but we
cannot do that yet until (at least) generic address spaces and program
scope variables are supported in OpenCL 3.0 mode.

Differential Revision: https://reviews.llvm.org/D103241
The file was modifiedclang/test/SemaOpenCL/atomic-ops.cl
The file was modifiedclang/test/Headers/opencl-c-header.cl
The file was modifiedclang/lib/Headers/opencl-c-base.h
Commit 5db52751a594410d0166d606b305b01a03f0ca3f by kerry.mclaughlin
[CostModel] Return an invalid cost for memory ops with unsupported types

Fixes getTypeConversion to return `TypeScalarizeScalableVector` when a scalable vector
type cannot be legalized by widening/splitting. When this is the method of legalization
found, getTypeLegalizationCost will return an Invalid cost.

The getMemoryOpCost, getMaskedMemoryOpCost & getGatherScatterOpCost functions already call
getTypeLegalizationCost and will now also return an Invalid cost for unsupported types.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D102515
The file was modifiedllvm/test/Transforms/LoopVectorize/AArch64/scalable-vf-hint.ll
The file was modifiedllvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
The file was addedllvm/test/Transforms/VectorCombine/AArch64/extract-cmp-binop.ll
The file was modifiedllvm/lib/CodeGen/TargetLoweringBase.cpp
The file was modifiedllvm/test/Transforms/VectorCombine/X86/extract-cmp-binop.ll
The file was addedllvm/test/Analysis/CostModel/AArch64/sve-illegal-types.ll
The file was modifiedllvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp
Commit f61602b0d3fd3ff5b277dc44cf22cfb5356dee5c by Vignesh.Balasubrmanian
[OpenMP][OMPD] Implementation of OMPD debugging library - libompd.

This is the first of seven patches that implements OMPD, a debugging interface to support debugging of OpenMP programs.
It contains support code required in "openmp/runtime" for OMPD implementation.

Reviewed By: @hbae
Differential Revision: https://reviews.llvm.org/D100181
The file was modifiedopenmp/runtime/src/kmp_gsupport.cpp
The file was modifiedopenmp/runtime/src/kmp.h
The file was modifiedopenmp/runtime/src/ompt-general.cpp
The file was modifiedopenmp/runtime/src/kmp_settings.cpp
The file was addedopenmp/runtime/src/ompd-specific.h
The file was addedopenmp/runtime/src/ompd-specific.cpp
The file was modifiedopenmp/runtime/CMakeLists.txt
The file was modifiedopenmp/runtime/src/kmp_config.h.cmake
The file was modifiedopenmp/runtime/src/kmp_runtime.cpp
The file was modifiedopenmp/runtime/src/include/omp-tools.h.var
The file was modifiedopenmp/runtime/src/kmp_settings.h
The file was modifiedopenmp/runtime/src/kmp_wait_release.h
The file was modifiedopenmp/runtime/src/ompt-specific.cpp
The file was modifiedopenmp/runtime/src/CMakeLists.txt
The file was modifiedopenmp/runtime/src/kmp_csupport.cpp
The file was modifiedopenmp/runtime/src/kmp_tasking.cpp
Commit 09e92c607cc94f3d088da2d13592f4e7100ba84a by konndennsa
[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass

This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.

Also, a crash problem on legacy pass manager is fixed.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D99149
The file was modifiedllvm/lib/Passes/PassRegistry.def
The file was modifiedllvm/lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp
The file was modifiedllvm/include/llvm/Transforms/Scalar/LoopUnrollAndJamPass.h
The file was modifiedllvm/test/Transforms/LoopUnrollAndJam/innerloop.ll
The file was modifiedllvm/include/llvm/Transforms/Scalar/LoopPassManager.h
The file was modifiedllvm/lib/Passes/PassBuilder.cpp
Commit 012898b92cad00e230a960a08a3f418628bec060 by nathan
[clang] p1099 using enum part 1

This adds support for p1099's 'using SCOPED_ENUM::MEMBER;'
functionality, bringing a member of an enumerator into the current
scope. The novel feature here, is that there need not be a class
hierarchical relationship between the current scope and the scope of
the SCOPED_ENUM. That's a new thing, the closest equivalent is a
typedef or alias declaration. But this means that
Sema::CheckUsingDeclQualifier needs adjustment. (a) one can't call it
until one knows the set of decls that are being referenced -- if
exactly one is an enumerator, we're in the new territory. Thus it
needs calling later in some cases. Also (b) there are two ways we hold
the set of such decls. During parsing (or instantiating a dependent
scope) we have a lookup result, and during instantiation we have a set
of shadow decls. Thus two optional arguments, at most one of which
should be non-null.

Differential Revision: https://reviews.llvm.org/D100276
The file was modifiedclang/test/CXX/dcl.dcl/basic.namespace/namespace.udecl/p3.cpp
The file was modifiedclang/include/clang/Sema/Sema.h
The file was modifiedclang/lib/Sema/SemaTemplateInstantiateDecl.cpp
The file was modifiedclang/test/SemaCXX/enum-scoped.cpp
The file was modifiedclang/lib/Sema/SemaDeclCXX.cpp
The file was addedclang/test/CXX/dcl.dcl/basic.namespace/namespace.udecl/p7-cxx20.cpp
The file was modifiedclang/include/clang/Basic/DiagnosticSemaKinds.td
The file was modifiedclang/test/CXX/dcl.dcl/basic.namespace/namespace.udecl/p7.cpp
Commit 2b626aba448a5febb76cca87623c7a380b9e96d6 by simon.moll
[VE][NFC] IRBuilder<> -> IRBuilderBase

VE's TTI broke with the switch from IRBuilder<> to IRBuilderBase.
Following that change to compile again.
The file was modifiedllvm/lib/Target/VE/VEISelLowering.cpp
The file was modifiedllvm/lib/Target/VE/VEISelLowering.h