SuccessChanges

Summary

  1. [SLP]Improve gathering of scalar elements. (details)
  2. [clang-cl] Parse /await:strict, new in MSVC 16.10 (details)
  3. [clang] p1099 using-enum feature macro & web page (details)
  4. [X86][SSE] Regenerate slow-pmulld.ll test checks (details)
  5. [X86][SLM] Adjust XMM non-PMULLD throughput costs to half rate. (details)
  6. [OpenCL] Add OpenCL builtin test generator (details)
  7. [x86] add tests for store merging miscompile (PR50623); NFC (details)
  8. [TableGen] Fix ProfileFoldOpInit so that parameters are named consistently [NFC] (details)
  9. [ARM] Fix Machine Outliner LDRD/STRD handling in Thumb mode. (details)
  10. Sanitizers.h - remove MathExtras.h include dependency (details)
  11. [SDAG] fix miscompile from merging stores of different sizes (details)
  12. [X86] Check destination element type before forming VTRUNCS/VTRUNCUS in combineTruncateWithSat. (details)
  13. [mlir][openacc][NFC] move index in processDataOperands (details)
  14. [SROA] Avoid splitting loads/stores with irregular type (details)
  15. Revert "[OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type" (details)
  16. [mlir][ArmSVE] Add basic load/store operations (details)
  17. Do not generate calls to the 128-bit function __multi3() on 32-bit ARM (details)
  18. [InstCombine] add tests for casts-around-ctlz; NFC (details)
  19. [libcxx][ci] enables assertions for runtimes-build (details)
  20. [mlir] fix a crash if the dialect is missing a data layout interface (details)
  21. clang/darwin: use response files with ld64 (details)
  22. Fix typo in Toy tutorial Ch1 (details)
Commit a0086add2e52a82dd83114f458c10e2e4bdd15ac by a.bataev
[SLP]Improve gathering of scalar elements.

1. Better sorting of scalars to be gathered. Trying to insert
   constants/arguments/instructions-out-of-loop at first and only then
   the instructions which are inside the loop. It improves hoisting of
   invariant insertelements instructions.
2. Better detection of shuffle candidates in gathering function.
3. The cost of insertelement for constants is 0.

Part of D57059.

Differential Revision: https://reviews.llvm.org/D103458
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/jumbled-load-multiuse.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/phi_landingpad.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/PR39774.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/geps-non-pow-2.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/slp-max-phi-size.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/slp-umax-rdx-matcher-crash.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/commutativity.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/remark_extract_broadcast.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/crash_mandeltext.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/tiny-tree.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/AArch64/vectorize-free-extracts-inserts.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/crash_exceed_scheduling.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/SystemZ/pr34619.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/crash_lencod.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/horizontal-minmax.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/partail.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/value-bug-inseltpoison.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/AArch64/insertelement.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/phi3.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/crash_smallpt.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/shrink_after_reorder.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/matched-shuffled-entries.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/pr35497.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/AArch64/trunc-insertion.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/reduction2.ll
The file was modifiedllvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
The file was modifiedllvm/test/Transforms/SLPVectorizer/AArch64/insertelement-inseltpoison.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/jumbled-load-used-in-phi.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/reorder_repeated_ops.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/hoist.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/value-bug.ll
Commit 64dbd649cf661cbca5e8670d220aec40d6892572 by hans
[clang-cl] Parse /await:strict, new in MSVC 16.10
The file was modifiedclang/include/clang/Driver/Options.td
The file was modifiedclang/test/Driver/cl-options.c
Commit c1cd743519af3978b944df88f57c6e523caa10dc by nathan
[clang] p1099 using-enum feature macro & web page

This completes the series implementing p1099, by adding the feature
macro and updating the web page.

Differential Revision: https://reviews.llvm.org/D102242
The file was modifiedclang/lib/Frontend/InitPreprocessor.cpp
The file was modifiedclang/www/cxx_status.html
The file was modifiedclang/test/Lexer/cxx-features.cpp
Commit 8ffeb5c47d94d8f8eafc4e986fe47578b716c1dc by llvm-dev
[X86][SSE] Regenerate slow-pmulld.ll test checks
The file was modifiedllvm/test/CodeGen/X86/slow-pmulld.ll
Commit 630820bafc6866ce1efa4f1e2c4b11f6250eae9c by llvm-dev
[X86][SLM] Adjust XMM non-PMULLD throughput costs to half rate.

Match what's reported in the costs table, Agner's tables and the Intel AOM
The file was modifiedllvm/test/CodeGen/X86/slow-pmulld.ll
The file was modifiedllvm/test/tools/llvm-mca/X86/SLM/resources-ssse3.s
The file was modifiedllvm/test/tools/llvm-mca/X86/SLM/resources-sse41.s
The file was modifiedllvm/lib/Target/X86/X86ScheduleSLM.td
The file was modifiedllvm/test/tools/llvm-mca/X86/SLM/resources-sse2.s
Commit 8866793b4e0abd31e4f57abf9ba832d691a3a3e1 by sven.vanhaastregt
[OpenCL] Add OpenCL builtin test generator

Add a new clang-tblgen flag `-gen-clang-opencl-builtin-tests` that
generates a .cl file containing calls to every builtin function
defined in the .td input.

This patch does not add any use of the new flag yet, so the only way
to obtain a generated test file is through a manual invocation of
clang-tblgen.  A test making use of this emitter will be added in a
followup commit.

Differential Revision: https://reviews.llvm.org/D97869
The file was modifiedclang/utils/TableGen/TableGenBackends.h
The file was modifiedclang/utils/TableGen/ClangOpenCLBuiltinEmitter.cpp
The file was modifiedclang/utils/TableGen/TableGen.cpp
Commit 2ef81cb297954cdbc2eca2f204a5ecba4ec1ccd8 by spatel
[x86] add tests for store merging miscompile (PR50623); NFC
The file was modifiedllvm/test/CodeGen/X86/stores-merging.ll
Commit ef8df920fbbc945dce6aeec717629ddde90a8ebe by Paul C. Anagnostopoulos
[TableGen] Fix ProfileFoldOpInit so that parameters are named consistently [NFC]

See https://bugs.llvm.org/show_bug.cgi?id=50595

Differential Revision: https://reviews.llvm.org/D103823
The file was modifiedllvm/lib/TableGen/Record.cpp
Commit 6c78dbd4ca1f2c25cdc276d646c7920afe856ca3 by yvan.roux
[ARM] Fix Machine Outliner LDRD/STRD handling in Thumb mode.

This is a fix for PR50481

Immediate values for AddrModeT2_i8s4 are already scaled in MCinst operand.
This patch changes the number of bits and scale factor to reflect that
state when checking stack offset status. AddrModeT2_i7s[2|4] also have
this particularity but since MVE instructions are not outlined, just move
these cases to the unhandled ones.

Differential Revision: https://reviews.llvm.org/D103167
The file was modifiedllvm/test/CodeGen/ARM/machine-outliner-stack-fixup-thumb.mir
The file was modifiedllvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
Commit 206a66de5902b2b6dc0c62c4a25526d7e7f24186 by llvm-dev
Sanitizers.h - remove MathExtras.h include dependency

The MathExtras.h header is included purely for the countPopulation() method - by moving this into Sanitizers.cpp we can remove the use of this costly header.

We only ever use isPowerOf2() / countPopulation() inside asserts so this shouldn't have any performance effects on production code.

Differential Revision: https://reviews.llvm.org/D103953
The file was modifiedclang/lib/Basic/Sanitizers.cpp
The file was modifiedclang/include/clang/Basic/Sanitizers.h
Commit dd763ac79196b3d3bc0370b9dbd35e0c083e52a4 by spatel
[SDAG] fix miscompile from merging stores of different sizes

As shown in:
https://llvm.org/PR50623
...and the similar tests here, we were not accounting for
store merging of different sizes that do not cover the
entire range of the wide value to be stored.

This is the easy fix: just make sure that all of the
original stores are the same size, so when we calculate
the wide width, it's a simple N * M check.

This still allows all of the motivating optimizations from:
D86420 / 54a5dd485c4d
D87112 / 7a06b166b1af

We could enhance this code to track individual bytes and
allow merging multiple sizes.
The file was modifiedllvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
The file was modifiedllvm/test/CodeGen/X86/stores-merging.ll
Commit 765ef4bb2af604ea2bbd6c1bffaa6e1600804c9e by craig.topper
[X86] Check destination element type before forming VTRUNCS/VTRUNCUS in combineTruncateWithSat.

Fixes crash reported here https://reviews.llvm.org/D73607

Using a store to keep the trunc intact. Returning v16i24 would
cause the trunc to be optimized away in SelectionDAGBuilder.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D103940
The file was modifiedllvm/test/CodeGen/X86/vector-trunc-ssat.ll
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
Commit cf8467057947e019f7fe45d00836dfb629715064 by clementval
[mlir][openacc][NFC] move index in processDataOperands

Move the index variable used to track variables inside of the specific
processDataOperands functions.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D103924
The file was modifiedmlir/lib/Target/LLVMIR/Dialect/OpenACC/OpenACCToLLVMIRTranslation.cpp
Commit d3faef6eefe51a8f231898a4eda9130c8ba01bb5 by thatlemon
[SROA] Avoid splitting loads/stores with irregular type

Upon encountering loads/stores on types whose size is not a multiple of 8 bits the SROA pass would either trip an assertion or use logic that was not meant to work with such irregularly-sized types.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D99435
The file was modifiedllvm/lib/Transforms/Scalar/SROA.cpp
The file was modifiedllvm/test/Transforms/SROA/slice-width.ll
The file was addedllvm/test/Transforms/SROA/irregular-type.ll
Commit 9ce2e5e7003d4c88eea8df27e830e5af4336aeed by Andrey.Churbanov
Revert "[OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type"

This reverts commit a1f550e052543f75acac9089b760cbc61729131f.

Revert in order to fix backwards compatibility breakage
caused by type size change for task dependence flag.
The file was modifiedclang/test/OpenMP/target_update_depend_codegen.cpp
The file was removedopenmp/runtime/test/tasking/omp51_task_dep_inoutset.c
The file was modifiedopenmp/runtime/test/tasking/hidden_helper_task/gtid.cpp
The file was modifiedclang/test/OpenMP/target_exit_data_depend_codegen.cpp
The file was modifiedclang/test/OpenMP/task_codegen.cpp
The file was modifiedopenmp/runtime/src/kmp_taskdeps.h
The file was modifiedclang/lib/CodeGen/CGOpenMPRuntime.cpp
The file was modifiedopenmp/runtime/src/kmp.h
The file was modifiedclang/test/OpenMP/depobj_codegen.cpp
The file was modifiedclang/test/OpenMP/task_codegen.c
The file was modifiedopenmp/runtime/test/tasking/bug_nested_proxy_task.c
The file was modifiedopenmp/runtime/test/tasking/hidden_helper_task/depend.cpp
The file was modifiedclang/test/OpenMP/target_enter_data_depend_codegen.cpp
The file was modifiedopenmp/runtime/src/kmp_taskdeps.cpp
The file was modifiedopenmp/runtime/test/tasking/bug_proxy_task_dep_waiting.c
The file was modifiedclang/test/OpenMP/task_if_codegen.cpp
The file was modifiedopenmp/runtime/test/tasking/hidden_helper_task/common.h
Commit 96ca2d92b52bd97fcdce4c0ba2723399b005e0a9 by javier.setoain
[mlir][ArmSVE] Add basic load/store operations

ArmSVE-specific memory operations are needed to generate end-to-end
code for as long as MLIR core doesn't support scalable vectors. This
instructions will be eventually unnecessary, for now they're required
for more complex testing.

Differential Revision: https://reviews.llvm.org/D103535
The file was modifiedmlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp
The file was modifiedmlir/test/Target/LLVMIR/arm-sve.mlir
The file was modifiedmlir/include/mlir/Dialect/ArmSVE/ArmSVE.td
The file was modifiedmlir/test/Dialect/ArmSVE/roundtrip.mlir
The file was addedmlir/test/Dialect/ArmSVE/memcpy.mlir
Commit 64e9aa33020d68a98c30bf05362ffc1c1778890c by rengolin
Do not generate calls to the 128-bit function __multi3() on 32-bit ARM

The function __multi3() is undefined on 32-bit ARM, so a call to it
should never be emitted. Instead, plain instructions need to be
generated to perform 128-bit multiplications.

Differential Revision: https://reviews.llvm.org/D103906
The file was modifiedllvm/test/CodeGen/ARM/umulo-128-legalisation-lowering.ll
The file was modifiedllvm/lib/Target/ARM/ARMISelLowering.cpp
Commit 9eef6e39816a502ccabdd70702694993f8b63061 by spatel
[InstCombine] add tests for casts-around-ctlz; NFC

Baseline for D103788
The file was addedllvm/test/Transforms/InstCombine/zext-ctlz-trunc-to-ctlz-add.ll
Commit cdb9d242debaf689395bf38d19ca90327cd3b9fa by cjdb
[libcxx][ci] enables assertions for runtimes-build

This will catch nasty Clang bugs like
https://bugs.llvm.org/show_bug.cgi?id=50592 before we merge stuff into
libc++ main.

Differential Revision: https://reviews.llvm.org/D103863
The file was modifiedlibcxx/utils/ci/run-buildbot
Commit f6faa71eafbcd52d5154aadf888fce8b3af73c16 by zinenko
[mlir] fix a crash if the dialect is missing a data layout interface

The top-level verifier of data layout specifications delegates verification of
entries with identifier keys to the dialect of the identifier prefix. This flow
was missing a check whether the dialect actually implements the relevant
interface.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D103945
The file was modifiedmlir/lib/Interfaces/DataLayoutInterfaces.cpp
The file was modifiedmlir/test/Interfaces/DataLayoutInterfaces/types.mlir
Commit 1c7f3395b8ec52462220898495883ec570390367 by keithbsmiley
clang/darwin: use response files with ld64

This crasher was fixed with Xcode 13.0 beta 1 / ld64 705. This is an
updated revert of https://reviews.llvm.org/D92357

Differential Revision: https://reviews.llvm.org/D103934
The file was modifiedclang/lib/Driver/ToolChains/Darwin.cpp
Commit acc3ca3b7a08dc8d2690953af41a82652bb4f73b by joker.eph
Fix typo in Toy tutorial Ch1

This aligns the website with the actual test case in the repo.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D84193
The file was modifiedmlir/docs/Tutorials/Toy/Ch-1.md