Changes

Summary

  1. Avoid building the entire tree and testing LLVM itself on MLIR builders (details)
Commit fbdda46df1702d87909e66856796ffaefb5c0b41 by joker.eph
Avoid building the entire tree and testing LLVM itself on MLIR builders

This reduces the amount of targets to build and tests. The build gets faster (see below) but
the main motivation right now is to not get notified by an mlir-* bot when there is a failure
in a LLVM test (there are enough other bots to cover these).

Build time gain is hard to evaluate because it is highly dependent on the machine.
Below are the results on my Linux machine (with the same cmake config as the
win-mlir-buildbot, so no python tests):
```
$ time ninja check-mlir -j 16
894.323 [0/1/2803] Running the MLIR regression tests

Testing Time: 98.80s
  Unsupported      : 163
  Passed           : 927
  Expectedly Failed:   1

real 16m41.103s
user 188m17.260s
sys 23m22.081s
```

After that, adding `ninja -j 16` results in an added:

```
$ time ninja -j 16
441.094 [0/1/767] Linking CXX executable bin/SpeculativeJIT

real 7m21.771s
user 86m25.291s
sys 15m5.505s
```

So it is ~1/3 of the build time that we're saving here.

Differential Revision: https://reviews.llvm.org/D110187
The file was modifiedbuildbot/osuosl/master/config/builders.py (diff)

Summary

  1. [mlir][sparse] cleanup ABI issues in C interface with memrefs (details)
  2. [PowerPC] add testcase for chain commoning; nfc (details)
  3. tsan: don't call dlsym during exit (details)
  4. tsan: move errno spoiling reporting into a separate function (NFC) (details)
  5. tsan: enable sse4.2 in tests (details)
  6. [Polly] Add -polly-reschedule and -polly-postopts options. (details)
  7. tsan: reset destination range in Java heap move (details)
  8. tsan: uninline Enable/DisableIgnores (details)
  9. tsan: prepare for trace mapping removal (details)
  10. [lldb] Add --stack option to `target symbols add` command (details)
  11. [flang] Change complex type define in runtime for clang-cl (details)
  12. [InstCombine] Move InstCombineWorklist to Utils to allow reuse (NFC). (details)
  13. [clang][ASTImporter] Generic attribute import handling (first step). (details)
  14. [Utils] Replace llc with cat for tests (details)
  15. tsan: account for mid app range in mem profile (details)
  16. tsan: include MBlock/SyncObj stats into mem profile (details)
  17. tsan: make mem profile data more consistent (details)
  18. tsan: include internal allocator info in mem profile (details)
  19. tsan: move mem profile initialization into separate function (details)
  20. tsan: remove stale comment (details)
  21. tsan: write uptime in mem profile (details)
  22. [ARM] Add additional tests for VMOVL in tail predicated loops. (details)
  23. [AMDGPU] Divergence-driven instruction selection for mul i32 (details)
  24. [AMDGPU] Convert mac/fmac to mad/fma when folding output modifiers (details)
  25. [AArch64][SVE] Add missing load/store patterns for unpacked bfloat vectors. (details)
  26. [VectorCombine] Switch to using a worklist. (details)
  27. [LoopVectorize][X86] Add operands to make it more obvious what line the CHECK concerns (details)
  28. [SelectionDAG] Make WidenVecRes_Convert work for scalable vectors. (details)
  29. [hwasan] also omit safe mem[cpy|mov|set]. (details)
  30. Don't fold (select C, (gep Ptr, Idx), Ptr) if C is vector but Idx is scalar (details)
  31. Unbreak module builds by making InstructionWorklist.h non-modular (details)
  32. [ARM] Allow smaller VMOVL in tail predicated loops (details)
  33. [lldb] [Windows] Fix continuing from breakpoints and singlestepping on ARM/AArch64 (details)
  34. [Matrix] Emit assumption that matrix indices are valid. (details)
  35. Revert "[CodeGen] regenerate test checks; NFC" (details)
  36. Revert "[InstCombine] fold cast of right-shift if high bits are not demanded" (details)
  37. [Passes] Run vector-combine early with -fenable-matrix. (details)
  38. [gn build] (manually) port f8b1cc365786 (details)
  39. [gn build] Port 7a320b279d07 (details)
  40. [SelectionDAG] Add PromoteIntOp_INSERT_SUBVECTOR. (details)
  41. [lldb] JITLoaderGDB tests can use lli in ORC greedy mode (details)
Commit 128a9e1cb480188fc68aaedbcaf92d8ee74a92c7 by ajcbik
[mlir][sparse] cleanup ABI issues in C interface with memrefs

This change adds automatic wrapper functoins with emit_c_interface
to all methods in the sparse support library that deal with MEMREFs.
The wrappers will take care of passing MEMREFs by value internally
and by pointer externally, thereby avoiding ABI issues across platforms.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D110219
The file was modifiedmlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp (diff)
The file was modifiedmlir/lib/ExecutionEngine/SparseUtils.cpp (diff)
Commit 957514eb9e712d9f0ea5c7afb2294c05da0c5e2d by czhengsz
[PowerPC] add testcase for chain commoning; nfc
The file was addedllvm/test/CodeGen/PowerPC/common-chain.ll
Commit 20ee72d4ccb17c6f32641c690fa129475427ae45 by dvyukov
tsan: don't call dlsym during exit

dlsym calls into dynamic linker which calls malloc and other things.
It's problematic to do it during the actual exit, because
it can happen from a singal handler or from within the runtime
after we reported the first bug, etc.
See https://github.com/google/sanitizers/issues/1440 for an example
(captured in the added test).
Initialize the callbacks during startup instead.

Depends on D110159.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D110166
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_interface.h (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_rtl.h (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_rtl.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform_posix.cpp (diff)
The file was addedcompiler-rt/test/tsan/signal_exit.cpp
Commit cf93f7677de3cde9a1166d23eafddd27536c8b25 by dvyukov
tsan: move errno spoiling reporting into a separate function (NFC)

CallUserSignalHandler function is quite large and complex.
Move errno spoiling reporting into a separate function.
No logical changes.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D110159
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp (diff)
Commit 41f8ef3e3183145a49366bb4cc639dd13664d0c6 by dvyukov
tsan: enable sse4.2 in tests

Pass -msse4.2 flag to the tests the same way we do for the runtime.
Layout of some structs in the runtime headers depends on the flag
(TSAN_VECTORIZE), so we need it to be consistent across the runtime
and tests.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D110192
The file was modifiedcompiler-rt/lib/tsan/tests/CMakeLists.txt (diff)
Commit ced20c6672970ee416147b0fc8f2fb6e733acbc5 by llvm-project
[Polly] Add -polly-reschedule and -polly-postopts options.

This command line options allow to off parts of the schedule tree optimization pipeline.
The file was modifiedpolly/lib/Transform/ScheduleOptimizer.cpp (diff)
Commit db2f870fe3dcecc43c874ef571757d5aeac0569c by dvyukov
tsan: reset destination range in Java heap move

Switch Java heap move to the new scheme required for the new tsan runtime.
Instead of copying the shadow we reset the destination range.
The new v3 trace contains addresses of accesses, so we cannot simply copy the shadow.
This can lead to false negatives, but cannot lead to false positives.

Depends on D110159.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D110190
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_interface_java.cpp (diff)
The file was modifiedcompiler-rt/test/tsan/java_move_overlap_race.cpp (diff)
The file was modifiedcompiler-rt/test/tsan/java_race_move.cpp (diff)
Commit 82e593cf900d10e9968fcecdcb51e922a553c2de by dvyukov
tsan: uninline Enable/DisableIgnores

ScopedInterceptor::Enable/DisableIgnores is only used for some special cases.
Unline them from the common interceptor handling.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D110157
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_interceptors.h (diff)
Commit 4986959eb2140a58f7bcce4b616483549a68e0a2 by dvyukov
tsan: prepare for trace mapping removal

Don't test for presence of the trace mapping,
it will be removed soon.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D110194
The file was modifiedcompiler-rt/test/sanitizer_common/TestCases/Linux/decorate_proc_maps.cpp (diff)
Commit 47f79c6057764e0c83016269ae2359f8c5c8d135 by Jonas Devlieghere
[lldb] Add --stack option to `target symbols add` command

Currently you can ask the target symbols add command to locate the debug
symbols for the current frame. This patch add an options to do that for
the whole call stack.

Differential revision: https://reviews.llvm.org/D110011
The file was addedlldb/test/API/macosx/add-dsym/TestAddDsymDownload.py
The file was modifiedlldb/source/Commands/CommandObjectTarget.cpp (diff)
Commit abbb0f901ad85aaa06780deefbda9c0ee0c2c7a2 by diana.picus
[flang] Change complex type define in runtime for clang-cl

When compiling the runtime with a version of clang-cl newer than 12, we
define CMPLXF as __builtin_complex, which returns a float _Complex type.
This errors out in contexts where the result of CMPLXF is expected to be
a float_Complex_t. This is defined as _Fcomplex whenever _MSC_VER is
defined (and as float _Complex otherwise).

This patch defines float_Complex_t & friends as _Fcomplex only when
we're using "true" MSVC, and not just clang-pretending-to-be-MSVC. This
should only affect clang-cl >= 12.

Differential Revision: https://reviews.llvm.org/D110139
The file was modifiedflang/runtime/complex-reduction.c (diff)
The file was modifiedflang/runtime/complex-reduction.h (diff)
Commit e08a5dc86f1ff868a61e74bfea413889a3d5915f by flo
[InstCombine] Move InstCombineWorklist to Utils to allow reuse (NFC).

InstCombine's worklist can be re-used by other passes like
VectorCombine. Move it to llvm/Transform/Utils and rename it to
InstructionWorklist.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D110181
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineInternal.h (diff)
The file was modifiedllvm/include/llvm/Transforms/InstCombine/InstCombine.h (diff)
The file was removedllvm/include/llvm/Transforms/InstCombine/InstCombineWorklist.h
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp (diff)
The file was modifiedllvm/lib/Transforms/InstCombine/InstructionCombining.cpp (diff)
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (diff)
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (diff)
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp (diff)
The file was addedllvm/include/llvm/Transforms/Utils/InstructionWorklist.h
The file was modifiedllvm/include/llvm/Transforms/InstCombine/InstCombiner.h (diff)
Commit 7ce638538bcf323cd15ed5dfbc43013312b0e3e3 by 1.int32
[clang][ASTImporter] Generic attribute import handling (first step).

Import of Attr objects was incomplete in ASTImporter.
This change introduces support for a generic way of importing an attribute.
For an usage example import of the attribute AssertCapability is
added to ASTImporter.
Updating the old attribute import code and adding new attributes or extending
the generic functions (if needed) is future work.

Reviewed By: steakhal, martong

Differential Revision: https://reviews.llvm.org/D109608
The file was modifiedclang/unittests/AST/ASTImporterTest.cpp (diff)
The file was modifiedclang/lib/AST/ASTImporter.cpp (diff)
Commit ecd5145c27e819f95036cc0be8f22ce174f19238 by sebastian.neubauer
[Utils] Replace llc with cat for tests

Make the update_llc_test_checks script test independant of llc behavior
by using cat with static files to simulate llc output.

This allows changing llc without breaking the script test case.

The update script is executed in a temporary directory, so the
llc-generated assembly files are copied there. %T is deprecated, but it
allows copying a file with a predictable filename.

Differential Revision: https://reviews.llvm.org/D110143
The file was modifiedllvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_no_merge_comments.ll (diff)
The file was modifiedllvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_no_merge_comments.ll.expected (diff)
The file was addedllvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_no_merge_comments-O3.s
The file was modifiedllvm/test/tools/UpdateTestChecks/update_llc_test_checks/amdgpu-no-merge-comments.test (diff)
The file was addedllvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/amdgpu_no_merge_comments-O0.s
The file was modifiedllvm/utils/UpdateTestChecks/common.py (diff)
Commit 608ffc98c3b781a3da9b7222d145cade96fda14c by dvyukov
tsan: account for mid app range in mem profile

We account low and high ranges, but forgot abount the mid range.
Account mid range as well.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D110148
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp (diff)
Commit eefef56ece7e27c8746cd207e8e2d96996ea5de1 by dvyukov
tsan: include MBlock/SyncObj stats into mem profile

Include info about MBlock/SyncObj memory consumption in the memory profile.

Depends on D110148.

Reviewed By: melver, vitalybuka

Differential Revision: https://reviews.llvm.org/D110149
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_dense_alloc.h (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_sync.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_sync.h (diff)
Commit 58a157cd3b54942283a301019dceb65be2da85f7 by dvyukov
tsan: make mem profile data more consistent

We currently query number of threads before reading /proc/self/smaps.
But reading /proc/self/smaps can take lots of time for huge processes
and it's retries several times with different buffer sizes.
Overall it can take tens of seconds. This can make number of threads
significantly inconsistent with the rest of the stats.
So query it after reading /proc/self/smaps.

Depends on D110149.

Reviewed By: melver, vitalybuka

Differential Revision: https://reviews.llvm.org/D110150
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform.h (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_rtl.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform_windows.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform_mac.cpp (diff)
Commit b8aa9b0c37f4914cdd68aef5cf32fb411d2674c0 by dvyukov
tsan: include internal allocator info in mem profile

We allocate things from the internal allocator,
it's useful to know how much it consumes.

Depends on D110150.

Reviewed By: melver, vitalybuka

Differential Revision: https://reviews.llvm.org/D110151
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp (diff)
Commit e8101f2149dfcd6a915b975a1f83ac09a5cd04b9 by dvyukov
tsan: move mem profile initialization into separate function

BackgroundThread function is quite large,
move mem profile initialization into a separate function.

Depends on D110151.

Reviewed By: melver, vitalybuka

Differential Revision: https://reviews.llvm.org/D110152
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_rtl.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_rtl.h (diff)
Commit ae6d57ca5a945063a25181e3875c1fcf3787a040 by dvyukov
tsan: remove stale comment

We do query it every 100ms now.
(GetRSS was fixed to not be dead slow IIRC)

Depends on D110152.

Reviewed By: melver, vitalybuka

Differential Revision: https://reviews.llvm.org/D110153
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_rtl.cpp (diff)
Commit 0ee77d6db355215273eb78c4546321067c882ff3 by dvyukov
tsan: write uptime in mem profile

Write uptime in real time seconds for every mem profile record.
Uptime is useful to make more sense out of the profile,
compare random lines, etc.

Depends on D110153.

Reviewed By: melver, vitalybuka

Differential Revision: https://reviews.llvm.org/D110154
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform_mac.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_rtl.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform_windows.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp (diff)
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_platform.h (diff)
Commit 636fc0ef86f69229486c36f0d3c7539fef860a5a by david.green
[ARM] Add additional tests for VMOVL in tail predicated loops.
The file was addedllvm/test/CodeGen/Thumb2/mve-vmovlloop.ll
Commit 3828ea6181fd007438379de70fc7b9fc9c8dbb02 by jay.foad
[AMDGPU] Divergence-driven instruction selection for mul i32

Differential Revision: https://reviews.llvm.org/D109881
The file was modifiedllvm/test/CodeGen/AMDGPU/vgpr-liverange-ir.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/wwm-reserved-spill.ll (diff)
The file was modifiedllvm/lib/Target/AMDGPU/SOPInstructions.td (diff)
The file was modifiedllvm/lib/Target/AMDGPU/VOP3Instructions.td (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/wwm-reserved.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/urem-seteq-illegal-types.ll (diff)
Commit 0205806d0fe5706c76ae1756e9180918dd495446 by jay.foad
[AMDGPU] Convert mac/fmac to mad/fma when folding output modifiers

Use of output modifiers forces VOP3 encoding for a VOP2 mac/fmac
instruction, so we might as well convert it to the more flexible VOP3-
only mad/fma form.

With this change, the only way we should emit VOP3-encoded mac/fmac is
if regalloc chooses registers that require the VOP3 encoding, e.g. sgprs
for both src0 and src1. In all other cases the mac/fmac should either be
converted to mad/fma or shrunk to VOP2 encoding.

Differential Revision: https://reviews.llvm.org/D110156
The file was modifiedllvm/lib/Target/AMDGPU/SIFoldOperands.cpp (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/mad-mix.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/mad-mix-lo.ll (diff)
Commit ab3607c0ed92a7e39952ce22e72e778d2679876a by sander.desmalen
[AArch64][SVE] Add missing load/store patterns for unpacked bfloat vectors.

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D110063
The file was modifiedllvm/test/CodeGen/AArch64/sve-st1-addressing-mode-reg-reg.ll (diff)
The file was modifiedllvm/test/CodeGen/AArch64/sve-masked-ldst-nonext.ll (diff)
The file was modifiedllvm/lib/Target/AArch64/AArch64SVEInstrInfo.td (diff)
The file was modifiedllvm/test/CodeGen/AArch64/sve-ld1-addressing-mode-reg-reg.ll (diff)
Commit 300870a95c22fde840862cf57d82adba3e5bd633 by flo
[VectorCombine] Switch to using a worklist.

This patch updates VectorCombine to use a worklist to allow iterative
simplifications where a combine enables other combines.

Suggested in D100302.

The main use case at the moment is foldSingleElementStore and
scalarizeLoadExtract working together to improve scalarization.

Note that we now also do not run SimplifyInstructionsInBlock on the
whole function if there have been changes. This means we fail to
remove/simplify instructions not related to any of the vector combines.
IMO this is fine, as simplifying the whole function seems more like a
workaround for not tracking the changed instructions.

Compile-time impact looks neutral:
NewPM-O3: +0.02%
NewPM-ReleaseThinLTO: -0.00%
NewPM-ReleaseLTO-g: -0.02%

http://llvm-compile-time-tracker.com/compare.php?from=52832cd917af00e2b9c6a9d1476ba79754dcabff&to=e66520a4637290550a945d528e3e59573485dd40&stat=instructions

Reviewed By: spatel, lebedev.ri

Differential Revision: https://reviews.llvm.org/D110171
The file was modifiedllvm/lib/Transforms/Vectorize/VectorCombine.cpp (diff)
The file was modifiedllvm/test/Transforms/PhaseOrdering/AArch64/matrix-extract-insert.ll (diff)
The file was modifiedllvm/test/Transforms/VectorCombine/AArch64/load-extract-insert-store-scalarization.ll (diff)
The file was modifiedllvm/test/Transforms/VectorCombine/X86/extract-binop-inseltpoison.ll (diff)
The file was modifiedllvm/test/Transforms/VectorCombine/load-insert-store.ll (diff)
The file was modifiedllvm/test/Transforms/VectorCombine/X86/extract-binop.ll (diff)
Commit 41492d77ba65338b9eb2b7f401e47acf22e4ea19 by llvm-dev
[LoopVectorize][X86] Add operands to make it more obvious what line the CHECK concerns

As we're checking the cost debug analysis these should match the original IR line - so we shouldn't have any variable naming issues.

I'm investigating v4i32 mul -> PMADDDW costs handling (for PR47437) and these CHECK lines were proving tricky to keep track of
The file was modifiedllvm/test/Transforms/LoopVectorize/X86/mul_slm_16bit.ll (diff)
Commit 4ca1fbe361860976646ad09da26757bf32563145 by sander.desmalen
[SelectionDAG] Make WidenVecRes_Convert work for scalable vectors.

Most of the code wasn't yet scalable safe, although most of the
code conceptually just works for scalable vectors. This change
makes the algorithm work on ElementCount, where appropriate,
and leaves the fixed-width only code to use `getFixedNumElements`.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D110058
The file was modifiedllvm/test/CodeGen/AArch64/sve-fcvt.ll (diff)
The file was modifiedllvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (diff)
Commit 36daf074d997a79f25a1de2a1b869170ea6c20cc by fmayer
[hwasan] also omit safe mem[cpy|mov|set].

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D109816
The file was modifiedllvm/lib/Analysis/StackSafetyAnalysis.cpp (diff)
The file was modifiedllvm/test/Analysis/StackSafetyAnalysis/local.ll (diff)
The file was modifiedllvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp (diff)
The file was modifiedllvm/test/Analysis/StackSafetyAnalysis/ipa.ll (diff)
The file was modifiedllvm/test/Analysis/StackSafetyAnalysis/ipa-alias.ll (diff)
The file was modifiedllvm/test/Instrumentation/HWAddressSanitizer/mem-intrinsics.ll (diff)
The file was modifiedllvm/test/Instrumentation/HWAddressSanitizer/stack-safety-analysis.ll (diff)
The file was modifiedllvm/include/llvm/Analysis/StackSafetyAnalysis.h (diff)
Commit d0746f2e9bbf08f52196ae12f25d0ef7edcbbe4c by yikong
Don't fold (select C, (gep Ptr, Idx), Ptr) if C is vector but Idx is scalar

The folding rule (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C,
Idx, 0)) creates a malformed SELECT IR if C is a vector while Idx is scalar.

  SELECT VecC, ScalarIdx, 0

We could splat Idx to a vector but it defeats the purpose of
optimisation. Don't apply the folding rule in this case.

This fixes a regression from commit d561b6fbdbe6d1da05fd92003a4ac1e37bf4b8bc.
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (diff)
The file was modifiedllvm/test/Transforms/InstCombine/select-gep.ll (diff)
Commit a5e1c746b870d79142419a07a8aecc471eacfed1 by Raphael Isemann
Unbreak module builds by making InstructionWorklist.h non-modular

This regressed in D110181 and apparently the header intentionally requires
DEBUG_TYPE to be defined by the including file. Just exclude the header from
the module to unbreak the build.
The file was modifiedllvm/include/llvm/module.modulemap (diff)
Commit 02cd8a6b915a9dab32fdd91167f875ce5f67ebd4 by david.green
[ARM] Allow smaller VMOVL in tail predicated loops

This allows VMOVL in tail predicated loops so long as the the vector
size the VMOVL is extending into is less than or equal to the size of
the VCTP in the tail predicated loop. These cases represent a
sign-extend-inreg (or zero-extend-inreg), which needn't block tail
predication as in https://godbolt.org/z/hdTsEbx8Y.

For this a vecsize has been added to the TSFlag bits of MVE
instructions, which stores the size of the elements that the MVE
instruction operates on. In the case of multiple size (such as a
MVE_VMOVLs8bh that extends from i8 to i16, the largest size was be
chosen). The sizes are encoded as 00 = i8, 01 = i16, 10 = i32 and 11 =
i64, which often (but not always) comes from the instruction encoding
directly. A unit test was added, and although only a subset of the
vecsizes are currently used, the rest should be useful for other cases.

Differential Revision: https://reviews.llvm.org/D109706
The file was modifiedllvm/test/CodeGen/Thumb2/mve-vmovlloop.ll (diff)
The file was modifiedllvm/unittests/Target/ARM/MachineInstrTest.cpp (diff)
The file was modifiedllvm/lib/Target/ARM/ARMInstrMVE.td (diff)
The file was modifiedllvm/lib/Target/ARM/MCTargetDesc/ARMBaseInfo.h (diff)
The file was modifiedllvm/lib/Target/ARM/ARMLowOverheadLoops.cpp (diff)
The file was modifiedllvm/lib/Target/ARM/ARMInstrFormats.td (diff)
Commit 9f34f75ff8f49b0efca6e20d916527a2c432d8b4 by martin
[lldb] [Windows] Fix continuing from breakpoints and singlestepping on ARM/AArch64

Based on suggestions by Eric Youngdale.

This fixes https://llvm.org/PR51673.

Differential Revision: https://reviews.llvm.org/D109777
The file was modifiedlldb/source/Plugins/Process/Windows/Common/NativeProcessWindows.cpp (diff)
The file was modifiedlldb/source/Plugins/Process/Windows/Common/NativeThreadWindows.cpp (diff)
The file was modifiedlldb/source/Plugins/Process/Windows/Common/TargetThreadWindows.cpp (diff)
The file was modifiedlldb/source/Plugins/Platform/Windows/PlatformWindows.h (diff)
The file was modifiedlldb/source/Plugins/Process/Windows/Common/NativeProcessWindows.h (diff)
The file was modifiedlldb/source/Plugins/Process/Windows/Common/ProcessWindows.cpp (diff)
The file was modifiedlldb/source/Plugins/Platform/Windows/PlatformWindows.cpp (diff)
Commit ea21d688dc0a420b9fc385562a46017fb39b13e5 by flo
[Matrix] Emit assumption that matrix indices are valid.

The matrix extension requires the indices for matrix subscript
expression to be valid and it is UB otherwise.

extract/insertelement produce poison if the index is invalid, which
limits the optimizer to not be bale to scalarize load/extract pairs for
example, which causes very suboptimal code to be generated when using
matrix subscript expressions with variable indices for large matrixes.

This patch updates IRGen to emit assumes to for index expression to
convey the information that the index must be valid.

This also adjusts the order in which operations are emitted slightly, so
indices & assumes are added before the load of the matrix value.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D102478
The file was modifiedllvm/include/llvm/IR/MatrixBuilder.h (diff)
The file was modifiedclang/test/CodeGenObjC/matrix-type-operators.m (diff)
The file was modifiedclang/test/CodeGen/matrix-type-operators.c (diff)
The file was modifiedclang/test/CodeGenCXX/matrix-type-operators.cpp (diff)
The file was modifiedclang/lib/CodeGen/CGExpr.cpp (diff)
The file was modifiedclang/lib/CodeGen/CGExprScalar.cpp (diff)
Commit 1ee851c5859fdb36eca57a46347a1e7b8e1ff236 by spatel
Revert "[CodeGen] regenerate test checks; NFC"

This reverts commit 52832cd917af00e2b9c6a9d1476ba79754dcabff.
The motivating commit 2f6b07316f5 caused several bots to hit
an infinite loop at stage 2, so that needs to be reverted too
while figuring out how to fix that.
The file was modifiedclang/test/CodeGen/aapcs-bitfield.c (diff)
Commit c6013f71a4555f6d9ef9c60e6bc4376ad63f1c47 by spatel
Revert "[InstCombine] fold cast of right-shift if high bits are not demanded"

This reverts commit 2f6b07316f560a1f6d225919019dff2e5d6346e5.

This caused several bots to hit an infinite loop at stage 2,
so it needs to be reverted while figuring out how to fix that.
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp (diff)
The file was modifiedllvm/test/Transforms/InstCombine/trunc-demand.ll (diff)
Commit a7c6471a85380f5af644e50daf2951b41c82f1b2 by flo
[Passes] Run vector-combine early with -fenable-matrix.

IR with matrix intrinsics is likely to also contain large vector
operations, which can benefit from early simplifications.

This is the last step in a series of changes to improve code-gen for
code using matrix subscript operators with the C/C++ matrix extension in
CLang, like

    using matrix_t = double __attribute__((matrix_type(15, 15)));

    void foo(unsigned i, matrix_t &A, matrix_t &B) {
      for (unsigned j = 0; j < 4; ++j)
        for (unsigned k = 0; k < i; k++)
          B[k][j] -= A[k][j] * B[i][j];
    }

https://clang.godbolt.org/z/6dKxK1Ed7

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D102496
The file was modifiedllvm/lib/Transforms/IPO/PassManagerBuilder.cpp (diff)
The file was modifiedllvm/test/Other/new-pm-defaults.ll (diff)
The file was modifiedllvm/test/Transforms/PhaseOrdering/AArch64/matrix-extract-insert.ll (diff)
The file was modifiedllvm/lib/Passes/PassBuilderPipelines.cpp (diff)
Commit c828b93fb367b67d5e6342fee179a93970ba71ec by thakis
[gn build] (manually) port f8b1cc365786
The file was modifiedllvm/utils/gn/secondary/libcxxabi/src/BUILD.gn (diff)
Commit f099ac838e6bce8b743a71c2fc46c1699eae8dc3 by llvmgnsyncbot
[gn build] Port 7a320b279d07
The file was modifiedllvm/utils/gn/secondary/libcxx/include/BUILD.gn (diff)
Commit d5681f1d688a45c000dd1e2c4f4d3678e0440b94 by sander.desmalen
[SelectionDAG] Add PromoteIntOp_INSERT_SUBVECTOR.

This is required to codegen something like:
  <vscale x 8 x i16> @llvm.experimental.vector.insert(<vscale x 8 x i16> %vec,
                                                      <vscale x 2 x i16> %subvec,
                                                      i64 %idx)
where the output vector is legal, but the input vector needs promoting.

It implements this by performing the whole operation on the promoted type,
and then truncating the result.

Reviewed By: david-arm, craig.topper

Differential Revision: https://reviews.llvm.org/D110059
The file was modifiedllvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp (diff)
The file was modifiedllvm/test/CodeGen/AArch64/sve-insert-vector.ll (diff)
The file was modifiedllvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h (diff)
Commit 9689c1b7bb77d65e8acc9a13e5e416803d38b02f by Stefan Gränitz
[lldb] JITLoaderGDB tests can use lli in ORC greedy mode

At first, lli only supported lazy mode for ORC. Greedy mode was added with e1579894d205 and is the default settings now. JITLoaderGDB tests don't rely on laziness, so we can switch them to greedy and remove some complexity.
The file was modifiedlldb/test/Shell/Breakpoint/jit-loader_rtdyld_elf.test (diff)
The file was modifiedlldb/test/Shell/Breakpoint/jit-loader_jitlink_elf.test (diff)