SuccessChanges

Summary

  1. [lldb] Fix crash in "help memory read" (details)
  2. [ARM][MachineOutliner] Add stack fixup feature (details)
  3. [lldb] Re-enable TestPlatformProcessConnect on macos (details)
  4. [LLDB] Add support to resize SVE registers at run-time (details)
  5. [LLDB] Test SVE dynamic resize with multiple threads (details)
  6. [LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands. (details)
Commit 9a7672ac4980bca8829814e1e49e1c201a5bf9b6 by david.spickett
[lldb] Fix crash in "help memory read"

When a command option does not have a short version
(e.g. -f for --file), we use an arbitrary value in the
short_option field to mark it as invalid.
(though this value is unqiue to be used later for other
things)

We check that this short option is valid to print using
llvm::isPrint. This implicitly casts our int to char,
meaning we check the last char of any short_option value.

Since the arbitrary value we chose for these options is
some shortened hex version of the name, this returned true
even for invalid values.

Since llvm::isPrint returns true we later call std::islower
and/or std::isupper on the short_option value. (the int)

Calling these functions with something that cannot be validly
converted to unsigned char is undefined. Somehow we got/get
away with this but for me compiling with g++-9 I got a crash
for "help memory read".

The other command that uses this is "target variable" but that
didn't crash for unknown reasons.

Checking that short_option can fit into an unsigned char before
we call llvm::isPrint means we will not attempt to call islower/upper
on these options since we have no reason to print them.

This also fixes bogus short options being shown for "memory read"
and target variable.

For "target variable", before:
       -e <filename> ( --file <filename> )
       -b <filename> ( --shlib <filename> )
After:
       --file <filename>
       --shlib <filename>

(note that the bogus short options are just the bottom byte of our
arbitrary short_option value)

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D94917
The file was modifiedlldb/test/API/commands/help/TestHelp.py
The file was modifiedlldb/include/lldb/Utility/OptionDefinition.h
Commit 244ad228f34363b508cd1096c99d8f1bbe999d85 by yvan.roux
[ARM][MachineOutliner] Add stack fixup feature

This patch handles cases where we have to save/restore the link register
into the stack and and load/store instruction which use the stack are
part of the outlined region. It checks that there will be no overflow
introduced by the new offset and fixup these instructions accordingly.

Differential Revision: https://reviews.llvm.org/D92934
The file was modifiedllvm/test/CodeGen/ARM/machine-outliner-default.mir
The file was modifiedllvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.generated.expected
The file was modifiedllvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll.nogenerated.expected
The file was modifiedllvm/test/CodeGen/ARM/machine-outliner-no-lr-save.mir
The file was modifiedllvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/arm_generated_funcs.ll
The file was addedllvm/test/CodeGen/ARM/machine-outliner-stack-fixup-arm.mir
The file was modifiedllvm/lib/Target/ARM/ARMBaseInstrInfo.h
The file was addedllvm/test/CodeGen/ARM/machine-outliner-stack-fixup-thumb.mir
The file was modifiedllvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
Commit 079e664661770a78e30c0d27a12d50047f1b1ea8 by pavel
[lldb] Re-enable TestPlatformProcessConnect on macos

The test couldn't find lldb-server as it's path was being overridden by
LLDB_DEBUGSERVER_PATH environment variable (pointing to debugserver).
This test should always use lldb-server, as it tests its platform
capabilities.

There's no need for the environment override, as lldb-server tests
should test the executable they just built, so I just remote the
override capability.
The file was modifiedlldb/packages/Python/lldbsuite/test/tools/lldb-server/lldbgdbserverutils.py
The file was modifiedlldb/test/API/tools/lldb-server/platform-process-connect/TestPlatformProcessConnect.py
Commit e448ad787e16119f8db8cc6999896e678a0356ac by omair.javaid
[LLDB] Add support to resize SVE registers at run-time

This patch builds on previously submitted SVE patches regarding expedited
register set and per thread register infos. (D82853 D82855 and D82857)

We need to resize SVE register based on value received in expedited list.
Also we need to resize SVE registers when we write vg register using
register write vg command. The resize will result in a updated offset
for all of fpr and sve register set. This offset will be configured
in native register context by RegisterInfoInterface and will also be
be updated on client side in GDBRemoteRegisterContext.

A follow up patch will provide a API test to verify this change.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D82863
The file was modifiedlldb/source/Plugins/Process/Utility/DynamicRegisterInfo.cpp
The file was modifiedlldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext.h
The file was modifiedlldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
The file was modifiedlldb/source/Plugins/Process/gdb-remote/GDBRemoteRegisterContext.cpp
The file was modifiedlldb/source/Plugins/Process/Utility/DynamicRegisterInfo.h
The file was modifiedlldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
Commit 4d3081331ad854e0bff5032c818ec6414fb974c0 by omair.javaid
[LLDB] Test SVE dynamic resize with multiple threads

This patch adds a new test case which depends on AArch64 SVE support and
dynamic resize capability enabled. It created two seperate threads which
have different values of sve registers and SVE vector granule at various
points during execution.

We test that LLDB is doing the size and offset updates properly for all
of the threads including the main thread and when we VG is updated using
prctl call or by 'register write vg' command the appropriate changes are
also update in register infos.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D82866
The file was addedlldb/test/API/commands/register/register/aarch64_sve_registers/rw_access_dynamic_resize/Makefile
The file was addedlldb/test/API/commands/register/register/aarch64_sve_registers/rw_access_dynamic_resize/TestSVEThreadedDynamic.py
The file was addedlldb/test/API/commands/register/register/aarch64_sve_registers/rw_access_dynamic_resize/main.c
Commit 83daa49758a12d585fe2d9a64448e54d91bcfaff by flo
[LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands.

D84108 exposed a bad interaction between inlining and loop-rotation
during regular LTO, which is causing notable regressions in at least
CINT2006/473.astar.

The problem boils down to: we now rotate a loop just before the vectorizer
which requires duplicating a function call in the preheader when compiling
the individual files ('prepare for LTO'). But this then prevents further
inlining of the function during LTO.

This patch tries to resolve this issue by making LoopRotate more
conservative with respect to rotating loops that have inline-able calls
during the 'prepare for LTO' stage.

I think this change intuitively improves the current situation in
general. Loop-rotate tries hard to avoid creating headers that are 'too
big'. At the moment, it assumes all inlining already happened and the
cost of duplicating a call is equal to just doing the call. But with LTO,
inlining also happens during full LTO and it is possible that a previously
duplicated call is actually a huge function which gets inlined
during LTO.

From the perspective of LV, not much should change overall. Most loops
calling user-provided functions won't get vectorized to start with
(unless we can infer that the function does not touch memory, has no
other side effects). If we do not inline the 'inline-able' call during
the LTO stage, we merely delayed loop-rotation & vectorization. If we
inline during LTO, chances should be very high that the inlined code is
itself vectorizable or the user call was not vectorizable to start with.

There could of course be scenarios where we inline a sufficiently large
function with code not profitable to vectorize, which would have be
vectorized earlier (by scalarzing the call). But even in that case,
there probably is no big performance impact, because it should be mostly
down to the cost-model to reject vectorization in that case. And then
the version with scalarized calls should also not be beneficial. In a way,
LV should have strictly more information after inlining and make more
accurate decisions (barring cost-model issues).

There is of course plenty of room for things to go wrong unexpectedly,
so we need to keep a close look at actual performance and address any
follow-up issues.

I took a look at the impact on statistics for
MultiSource/SPEC2000/SPEC2006. There are a few benchmarks with fewer
loops rotated, but no change to the number of loops vectorized.

Reviewed By: sanwou01

Differential Revision: https://reviews.llvm.org/D94232
The file was modifiedllvm/lib/Transforms/Utils/LoopRotationUtils.cpp
The file was modifiedllvm/include/llvm/Analysis/CodeMetrics.h
The file was modifiedllvm/test/Transforms/LoopRotate/call-prepare-for-lto.ll
The file was modifiedllvm/lib/Analysis/CodeMetrics.cpp
The file was modifiedllvm/include/llvm/Transforms/Utils/LoopRotationUtils.h
The file was modifiedllvm/lib/Passes/PassBuilder.cpp
The file was modifiedllvm/include/llvm/Transforms/Scalar/LoopRotation.h
The file was modifiedllvm/lib/Transforms/Scalar/LoopRotation.cpp
The file was modifiedllvm/lib/Transforms/IPO/PassManagerBuilder.cpp
The file was modifiedllvm/include/llvm/Transforms/Scalar.h