SuccessChanges

Summary

  1. [MSAN RT] Use __sanitizer::mem_is_zero in __msan_test_shadow (details)
  2. [WebAssembly][ConstantFolding] Fold fp-to-int truncation intrinsics (details)
  3. [SampleFDO] Stop letting findCalleeFunctionSamples return unrelated profiles (details)
  4. [MachineOutliner][AArch64] WA for multiple stack fixup cases in MachineOutliner. (details)
  5. [XCOFF][AIX] Use TE storage mapping class when large code model is enabled (details)
Commit c0b5000bd848303320c03f80fbf84d71e74518c9 by guiand
[MSAN RT] Use __sanitizer::mem_is_zero in __msan_test_shadow

The former function is particularly optimized for exactly the
use case we're interested in: an all-zero buffer.

This reduces the overhead of calling this function some 80% or
more. This is particularly for instrumenting code heavy with
string processing functions, like grep. An invocation of grep
with the pattern '[aeiou]k[aeiou]' has its runtime reduced by
~75% with this patch

Differential Revision: https://reviews.llvm.org/D84961
The file was modifiedcompiler-rt/lib/msan/msan.cpp
Commit 514445e0353e82fa0bd59eeea437499500e232cd by tlively
[WebAssembly][ConstantFolding] Fold fp-to-int truncation intrinsics

Constant fold both the trapping and saturating versions of the
WebAssembly truncation intrinsics. The tests are adapted from the
WebAssembly spec tests for the corresponding instructions.

Requested in PR46982.

Differential Revision: https://reviews.llvm.org/D85392
The file was addedllvm/test/Analysis/ConstantFolding/WebAssembly/trunc_saturate.ll
The file was modifiedllvm/lib/Analysis/ConstantFolding.cpp
The file was addedllvm/test/Analysis/ConstantFolding/WebAssembly/trunc.ll
Commit 4cd8e9b169f4dc5dde19807585c86f6d6113d3ff by wmi
[SampleFDO] Stop letting findCalleeFunctionSamples return unrelated profiles
for invoke instructions.

We see a warning of "No debug information found in function foo: Function
profile not used" in a case. The function foo is called by an invoke
instruction. It has no debug information because it has attribute((nodebug))
in the definition. It shouldn't have profile instance in the sample profile
but compiler thinks it does, that turns out to be a compiler bug in
findCalleeFunctionSamples. The bug is exposed when sample-profile-merge-inlinee
is enabled recently.

Currently in findCalleeFunctionSamples, CalleeName is unset and is empty for
invoke instruction. For empty CalleeName, findFunctionSamplesAt will treat
the call as an indirect call and will return any inline instance profile at
the same location as the instruction. That leads to a wrong profile being
returned to function foo.

The patch set CalleeName when the instruction is an invoke.

Differential Revision: https://reviews.llvm.org/D85664
The file was modifiedllvm/lib/Transforms/IPO/SampleProfile.cpp
The file was addedllvm/test/Transforms/SampleProfile/nodebug-error.ll
Commit 7bc03f55539f7f081daea5363f2e4845b2e75f57 by puyan
[MachineOutliner][AArch64] WA for multiple stack fixup cases in MachineOutliner.

In cases where MachineOutliner candidates either are:

  * noreturn
  * have calls with no available LR or free regs
  * Don't use SP

we can end up hitting stack fixup code for the caller and the callee for
a FrameID of MachineOutlinerDefault. This triggers the assert:

  `assert(OF.FrameConstructionID != MachineOutlinerDefault &&
          "Can only fix up stack references once");`

in AArch64InstrInfo.cpp. This assert exists for now because a lot of the
fixup code is not tested to handle fixing up more than once and needs
some better checks and enhancements to avoid potentially generating
illegal code.

I've filed a Bugzilla report to track this until these cases are handled
by the AArch64 MachineOutliner: https://bugs.llvm.org/show_bug.cgi?id=46767

This diff detects cases that will cause these multiple stack fixups and
prune the Candidates from `RepeatedSequenceLocs`.

    Differential Revision: https://reviews.llvm.org/D83923
The file was addedllvm/test/CodeGen/AArch64/machine-outliner-noreturn-no-stack.mir
The file was addedllvm/test/CodeGen/AArch64/machine-outliner-no-noreturn-no-stack.mir
The file was modifiedllvm/lib/Target/AArch64/AArch64InstrInfo.cpp
The file was addedllvm/test/CodeGen/AArch64/machine-outliner-2fixup-blr-terminator.mir
Commit 20abff0481d598c850d2690083f90700fc8c9603 by jasonliu
[XCOFF][AIX] Use TE storage mapping class when large code model is enabled

Summary:
Use TE SMC instead of TC SMC in large code model mode,
so that large code model TOC entries could get placed after all
the small code model TOC entries, which reduces the chance of TOC overflow.

Reviewed By: Xiangling_L

Differential Revision: https://reviews.llvm.org/D85455
The file was modifiedllvm/test/CodeGen/PowerPC/aix-lower-block-address.ll
The file was modifiedllvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
The file was modifiedllvm/lib/Target/PowerPC/MCTargetDesc/PPCMCTargetDesc.cpp
The file was modifiedllvm/include/llvm/Target/TargetLoweringObjectFile.h
The file was modifiedllvm/lib/MC/MCSectionXCOFF.cpp
The file was modifiedllvm/test/CodeGen/PowerPC/aix-lower-jump-table.ll
The file was modifiedllvm/test/CodeGen/PowerPC/lower-globaladdr64-aix-asm.ll
The file was modifiedllvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
The file was modifiedllvm/test/CodeGen/PowerPC/lower-globaladdr32-aix-asm.ll
The file was modifiedllvm/test/CodeGen/PowerPC/aix-lower-constant-pool-index.ll
The file was modifiedllvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h