SuccessChanges

Summary

  1. [XCOFF][AIX] Support relocation generation for large code model (details)
  2. [ThinLTO][Legacy] Compute PreservedGUID based on IRName in Symtab (details)
  3. [llvm] [DAG] Fix bug in llvm.get.active.lane.mask lowering (details)
  4. Revert "Fix frame pointer layout on AArch64 Linux." (details)
  5. [NFC][InstCombine] Add tests with PHI-of-{insert,extract}value with multiple uses (details)
  6. [NFC][Value] Fixup comments, "N users" is NOT the same as "N uses". (details)
  7. [Value][InstCombine] Fix one-use checks in PHI-of-op -> Op-of-PHI[s] transforms to be one-user checks (details)
  8. [clang] Exclude invalid destructors from lookups. (details)
  9. [X86] Add assembler support for .d32 and .d8 mnemonic suffixes to control displacement size. (details)
  10. [IR] Add NoUndef attribute to Intrinsics.td (details)
  11. [SampleFDO] Enhance profile remapping support for searching inline instance (details)
  12. AMDGPU: Don't assert on misaligned DS read2/write2 offsets (details)
  13. [Hexagon] Implement llvm.masked.load and llvm.masked.store for HVX (details)
  14. [SVE] Remove calls to VectorType::getNumElements from clang (details)
  15. [Polly] Use llvm::function_ref. NFC. (details)
  16. [Polly] Inline ShoulDelete lambda. NFC. (details)
  17. [LTO] Don't apply LTOPostLink module flag during writeMergedModule (details)
  18. [MC][SVE] Fix data operand for instruction alias of `st1d`. (details)
  19. [gn build] Manually port ed07e1fe (details)
  20. [InstSimplify] Simplify to vector constants when possible (details)
  21. Add cmake test support for LLJITWithThinLTOSummaries to make sure (details)
  22. [mlir] NFC: fix trivial typos in documents (details)
  23. [OpenMP] Fix import library installation with MinGW (details)
  24. [libomptarget][amdgpu] Update plugin CMake to work with latest rocr library (details)
  25. [Support][Windows] Fix incorrect GetFinalPathNameByHandleW() return value check in realPathFromHandle() (details)
  26. [llvm] [Thumb2] Test unusual length for active lane mask (details)
Commit 413054400d949ddd15e9bfdcb587502ea0311fcf by jasonliu
[XCOFF][AIX] Support relocation generation for large code model

Summary:
Support TOCU and TOCL relocation type for object file generation.

Reviewed by: DiggerLin

Differential Revision: https://reviews.llvm.org/D84549
The file was modifiedllvm/lib/Target/PowerPC/MCTargetDesc/PPCXCOFFObjectWriter.cpp
The file was modifiedllvm/lib/MC/XCOFFObjectWriter.cpp
The file was addedllvm/test/CodeGen/PowerPC/aix-xcoff-reloc-large.ll
Commit 34b289b6dbcf1cdb328ab0a13cdedf96701394af by Steven Wu
[ThinLTO][Legacy] Compute PreservedGUID based on IRName in Symtab

Instead of computing GUID based on some assumption about symbol mangling
rule from IRName to symbol name, lookup the IRName from all the symtabs
from all the input files to see if there are any matching symbols entry
provides the IRName for GUID computation.

rdar://65853754

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D84803
The file was addedllvm/test/ThinLTO/X86/mangled_symbol.ll
The file was modifiedllvm/lib/LTO/ThinLTOCodeGenerator.cpp
The file was modifiedllvm/test/ThinLTO/X86/weak_resolution.ll
The file was modifiedllvm/test/ThinLTO/X86/internalize.ll
The file was modifiedllvm/test/ThinLTO/X86/weak_resolution_single.ll
Commit 72305a08ffcb2da10a33732adfaa8757ba70904f by ajcbik
[llvm] [DAG] Fix bug in llvm.get.active.lane.mask lowering

This intrinsic only accepted proper machine vector lengths.
Fixed by this change. With unit tests.

https://bugs.llvm.org/show_bug.cgi?id=47299

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D86585
The file was addedllvm/test/CodeGen/X86/pr47299.ll
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Commit 9061eb8245cc1ae25e8f7865062d1d8e44406994 by resistor
Revert "Fix frame pointer layout on AArch64 Linux."

This broke stage2 of clang-cmake-aarch64-full.

This reverts commit a0aed80b22d1b698b86e0c16109fdfd4d592756f.
The file was modifiedllvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
The file was modifiedllvm/lib/Target/AArch64/AArch64FrameLowering.cpp
Commit 8bfe46dce22266e596370eac86b1aae799300e7e by lebedev.ri
[NFC][InstCombine] Add tests with PHI-of-{insert,extract}value with multiple uses

It is fine if the operation has multiple uses, as long as they are all
in this very PHI node.
The file was modifiedllvm/test/Transforms/InstCombine/phi-of-extractvalues.ll
The file was modifiedllvm/test/Transforms/InstCombine/phi-of-insertvalues.ll
Commit c07a430bd39cccb64712ddcba85254a5bb1cd89b by lebedev.ri
[NFC][Value] Fixup comments, "N users" is NOT the same as "N uses".

In those cases, it really means "N uses".
The file was modifiedllvm/include/llvm/IR/Value.h
Commit 95848ea101274b8bd774c63bad55f21a08080705 by lebedev.ri
[Value][InstCombine] Fix one-use checks in PHI-of-op -> Op-of-PHI[s] transforms to be one-user checks

As FIXME said, they really should be checking for a single user,
not use, so let's do that. It is not *that* unusual to have
the same value as incoming value in a PHI node, not unlike
how a PHI may have the same incoming basic block more than once.

There isn't a nice way to do that, Value::users() isn't uniqified,
and Value only tracks it's uses, not Users, so the check is
potentially costly since it does indeed potentially involes
traversing the entire use list of a value.
The file was modifiedllvm/test/Transforms/InstCombine/phi-aware-aggregate-reconstruction.ll
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
The file was modifiedllvm/test/Transforms/PGOProfile/chr.ll
The file was modifiedllvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll
The file was modifiedllvm/include/llvm/IR/Value.h
The file was modifiedllvm/lib/IR/Value.cpp
The file was modifiedllvm/test/Transforms/InstCombine/phi-of-extractvalues.ll
The file was modifiedllvm/test/Transforms/InstCombine/phi-of-insertvalues.ll
Commit eed0af6179ca4fe9e60121e0829ed8d3849b1ce5 by adamcz
[clang] Exclude invalid destructors from lookups.

This fixes a crash when declaring a destructor with a wrong name, then
writing result to pch file and loading it again. The PCH storage uses
DeclarationNameKey as key and it is the same key for both the invalid
destructor and the implicit one that was created because the other one
was invalid. When querying for the Foo::~Foo we end up getting
Foo::~Bar, which is then rejected and we end up with nullptr in
CXXRecordDecl::GetDestructor().

Fixes https://bugs.llvm.org/show_bug.cgi?id=47270

Differential Revision: https://reviews.llvm.org/D86624
The file was addedclang/test/PCH/cxx-invalid-destructor.h
The file was modifiedclang/lib/AST/DeclBase.cpp
The file was addedclang/test/PCH/cxx-invalid-destructor.cpp
Commit 09288bcbf5f124a2b0e24a6b1f2d27b66dba9adf by craig.topper
[X86] Add assembler support for .d32 and .d8 mnemonic suffixes to control displacement size.

This is an older syntax than the {disp32} and {disp8} pseudo
prefixes that were added a few weeks ago. We can reuse most of
the support for that to support .d32 and .d8 as well.
The file was modifiedllvm/test/MC/X86/x86-64.s
The file was modifiedllvm/docs/ReleaseNotes.rst
The file was modifiedllvm/test/MC/X86/x86-32.s
The file was modifiedllvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
Commit 684b43c0cfb1092a65c237b39d0662bfe0a2c97a by aqjune
[IR] Add NoUndef attribute to Intrinsics.td

This patch adds NoUndef to Intrinsics.td.
The attribute is attached to llvm.assume's operand, because llvm.assume(undef)
is UB.
It is attached to pointer operands of several memory accessing intrinsics
as well.

This change makes ValueTracking::getGuaranteedNonPoisonOps' intrinsic check
unnecessary, so it is removed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86576
The file was modifiedllvm/utils/TableGen/CodeGenTarget.cpp
The file was modifiedllvm/utils/TableGen/CodeGenIntrinsics.h
The file was modifiedllvm/test/Transforms/EarlyCSE/invariant.start.ll
The file was modifiedmlir/test/Target/llvmir-intrinsics.mlir
The file was modifiedllvm/lib/Analysis/ValueTracking.cpp
The file was modifiedllvm/utils/TableGen/IntrinsicEmitter.cpp
The file was modifiedllvm/include/llvm/IR/Intrinsics.td
Commit c67ccf5fafc8c035f152ce30115bbdacf23530d5 by wmi
[SampleFDO] Enhance profile remapping support for searching inline instance
and indirect call promotion candidate.

Profile remapping is a feature to match a function in the module with its
profile in sample profile if the function name and the name in profile look
different but are equivalent using given remapping rules. This is a useful
feature to keep the performance stable by specifying some remapping rules
when sampleFDO targets are going through some large scale function signature
change.

However, currently profile remapping support is only valid for outline
function profile in SampleFDO. It cannot match a callee with an inline
instance profile if they have different but equivalent names. We found
that without the support for inline instance profile, remapping is less
effective for some large scale change.

To add that support, before any remapping lookup happens, all the names
in the profile will be inserted into remapper and the Key to the name
mapping will be recorded in a map called NameMap in the remapper. During
name lookup, a Key will be returned for the given name and it will be used
to extract an equivalent name in the profile from NameMap. So with the help
of the NameMap, we can translate any given name to an equivalent name in
the profile if it exists. Whenever we try to match a name in the module to
a name in the profile, we will try the match with the original name first,
and if it doesn't match, we will use the equivalent name got from remapper
to try the match for another time. In this way, the patch can enhance the
profile remapping support for searching inline instance and searching
indirect call promotion candidate.

In a planned large scale change of int64 type (long long) to int64_t (long),
we found the performance of a google internal benchmark degraded by 2% if
nothing was done. If existing profile remapping was enabled, the performance
degradation dropped to 1.2%. If the profile remapping with the current patch
was enabled, the performance degradation further dropped to 0.14% (Note the
experiment was done before searching indirect call promotion candidate was
added. We hope with the remapping support of searching indirect call promotion
candidate, the degradation can drop to 0% in the end. It will be evaluated
post commit).

Differential Revision: https://reviews.llvm.org/D86332
The file was modifiedllvm/include/llvm/ProfileData/SampleProfReader.h
The file was addedllvm/test/Transforms/SampleProfile/Inputs/remap-2.prof
The file was addedllvm/test/Transforms/SampleProfile/remap-2.ll
The file was modifiedllvm/lib/ProfileData/SampleProfReader.cpp
The file was modifiedllvm/unittests/ProfileData/SampleProfTest.cpp
The file was modifiedllvm/include/llvm/ProfileData/SampleProf.h
The file was modifiedllvm/lib/ProfileData/SampleProf.cpp
The file was modifiedllvm/lib/Transforms/IPO/SampleProfile.cpp
Commit f78687df9b790b4f4177a72cbd25b49d14c437b4 by Matthew.Arsenault
AMDGPU: Don't assert on misaligned DS read2/write2 offsets

This would assert with unaligned DS access enabled. The offset may not
be aligned. Theoretically the pattern predicate should check the
memory alignment, although it is possible to have the memory be
aligned but not the immediate offset.

In this case I would expect it to use ds_{read|write}_b64 with
unaligned access, but am not clear if there's a reason it doesn't.
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/ds_read2.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/ds_write2.ll
Commit e15143d31bca3973db51714af6361f3e77a9e058 by kparzysz
[Hexagon] Implement llvm.masked.load and llvm.masked.store for HVX
The file was modifiedllvm/lib/Target/Hexagon/HexagonISelLowering.h
The file was modifiedllvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp
The file was addedllvm/test/CodeGen/Hexagon/autohvx/masked-vmem-basic.ll
The file was modifiedllvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp
The file was modifiedllvm/test/CodeGen/Hexagon/hvx-bitcast-v64i1.ll
The file was modifiedllvm/lib/Target/Hexagon/HexagonTargetTransformInfo.h
The file was modifiedllvm/lib/Target/Hexagon/HexagonInstrInfo.cpp
The file was modifiedllvm/lib/Target/Hexagon/HexagonPatternsHVX.td
The file was modifiedllvm/test/CodeGen/Hexagon/store-vector-pred.ll
Commit 19e883fc59887e98a49ec03557ad2b6bc5537e03 by ctetreau
[SVE] Remove calls to VectorType::getNumElements from clang

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D82582
The file was modifiedclang/lib/CodeGen/CGAtomic.cpp
The file was modifiedclang/lib/CodeGen/CGBuiltin.cpp
The file was modifiedclang/lib/CodeGen/CGExpr.cpp
The file was modifiedclang/lib/CodeGen/SwiftCallingConv.cpp
The file was modifiedclang/lib/CodeGen/CGExprScalar.cpp
Commit c971b53b22a5cd43b54bf4773fe3c59ea1b805fb by llvm-project
[Polly] Use llvm::function_ref. NFC.

As suggested by David Blaike at
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/822584.html
The file was modifiedpolly/lib/Analysis/ScopInfo.cpp
The file was modifiedpolly/include/polly/ScopInfo.h
Commit 6538fff37245921a0983d94c08af7e6cc120b3a9 by llvm-project
[Polly] Inline ShoulDelete lambda. NFC.

As suggested by David Blaikie at
ihttps://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/822584.html
The file was modifiedpolly/lib/Transform/Simplify.cpp
The file was modifiedpolly/lib/Analysis/ScopInfo.cpp
Commit 476ca330894bf42feeb6c13547d14c821f6b8e0a by Steven Wu
[LTO] Don't apply LTOPostLink module flag during writeMergedModule

For `ld64` which uses legacy LTOCodeGenerator, it relies on
writeMergedModule to perform `ld -r` (generates a linked object file).
If all the inputs to `ld -r` is fullLTO bitcode, `ld64` will linked the
bitcode module, internalize all the symbols and write out another
fullLTO bitcode object file. This bitcode file doesn't have all the
bitcode inputs and it should not have LTOPostLink module flag. It will
also cause error when this bitcode object file is linked with other LTO
object file.
Fix the issue by not applying LTOPostLink flag during writeMergedModule
function. The flag should only be added when all the bitcode are linked
and ready to be optimized.

rdar://problem/58462798

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D84789
The file was modifiedllvm/test/LTO/ARM/lto-linking-metadata.ll
The file was modifiedllvm/tools/llvm-lto/llvm-lto.cpp
The file was modifiedllvm/lib/LTO/LTOCodeGenerator.cpp
Commit 61dfa009579f10e75ef110a80c3e5b657ec5867a by francesco.petrogalli
[MC][SVE] Fix data operand for instruction alias of `st1d`.

The version of `st1d` that operates with vector plus immediate
addressing mode uses the alias `st1d { <Zn>.d }, <Pg>, [<Za>.d]` for
rendering `st1d { <Zn>.d }, <Pg>, [<Za>.d, #0]`. The disassembler was
generating `<Zn>.s` instead of `<Zn>.d>`.

Differential Revision: https://reviews.llvm.org/D86633
The file was modifiedllvm/test/MC/AArch64/SVE/st1b.s
The file was modifiedllvm/test/MC/AArch64/SVE/st1h.s
The file was modifiedllvm/test/MC/AArch64/SVE/st1w.s
The file was modifiedllvm/test/MC/AArch64/SVE/st1d.s
The file was modifiedllvm/lib/Target/AArch64/SVEInstrFormats.td
Commit 1446c1801deaf7a38221b45662f2e17fa1d5e8f0 by aeubanks
[gn build] Manually port ed07e1fe
The file was modifiedllvm/utils/gn/secondary/llvm/include/llvm/Config/BUILD.gn
Commit 098d3f98276de90b6e1468031bd3858615240bb7 by aeubanks
[InstSimplify] Simplify to vector constants when possible

InstSimplify should do all transformations that ConstProp does, but
one thing that ConstProp does that InstSimplify wouldn't is inline
vector instructions that are constants, e.g. into a ret.

Previously vector instructions wouldn't be inlined in InstSimplify
because llvm::Simplify*Instruction() would return nullptr for specific
instructions, such as vector instructions that were actually constants,
if it couldn't simplify them.

This changes SimplifyInsertElementInst, SimplifyExtractElementInst, and
SimplifyShuffleVectorInst to return a vector constant when possible.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D85946
The file was modifiedllvm/test/Transforms/InstSimplify/vscale.ll
The file was modifiedllvm/lib/Analysis/InstructionSimplify.cpp
The file was modifiedllvm/test/Analysis/ConstantFolding/vscale-shufflevector.ll
Commit ea7b1c79f73d8def5d806ae79dea125d146ac864 by echristo
Add cmake test support for LLJITWithThinLTOSummaries to make sure
it's being built and called (and substituted).
The file was modifiedllvm/test/CMakeLists.txt
The file was modifiedllvm/test/lit.cfg.py
Commit 603a8a60ba444eb7fc77f0b31dd063a7583df2c4 by ishizaki
[mlir] NFC: fix trivial typos in documents

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D86563
The file was modifiedmlir/docs/SPIRVToLLVMDialectConversion.md
The file was modifiedmlir/docs/CAPI.md
The file was modifiedmlir/docs/Dialects/Linalg.md
The file was modifiedmlir/docs/Rationale/Rationale.md
The file was modifiedmlir/docs/OpDefinitions.md
Commit 1596ea80fdf3410f94ef9a2548701d26cc81c2f5 by Andrey.Churbanov
[OpenMP] Fix import library installation with MinGW

Patch by mati865@gmail.com

Differential Revision: https://reviews.llvm.org/D86552
The file was modifiedopenmp/runtime/src/CMakeLists.txt
Commit 28fbf422f248fc74681a53208aa2f543a67515ac by jonathanchesterfield
[libomptarget][amdgpu] Update plugin CMake to work with latest rocr library
The file was modifiedopenmp/libomptarget/plugins/amdgpu/CMakeLists.txt
Commit ceffd6993c350b57f43cec3b6371b159fc4a3149 by platonov.aleksandr
[Support][Windows] Fix incorrect GetFinalPathNameByHandleW() return value check in realPathFromHandle()

`GetFinalPathNameByHandleW(,,N,)` returns:
- `< N` on success (this value does not include the size of the terminating null character)
- `>= N` if buffer is too small (this value includes the size of the terminating null character)

So, when `N == Buffer.capacity() - 1`, we need to resize buffer if return value is > `Buffer.capacity() - 2`.
Also, we can set `N` to `Buffer.capacity()`.

Thus, without this patch `realPathFromHandle()` returns unfilled buffer when length of the final path of the file is equal to `Buffer.capacity()` or `Buffer.capacity() - 1`.

Reviewed By: andrewng, amccarth

Differential Revision: https://reviews.llvm.org/D86564
The file was modifiedllvm/lib/Support/Windows/Path.inc
Commit c6c292da910578bdec76616c606da2d79b730667 by ajcbik
[llvm] [Thumb2] Test unusual length for active lane mask

Thumb2 test for the fixed issue with unusual length.

https://bugs.llvm.org/show_bug.cgi?id=47299

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D86646
The file was modifiedllvm/test/CodeGen/Thumb2/active_lane_mask.ll