Commit
413054400d949ddd15e9bfdcb587502ea0311fcf
by jasonliu[XCOFF][AIX] Support relocation generation for large code model
Summary: Support TOCU and TOCL relocation type for object file generation.
Reviewed by: DiggerLin
Differential Revision: https://reviews.llvm.org/D84549
|
 | llvm/test/CodeGen/PowerPC/aix-xcoff-reloc-large.ll |
 | llvm/lib/Target/PowerPC/MCTargetDesc/PPCXCOFFObjectWriter.cpp |
 | llvm/lib/MC/XCOFFObjectWriter.cpp |
Commit
34b289b6dbcf1cdb328ab0a13cdedf96701394af
by Steven Wu[ThinLTO][Legacy] Compute PreservedGUID based on IRName in Symtab
Instead of computing GUID based on some assumption about symbol mangling rule from IRName to symbol name, lookup the IRName from all the symtabs from all the input files to see if there are any matching symbols entry provides the IRName for GUID computation.
rdar://65853754
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D84803
|
 | llvm/test/ThinLTO/X86/mangled_symbol.ll |
 | llvm/lib/LTO/ThinLTOCodeGenerator.cpp |
 | llvm/test/ThinLTO/X86/internalize.ll |
 | llvm/test/ThinLTO/X86/weak_resolution_single.ll |
 | llvm/test/ThinLTO/X86/weak_resolution.ll |
Commit
72305a08ffcb2da10a33732adfaa8757ba70904f
by ajcbik[llvm] [DAG] Fix bug in llvm.get.active.lane.mask lowering
This intrinsic only accepted proper machine vector lengths. Fixed by this change. With unit tests.
https://bugs.llvm.org/show_bug.cgi?id=47299
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D86585
|
 | llvm/test/CodeGen/X86/pr47299.ll |
 | llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp |
Commit
9061eb8245cc1ae25e8f7865062d1d8e44406994
by resistorRevert "Fix frame pointer layout on AArch64 Linux."
This broke stage2 of clang-cmake-aarch64-full.
This reverts commit a0aed80b22d1b698b86e0c16109fdfd4d592756f.
|
 | llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h |
 | llvm/lib/Target/AArch64/AArch64FrameLowering.cpp |
Commit
8bfe46dce22266e596370eac86b1aae799300e7e
by lebedev.ri[NFC][InstCombine] Add tests with PHI-of-{insert,extract}value with multiple uses
It is fine if the operation has multiple uses, as long as they are all in this very PHI node.
|
 | llvm/test/Transforms/InstCombine/phi-of-extractvalues.ll |
 | llvm/test/Transforms/InstCombine/phi-of-insertvalues.ll |
Commit
c07a430bd39cccb64712ddcba85254a5bb1cd89b
by lebedev.ri[NFC][Value] Fixup comments, "N users" is NOT the same as "N uses".
In those cases, it really means "N uses".
|
 | llvm/include/llvm/IR/Value.h |
Commit
95848ea101274b8bd774c63bad55f21a08080705
by lebedev.ri[Value][InstCombine] Fix one-use checks in PHI-of-op -> Op-of-PHI[s] transforms to be one-user checks
As FIXME said, they really should be checking for a single user, not use, so let's do that. It is not *that* unusual to have the same value as incoming value in a PHI node, not unlike how a PHI may have the same incoming basic block more than once.
There isn't a nice way to do that, Value::users() isn't uniqified, and Value only tracks it's uses, not Users, so the check is potentially costly since it does indeed potentially involes traversing the entire use list of a value.
|
 | llvm/lib/IR/Value.cpp |
 | llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp |
 | llvm/test/Transforms/PGOProfile/chr.ll |
 | llvm/include/llvm/IR/Value.h |
 | llvm/test/Transforms/InstCombine/phi-aware-aggregate-reconstruction.ll |
 | llvm/test/Transforms/InstCombine/phi-of-extractvalues.ll |
 | llvm/test/Transforms/InstCombine/phi-of-insertvalues.ll |
 | llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll |
Commit
eed0af6179ca4fe9e60121e0829ed8d3849b1ce5
by adamcz[clang] Exclude invalid destructors from lookups.
This fixes a crash when declaring a destructor with a wrong name, then writing result to pch file and loading it again. The PCH storage uses DeclarationNameKey as key and it is the same key for both the invalid destructor and the implicit one that was created because the other one was invalid. When querying for the Foo::~Foo we end up getting Foo::~Bar, which is then rejected and we end up with nullptr in CXXRecordDecl::GetDestructor().
Fixes https://bugs.llvm.org/show_bug.cgi?id=47270
Differential Revision: https://reviews.llvm.org/D86624
|
 | clang/test/PCH/cxx-invalid-destructor.cpp |
 | clang/lib/AST/DeclBase.cpp |
 | clang/test/PCH/cxx-invalid-destructor.h |
Commit
09288bcbf5f124a2b0e24a6b1f2d27b66dba9adf
by craig.topper[X86] Add assembler support for .d32 and .d8 mnemonic suffixes to control displacement size.
This is an older syntax than the {disp32} and {disp8} pseudo prefixes that were added a few weeks ago. We can reuse most of the support for that to support .d32 and .d8 as well.
|
 | llvm/test/MC/X86/x86-32.s |
 | llvm/test/MC/X86/x86-64.s |
 | llvm/docs/ReleaseNotes.rst |
 | llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp |
Commit
684b43c0cfb1092a65c237b39d0662bfe0a2c97a
by aqjune[IR] Add NoUndef attribute to Intrinsics.td
This patch adds NoUndef to Intrinsics.td. The attribute is attached to llvm.assume's operand, because llvm.assume(undef) is UB. It is attached to pointer operands of several memory accessing intrinsics as well.
This change makes ValueTracking::getGuaranteedNonPoisonOps' intrinsic check unnecessary, so it is removed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86576
|
 | llvm/lib/Analysis/ValueTracking.cpp |
 | llvm/test/Transforms/EarlyCSE/invariant.start.ll |
 | llvm/utils/TableGen/CodeGenTarget.cpp |
 | llvm/include/llvm/IR/Intrinsics.td |
 | llvm/utils/TableGen/IntrinsicEmitter.cpp |
 | mlir/test/Target/llvmir-intrinsics.mlir |
 | llvm/utils/TableGen/CodeGenIntrinsics.h |
Commit
c67ccf5fafc8c035f152ce30115bbdacf23530d5
by wmi[SampleFDO] Enhance profile remapping support for searching inline instance and indirect call promotion candidate.
Profile remapping is a feature to match a function in the module with its profile in sample profile if the function name and the name in profile look different but are equivalent using given remapping rules. This is a useful feature to keep the performance stable by specifying some remapping rules when sampleFDO targets are going through some large scale function signature change.
However, currently profile remapping support is only valid for outline function profile in SampleFDO. It cannot match a callee with an inline instance profile if they have different but equivalent names. We found that without the support for inline instance profile, remapping is less effective for some large scale change.
To add that support, before any remapping lookup happens, all the names in the profile will be inserted into remapper and the Key to the name mapping will be recorded in a map called NameMap in the remapper. During name lookup, a Key will be returned for the given name and it will be used to extract an equivalent name in the profile from NameMap. So with the help of the NameMap, we can translate any given name to an equivalent name in the profile if it exists. Whenever we try to match a name in the module to a name in the profile, we will try the match with the original name first, and if it doesn't match, we will use the equivalent name got from remapper to try the match for another time. In this way, the patch can enhance the profile remapping support for searching inline instance and searching indirect call promotion candidate.
In a planned large scale change of int64 type (long long) to int64_t (long), we found the performance of a google internal benchmark degraded by 2% if nothing was done. If existing profile remapping was enabled, the performance degradation dropped to 1.2%. If the profile remapping with the current patch was enabled, the performance degradation further dropped to 0.14% (Note the experiment was done before searching indirect call promotion candidate was added. We hope with the remapping support of searching indirect call promotion candidate, the degradation can drop to 0% in the end. It will be evaluated post commit).
Differential Revision: https://reviews.llvm.org/D86332
|
 | llvm/include/llvm/ProfileData/SampleProfReader.h |
 | llvm/lib/Transforms/IPO/SampleProfile.cpp |
 | llvm/unittests/ProfileData/SampleProfTest.cpp |
 | llvm/include/llvm/ProfileData/SampleProf.h |
 | llvm/test/Transforms/SampleProfile/remap-2.ll |
 | llvm/lib/ProfileData/SampleProfReader.cpp |
 | llvm/test/Transforms/SampleProfile/Inputs/remap-2.prof |
 | llvm/lib/ProfileData/SampleProf.cpp |
Commit
f78687df9b790b4f4177a72cbd25b49d14c437b4
by Matthew.ArsenaultAMDGPU: Don't assert on misaligned DS read2/write2 offsets
This would assert with unaligned DS access enabled. The offset may not be aligned. Theoretically the pattern predicate should check the memory alignment, although it is possible to have the memory be aligned but not the immediate offset.
In this case I would expect it to use ds_{read|write}_b64 with unaligned access, but am not clear if there's a reason it doesn't.
|
 | llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp |
 | llvm/test/CodeGen/AMDGPU/ds_write2.ll |
 | llvm/test/CodeGen/AMDGPU/ds_read2.ll |
Commit
e15143d31bca3973db51714af6361f3e77a9e058
by kparzysz[Hexagon] Implement llvm.masked.load and llvm.masked.store for HVX
|
 | llvm/test/CodeGen/Hexagon/store-vector-pred.ll |
 | llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.h |
 | llvm/lib/Target/Hexagon/HexagonInstrInfo.cpp |
 | llvm/lib/Target/Hexagon/HexagonPatternsHVX.td |
 | llvm/lib/Target/Hexagon/HexagonISelLowering.h |
 | llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp |
 | llvm/test/CodeGen/Hexagon/autohvx/masked-vmem-basic.ll |
 | llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp |
 | llvm/test/CodeGen/Hexagon/hvx-bitcast-v64i1.ll |
Commit
19e883fc59887e98a49ec03557ad2b6bc5537e03
by ctetreau[SVE] Remove calls to VectorType::getNumElements from clang
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D82582
|
 | clang/lib/CodeGen/CGAtomic.cpp |
 | clang/lib/CodeGen/CGBuiltin.cpp |
 | clang/lib/CodeGen/CGExprScalar.cpp |
 | clang/lib/CodeGen/SwiftCallingConv.cpp |
 | clang/lib/CodeGen/CGExpr.cpp |
Commit
c971b53b22a5cd43b54bf4773fe3c59ea1b805fb
by llvm-project[Polly] Use llvm::function_ref. NFC.
As suggested by David Blaike at https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/822584.html
|
 | polly/lib/Analysis/ScopInfo.cpp |
 | polly/include/polly/ScopInfo.h |
Commit
6538fff37245921a0983d94c08af7e6cc120b3a9
by llvm-project[Polly] Inline ShoulDelete lambda. NFC.
As suggested by David Blaikie at ihttps://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200824/822584.html
|
 | polly/lib/Analysis/ScopInfo.cpp |
 | polly/lib/Transform/Simplify.cpp |
Commit
476ca330894bf42feeb6c13547d14c821f6b8e0a
by Steven Wu[LTO] Don't apply LTOPostLink module flag during writeMergedModule
For `ld64` which uses legacy LTOCodeGenerator, it relies on writeMergedModule to perform `ld -r` (generates a linked object file). If all the inputs to `ld -r` is fullLTO bitcode, `ld64` will linked the bitcode module, internalize all the symbols and write out another fullLTO bitcode object file. This bitcode file doesn't have all the bitcode inputs and it should not have LTOPostLink module flag. It will also cause error when this bitcode object file is linked with other LTO object file. Fix the issue by not applying LTOPostLink flag during writeMergedModule function. The flag should only be added when all the bitcode are linked and ready to be optimized.
rdar://problem/58462798
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D84789
|
 | llvm/test/LTO/ARM/lto-linking-metadata.ll |
 | llvm/tools/llvm-lto/llvm-lto.cpp |
 | llvm/lib/LTO/LTOCodeGenerator.cpp |
Commit
61dfa009579f10e75ef110a80c3e5b657ec5867a
by francesco.petrogalli[MC][SVE] Fix data operand for instruction alias of `st1d`.
The version of `st1d` that operates with vector plus immediate addressing mode uses the alias `st1d { <Zn>.d }, <Pg>, [<Za>.d]` for rendering `st1d { <Zn>.d }, <Pg>, [<Za>.d, #0]`. The disassembler was generating `<Zn>.s` instead of `<Zn>.d>`.
Differential Revision: https://reviews.llvm.org/D86633
|
 | llvm/test/MC/AArch64/SVE/st1w.s |
 | llvm/lib/Target/AArch64/SVEInstrFormats.td |
 | llvm/test/MC/AArch64/SVE/st1h.s |
 | llvm/test/MC/AArch64/SVE/st1b.s |
 | llvm/test/MC/AArch64/SVE/st1d.s |
Commit
1446c1801deaf7a38221b45662f2e17fa1d5e8f0
by aeubanks[gn build] Manually port ed07e1fe
|
 | llvm/utils/gn/secondary/llvm/include/llvm/Config/BUILD.gn |
Commit
098d3f98276de90b6e1468031bd3858615240bb7
by aeubanks[InstSimplify] Simplify to vector constants when possible
InstSimplify should do all transformations that ConstProp does, but one thing that ConstProp does that InstSimplify wouldn't is inline vector instructions that are constants, e.g. into a ret.
Previously vector instructions wouldn't be inlined in InstSimplify because llvm::Simplify*Instruction() would return nullptr for specific instructions, such as vector instructions that were actually constants, if it couldn't simplify them.
This changes SimplifyInsertElementInst, SimplifyExtractElementInst, and SimplifyShuffleVectorInst to return a vector constant when possible.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D85946
|
 | llvm/lib/Analysis/InstructionSimplify.cpp |
 | llvm/test/Transforms/InstSimplify/vscale.ll |
 | llvm/test/Analysis/ConstantFolding/vscale-shufflevector.ll |
Commit
ea7b1c79f73d8def5d806ae79dea125d146ac864
by echristoAdd cmake test support for LLJITWithThinLTOSummaries to make sure it's being built and called (and substituted).
|
 | llvm/test/lit.cfg.py |
 | llvm/test/CMakeLists.txt |
Commit
603a8a60ba444eb7fc77f0b31dd063a7583df2c4
by ishizaki[mlir] NFC: fix trivial typos in documents
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D86563
|
 | mlir/docs/CAPI.md |
 | mlir/docs/Rationale/Rationale.md |
 | mlir/docs/SPIRVToLLVMDialectConversion.md |
 | mlir/docs/Dialects/Linalg.md |
 | mlir/docs/OpDefinitions.md |
Commit
1596ea80fdf3410f94ef9a2548701d26cc81c2f5
by Andrey.Churbanov[OpenMP] Fix import library installation with MinGW
Patch by mati865@gmail.com
Differential Revision: https://reviews.llvm.org/D86552
|
 | openmp/runtime/src/CMakeLists.txt |
Commit
28fbf422f248fc74681a53208aa2f543a67515ac
by jonathanchesterfield[libomptarget][amdgpu] Update plugin CMake to work with latest rocr library
|
 | openmp/libomptarget/plugins/amdgpu/CMakeLists.txt |
Commit
ceffd6993c350b57f43cec3b6371b159fc4a3149
by platonov.aleksandr[Support][Windows] Fix incorrect GetFinalPathNameByHandleW() return value check in realPathFromHandle()
`GetFinalPathNameByHandleW(,,N,)` returns: - `< N` on success (this value does not include the size of the terminating null character) - `>= N` if buffer is too small (this value includes the size of the terminating null character)
So, when `N == Buffer.capacity() - 1`, we need to resize buffer if return value is > `Buffer.capacity() - 2`. Also, we can set `N` to `Buffer.capacity()`.
Thus, without this patch `realPathFromHandle()` returns unfilled buffer when length of the final path of the file is equal to `Buffer.capacity()` or `Buffer.capacity() - 1`.
Reviewed By: andrewng, amccarth
Differential Revision: https://reviews.llvm.org/D86564
|
 | llvm/lib/Support/Windows/Path.inc |
Commit
c6c292da910578bdec76616c606da2d79b730667
by ajcbik[llvm] [Thumb2] Test unusual length for active lane mask
Thumb2 test for the fixed issue with unusual length.
https://bugs.llvm.org/show_bug.cgi?id=47299
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D86646
|
 | llvm/test/CodeGen/Thumb2/active_lane_mask.ll |
Commit
54a5dd485c4d04d142a58c9349ada0c897cbeae6
by spatel[DAGCombiner] allow store merging non-i8 truncated ops
We have a gap in our store merging capabilities for shift+truncate patterns as discussed in: https://llvm.org/PR46662
I generalized the code/comments for this function in earlier commits, so we only need ease the type restriction and adjust the address/endian checking to make this work.
AArch64 lets us switch endian to make sure that patterns are matched either way.
Differential Revision: https://reviews.llvm.org/D86420
|
 | llvm/test/CodeGen/AArch64/merge-trunc-store.ll |
 | llvm/test/CodeGen/X86/stores-merging.ll |
 | llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp |
Commit
9936455204fd6ab72715cc9d67385ddc93e072ed
by resistorReapply D70800: Fix AArch64 AAPCS frame record chain
Original Commit Message: After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register) may be placed in the middle of a stack frame if a function has both callee-saved general-purpose registers and floating point registers. This will break the stack unwinders that simply walk through the frame records (based on the guarantee from AAPCS64 "The Frame Pointer" section). This commit fixes the problem by adding the frame record offset.
Patch By: logan
|
 | llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h |
 | llvm/test/CodeGen/AArch64/framelayout-frame-record.mir |
 | llvm/test/CodeGen/AArch64/framelayout-fp-csr.ll |
 | llvm/lib/Target/AArch64/AArch64FrameLowering.cpp |