[MLIR][LLVM] Move the LLVM inliner interface into a separate file.
A fully fledged LLVM inliner will require a lot of logic. Since
`LLVMDialect.cpp` is large enough as it is, preemptively outline the
inlining logic into a separate `.cpp` file. This will also allow us to
add a `DEBUG_TYPE` for debugging the inliner.
The name `LLVMInlining` was chosen over `LLVMInlinerInterface` to keep
the option open for exposing inlining functionality even when not
invoked through the `DialectInlinerInterface`.
Depends on D146616
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D146628
[ADT] add ConcurrentHashtable class.
ConcurrentHashTable - is a resizeable concurrent hashtable.
The range of resizings is limited up to x2^32. The hashtable allows only concurrent insertions.
Concurrent hashtable is necessary for the D96035 patch.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D132455
[X86] LowerVectorAllZero - add 512-bit support with AVX512 vptestnmd+kortestw patterns (REAPPLIED)
Another step toward #53419 - this is also another step towards expanding MatchVectorAllZeroTest to match any pair of vectors and merge EmitAVX512Test into it.
Silence unused variable warning in NDEBUG builds
I usually would fold this into the assert, but the comment there
suggests side effects. NFC.
ModuleMap.cpp:938:9: error: unused variable 'MainFile' [-Werror,-Wunused-variable]
auto *MainFile = SourceMgr.getFileEntryForID(SourceMgr.getMainFileID());
Revert "[ADT] add ConcurrentHashtable class."
This reverts commit 8482b238062ed7263facea9490f67119e00a037a.
[AIX][CodeGen] Storage Locations for Constant Pointers
This patch adds an `llc` option `-mroptr` to specify storage locations for constant pointers on AIX.
When the `-mroptr` option is specified, constant pointers, virtual function tables, and virtual type tables are placed in read-only storage. Otherwise, by default, pointers, virtual function tables, and virtual type tables are placed are placed in read/write storage.
https://reviews.llvm.org/D144190 enables the `-mroptr` option for `clang`.
Reviewed By: hubert.reinterpretcast, stephenpeckham, myhsu, MaskRay, serge-sans-paille
Differential Revision: https://reviews.llvm.org/D144189
[lldb][AArch64] Fix run-qemu.sh when only MTE is enabled.
SVE and MTE both require a CPU with that feature before
you can use the other options, but we only added the "max"
cpu when SVE was enabled too.
[OpenMP] Add notifyDataUnmapped back in disassociatePtr
Fix regression introduced by https://reviews.llvm.org/D123446
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D146689
Enable constexpr class members that are device-mapped to not be optimized out.
This patch fixes an issue whereby a constexpr class member which is
mapped to the device is being optimized out thus leading to a runtime
[AggressiveInstCombine] Pre-Commit test for D144445 (NFC)
Differential Revision: https://reviews.llvm.org/D145355
[AggressiveInstCombine] folding load for constant global patterened arrays and structs by alignment
Differential Revision: https://reviews.llvm.org/D144445
Reviewed By: nikic
fix: wrong arrow
[MSAN] Support load and stores of scalable vector types
This adds support for scalable vector types - at least far enough to get basic load and store cases working. It turns out that load/store without origin tracking already worked; I apparently got that working with one of the pre patches to use TypeSize utilities and didn't notice. The code changes here are required to enable origin tracking.
For origin tracking, a 4 byte value - the origin - is broadcast into a shadow region whose size exactly matches the type being accessed. This origin is only written if the shadow value is non-zero. The details of how shadow is computed from the original value being stored aren't relevant for this patch.
The code changes involve two related primitives.
First, we need to be able to perform that broadcast into a scalable sized memory region. This requires the use of a loop, and appropriate bound. The fixed size case optimizes with larger stores and alignment; I did not bother with that for the scalable case for now. We can optimize this codepath later if desired.
Second, we need a way to test if the shadow is zero. The mechanism for this in the code is to convert the shadow value into a scalar, and then zero check that. There's an assumption that this scalar is zero exactly when all elements of the shadow value are zero. As a result, we use an OR reduction on the scalable vector. This is analogous to how e.g. an array is handled. I landed a bunch of cleanup changes to remove other direct uses of the scalar conversion to convince myself there were no other undocumented invariants.
Differential Revision: https://reviews.llvm.org/D146157
[Clang] Fix evaluation of parameters of lambda call operator attributes
Fix a regresion introduced by D124351.
Attributes of lambda call operator were evaluated in the
context of the closure object type rather than its operator,
causing an assertion failure.
This was because we temporarily switch to the class lambda to
produce the mangling of the lambda, but we stayed in that
context too long.
Reviewed By: eandrews, aaron.ballman
Differential Revision: https://reviews.llvm.org/D146535
[AArch64] Add Missing Custom Target Operands
I noticed, when examining the generated Asm Matcher table, that some of
these custom immediate operands are missing, and so we are not parsing
some hint aliases into the correct MCInst.
Where this becomes apparent is when you parse e.g. `hint #7` into an
MCInst - without these cases, it becomes the MCInst `(HINT 17)`, which
will always be printed as `hint #17`. With these cases, it becomes the
MCInst `XPACLRI`, which will be printed as `xpaclri` with pauth, or
`hint #17` without, matching how `xpaclri` is parsed.
We only handle some specific hint aliases in this manner, usually where
these hints have specific effects that need to be modelled for accurate
code-generation. Otherwise, we just use the normal `InstAlias` system
to have the aliases parsed into a `(HINT N)` MCInst.
Differential Revision: https://reviews.llvm.org/D146630
[HWASAN] Disable unexpected_format_specifier_test because HWASAN doesn't provide a printf interceptor
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D146647
[HWASAN] Instrument scalable load/store without crashing
We can simply push them down the existing call slowpath with some minor changes to how we compute the size argument.
[OpenMP][OMPIRBuilder] Make OffloadEntriesInfoManager a member of OpenMPIRBuilder
This patch adds the OffloadEntriesInfoManager to the OpenMPIRBuilder, and
allows the OffloadEntriesInfoManager to access the Configuration in the
OpenMPIRBuilder. With the shared Config there is no risk for inconsistencies,
and there is no longer the need for clang to have a separate
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D146549
by Felipe de Azevedo Piovezan
[lldb] Explicitly set libcxx paths when USE_SYSTEM_STDLIB is provided
For tests marked as "USE_SYSTEM_STDLIB", the expectation is that the
system's standard library should be used. However, the implementation of
this flag is such that we simply don't pass _any_ libcxxx-related flags
to Clang; in turn, Clang will use its defaults.
For a Clang/Libcxx pair compiled together, Clang defaults to:
1. The headers of the sibling libcxx.
2. The libraries of the system.
This mismatch is actually a bug in the driver; once fixed, however, (2)
would point to the sibling libcxx as well, which is _not_ what test
authors intended with the USE_SYSTEM_STDLIB flag.
As such, this patch explicitly sets a path to the system's libraries.
This change is done only in Apple platforms so that we can test this
works in this case first.
Differential Revision: https://reviews.llvm.org/D146714
[BoundsChecking] Don't crash on scalable vector sizes
[MergeFuncs] Add tests for D144682 (NFC)
I forgot to git add this test when committing the change.
[X86] LowerVectorAllZero - lower to CMP(MOVMSK(NOT(X)),0) instead of CMP(MOVMSK(X),65535)
In most cases the NOT will still be scalarized, but it allows us to perform the CMP(X,0) combines inside combineCMP()
[MemProf] Use stable_sort to avoid non-determinism
Switch from std::sort to std::stable_sort when sorting callsites to
avoid non-determinism when the comparisons are equal. This showed up in
internal testing of fe27495be2040007c7b20844a9371b06156ab405.
[clangd] Add provider info on symbol hover.
Differential Revision: https://reviews.llvm.org/D144976
[libc] Implement memory fences on NVPTX
Memory fences are not handled by the NVPTX backend. We need to replace
them with a memory barrier intrinsic function. This doesn't include the
ordering, but should perform the necessary functionality, albeit slower.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D146725
[NFC][AArch64] Sort Hints in armv8.3a-signed-pointer.s test
[libc] Fix inline assembly for nvptx quick_exit
The `exit` function in NVPTX has no intrinsic, but the assembly requires
a semicolon in the ptx, otherwise it will fail.
[MergeFunc] Don't assume constant metadata operands
We should not call mdconst::extract, unless we know that the
metadata in question is ConstantAsMetadata.
For now we consider all other metadata as equal. The noalias test
shows that this is not correct, but at least it doesn't crash
[flang] Lowering fir.dispatch in the polymorphic op pass
Differential revision: https://reviews.llvm.org/D146594
[ArgPromotion] Remove dead code produced by removing dead arguments
ArgPromotion currently produces phantom / dead loads. A good example of this is store-into-inself.ll. First, ArgPromo finds the promotable argument %p in @l. Then it inserts a load of %p in the caller, and passes instead the loaded value / transforms the function body. PromoteMem2Reg is able to optimize out the entire function body, resulting in an unused argument. In a subsequent ArgPromotion pass, it removes the dead argument, resulting in a dead load in the caller. These dead loads may reduce effectiveness of other transformations (e.g. SimplifyCFG, MergedLoadStoreMotion).
This patch removes loads and geps that are made dead in the caller after removal of dead args.
Differential Revision: https://reviews.llvm.org/D146327
[libc] enable printf using system FILE
The printf and fprintf implementations use our internal implementation
to improve performance when it's available, but this patch enables using
the public FILE API for overlay mode.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D146001
[libc] Fix some math conversion warnings
Differential Revision: https://reviews.llvm.org/D146738
[docs] Document -fomit-frame-pointer
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D146603