Commit
e4dee7e7309a060bd8dd3c9df0a708157fc935d4
by kareem.ergawy[MLIR][SPIRV] Properly (de-)serialize BranchConditionalOp.
Implements proper (de-)serialization logic for BranchConditionalOp when such ops have true/false target operands.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D101602
|
 | mlir/test/Target/SPIRV/phi.mlir |
 | mlir/lib/Target/SPIRV/Serialization/Serializer.cpp |
 | mlir/lib/Target/SPIRV/Deserialization/Deserializer.cpp |
 | mlir/lib/Target/SPIRV/Deserialization/Deserializer.h |
Commit
1ccebb18ef9f4110e555209261d73dbec393e364
by Amara Emerson[GlobalISel] Micro-optimize the conditional branch optimization.
Convert a check into an assert and pass an MI instead of recomputing in the apply function.
|
 | llvm/include/llvm/Target/GlobalISel/Combine.td |
 | llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp |
 | llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h |
Commit
9deb7eeaf76c3285b72ce75d30fcade63b96e2dc
by czhengsz[Debug-Info][NFC] add a wrapper for Die.addValue
Add a new wrapper function addAttribute() for Die.addValue() function, so we can do some attributes control in one single interface.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D101125
|
 | llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp |
 | llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h |
 | llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp |
Commit
911a541620bcc78e637589b8623d94b8f3cdafba
by guopeilin1[LazyValueInfo] Insert an Overdefined placeholder to prevent infinite recursion
getValueFromCondition() uses a Visited set to record the intermediate value. However, it uses a postorder way to compute the value first and update the Visited set later. Thus it will be trapped into an infinite recursion if there exists IRs that use no dominated by its def as in this example:
%tmp3 = or i1 undef, %tmp4 %tmp4 = or i1 undef, %tmp3
To prevent this, we can insert an Overdefined placeholder into the set before computing the actual value.
Reviewed by: nikic
Differential Revision: https://reviews.llvm.org/D101273
|
 | llvm/lib/Analysis/LazyValueInfo.cpp |
 | llvm/test/Transforms/JumpThreading/insert-placeholder-to-prevent-infinite-recursion.ll |
Commit
dafbfb1d1d8e01beac3704aea4e8df45260a6310
by martin[libcxx] Fix a case of -Wundef warnings. NFC.
Differential Revision: https://reviews.llvm.org/D101978
|
 | libcxx/src/locale.cpp |
Commit
d2b2ad32b76989b68e7b525e7484e25b0f0cc4e6
by james.henderson[lit][test] Attempt fix when paths include symlink
Example of failure: https://lab.llvm.org/staging/#/builders/126/builds/345/steps/5/logs/FAIL__lit___use-tool-search-env_py
|
 | llvm/utils/lit/tests/Inputs/use-tool-search-env/lit.cfg |
Commit
cf06c8eee3a5ac6172e77abe5d7547554e6a6620
by caroline.concatto[LoopVectorize][SVE] Remove assert for scalable vector in InnerLoopVectorizer::fixReduction
The function fixReduction used to assert/crash for scalable vector when a vector reduce could be done with a smaller vector. This patch removes this assertion as it is safe to use scalable vector for vector reduce and truncate.
Differential Revision: https://reviews.llvm.org/D101260
|
 | llvm/test/Transforms/LoopVectorize/scalable-reduction-inloop.ll |
 | llvm/lib/Transforms/Vectorize/LoopVectorize.cpp |
Commit
778487a221496e92795afab147c3a030c74ad356
by diana.picus[flang] Add tests for MIN for character arrays. NFC
We used to test only scalar character types. This commit adds tests for arrays with a few simple shapes.
Differential Revision: https://reviews.llvm.org/D101983
|
 | flang/unittests/RuntimeGTest/CharacterTest.cpp |
Commit
2ea36e94927ccbc1f8e915a4e5c932531e69f02d
by diana.picus[flang] Remove redundant reallocation
The MaxMinHelper used to implement MIN and MAX for character types would reallocate the accumulator whenever the number of characters in it was different from that in the other input. This is unnecessary if the accumulator is already larger than the other input. This patch fixes the issue and adds a unit test to make sure we don't reallocate if we don't need to.
Differential Revision: https://reviews.llvm.org/D101984
|
 | flang/runtime/character.cpp |
 | flang/unittests/RuntimeGTest/CharacterTest.cpp |
Commit
98e5ede60499f255c2cd48b85dcda14af5b99c7d
by sebastian.neubauer[AMDGPU] Serialize MFInfo::ScavengeFI
Serialize ScavengeFI from SIMachineFunctionInfo into yaml.
ScavengeFI is not used outside of the PrologEpilogInserter, so this shouldn't change anything.
Differential Revision: https://reviews.llvm.org/D101367
|
 | llvm/test/CodeGen/MIR/AMDGPU/invalid-frame-index-invalid-fixed-stack.mir |
 | llvm/test/CodeGen/MIR/AMDGPU/invalid-frame-index.mir |
 | llvm/lib/CodeGen/CMakeLists.txt |
 | llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |
 | llvm/lib/CodeGen/MIRYamlMapping.cpp |
 | llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-after-pei.ll |
 | llvm/test/CodeGen/MIR/AMDGPU/invalid-frame-index-no-stack.mir |
 | llvm/include/llvm/CodeGen/MIRYamlMapping.h |
 | llvm/test/CodeGen/MIR/AMDGPU/invalid-frame-index2.mir |
 | llvm/test/CodeGen/MIR/AMDGPU/invalid-frame-index-invalid-stack.mir |
 | llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-no-ir.mir |
 | llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h |
 | llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp |
Commit
8894a4b5d70a2fee8c35e2e66597fec24bc15770
by llvmgnsyncbot[gn build] Port 98e5ede60499
|
 | llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn |
Commit
f87638338464e7ff9396e92e04e3f5702d479d39
by thatlemon[AsmParser][ARM] Make .thumb_func imply .thumb
GNU as documentation states that a `.thumb_func` directive implies `.thumb`, teach the asm parser to switch mode whenever it's encountered. On the other hand the labeled form, exclusive to Apple's toolchain, doesn't switch mode at all.
Reviewed By: nickdesaulniers, peter.smith
Differential Revision: https://reviews.llvm.org/D101975
|
 | llvm/test/MC/ARM/thumb_func-implies-thumb.s |
 | llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp |
 | lld/test/ELF/arm-ldrlit-err.s |
Commit
eb1b26ec1d1ac60b2207354fcd003cad40e12b76
by gchatelet[llvm][NFC] Remove deprecated TargetFrameLowering and InstrTypes alignment functions
Differential Revision: https://reviews.llvm.org/D102056
|
 | llvm/include/llvm/CodeGen/TargetFrameLowering.h |
 | llvm/include/llvm/IR/InstrTypes.h |
Commit
e805b7c2d63c1f8b74f228718a55536f54ddd1c0
by gchatelet[llvm][NFC] Remove remaining deprecated alignment functions from CodeGen
Differential Revision: https://reviews.llvm.org/D102058
|
 | llvm/include/llvm/CodeGen/MachineMemOperand.h |
 | llvm/include/llvm/CodeGen/SelectionDAGNodes.h |
 | llvm/include/llvm/CodeGen/MachineFrameInfo.h |
 | llvm/lib/CodeGen/MachineOperand.cpp |
Commit
f0762fc42f0f4ecf849bef42eed2bb4c0785ea67
by gbreynoo[llvm-dwarfdump] Help option output should be consistent with the command guide
The dwarfdump command guide shows the short options used as aliases but these are not found in the help text unless --show-hidden is used. Investigating other tools some follow this pattern, others like llvm-objdump show aliases with --help. This change fixes the help output to be consistent with the command guide. This includes updating alias descriptions in the help output to use "--".
As part of this change I updated cmdline.test, including some options that were missing testing.
Differential Revision: https://reviews.llvm.org/D101646
|
 | llvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp |
 | llvm/test/tools/llvm-dwarfdump/cmdline.test |
Commit
0791f968fee259e5c34523167bd58179b8b081c2
by stephen.tozer[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST
This patch modifies updateDbgUsersToReg to properly handle DBG_VALUE_LIST instructions, by replacing the hard-coded operand indices (i.e. getOperand(0)) with the more general getDebugOperandsForReg(), and updating the register for all matching operands.
Differential Revision: https://reviews.llvm.org/D101523
|
 | llvm/lib/CodeGen/MachineCopyPropagation.cpp |
 | llvm/include/llvm/CodeGen/MachineRegisterInfo.h |
 | llvm/test/DebugInfo/ARM/machine-cp-updates-dbg-reg.mir |
Commit
227678089cf6d8b15d51e58abfefd4f346e9c7f0
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: add tests with eliminatible GPR moves
|
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-gpr.s |
Commit
7059b28d5d276cab89815b762d10431329a7da2a
by lebedev.ri[X86] AMD Zen 3: 32/64 -bit GPR register moves are zero-cycle
I've verified this with llvm-exegesis. This is not limited to zero registers.
Refs: AMD SOG 19h, 2.9.4 Zero Cycle Move The processor is able to execute certain register to register mov operations with zero cycle delay.
Agner, 22.13 Instructions with no latency Register-to-register move instructions are resolved at the register rename stage without using any execution units. These instructions have zero latency. It is possible to do six such register renamings per clock cycle, and it is even possible to rename the same register multiple times in one clock cycle.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-gpr.s |
Commit
bda9ca3e44c1b67d1c4ed145bb7071c340fe8961
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: add tests with non-eliminatible MMX moves
In Zen3, MMX moves are *not* eliminated, i've verified this with llvm-exegesis.
|
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-mmx.s |
Commit
442de0c1adf36bfddb5fb66b442bba8999fa733b
by david.stuttardAMDGPU: Correct const_index_stride for wave 32 for PAL ABI
Since there is a single scratch resource descriptor for all shaders, if there is a wave32 and a wave64 shader (for instance for VsFs pairs) then the const_index_stride will be incorrect for wave32 shaders.
Differential Revision: https://reviews.llvm.org/D101830
Change-Id: Id8de5566b0d1a07a814e2e7db016df9d20bf6d2c
|
 | llvm/lib/Target/AMDGPU/SIFrameLowering.cpp |
 | llvm/test/CodeGen/AMDGPU/pal-simple-indirect-call.ll |
Commit
f372ff17f74f99f5e1c021a9c919b33c4caf38d9
by olemarius.strohm[NFC] (test commit) Changed example invocation of C++ for OpenCL
|
 | clang/docs/OpenCLSupport.rst |
Commit
8e42024f79997827cefe00d31cd3bc55d1551fec
by llvm-dev[X86] Ensure we pass DebugLoc by const reference where possible. NFCI.
Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef
|
 | llvm/lib/Target/X86/X86ISelLowering.cpp |
Commit
2a3f60b5f5304f61cab3654a6afb67b79ca7df86
by llvm-dev[SLP] Regenerate tests to reduce diff in D98714. NFCI.
|
 | llvm/test/Transforms/SLPVectorizer/X86/pr44067.ll |
 | llvm/test/Transforms/SLPVectorizer/vectorizable-functions-inseltpoison.ll |
 | llvm/test/Transforms/SLPVectorizer/vectorizable-functions.ll |
Commit
793b4b26039e461dc3142a3f667ba7c97b0ed920
by david.stuttardRevert "AMDGPU: Correct const_index_stride for wave 32 for PAL ABI"
This reverts commit 442de0c1adf36bfddb5fb66b442bba8999fa733b.
|
 | llvm/lib/Target/AMDGPU/SIFrameLowering.cpp |
 | llvm/test/CodeGen/AMDGPU/pal-simple-indirect-call.ll |
Commit
280aa3415e408cacc520274fdb948ec9fc63865a
by llvm-dev[DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts
Based off a discussion on D89281 - where the AARCH64 implementations were being replaced to use funnel shifts.
Any target that has efficient funnel shift lowering can handle the shift parts expansion using the same expansion, avoiding a lot of duplication.
I've generalized the X86 implementation and moved it to TargetLowering - so far I've found that AARCH64 and AMDGPU benefit, but many other targets (ARM, PowerPC + RISCV in particular) could easily use this with a few minor improvements to their funnel shift lowering (or the folding of their target ops that funnel shifts lower to).
NOTE: I'm trying to avoid adding full SHIFT_PARTS legalizer handling as I think it might actually be possible to remove these opcodes in the medium-term and use funnel shift / libcall expansion directly.
Differential Revision: https://reviews.llvm.org/D101987
|
 | llvm/lib/Target/AMDGPU/R600ISelLowering.h |
 | llvm/test/CodeGen/AMDGPU/fp_to_sint.ll |
 | llvm/test/CodeGen/AMDGPU/srl.ll |
 | llvm/test/CodeGen/AMDGPU/fp_to_uint.ll |
 | llvm/include/llvm/CodeGen/TargetLowering.h |
 | llvm/lib/Target/X86/X86ISelLowering.cpp |
 | llvm/test/CodeGen/AMDGPU/shl.ll |
 | llvm/lib/Target/AMDGPU/R600ISelLowering.cpp |
 | llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp |
 | llvm/test/CodeGen/AMDGPU/sra.ll |
 | llvm/test/CodeGen/AArch64/arm64-long-shift.ll |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.h |
Commit
ce0c1f3ced9bccb29c34b87de82c5cdffcbcd457
by stephen.tozer[DebugInfo] Fix crash when emitting an invalidated SDDbgValue
This patch fixes a crash in the compiler that occurs when certain invalidated SDDbgValues are emitted. The cause of this was that we would attempt to check the liveness of the debug value's operands, which triggers an assert if any of those operands are invalid. This patch changes this check such that it only occurs if the SDDbgValue is valid; if not, the check is irrelevant anyway, so can be safely ignored.
Differential Revision: https://reviews.llvm.org/D101540
|
 | llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp |
 | llvm/test/DebugInfo/Generic/invalidated-dbg-value-is-undef.ll |
Commit
d9f2960c932c9803e662098e33d899efa3c67f44
by joachim[NFC] Correctly assert the indents for printEnumValHelpStr.
Only verify that there's no negative indent. Noted by @chapuni in https://reviews.llvm.org/D93494.
Reviewed By: chapuni
Differential Revision: https://reviews.llvm.org/D102021
|
 | llvm/lib/Support/CommandLine.cpp |
Commit
76f1de10f43ec4d1eb6146c45ccd6f93df5aa3e1
by anastasia.stulova[OpenCL] Fix optional image types.
This change allows the use of identifiers for image types from `cl_khr_gl_msaa_sharing` freely in the kernel code if the extension is not supported since they are not in the list of the reserved identifiers.
This change also removed the need for pragma for the types in the extensions since the spec does not require the pragma uses.
Differential Revision: https://reviews.llvm.org/D100983
|
 | clang/include/clang/Basic/OpenCLImageTypes.def |
 | clang/lib/Parse/ParseDecl.cpp |
 | clang/lib/Sema/Sema.cpp |
 | clang/lib/Sema/SemaType.cpp |
 | clang/test/SemaOpenCL/invalid-image.cl |
 | clang/test/SemaOpenCL/access-qualifier.cl |
Commit
dfe3ffaa4a47ea93cc289b4496c093fbaf73adbc
by malhar.jajoo[ARM] Transforming memset to Tail predicated Loop
This patch converts llvm.memset intrinsic into Tail Predicated Hardware loops for a target that supports the Arm M-profile Vector Extension (MVE).
The llvm.memset is converted to a TP loop for both constant and non-constant input sizes (of llvm.memset).
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D100435
|
 | llvm/test/CodeGen/Thumb2/LowOverheadLoops/memcall.ll |
 | llvm/test/CodeGen/Thumb2/mve-phireg.ll |
 | llvm/test/CodeGen/Thumb2/mve-tp-loop.ll |
 | llvm/lib/Target/ARM/ARMSubtarget.h |
 | llvm/test/CodeGen/Thumb2/mve-tp-loop.mir |
 | llvm/lib/Target/ARM/ARMInstrMVE.td |
 | llvm/lib/Target/ARM/ARMISelLowering.cpp |
 | llvm/test/CodeGen/Thumb2/mve-gather-scatter-optimisation.ll |
 | llvm/lib/Target/ARM/ARMSelectionDAGInfo.cpp |
 | llvm/lib/Target/ARM/ARMISelLowering.h |
Commit
14818a86d044909d8eeb1f39f689e2785a09823b
by stephen.tozerFix: [DebugInfo] Fix crash when emitting an invalidated SDDbgValue
This patch is a fix for revision ce0c1f3c, which caused test failures on bots without x86 as a registered target. This patch moves the test added in the prior patch to the x86 folder, so that it only runs on bots with the correct target available.
|
 | llvm/test/DebugInfo/Generic/invalidated-dbg-value-is-undef.ll |
 | llvm/test/DebugInfo/X86/invalidated-dbg-value-is-undef.ll |
Commit
606d4e806192013ff7da33351f671d08b4524438
by david.stuttardAMDGPU: Correct const_index_stride for wave 32 for PAL ABI
Retrying after revert and fix (removed implicit def flag from operand). Now passes with expensive_checks enabled.
Since there is a single scratch resource descriptor for all shaders, if there is a wave32 and a wave64 shader (for instance for VsFs pairs) then the const_index_stride will be incorrect for wave32 shaders.
Differential Revision: https://reviews.llvm.org/D101830
Change-Id: Ie3b8b2921237968caca91527dd0c97b1b0cc0360
|
 | llvm/test/CodeGen/AMDGPU/pal-simple-indirect-call.ll |
 | llvm/lib/Target/AMDGPU/SIFrameLowering.cpp |
Commit
13c0316239dc31a34262f2270d0952aa152a9a76
by sebastian.neubauer[AMDGPU] Restrict immediate scratch offsets
gfx9 does not work with negative offsets, gfx10 works only with aligned negative offsets, but not with unaligned negative offsets.
This is slightly more conservative than needed, gfx9 does support negative offsets when a VGPR address is used and gfx10 supports negative, unaligned offsets when an SGPR address is used, but we do not make use of that with this patch.
Differential Revision: https://reviews.llvm.org/D101292
|
 | llvm/lib/Target/AMDGPU/AMDGPU.td |
 | llvm/lib/Target/AMDGPU/GCNSubtarget.h |
 | llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll |
 | llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp |
 | llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp |
 | llvm/lib/Target/AMDGPU/SIInstrInfo.cpp |
 | llvm/test/CodeGen/AMDGPU/flat-scratch.ll |
Commit
6248d1119040d5031b248633005998b94b8024d4
by benny.kraRetire TargetRegisterInfo::getSpillAlignment
getSpillAlign does the same thing.
|
 | llvm/lib/Target/Hexagon/HexagonInstrInfo.cpp |
 | llvm/lib/Target/Hexagon/HexagonISelLowering.cpp |
 | llvm/lib/CodeGen/PrologEpilogInserter.cpp |
 | llvm/include/llvm/CodeGen/TargetRegisterInfo.h |
Commit
dd21c6b843b25d2d65daab561fe47b4157c32952
by llvm-dev[DAG] Ensure all SD classes consistently return a const reference with getDebugLoc(). NFCI.
Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.
|
 | llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h |
Commit
c9d4b4173b56c5a56d32d07be660f872b9746f87
by llvm-dev[CodeGen] Ensure UserValue::getDebugLoc() and UserLabel::getDebugLoc() consistently return a const reference NFCI.
Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.
|
 | llvm/lib/CodeGen/LiveDebugVariables.cpp |
Commit
7bc1dd1191aba77da83f04415ee646cc3381729e
by stephen.tozerReapply "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands"
Reapply b623df3c, which was reverted while reverting a different patch with a breaking change. There are no underlying issues with this patch, so no changes have been made to the original patch.
This reverts commit b11e4c990771541e440861f017afea7b4ba162f4.
|
 | llvm/test/DebugInfo/X86/live-debug-vars-loc-limit.ll |
 | llvm/lib/CodeGen/LiveDebugVariables.cpp |
Commit
8935c8449b7b17049990d29443ed29dde315f281
by arthur.j.odwyer[libc++] [test] Test that list::swap/move/move-assign does not invalidate iterators.
And remove the dedicated debug-iterator test; we want to test this in all modes. We have a CI step for testing the whole test suite with `--debug_level=1` now.
Part of https://reviews.llvm.org/D102003
|
 | libcxx/test/libcxx/containers/sequences/list/list.cons/db_move.pass.cpp |
 | libcxx/test/std/containers/sequences/list/list.cons/move.pass.cpp |
 | libcxx/test/std/containers/sequences/list/list.cons/assign_move.pass.cpp |
 | libcxx/test/std/containers/sequences/list/list.special/swap.pass.cpp |
Commit
a1f75bf091a20132dc44828a2a9a68d559f922f3
by arthur.j.odwyer[libc++] [test] Simplify arithmetic in list.special/swap.pass.cpp. NFCI.
Part of https://reviews.llvm.org/D102003
|
 | libcxx/test/std/containers/sequences/list/list.special/swap.pass.cpp |
Commit
f42355e17c3f3d1d099d028a388796a64724ffdb
by arthur.j.odwyer[libc++] [test] Test that unordered_*::swap/move/assign does not invalidate iterators.
And remove the dedicated debug-iterator tests; we want to test this in all modes. We have a CI step for testing the whole test suite with `--debug_level=1` now.
Part of https://reviews.llvm.org/D102003
|
 | libcxx/test/std/containers/unord/unord.set/unord.set.cnstr/move.pass.cpp |
 | libcxx/test/std/containers/unord/unord.multimap/unord.multimap.swap/swap_non_member.pass.cpp |
 | libcxx/test/std/containers/unord/unord.set/unord.set.cnstr/assign_move.pass.cpp |
 | libcxx/test/std/containers/unord/unord.multimap/unord.multimap.cnstr/move.pass.cpp |
 | libcxx/test/std/containers/unord/unord.map/unord.map.cnstr/move.pass.cpp |
 | libcxx/test/std/containers/unord/unord.map/unord.map.swap/swap_non_member.pass.cpp |
 | libcxx/test/std/containers/unord/unord.multimap/unord.multimap.cnstr/assign_move.pass.cpp |
 | libcxx/test/std/containers/unord/unord.map/unord.map.cnstr/assign_move.pass.cpp |
 | libcxx/test/libcxx/containers/unord/unord.multimap/db_move.pass.cpp |
 | libcxx/test/std/containers/unord/unord.set/unord.set.swap/swap_non_member.pass.cpp |
 | libcxx/test/std/containers/unord/unord.multiset/unord.multiset.swap/swap_non_member.pass.cpp |
 | libcxx/test/libcxx/containers/unord/unord.map/db_move.pass.cpp |
 | libcxx/test/libcxx/containers/unord/unord.set/db_move.pass.cpp |
 | libcxx/test/std/containers/unord/unord.multiset/unord.multiset.cnstr/assign_move.pass.cpp |
 | libcxx/test/libcxx/containers/unord/unord.multiset/db_move.pass.cpp |
 | libcxx/test/std/containers/unord/unord.multiset/unord.multiset.cnstr/move.pass.cpp |
Commit
e6d688ec96706c1bbcb27419333828ec61752fab
by lebedev.ri[NFC][X86][MCA] Increase iteration count in reg move elimination tests
So the IPC actually stabilizes at 6.
|
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-mmx.s |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-gpr.s |
Commit
c3cd8ed0097b07e5454255ffe5899ded21ca0bff
by lebedev.ri[NFC][X86] AMD Zen 3: move sched classes for renameables moves togeter
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
Commit
d8c6202576771f0e1478b3abdd246600caf7d704
by lebedev.ri[X86] AMD Zen 3: throughput for renameable GPR moves is 6
They are resolved at the register rename stage without using any execution units.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-gpr.s |
Commit
cbabe4f4d62a6bcee206e0673de559805a092420
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: Add tests for renameable SSE XMM moves
|
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-sse-xmm.s |
Commit
bcbfc22ff9b2f16d77489b0ce34e8d96e4f9ae5b
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: Add tests for renameable AVX XMM moves
|
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-xmm.s |
Commit
0d961fbd525cb7df3e981d6469b81cbf8f5e5883
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: Add tests for renameable AVX YMM moves
|
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-ymm.s |
Commit
9db4203883f57f34e7e88fd6deb761ef8a9f7d5a
by lebedev.ri[X86] AMD Zen 3: SSE XMM moves are zero-cycle
I've verified this with llvm-exegesis. This is not limited to zero registers.
Refs: AMD SOG 19h, 2.9.4 Zero Cycle Move The processor is able to execute certain register to register mov operations with zero cycle delay.
Agner, 22.13 Instructions with no latency Register-to-register move instructions are resolved at the register rename stage without using any execution units. These instructions have zero latency. It is possible to do six such register renamings per clock cycle, and it is even possible to rename the same register multiple times in one clock cycle.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-sse-xmm.s |
Commit
ee020b930d1299acf42b759dd15a44d2020ef963
by lebedev.ri[X86] AMD Zen 3: AVX XMM moves are zero-cycle
I've verified this with llvm-exegesis. This is not limited to zero registers.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-xmm.s |
Commit
715c0d0bd412141e0404d5bfcad4dddac3bfc0d0
by lebedev.ri[X86] AMD Zen 3: AVX YMM moves are zero-cycle
I've verified this with llvm-exegesis. This is not limited to zero registers.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-ymm.s |
Commit
758c173309edbd6ac3958eb08dc01b6524badff8
by lebedev.ri[X86] AMD Zen 3: throughput for renameable XMM/YMM moves is 6
They are resolved at the register rename stage without using any execution units.
|
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-sse-xmm.s |
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/resources-avx1.s |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-xmm.s |
 | llvm/test/tools/llvm-mca/X86/Znver3/resources-sse2.s |
 | llvm/test/tools/llvm-mca/X86/Znver3/resources-sse1.s |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-ymm.s |
Commit
34de155f7e335e9e69276356565dcc31ed7d8535
by lebedev.ri[NFC][X86][MCA] AMD Zen3 Decrease iteration count in reg-move-elimination tests
Drop it just enough so it still produces the right IPC.
|
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-mmx.s |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-xmm.s |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-sse-xmm.s |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-ymm.s |
Commit
25bbff632d018d178272a61c0732203d53d3a2e3
by saghir[PowerPC] Provide MMA builtins for compatibility
Vector pair intrinsics and builtins were renamed in https://reviews.llvm.org/D91974 to replace the _mma_ prefix by _vsx_. However, some projects used the _mma_ version, so this patch adds these intrinsics to provide compatibility.
Fixes Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=50159
Reviewed By: nemanjai, amyk
Differential Revision: https://reviews.llvm.org/D100482
|
 | clang/test/CodeGen/builtins-ppc-pair-mma.c |
 | clang/include/clang/Basic/BuiltinsPPC.def |
 | clang/lib/CodeGen/CGBuiltin.cpp |
 | clang/lib/Sema/SemaChecking.cpp |
Commit
faab8c140ab2480d978ccc3ea11cbc3b279736b6
by tpopp[mlir] Rename BufferAliasAnalysis to BufferViewFlowAnalysis
This it to make more clear the difference between this and an AliasAnalysis.
For example, given a sequence of subviews that create values A -> B -> C -> d: BufferViewFlowAnalysis::resolve(B) => {B, C, D} AliasAnalysis::resolve(B) => {A, B, C, D}
Differential Revision: https://reviews.llvm.org/D100838
|
 | mlir/lib/Analysis/BufferAliasAnalysis.cpp |
 | mlir/include/mlir/Transforms/Bufferize.h |
 | mlir/lib/Analysis/CMakeLists.txt |
 | mlir/lib/Analysis/BufferViewFlowAnalysis.cpp |
 | mlir/include/mlir/Analysis/BufferAliasAnalysis.h |
 | mlir/lib/Transforms/BufferOptimizations.cpp |
 | mlir/include/mlir/Analysis/BufferViewFlowAnalysis.h |
 | mlir/include/mlir/Transforms/BufferUtils.h |
 | mlir/lib/Transforms/BufferDeallocation.cpp |
Commit
f31531a30b124042d8523b7d50053ade82659c5b
by gysit[mlir][linalg] Remove redundant indexOp builder.
Remove the builder signature taking a signed dimension identifier.
Reviewed By: ergawy
Differential Revision: https://reviews.llvm.org/D102055
|
 | mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td |
 | mlir/lib/Dialect/Linalg/Transforms/Interchange.cpp |
 | mlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp |
Commit
a15f8589f4e81973b096a5ccc7b5b687c3284ebe
by huberjn[libomptarget] Add support for target memory allocators to cuda RTL
Summary: The allocator interface added in D97883 allows the RTL to allocate shared and host-pinned memory from the cuda plugin. This patch adds support for these to the runtime.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D102000
|
 | openmp/libomptarget/test/api/omp_host_pinned_memory.c |
 | openmp/libomptarget/plugins/cuda/src/rtl.cpp |
 | openmp/libomptarget/test/api/omp_device_managed_memory.c |
 | openmp/libomptarget/plugins/common/MemoryManager/MemoryManager.h |
Commit
0a6f11aabdd3f116b603694a0d4f9abbba62ade4
by spatel[AArch64] add test for missed vectorization; NFC
This is a reduction of the example in: https://llvm.org/PR50256
|
 | llvm/test/Transforms/SLPVectorizer/AArch64/widen.ll |
Commit
bc302bfbef84bd778a9e5e0a1b5851c6a55c1d9c
by jotremBasicAA: Recognize inttoptr as isEscapeSource
Pointers escape when converted to integers, so a pointer produced by converting an integer to a pointer must not be a local non-escaping object.
Reviewed By: nikic, nlopes, aqjune
Differential Revision: https://reviews.llvm.org/D101541
|
 | llvm/lib/Analysis/BasicAliasAnalysis.cpp |
 | llvm/test/Analysis/BasicAA/noalias-inttoptr.ll |
Commit
565ee6afc707d5744d0ec90936f0c0564c1acf69
by thomasraoux[mlir][spirv] add support lowering of extract_slice to scalar type
Differential Revision: https://reviews.llvm.org/D102041
|
 | mlir/lib/Conversion/VectorToSPIRV/VectorToSPIRV.cpp |
 | mlir/test/Conversion/VectorToSPIRV/simple.mlir |
Commit
a970e69d6b62d60c4c222e2a4be0a73999c97651
by thomasraoux[mlir][vector] add pattern to cast away leading unit dim for elementwise op
Differential Revision: https://reviews.llvm.org/D102034
|
 | mlir/lib/Dialect/Vector/VectorTransforms.cpp |
 | mlir/test/Dialect/Vector/vector-transforms.mlir |
Commit
70cbc6dbef7048d3b1aa89a676d96c6ba075b41b
by mascasa[libFuzzer] Fix stack overflow detection
Address sanitizer can detect stack exhaustion via its SEGV handler, which is executed on a separate stack using the sigaltstack mechanism. When libFuzzer is used with address sanitizer, it installs its own signal handlers which defer to those put in place by the sanitizer before performing additional actions. In the particular case of a stack overflow, the current setup fails because libFuzzer doesn't preserve the flag for executing the signal handler on a separate stack: when we run out of stack space, the operating system can't run the SEGV handler, so address sanitizer never reports the issue. See the included test for an example.
This commit fixes the issue by making libFuzzer preserve the SA_ONSTACK flag when installing its signal handlers; the dedicated signal-handler stack set up by the sanitizer runtime appears to be large enough to support the additional frames from the fuzzer.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D101824
|
 | compiler-rt/lib/fuzzer/FuzzerUtilPosix.cpp |
 | compiler-rt/test/fuzzer/StackOverflowTest.cpp |
 | compiler-rt/test/fuzzer/stack-overflow-with-asan.test |
Commit
a8e30e63aca0e9c61f956e61303ae3694cf00f2c
by lebedev.ri[NFC][X86][MCA] AMD Zen3: add test for zero-cycle X87 move
|
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-x87.s |
Commit
2819009b5aa9725aebba63e8722e31943a7fb36f
by lebedev.ri[X86] AMD Zen 3: _REV variants of zero-cycles moves are also zero-cycles (PR50261)
Sometimes disassembler picks _REV variants of instructions over the plain ones, which in this case exposed an issue that the _REV variants aren't being modelled as optimizable moves.
|
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-ymm.s |
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-xmm.s |
Commit
f744723f7538934e0beb5d8a2267afeb86345986
by llvm-dev[X86] combineXor - limit fold to non-opaque constants (PR50254)
Ensure we don't try to fold when one might be an opaque constant - the constant fold will fail and then the reverse fold will happen in DAGCombine.....
|
 | llvm/lib/Target/X86/X86ISelLowering.cpp |
 | llvm/test/CodeGen/X86/pr50254.ll |
Commit
1006ac3963eaf39153d6637b631662e87ebf3b4d
by whitneyt[LoopNest] Consider loop nest with inner loop guard using outer loop induction variable to be perfect
This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect.
Reviewed By: bmahjour, sidbav
Differential Revision: https://reviews.llvm.org/D94717
|
 | llvm/lib/Analysis/LoopInfo.cpp |
 | llvm/lib/Analysis/LoopNestAnalysis.cpp |
 | llvm/test/Analysis/LoopNestAnalysis/imperfectnest.ll |
 | llvm/unittests/Analysis/LoopInfoTest.cpp |
 | llvm/test/Analysis/LoopNestAnalysis/perfectnest.ll |
 | llvm/include/llvm/Analysis/LoopNestAnalysis.h |
Commit
f09414499c4717b66baa9581c641e8a636e5dcc1
by mascasa[libFuzzer] Fix stack-overflow-with-asan.test.
Fix function return type and remove check for SUMMARY, since it doesn't seem to be output in Windows.
|
 | compiler-rt/test/fuzzer/stack-overflow-with-asan.test |
 | compiler-rt/test/fuzzer/StackOverflowTest.cpp |
Commit
6a2850f3fc24cc53da6543ee98bd837007c65725
by i[AArch64][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local
Similar to X86 D73230 & 46788a21f9152be3950e57dc526454655682bdd4
With this change, we can set dso_local in clang's -fpic -fno-semantic-interposition mode, for default visibility external linkage non-ifunc-non-COMDAT definitions.
For such dso_local definitions, variable access/taking the address of a function/calling a function will go through a local alias to avoid GOT/PLT.
Note: the 'S' inline assembly constraint refers to an absolute symbolic address or a label reference (D46745).
Differential Revision: https://reviews.llvm.org/D101872
|
 | llvm/test/CodeGen/AArch64/basic-pic.ll |
 | llvm/lib/Target/AArch64/AArch64MCInstLower.cpp |
 | llvm/test/CodeGen/AArch64/elf-globals-static.ll |
 | llvm/test/CodeGen/AArch64/elf-preemption.ll |
 | llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp |
 | llvm/test/CodeGen/AArch64/semantic-interposition-asm.ll |
Commit
5b1610a25054b308d02be8882dd34bed3dc29ef4
by lebedev.ri[X86] AMD Zen 3: MOVSX32rr32 is a zero-cycle move
It measures as such, and the reference docs agree.
I can't easily add a MCA test, because there's no mnemonic for it, it can only be disassembled or created as a MCInst.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
Commit
b8701dc1749e228b886e53bdb32eeebba00e30da
by lebedev.ri[X86] AMD Zen 3: mark XMM/YMM (but not MMX!) reg moves as eliminatible in RegisterFile
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
Commit
d319005a3746a7661c8c9a3302266b6ff7cf61be
by Saleem Abdulrasoollit: revert 134b103fc0f3a995d76398bf4b029d72bebe8162
Revert the 32-process cap on Windows. When testing with Swift, we found that there was a time reduction for testing with the higher load. This should hopefully not matter much in practice. In the case that the original problem with python remains with a high subprocess count, we can easily revert this change.
|
 | llvm/utils/lit/lit/util.py |
Commit
8002c5d65fdc979fc2f4fa33509f6c32caca3dce
by Louis Dionne[libc++][ci] Run longer CI jobs first
Jobs that test with a more recent standard version run more tests, so they take longer. We'll decrease the average latency by running them first instead of last.
|
 | libcxx/utils/ci/buildkite-pipeline.yml |
Commit
d8aba75a768033c326613d85e8789703cb4565d2
by iInternalize some cl::opt global variables or move them under namespace llvm
|
 | llvm/lib/Analysis/AliasAnalysis.cpp |
 | llvm/lib/Analysis/BlockFrequencyInfoImpl.cpp |
 | llvm/lib/Passes/PassBuilder.cpp |
 | llvm/lib/Transforms/Utils/SizeOpts.cpp |
 | llvm/lib/Transforms/Utils/AssumeBundleBuilder.cpp |
 | llvm/lib/CodeGen/MachineBlockFrequencyInfo.cpp |
 | llvm/lib/Transforms/IPO/BlockExtractor.cpp |
 | llvm/lib/LTO/SummaryBasedOptimizations.cpp |
 | llvm/lib/Analysis/BlockFrequencyInfo.cpp |
 | llvm/lib/Transforms/IPO/PassManagerBuilder.cpp |
 | llvm/include/llvm/Transforms/Utils/SizeOpts.h |
 | llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp |
 | llvm/unittests/Analysis/AssumeBundleQueriesTest.cpp |
 | llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h |
 | llvm/lib/Transforms/IPO/SyntheticCountsPropagation.cpp |
 | llvm/lib/MC/MCAsmInfoXCOFF.cpp |
 | llvm/tools/opt/NewPMDriver.cpp |
 | llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp |
 | llvm/lib/Analysis/CallGraphSCCPass.cpp |
 | llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp |
 | polly/lib/Analysis/ScopDetectionDiagnostic.cpp |
 | llvm/lib/MC/MCAsmInfo.cpp |
 | llvm/lib/CodeGen/MachineBlockPlacement.cpp |
 | llvm/tools/opt/opt.cpp |
 | llvm/lib/CodeGen/MachineBranchProbabilityInfo.cpp |
Commit
50cf0a1d1ae48bd0397b41a400e01c62975b6706
by kparzyszAllow empty value list in propagateMetadata(Inst, ArrayOf...)
This will allow writing propagateMetadata(Inst, collectInterestingValues(...)) without concern about empty lists. In case of an empty list, Inst is returned without any changes.
|
 | llvm/lib/Analysis/VectorUtils.cpp |
Commit
724604901a104d8ba9e48ca0330e164a66c1c7ac
by i[unittest] Fix -Wunused-variable after D94717
|
 | llvm/unittests/Analysis/LoopInfoTest.cpp |
Commit
1e9c39a3f982fe2f50cd19c74be8b64dfba4baad
by tlively[WebAssembly] Use functions instead of macros for const SIMD intrinsics
To improve hygiene, consistency, and usability, it would be good to replace all the macro intrinsics in wasm_simd128.h with functions. The reason for using macros in the first place was to enforce the use of constants for some arguments using `_Static_assert` with `__builtin_constant_p`. This commit switches to using functions and uses the `__diagnose_if__` attribute rather than `_Static_assert` to enforce constantness.
The remaining macro intrinsics cannot be made into functions until the builtin functions they are implemented with can be replaced with normal code patterns because the builtin functions themselves require that their arguments are constants.
This commit also fixes a bug with the const_splat intrinsics in which the f32x4 and f64x2 variants were incorrectly producing integer vectors.
Differential Revision: https://reviews.llvm.org/D102018
|
 | clang/test/Headers/wasm.c |
 | clang/lib/Headers/wasm_simd128.h |
Commit
6c99e631201aaea0a75708749cbaf2ba08a493f9
by flo[SCEV] By more careful when traversing phis in isImpliedViaMerge.
I think currently isImpliedViaMerge can incorrectly return true for phis in a loop/cycle, if the found condition involves the previous value of
Consider the case in exit_cond_depends_on_inner_loop.
At some point, we call (modulo simplifications) isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1).
The existing code tries to prove IncV <= -1 for all incoming values InvV using the found condition (%call <= -1). At the moment this succeeds, but only because it does not compare the same runtime value. The found condition checks the value of the last iteration, but the incoming value is from the *previous* iteration.
Hence we incorrectly determine that the *previous* value was <= -1, which may not be true.
I think we need to be more careful when looking at the incoming values here. In particular, we need to rule out that a found condition refers to any value that may refer to one of the previous iterations. I'm not sure there's a reliable way to do so (that also works of irreducible control flow).
So for now this patch adds an additional requirement that the incoming value must properly dominate the phi block. This should ensure the values do not change in a cycle. I am not entirely sure if will catch all cases and I appreciate a through second look in that regard.
Alternatively we could also unconditionally bail out in this case, instead of checking the incoming values
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101829
|
 | llvm/test/Transforms/IndVarSimplify/eliminate-exit.ll |
 | llvm/test/Transforms/IRCE/decrementing-loop.ll |
 | llvm/lib/Analysis/ScalarEvolution.cpp |
Commit
7ca26c5fa2df253878cab22e1e2f0d6f1b481218
by aeubanksRevert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST"
This reverts commit 0791f968fee259e5c34523167bd58179b8b081c2.
Causing crashes: https://crbug.com/1206764
|
 | llvm/lib/CodeGen/MachineCopyPropagation.cpp |
 | llvm/test/DebugInfo/ARM/machine-cp-updates-dbg-reg.mir |
 | llvm/include/llvm/CodeGen/MachineRegisterInfo.h |
Commit
21db1e3b01402678994a291930eadf82187750c4
by gysit[mlir][docs] remove stale statement about index type in vectors
b614ada0e8 ("[mlir] add support for index type in vectors.") removed this limitation.
Differential Revision: https://reviews.llvm.org/D102081
|
 | mlir/include/mlir/IR/BuiltinTypes.td |
Commit
a3f22d020b2709b2b4897ae3450c33834e646329
by pifon[mlir] Add a pattern to bufferize linalg.tensor_reshape.
Differential Revision: https://reviews.llvm.org/D102089
|
 | mlir/test/Dialect/Linalg/bufferize.mlir |
 | mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp |
Commit
3444996b4c45f6efdd731100e8ca6c6105407045
by pifon[mlir] Add a pattern to bufferize std.index_cast.
Differential Revision: https://reviews.llvm.org/D102088
|
 | mlir/test/Dialect/Standard/bufferize.mlir |
 | mlir/lib/Dialect/StandardOps/Transforms/Bufferize.cpp |
Commit
f2f88f3e7a110b2d4d9da446e45f0dba040e62b2
by vyacheslav.p.zakharinAn attempt to abandon omptarget out-of-tree builds.
I want to start using LLVM component libraries in libomptarget to stop duplicating implementations already available in LLVM (e.g. LLVMObject, LLVMSupport, etc.). Without relying on LLVM in all libomptarget builds one has to provide fallback implementation for each used LLVM feature.
This is an attempt to stop supporting out-of-llvm-tree builds of libomptarget.
I understand that I may need to revert this, if this affects downstream projects in a bad way.
Differential Revision: https://reviews.llvm.org/D101509
|
 | openmp/CMakeLists.txt |
 | openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake |
 | openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt |
 | openmp/README.rst |
 | openmp/libomptarget/src/CMakeLists.txt |
Commit
c04c66d705b4f6e95a6325ef6d6c647ebc622165
by kai.wang[RISCV] Consider scalar types for required extensions.
We have vector operations on double vector and float scalar. For example, vfwadd.wf is such a instruction.
vfloat64m1_t vfwadd_wf(vfloat64m1_t op0, float op1, size_t op2);
We should specify F and D extensions for it.
Differential Revision: https://reviews.llvm.org/D102051
|
 | clang/utils/TableGen/RISCVVEmitter.cpp |
Commit
6b00b34b8a05896f79b18a1963811299b83d5b21
by phosek[BareMetal] Ensure that sysroot always comes after library paths
This addresses an issue introduced in D91559. We would invoke the compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both locations contain libraries with the same name, but we expect linker to pick up the library in path/to/lib since that version is more specialized. This was the case before D91559 where the sysroot path would be ignored, but after that change linker would now pick up the library from the sysroot which resulted in unexpected behavior.
The sysroot path should always come after any user provided library paths, followed by compiler runtime paths. We want for libraries in user provided library paths to always take precedence over sysroot libraries. This matches the behavior of other toolchains used with other targets.
Differential Revision: https://reviews.llvm.org/D102049
|
 | clang/lib/Driver/ToolChains/BareMetal.cpp |
 | clang/test/Driver/baremetal-sysroot.cpp |
Commit
01c78a0b0764e5c254c745a21c35f7950b6c8816
by pklausler[flang] Implement NORM2 in the runtime
Implement the reduction transformational intrinsic function NORM2 in the runtime, using infrastructure already in place for MAXVAL & al.
Differential Revision: https://reviews.llvm.org/D102024
|
 | flang/runtime/reduction.cpp |
 | flang/unittests/RuntimeGTest/Reduction.cpp |
 | flang/runtime/extrema.cpp |
 | flang/runtime/reduction.h |
Commit
01c26d4e048cf9812e7675cb704c2a4461b68e4c
by flo[LV] Rename Region to TargetRegion, similar to SinkRegion (NFC).
Adjust the name to make it clearer this is the region containing the target recipe, similar to SinkRegion below.
Suggested post-commit for ccebf7a1096a.
|
 | llvm/lib/Transforms/Vectorize/LoopVectorize.cpp |
Commit
337d7652823f59f4613552cebdf81292bf8f393d
by flo[LV] Assert if trying to sink replicate region into another region (NFC)
Currently sinking a replicate region into another replicate region is not supported. Add an assert, to make the problem more obvious, should it occur.
Discussed post-commit for ccebf7a1096a.
|
 | llvm/lib/Transforms/Vectorize/LoopVectorize.cpp |
Commit
c4adc49a1c988e6ea8a340b6245525ef5599812c
by rnk[SEH] Fix regression with SEH in noexpect functions
Commit 5baea0560160a693b19022c5d0ba637b6b46b2d8 set the CurCodeDecl because it was needed to pass the assert in CodeGenFunction::EmitLValueForLambdaField, But this was not right to do as CodeGenFunction::FinishFunction passes it to EmitEndEHSpec and cause corruption of the EHStack.
Revert the part of the commit that changes the CurCodeDecl, and instead adjust the assert to check for a null CurCodeDecl.
Differential Revision: https://reviews.llvm.org/D102027
|
 | clang/lib/CodeGen/CGException.cpp |
 | clang/lib/CodeGen/CGExpr.cpp |
 | clang/test/CodeGenCXX/exceptions-seh.cpp |
Commit
3822ac909ead8f41ebc81e382bb01908bf04f407
by andrea.dibiagio[MCA][RegisterFile] Fix register class check for move elimination (PR50265)
The register file should always check if the destination register is from a register class that allows move elimination.
Before this change, the check on the register class was only performed in a few very specific cases. However, it should have always been performed. This patch fixes the issue.
Note that none of the upstream scheduling models is currently affected by this bug, so there is no test for it. The issue was found by Roman while working on the znver3 model. I was able to reproduce the issue locally by tweaking the btver2 model. I then verified that this patch fixes the issue.
|
 | llvm/lib/MCA/HardwareUnits/RegisterFile.cpp |
Commit
75b9997760c69968863740ded6c89d4faf29ca7f
by flo[LV] Remove reference of PHI from comment, they are not recorded (NFC).
The comment incorrectly states that the PHI is recorded. That's not accurate, only the recipe for the incoming value is recorded.
Suggested post-commit for 4ba8720f8844.
|
 | llvm/lib/Transforms/Vectorize/LoopVectorize.cpp |
Commit
f97ada27aaf64207a2ffad937ce3ccf009e81bd8
by phosekRevert "[BareMetal] Ensure that sysroot always comes after library paths"
This reverts commit 6b00b34b8a05896f79b18a1963811299b83d5b21.
|
 | clang/test/Driver/baremetal-sysroot.cpp |
 | clang/lib/Driver/ToolChains/BareMetal.cpp |
Commit
d0453a8933a14c9441b2d89e6f934bd1bc243200
by thomasraoux[mlir][vector] Extend pattern to trim lead unit dimension to Splat Op
Differential Revision: https://reviews.llvm.org/D102091
|
 | mlir/lib/Dialect/Vector/VectorTransforms.cpp |
 | mlir/test/Dialect/Vector/vector-transforms.mlir |
Commit
b90b66bcbe3ec909d386d3d546cd116099619641
by thomasraoux[mlir] Missed clang-format
|
 | mlir/lib/Dialect/Vector/VectorTransforms.cpp |
Commit
d5a70db1938c06380bdab033b7d47a7437914f4c
by thakis[lld/mac] Write every weak symbol only once in the output
Before this, if an inline function was defined in several input files, lld would write each copy of the inline function the output. With this patch, it only writes one copy.
Reduces the size of Chromium Framework from 378MB to 345MB (compared to 290MB linked with ld64, which also does dead-stripping, which we don't do yet), and makes linking it faster:
N Min Max Median Avg Stddev x 10 3.9957051 4.3496981 4.1411121 4.156837 0.10092097 + 10 3.908154 4.169318 3.9712729 3.9846753 0.075773012 Difference at 95.0% confidence -0.172162 +/- 0.083847 -4.14165% +/- 2.01709% (Student's t, pooled s = 0.0892373)
Implementation-wise, when merging two weak symbols, this sets a "canOmitFromOutput" on the InputSection belonging to the weak symbol not put in the symbol table. We then don't write InputSections that have this set, as long as they are not referenced from other symbols. (This happens e.g. for object files that don't set .subsections_via_symbols or that use .alt_entry.)
Some restrictions: - not yet done for bitcode inputs - no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) -- Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs) (that is, catch block unwind information) and Personality Routines associated with weak functions still not stripped. This is wasteful, but harmless. - However, this does strip weaks from __unwind_info (which is needed for correctness and not just for size) - This nopes out on InputSections that are referenced form more than one symbol (eg from .alt_entry) for now
Things that work based on symbols Just Work: - map files (change in MapFile.cpp is no-op and not needed; I just found it a bit more explicit) - exports
Things that work with inputSections need to explicitly check if an inputSection is written (e.g. unwind info).
This patch is useful in itself, but it's also likely also a useful foundation for dead_strip.
I used to have a "canoncialRepresentative" pointer on InputSection instead of just the bool, which would be handy for ICF too. But I ended up not needing it for this patch, so I removed that again for now.
Differential Revision: https://reviews.llvm.org/D102076
|
 | lld/MachO/InputFiles.cpp |
 | lld/MachO/InputSection.cpp |
 | lld/MachO/InputSection.h |
 | lld/MachO/UnwindInfoSection.cpp |
 | lld/MachO/SymbolTable.cpp |
 | lld/test/MachO/weak-definition-gc.s |
 | lld/MachO/Symbols.h |
 | lld/MachO/Writer.cpp |
 | lld/MachO/MapFile.cpp |
Commit
167906c10932f5eda97b480ee084b17746c362e7
by phosek[BareMetal] Ensure that sysroot always comes after library paths
This addresses an issue introduced in D91559. We would invoke the compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both locations contain libraries with the same name, but we expect linker to pick up the library in path/to/lib since that version is more specialized. This was the case before D91559 where the sysroot path would be ignored, but after that change linker would now pick up the library from the sysroot which resulted in unexpected behavior.
The sysroot path should always come after any user provided library paths, followed by compiler runtime paths. We want for libraries in user provided library paths to always take precedence over sysroot libraries. This matches the behavior of other toolchains used with other targets.
Differential Revision: https://reviews.llvm.org/D102049
|
 | clang/lib/Driver/ToolChains/BareMetal.cpp |
 | clang/test/Driver/baremetal.cpp |
Commit
c6ddf669dcf379360e97a557e13617435d3c78cc
by Adrian PrantlFix the module-enabled build by removing a redundant type definition.
|
 | llvm/lib/Transforms/IPO/PassManagerBuilder.cpp |
Commit
1312852040b3190a6cb7d7c1f61fe95a5e930d8d
by Jessica Paquette[AArch64][GlobalISel] Legalize narrow type G_CTPOPs
Using `clampScalar` here because we ought to mark s128 as custom eventually.
(Right now, it will just fall back.)
With this legalization, we get the same code as SDAG: https://godbolt.org/z/TneoPKrKG
Differential Revision: https://reviews.llvm.org/D100908
|
 | llvm/test/CodeGen/AArch64/GlobalISel/legalize-ctpop.mir |
 | llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir |
Commit
6f7131002b6a821fc9b245ec5179910f171e3358
by aeubanks[NewPM] Move analysis invalidation/clearing logging to instrumentation
We're trying to move DebugLogging into instrumentation, rather than being part of PassManagers/AnalysisManagers.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D102093
|
 | clang/lib/CodeGen/BackendUtil.cpp |
 | llvm/lib/LTO/LTOBackend.cpp |
 | llvm/unittests/IR/PassBuilderCallbacksTest.cpp |
 | llvm/include/llvm/CodeGen/MachinePassManager.h |
 | llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp |
 | llvm/unittests/Analysis/CGSCCPassManagerTest.cpp |
 | llvm/unittests/CodeGen/PassManagerTest.cpp |
 | llvm/include/llvm/IR/PassInstrumentation.h |
 | llvm/unittests/IR/PassManagerTest.cpp |
 | llvm/include/llvm/IR/PassManagerImpl.h |
 | llvm/tools/opt/NewPMDriver.cpp |
 | llvm/test/Transforms/LoopUnroll/unroll-loop-invalidation.ll |
 | llvm/lib/Passes/StandardInstrumentations.cpp |
 | llvm/include/llvm/IR/PassManager.h |
Commit
0ad494838b8576de14144776490faa710fa2a099
by steveireNFC: Move TypeList implementation up the file
This will make it possible for more code to use it.
|
 | clang/include/clang/ASTMatchers/ASTMatchersInternal.h |
Commit
1f65f42dd37ab6a950d3ec110e3efca0ace1b615
by steveireMake `hasTypeLoc` matcher support more node types.
Differential Revision: https://reviews.llvm.org/D101572
|
 | clang/include/clang/ASTMatchers/ASTMatchers.h |
 | clang/unittests/ASTMatchers/ASTMatchersTraversalTest.cpp |
 | clang/docs/LibASTMatchersReference.html |
 | clang/include/clang/ASTMatchers/ASTMatchersInternal.h |
Commit
808bc11d9e1aa01edaf7ec4e56be3aee5ed42a83
by Amara Emerson[GlobalISel] Don't form zero/sign extending loads for atomics.
For importing patterns, we only support matching G_LOAD, not G_ZEXTLOAD or G_SEXTLOAD.
Differential Revision: https://reviews.llvm.org/D101932
|
 | llvm/test/CodeGen/AArch64/GlobalISel/prelegalizercombiner-extending-loads.mir |
 | llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp |
Commit
5b158093e2469dec16a070019c6432d26bf7be9b
by Amara Emerson[AArch64][GlobalISel] Create a new minimal combiner pass just for -O0.
We never bothered to have a separate set of combines for -O0 in the prelegalizer before. This results in some minor performance hits for a mode where performance isn't a concern (although not regressing code size significantly is still preferable).
This also removes the CSE option since we don't need it for -O0.
Through experiments, I've arrived at a set of combines that gets the most code size improvement at -O0, while reducing the amount of time spent in the combiner by around 35% give or take.
Differential Revision: https://reviews.llvm.org/D102038
|
 | llvm/lib/Target/AArch64/GISel/AArch64PreLegalizerCombiner.cpp |
 | llvm/test/CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll |
 | llvm/lib/Target/AArch64/AArch64TargetMachine.cpp |
 | llvm/lib/Target/AArch64/AArch64.h |
 | llvm/lib/Target/AArch64/GISel/AArch64O0PreLegalizerCombiner.cpp |
 | llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.h |
 | llvm/include/llvm/Target/GlobalISel/Combine.td |
 | llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp |
 | llvm/test/CodeGen/AArch64/O0-pipeline.ll |
 | llvm/lib/Target/AArch64/AArch64Combine.td |
 | llvm/lib/Target/AArch64/CMakeLists.txt |
 | llvm/test/CodeGen/AArch64/combine-loads.ll |
Commit
6aaf06f92988c6e2b91f90ab7ed3a6d21981480a
by thomasraoux[mlir][vector] Fix warning
Previous change caused another warning in some build configuration: "default label in switch which covers all enumeration values"
|
 | mlir/lib/Dialect/Vector/VectorTransforms.cpp |
Commit
d82bc9e81d0ec0254f9c069a392f0a826ce896ed
by aeubanks[gn build] Manually port 5b158093e
|
 | llvm/utils/gn/secondary/llvm/lib/Target/AArch64/BUILD.gn |
Commit
ddff81f6925655ac82fc70ebeb6edc511a62b3b5
by aeubanksRevert "lit: revert 134b103fc0f3a995d76398bf4b029d72bebe8162"
This reverts commit d319005a3746a7661c8c9a3302266b6ff7cf61be.
Causing messages like:
File "...\Python\Python39\lib\multiprocessing\connection.py", line 816, in _exhaustive_wait res = _winapi.WaitForMultipleObjects(L, False, timeout) ValueError: need at most 63 handles, got a sequence of length 74
|
 | llvm/utils/lit/lit/util.py |
Commit
5c84195b8ccb0c1352cd040a01b7b56374dd7ba6
by riddleriver[mlir] Add hover support to mlir-lsp-server
This provides information when the user hovers over a part of the source .mlir file. This revision adds the following hover behavior: * Operation: - Shows the generic form. * Operation Result: - Shows the parent operation name, result number(s), and type(s). * Block: - Shows the parent operation name, block number, predecessors, and successors. * Block Argument: - Shows the parent operation name, parent block, argument number, and type.
Differential Revision: https://reviews.llvm.org/D101113
|
 | mlir/lib/Parser/AsmParserState.cpp |
 | mlir/lib/Tools/mlir-lsp-server/lsp/Protocol.h |
 | mlir/lib/Tools/mlir-lsp-server/MLIRServer.cpp |
 | mlir/test/mlir-lsp-server/hover.test |
 | mlir/lib/Tools/mlir-lsp-server/MLIRServer.h |
 | mlir/docs/Tools/MLIRLSP.md |
 | mlir/lib/Tools/mlir-lsp-server/LSPServer.cpp |
 | mlir/include/mlir/Parser/AsmParserState.h |
 | mlir/lib/Tools/mlir-lsp-server/lsp/Protocol.cpp |
 | mlir/test/mlir-lsp-server/initialize-params.test |
Commit
44d14d5de6f1f74246bf66dea5538ddc304e6445
by aeubanks[lit] Bump up the Windows process cap from 32 to 60
At 61 or over, I see messages like
File "...\Python\Python39\lib\multiprocessing\connection.py", line 816, in _exhaustive_wait res = _winapi.WaitForMultipleObjects(L, False, timeout)
ValueError: need at most 63 handles, got a sequence of length 64
60 seems to work for me.
If this causes issues for anybody else, feel free to revert.
|
 | llvm/utils/lit/lit/util.py |
Commit
53b946aa636a31e9243b8c5bf1703a1f6eae798e
by riddleriver[mlir] Refactor the representation of function-like argument/result attributes.
The current design uses a unique entry for each argument/result attribute, with the name of the entry being something like "arg0". This provides for a somewhat sparse design, but ends up being much more expensive (from a runtime perspective) in-practice. The design requires building a string every time we lookup the dictionary for a specific arg/result, and also requires N attribute lookups when collecting all of the arg/result attribute dictionaries.
This revision restructures the design to instead have an ArrayAttr that contains all of the attribute dictionaries for arguments and another for results. This design reduces the number of attribute name lookups to 1, and allows for O(1) lookup for individual element dictionaries. The major downside is that we can end up with larger memory usage, as the ArrayAttr contains an entry for each element even if that element has no attributes. If the memory usage becomes too problematic, we can experiment with a more sparse structure that still provides a lot of the wins in this revision.
This dropped the compilation time of a somewhat large TensorFlow model from ~650 seconds to ~400 seconds.
Differential Revision: https://reviews.llvm.org/D102035
|
 | mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp |
 | mlir/lib/IR/FunctionSupport.cpp |
 | mlir/test/Dialect/LLVMIR/func.mlir |
 | mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp |
 | mlir/lib/Dialect/Linalg/Transforms/Detensorize.cpp |
 | mlir/lib/Dialect/SPIRV/Transforms/SPIRVConversion.cpp |
 | mlir/include/mlir/IR/FunctionSupport.h |
 | mlir/include/mlir/IR/BuiltinAttributes.td |
 | mlir/lib/Conversion/StandardToLLVM/StandardToLLVM.cpp |
 | mlir/lib/IR/BuiltinDialect.cpp |
 | mlir/lib/Transforms/Utils/DialectConversion.cpp |
 | mlir/test/IR/invalid-func-op.mlir |
 | mlir/lib/Dialect/SPIRV/IR/SPIRVOps.cpp |
 | mlir/include/mlir/IR/FunctionImplementation.h |
 | mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRV.cpp |
 | mlir/lib/Dialect/GPU/IR/GPUDialect.cpp |
 | mlir/include/mlir/Dialect/GPU/GPUOps.td |
 | mlir/lib/IR/FunctionImplementation.cpp |
 | mlir/test/IR/test-func-set-type.mlir |
Commit
223852d76fccc85cc5a844feec94781e8c5320ff
by VenkataRamanaiah.Nalamothu[DebugInfo] UnwindTable::create() should not add empty rows to CFI unwind table
UnwindTable::parseRows() may return successfully if the CFIProgram has either no CFI instructions or only DW_CFA_nop instructions and the UnwindRow return argument will be empty. But currently, the callers are not checking for this case which is leading to incorrect dumps in the unwind tables in such cases i.e.
CFA=unspecified
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D101892
|
 | llvm/unittests/DebugInfo/DWARF/DWARFDebugFrameTest.cpp |
 | llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp |
Commit
34a8a437bf20e0a340c19ed1fdb9cca584d43da1
by aeubanks[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose
Printing pass manager invocations is fairly verbose and not super useful.
This allows us to remove DebugLogging from pass managers and PassBuilder since all logging (aside from analysis managers) goes through instrumentation now.
This has the downside of never being able to print the top level pass manager via instrumentation, but that seems like a minor downside.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D101797
|
 | llvm/include/llvm/IR/PassManager.h |
 | llvm/lib/Passes/PassRegistry.def |
 | llvm/test/Other/new-pass-manager-cgscc-fct-proxy.ll |
 | llvm/lib/LTO/LTOBackend.cpp |
 | llvm/unittests/Analysis/CGSCCPassManagerTest.cpp |
 | llvm/include/llvm/Transforms/IPO/Inliner.h |
 | llvm/lib/Analysis/CGSCCPassManager.cpp |
 | llvm/test/tools/gold/X86/new-pm.ll |
 | polly/lib/Support/RegisterPasses.cpp |
 | llvm/unittests/IR/PassManagerTest.cpp |
 | llvm/test/Other/new-pm-lto-defaults.ll |
 | lld/test/ELF/lto/new-pass-manager.ll |
 | llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll |
 | llvm/test/Transforms/SROA/dead-inst.ll |
 | llvm/tools/llvm-opt-fuzzer/llvm-opt-fuzzer.cpp |
 | lld/test/wasm/lto/new-pass-manager.ll |
 | llvm/lib/Transforms/IPO/Inliner.cpp |
 | llvm/test/Transforms/Inline/cgscc-incremental-invalidate.ll |
 | llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll |
 | llvm/include/llvm/CodeGen/MachinePassManager.h |
 | llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp |
 | llvm/test/Transforms/LoopUnroll/unroll-loop-invalidation.ll |
 | llvm/test/Transforms/LoopRotate/pr35210.ll |
 | clang/lib/CodeGen/BackendUtil.cpp |
 | llvm/test/Transforms/Inline/clear-analyses.ll |
 | clang/test/CodeGen/thinlto-distributed-newpm.ll |
 | llvm/unittests/IR/PassBuilderCallbacksTest.cpp |
 | llvm/include/llvm/Target/TargetMachine.h |
 | llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll |
 | llvm/include/llvm/Passes/PassBuilder.h |
 | llvm/test/Other/pass-pipeline-parsing.ll |
 | llvm/tools/opt/NewPMDriver.cpp |
 | llvm/lib/Target/Hexagon/HexagonTargetMachine.cpp |
 | llvm/lib/Passes/StandardInstrumentations.cpp |
 | clang/test/CodeGenCoroutines/coro-newpm-pipeline.cpp |
 | llvm/test/Other/new-pm-defaults.ll |
 | llvm/test/Other/new-pm-thinlto-defaults.ll |
 | llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h |
 | llvm/test/Transforms/SCCP/ipsccp-preserve-analysis.ll |
 | llvm/lib/Target/NVPTX/NVPTXTargetMachine.h |
 | llvm/include/llvm/Transforms/Scalar/LoopPassManager.h |
 | llvm/lib/Target/Hexagon/HexagonTargetMachine.h |
 | llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll |
 | llvm/lib/Target/BPF/BPFTargetMachine.cpp |
 | llvm/lib/Target/BPF/BPFTargetMachine.h |
 | llvm/test/Other/new-pm-pgo-preinline.ll |
 | llvm/test/Other/loop-pm-invalidation.ll |
 | llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |
 | llvm/lib/Passes/PassBuilder.cpp |
 | llvm/lib/CodeGen/MachinePassManager.cpp |
 | llvm/lib/Transforms/Scalar/LoopPassManager.cpp |
 | llvm/test/Other/new-pass-manager.ll |
 | clang/test/CodeGen/lto-newpm-pipeline.c |
 | llvm/test/Other/new-pm-O0-defaults.ll |
 | llvm/test/Transforms/SCCP/preserve-analysis.ll |
 | llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp |
 | lld/test/COFF/lto-new-pass-manager.ll |
Commit
631da3b15203b672e237031f964a53f3bb194f9b
by michael.hliaoReplace a remaining CRLF with LF. NFC.
|
 | llvm/include/llvm/IR/Metadata.def |
Commit
77e2e5e07d01fe0b83c39d0c527c0d3d2e659146
by xiang1.zhang[X86] Support AMX fast register allocation
|
 | llvm/test/CodeGen/X86/AMX/amx-fast-tile-config.mir |
 | llvm/utils/gn/secondary/llvm/lib/Target/X86/BUILD.gn |
 | llvm/include/llvm/CodeGen/TargetPassConfig.h |
 | llvm/lib/Target/X86/X86ExpandPseudo.cpp |
 | llvm/lib/Target/X86/X86PreAMXConfig.cpp |
 | llvm/test/CodeGen/X86/AMX/amx-configO2toO0.ll |
 | llvm/test/CodeGen/X86/AMX/amx-low-intrinsics.ll |
 | llvm/lib/Target/X86/X86.h |
 | llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp |
 | llvm/lib/Target/X86/X86LowerAMXType.cpp |
 | llvm/lib/CodeGen/TargetPassConfig.cpp |
 | llvm/include/llvm/CodeGen/Passes.h |
 | llvm/tools/opt/opt.cpp |
 | llvm/lib/Target/X86/X86InstrAMX.td |
 | llvm/include/llvm/IR/IntrinsicsX86.td |
 | llvm/lib/Target/X86/X86FastTileConfig.cpp |
 | llvm/lib/Target/X86/X86TargetMachine.cpp |
 | llvm/test/CodeGen/X86/AMX/amx-configO0toO0.ll |
 | llvm/test/CodeGen/X86/AMX/amx-configO2toO0-precfg.ll |
 | llvm/test/CodeGen/X86/AMX/amx-configO2toO0-lower.ll |
 | llvm/lib/Target/X86/CMakeLists.txt |
 | llvm/test/CodeGen/X86/AMX/amx-low-intrinsics-no-amx-bitcast.ll |
 | llvm/test/CodeGen/X86/O0-pipeline.ll |
 | clang/include/clang/Basic/BuiltinsX86_64.def |
Commit
bebafe01a74619f19bd7830ff02a8647f1af54d6
by xiang1.zhangRevert "[X86] Support AMX fast register allocation"
This reverts commit 77e2e5e07d01fe0b83c39d0c527c0d3d2e659146.
|
 | llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp |
 | llvm/lib/Target/X86/X86InstrAMX.td |
 | llvm/include/llvm/CodeGen/TargetPassConfig.h |
 | llvm/test/CodeGen/X86/AMX/amx-configO0toO0.ll |
 | llvm/lib/Target/X86/X86LowerAMXType.cpp |
 | llvm/include/llvm/IR/IntrinsicsX86.td |
 | llvm/lib/Target/X86/X86PreAMXConfig.cpp |
 | llvm/include/llvm/CodeGen/Passes.h |
 | llvm/lib/Target/X86/X86ExpandPseudo.cpp |
 | llvm/tools/opt/opt.cpp |
 | llvm/lib/Target/X86/X86FastTileConfig.cpp |
 | llvm/test/CodeGen/X86/AMX/amx-fast-tile-config.mir |
 | llvm/lib/Target/X86/CMakeLists.txt |
 | llvm/test/CodeGen/X86/AMX/amx-configO2toO0.ll |
 | llvm/test/CodeGen/X86/AMX/amx-low-intrinsics.ll |
 | llvm/test/CodeGen/X86/O0-pipeline.ll |
 | clang/include/clang/Basic/BuiltinsX86_64.def |
 | llvm/utils/gn/secondary/llvm/lib/Target/X86/BUILD.gn |
 | llvm/lib/CodeGen/TargetPassConfig.cpp |
 | llvm/test/CodeGen/X86/AMX/amx-configO2toO0-precfg.ll |
 | llvm/test/CodeGen/X86/AMX/amx-low-intrinsics-no-amx-bitcast.ll |
 | llvm/test/CodeGen/X86/AMX/amx-configO2toO0-lower.ll |
 | llvm/lib/Target/X86/X86.h |
 | llvm/lib/Target/X86/X86TargetMachine.cpp |
Commit
72bd0116e3a1a70fb52fc47c056349b290ce2204
by aeubanksFix build after 34a8a437b
|
 | llvm/include/llvm/CodeGen/CodeGenPassBuilder.h |
Commit
d4bdeca5765ac2e81e217a5fa873d1ffbf0e95b0
by xiang1.zhang[X86] Support AMX fast register allocation
Differential Revision: https://reviews.llvm.org/D100026
|
 | llvm/lib/Target/X86/X86FastTileConfig.cpp |
 | llvm/tools/opt/opt.cpp |
 | llvm/lib/Target/X86/X86InstrAMX.td |
 | llvm/lib/Target/X86/X86PreAMXConfig.cpp |
 | llvm/include/llvm/CodeGen/Passes.h |
 | llvm/test/CodeGen/X86/AMX/amx-low-intrinsics-no-amx-bitcast.ll |
 | llvm/lib/Target/X86/X86TargetMachine.cpp |
 | llvm/test/CodeGen/X86/AMX/amx-fast-tile-config.mir |
 | llvm/lib/Target/X86/X86ExpandPseudo.cpp |
 | llvm/test/CodeGen/X86/AMX/amx-configO2toO0-precfg.ll |
 | clang/include/clang/Basic/BuiltinsX86_64.def |
 | llvm/lib/CodeGen/TargetPassConfig.cpp |
 | llvm/lib/Target/X86/CMakeLists.txt |
 | llvm/lib/Target/X86/X86LowerAMXType.cpp |
 | llvm/test/CodeGen/X86/AMX/amx-configO0toO0.ll |
 | llvm/test/CodeGen/X86/O0-pipeline.ll |
 | llvm/test/CodeGen/X86/AMX/amx-configO2toO0-lower.ll |
 | llvm/test/CodeGen/X86/AMX/amx-configO2toO0.ll |
 | llvm/include/llvm/IR/IntrinsicsX86.td |
 | llvm/lib/Target/X86/X86.h |
 | llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp |
 | llvm/test/CodeGen/X86/AMX/amx-low-intrinsics.ll |
 | llvm/include/llvm/CodeGen/TargetPassConfig.h |
 | llvm/utils/gn/secondary/llvm/lib/Target/X86/BUILD.gn |
Commit
e2a77644817fed9eb5a91d9e6cf4bc0c175f70e1
by ivan.butygin[mlir] Debug print pattern before and after matchAndRewrite call
Motivation: we have passes with lot of rewrites and when one one them segfaults or asserts, it is very hard to find waht exactly pattern failed without debug info.
Differential Revision: https://reviews.llvm.org/D101443
|
 | mlir/lib/Rewrite/PatternApplicator.cpp |
 | mlir/include/mlir/IR/PatternMatch.h |
Commit
2db4979c0fe05eac82ca770516057b7b9c4433e3
by qiucofan[VectorCombine] Simplify to scalar store if only one element updated
This patch simplifies load-insertelt-store pattern into getelementptr-store.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D98240
|
 | llvm/test/Other/opt-LTO-pipeline.ll |
 | llvm/test/CodeGen/AMDGPU/opt-pipeline.ll |
 | llvm/test/Transforms/InstCombine/load-insert-store.ll |
 | llvm/test/Transforms/VectorCombine/load-insert-store.ll |
 | llvm/lib/Transforms/Vectorize/VectorCombine.cpp |
Commit
c42007e266a38374f705564225dc35c5aacba7f4
by Louis Dionne[libc++] Use Xcode's CMake if it's present
This resolves issues when the CMake in use on the host is too old to configure libc++ properly, but Xcode has a sufficiently recent version. It is technically possible for the reverse issue to happen, where the Xcode version would be too old and the user-installed version would be better, however in the context of our build bots, we use AppleClang on Apple platforms, and the CMake shipped with Xcode should work with the AppleClang shipped alongside that Xcode.
Differential Revision: https://reviews.llvm.org/D102083
|
 | libcxx/utils/ci/run-buildbot |
Commit
b1c38207e9ca6aba883a8000239163520ee6ed83
by lebedev.ri[X86] Improve costmodel for scalar byte swaps
Currently we model i16 bswap as very high cost (`10`), which doesn't seem right, with all other being at `1`.
Regardless of `MOVBE`, i16 reg-reg bswap is lowered into (an extending move plus) rot-by-8: https://godbolt.org/z/8jrq7fMTj I think it should at worst have throughput of `1`:
Since i32/i64 already have cost of `1`, `MOVBE` doesn't improve their costs any further.
BUT, `MOVBE` must have at least a single memory operand, with other being a register. Which means, if we have a bswap of load, iff load has a single use, we'll fold bswap into load.
Likewise, if we have store of a bswap, iff bswap has a single use, we'll fold bswap into store.
So i think we should treat such a bswap as free, unless of course we know that for the particular CPU they are performing badly.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D101924
|
 | llvm/lib/Target/X86/X86Subtarget.h |
 | llvm/test/Analysis/CostModel/X86/bswap-store.ll |
 | llvm/test/Analysis/CostModel/X86/load-bswap.ll |
 | llvm/test/Analysis/CostModel/X86/bswap.ll |
 | llvm/lib/Target/X86/X86TargetTransformInfo.cpp |
 | llvm/lib/Target/X86/X86.td |
Commit
1acd9a1a29ac30044ecefb6613485d5d168f66ca
by lebedev.riRevert "[LICM] Hoist loads with invariant.group metadata"
This appears to miscompile google benchmark's GetCacheSizesFromKVFS() when compiling with -fstrict-vtable-pointers. Runnable reproducer: https://godbolt.org/z/f9ovKqTzb The "f.fail()" crashes with BUS error, it is compiled into testb, and the adress it is testing is non-sensical.
This reverts commit 4c89bcadf6cae8320a1925eb9cbeb8c8c1f5f58b.
|
 | llvm/test/Transforms/LICM/invariant.group.ll |
 | llvm/lib/Transforms/Scalar/LICM.cpp |
Commit
73df48158bf5460b5e3497ccec0df4c62c570fad
by uday[MLIR][NFC] Remove unused MLIRContext declaration
Remove unused MLIRContext declaration. NFC.
Differential Revision: https://reviews.llvm.org/D102103
|
 | mlir/lib/Support/MlirOptMain.cpp |
Commit
9610a2d753dbba385e8c2c005e2497e3add99472
by uday[MLIR] Add memref dialect dependency for affine fusion pass
For `AffineLoopFusion` pass, add `memref` dialect as a dependent dialect. Since the fusion pass can create `memref::AllocOp`s, the dialect must be registered in its dependent dialects.
The missing dependency was not discovered until now because the above said op creation happes only when the input already has `memref::AllocOp`s in it, and all dialects in the input are automatically added to the context.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D102104
|
 | mlir/include/mlir/Transforms/Passes.td |
Commit
74d096e5587969e0d1458ea2175515f6f02e7df3
by Louis Dionne[libc++] Move handling of the target triple to the DSL
This fixes a long standing issue where the triple is not always set consistently in all configurations. This change also moves the back-deployment Lit features to using the proper target triple instead of using something ad-hoc.
This will be necessary for using from scratch Lit configuration files in both normal testing and back-deployment testing.
Differential Revision: https://reviews.llvm.org/D102012
|
 | libcxx/test/std/input.output/filesystems/lit.local.cfg |
 | libcxxabi/test/lit.site.cfg.in |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/getline_pointer_size.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.assign/T.pass.cpp |
 | libcxxabi/test/cxa_vec_new_overflow_PR41395.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/long_double.pass.cpp |
 | libcxx/test/std/utilities/any/any.nonmembers/any.cast/any_cast_reference.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/types_char16_t_char8_t.pass.cpp |
 | libcxx/test/std/utilities/memory/temporary.buffer/overaligned.pass.cpp |
 | libcxx/test/std/utilities/any/any.class/any.cons/copy.pass.cpp |
 | libcxxabi/test/catch_multi_level_pointer.pass.cpp |
 | libcxx/test/std/utilities/optional/optional.object/optional.object.observe/value_const.pass.cpp |
 | libcxx/test/std/utilities/time/time.clock/time.clock.file/now.pass.cpp |
 | libcxx/test/std/numerics/rand/rand.device/eval.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.shared_mutex.requirements/thread.shared_mutex.class/lock.pass.cpp |
 | libcxx/test/libcxx/language.support/support.dynamic/libcpp_deallocate.sh.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char32_t_char8_t_unshift.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.cons/mutex_adopt_lock.pass.cpp |
 | libcxx/test/std/input.output/filesystems/fs.op.funcs/fs.op.create_directory/create_directory_with_attributes.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/lock_shared.pass.cpp |
 | libcxxabi/test/catch_member_pointer_nullptr.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/bool.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/try_lock_until_deadlock_bug.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.mod/emplace_index_init_list_args.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt.byname/ctor_char16_t_char8_t.pass.cpp |
 | libcxx/test/std/utilities/any/any.class/any.assign/move.pass.cpp |
 | libcxxabi/test/test_exception_address_alignment.pass.cpp |
 | libcxxabi/test/forced_unwind1.pass.cpp |
 | libcxx/test/std/utilities/optional/optional.bad_optional_access/derive.pass.cpp |
 | libcxxabi/test/dynamic_cast.pass.cpp |
 | libcxx/test/libcxx/thread/thread.threads/thread.thread.this/sleep_for.signals.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.cons/move_assign.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.ctor/in_place_index_init_list_args.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.assign/move.pass.cpp |
 | libcxx/test/libcxx/language.support/cxa_deleted_virtual.pass.cpp |
 | libcxx/test/std/strings/basic.string/string.capacity/over_max_size.pass.cpp |
 | libcxx/test/std/input.output/filesystems/class.directory_entry/directory_entry.obs/hard_link_count.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.ctor/in_place_index_args.pass.cpp |
 | libcxx/test/std/input.output/file.streams/fstreams/ofstream.members/open_path.pass.cpp |
 | libcxx/test/std/utilities/optional/optional.object/optional.object.observe/value_const_rvalue.pass.cpp |
 | libcxx/test/std/input.output/filesystems/fs.op.funcs/fs.op.create_directory/create_directory.pass.cpp |
 | libcxx/test/std/input.output/filesystems/fs.op.funcs/fs.op.copy_file/copy_file.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/peek.pass.cpp |
 | libcxx/test/std/utilities/any/any.class/any.cons/value.pass.cpp |
 | libcxx/test/std/utilities/any/any.nonmembers/any.cast/const_correctness.fail.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_char8_t_always_noconv.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/long.pass.cpp |
 | libcxx/test/std/thread/futures/futures.async/async_race.38682.pass.cpp |
 | libcxxabi/test/catch_function_01.pass.cpp |
 | libcxx/test/std/input.output/filesystems/class.directory_entry/directory_entry.mods/refresh.pass.cpp |
 | libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.array/new_align_val_t_nothrow.pass.cpp |
 | libcxx/test/std/localization/locales/locale/locale.cons/locale_string_cat.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.cons/default.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/double.pass.cpp |
 | libcxxabi/test/catch_member_data_pointer_01.pass.cpp |
 | libcxxabi/test/uncaught_exceptions.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.cons/move_ctor.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/try_lock_for.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/get_streambuf_chart.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/seekg_off.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_char8_t_out.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt.byname/ctor_char32_t_char8_t.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.obs/op_bool.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/try_lock_until.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/try_lock_shared.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/unsigned_long.pass.cpp |
 | libcxx/test/std/input.output/filesystems/fs.op.funcs/fs.op.file_size/file_size.pass.cpp |
 | libcxx/test/std/localization/locales/locale/locale.members/combine.pass.cpp |
 | libcxx/docs/TestingLibcxx.rst |
 | libcxx/test/libcxx/thread/semaphore.availability.verify.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.ctor/in_place_type_init_list_args.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/unsigned_int.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.shared_mutex.requirements/thread.shared_mutex.class/try_lock.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.mod/emplace_index_args.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/read.pass.cpp |
 | libcxx/test/std/utilities/format/format.error/format.error.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.get/get_type.pass.cpp |
 | libcxx/test/std/strings/basic.string/string.capacity/reserve_size.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/output.streams/ostream.formatted/ostream.inserters.arithmetic/minmax_showbase.pass.cpp |
 | libcxx/test/std/input.output/file.streams/fstreams/ifstream.members/open_path.pass.cpp |
 | libcxx/test/std/utilities/optional/optional.specalg/make_optional.pass.cpp |
 | libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/const_T.pass.cpp |
 | libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.array/sized_delete_array_fsizeddeallocation.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/get_pointer_size_chart.pass.cpp |
 | libcxxabi/test/native/arm-linux-eabi/lit.local.cfg |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/unsigned_short.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char32_t_char8_t_always_noconv.pass.cpp |
 | libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/move.pass.cpp |
 | libcxx/test/std/diagnostics/syserr/syserr.errcat/syserr.errcat.objects/generic_category.pass.cpp |
 | libcxx/test/std/thread/thread.barrier/arrive.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.ctor/T.pass.cpp |
 | libcxxabi/test/exception_object_alignment.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/unsigned_long_long.pass.cpp |
 | libcxxabi/test/incomplete_type.sh.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/getline_pointer_size_chart.pass.cpp |
 | libcxx/test/std/localization/locales/locale/locale.cons/locale_facetptr.pass.cpp |
 | libcxx/test/std/numerics/rand/rand.device/ctor.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.numeric/locale.num.get/facet.num.get.members/get_long.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.shared_mutex.requirements/thread.shared_mutex.class/copy.fail.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/get_chart.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.ctor/in_place_type_args.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.mod/emplace_type_init_list_args.pass.cpp |
 | libcxx/test/std/thread/thread.semaphore/release.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/pointer.pass.cpp |
 | libcxx/test/std/thread/thread.semaphore/timed.pass.cpp |
 | libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/U.pass.cpp |
 | libcxx/test/std/localization/locales/locale/locale.cons/copy.pass.cpp |
 | libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.single/delete_align_val_t_replace.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/float.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char32_t_char8_t_in.pass.cpp |
 | libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.single/new_align_val_t_nothrow.pass.cpp |
 | libcxx/test/std/thread/futures/futures.future_error/what.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/default.pass.cpp |
 | libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.array/new_align_val_t.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/try_lock.pass.cpp |
 | libcxx/test/std/utilities/any/any.nonmembers/any.cast/not_copy_constructible.fail.cpp |
 | libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/rvalue_T.pass.cpp |
 | libcxx/test/libcxx/language.support/support.dynamic/new_faligned_allocation.pass.cpp |
 | libcxx/test/std/utilities/charconv/charconv.from.chars/integral.roundtrip.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.cons/mutex_try_to_lock.pass.cpp |
 | libcxx/test/std/input.output/iostreams.base/ios.base/ios.types/ios_Init/ios_Init.multiple.pass.cpp |
 | libcxx/test/std/utilities/any/any.nonmembers/swap.pass.cpp |
 | libcxx/test/configs/legacy.cfg.in |
 | libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.single/new_align_val_t_nothrow_replace.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/try_lock_shared_for.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.mod/emplace_type_args.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.cons/mutex_duration.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.shared_mutex.requirements/thread.shared_mutex.class/try_lock_shared.pass.cpp |
 | libcxx/test/std/input.output/filesystems/fs.op.funcs/fs.op.last_write_time/last_write_time.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/long_long.pass.cpp |
 | libcxx/test/std/utilities/format/format.formatter/format.parse.ctx/check_arg_id.pass.cpp |
 | libcxx/test/std/input.output/filesystems/class.directory_entry/directory_entry.obs/file_size.pass.cpp |
 | libcxx/utils/libcxx/test/params.py |
 | libcxx/utils/libcxx/test/features.py |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream_extractors/streambuf.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/assign.compile.fail.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char32_t_char8_t_length.pass.cpp |
 | libcxx/test/libcxx/thread/thread.condition/PR30202_notify_from_pthread_created_thread.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/short.pass.cpp |
 | libcxx/test/std/input.output/filesystems/class.directory_entry/directory_entry.obs/last_write_time.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char32_t_char8_t_max_length.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/lock.pass.cpp |
 | libcxx/test/std/utilities/optional/optional.object/optional.object.observe/value_rvalue.pass.cpp |
 | libcxx/test/libcxx/language.support/support.dynamic/aligned_alloc_availability.verify.cpp |
 | libcxx/test/std/language.support/support.exception/uncaught/uncaught_exceptions.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/copy.compile.fail.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.shared_mutex.requirements/thread.shared_mutex.class/default.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_char8_t_max_length.pass.cpp |
 | libcxx/test/std/thread/thread.semaphore/binary.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.obs/owns_lock.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.ctor/move.pass.cpp |
 | libcxx/test/std/localization/locales/locale/locale.cons/assign.pass.cpp |
 | libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.wait/atomic_wait.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.shared_mutex.requirements/thread.shared_mutex.class/lock_shared.pass.cpp |
 | libcxx/test/std/utilities/any/any.nonmembers/any.cast/any_cast_pointer.pass.cpp |
 | libcxx/test/std/localization/locales/locale/locale.cons/locale_char_pointer_cat.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.cons/mutex.pass.cpp |
 | libcxx/test/std/utilities/format/format.formatter/format.parse.ctx/next_arg_id.pass.cpp |
 | libcxx/test/std/utilities/any/any.nonmembers/make_any.pass.cpp |
 | libcxxabi/test/test_aux_runtime_op_array_new.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/ctor_char32_t_char8_t.pass.cpp |
 | libcxx/utils/libcxx/test/config.py |
 | libcxx/test/std/localization/locales/locale/locale.statics/classic.pass.cpp |
 | libcxx/test/std/localization/locales/locale/locale.cons/locale_locale_cat.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/output.streams/ostream.formatted/ostream.inserters.arithmetic/minus1.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_char8_t_in.pass.cpp |
 | libcxx/test/std/input.output/file.streams/fstreams/ofstream.cons/path.pass.cpp |
 | libcxx/test/libcxx/utilities/charconv/charconv.to.chars/availability.fail.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/types_char32_t_char8_t.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.visit/visit_return_type.pass.cpp |
 | libcxx/test/std/thread/thread.semaphore/acquire.pass.cpp |
 | libcxxabi/test/catch_ptr_02.pass.cpp |
 | libcxx/test/std/thread/thread.barrier/arrive_and_wait.pass.cpp |
 | libcxx/test/libcxx/thread/atomic.availability.verify.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.locking/lock.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.cons/mutex_time_point.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.shared_mutex.requirements/thread.shared_mutex.class/assign.fail.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/get.pass.cpp |
 | libcxx/test/std/utilities/any/any.class/any.modifiers/swap.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.get/get_index.pass.cpp |
 | libcxx/test/std/utilities/any/any.class/any.modifiers/emplace.pass.cpp |
 | libcxx/test/std/thread/thread.latch/count_down.pass.cpp |
 | libunwind/test/lit.site.cfg.in |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_char8_t_unshift.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/get_pointer_size.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char32_t_char8_t_out.pass.cpp |
 | libcxx/test/std/utilities/charconv/charconv.to.chars/integral.pass.cpp |
 | libcxx/test/std/localization/locales/locale/locale.cons/default.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.formatted/istream.formatted.arithmetic/int.pass.cpp |
 | libcxx/test/std/localization/locales/locale/locale.cons/string.pass.cpp |
 | libcxx/test/std/utilities/any/any.class/any.cons/in_place_type.pass.cpp |
 | libcxx/test/libcxx/thread/latch.availability.verify.cpp |
 | libcxx/test/std/thread/thread.latch/try_wait.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.visit/robust_against_adl.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.ctor/copy.pass.cpp |
 | libcxx/test/std/diagnostics/syserr/syserr.errcat/syserr.errcat.objects/system_category.pass.cpp |
 | libcxx/test/std/input.output/file.streams/fstreams/fstream.members/open_path.pass.cpp |
 | libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.single/sized_delete_fsizeddeallocation.pass.cpp |
 | libcxx/test/std/utilities/any/any.nonmembers/any.cast/any_cast_request_invalid_value_category.fail.cpp |
 | libcxx/test/std/localization/locales/locale/locale.cons/char_pointer.pass.cpp |
 | libcxx/test/std/utilities/any/any.class/any.cons/move.pass.cpp |
 | libcxxabi/test/test_demangle.pass.cpp |
 | libcxx/test/std/thread/thread.latch/arrive_and_wait.pass.cpp |
 | libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.array/delete_align_val_t_replace.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char32_t_char8_t_encoding.pass.cpp |
 | libcxx/test/std/utilities/any/any.class/any.assign/copy.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/get_streambuf.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.assign/copy.pass.cpp |
 | libcxx/test/std/input.output/file.streams/fstreams/filebuf.members/open_path.pass.cpp |
 | libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.single/new_align_val_t.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/ctor_char16_t_char8_t.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_char8_t_encoding.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.cons/mutex_defer_lock.pass.cpp |
 | libcxx/test/std/input.output/file.streams/fstreams/ifstream.cons/path.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/char16_t_char8_t_length.pass.cpp |
 | libcxx/test/std/thread/thread.barrier/arrive_and_drop.pass.cpp |
 | libcxx/test/libcxx/memory/aligned_allocation_macro.compile.pass.cpp |
 | libcxx/test/std/thread/thread.barrier/completion.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.mutex.requirements/thread.sharedtimedmutex.requirements/thread.sharedtimedmutex.class/try_lock_shared_until.pass.cpp |
 | libcxxabi/test/libcxxabi/test/config.py |
 | libcxx/test/std/utilities/variant/variant.bad_variant_access/bad_variant_access.pass.cpp |
 | libcxx/test/libcxx/thread/barrier.availability.verify.cpp |
 | libcxx/test/std/utilities/optional/optional.bad_optional_access/default.pass.cpp |
 | libcxx/test/std/input.output/iostream.format/input.streams/istream.unformatted/ignore.pass.cpp |
 | libcxx/test/std/localization/locales/locale/locale.statics/global.pass.cpp |
 | libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.obs/mutex.pass.cpp |
 | libcxx/test/libcxx/thread/thread.threads/thread.thread.this/sleep_for.pass.cpp |
 | libcxxabi/test/forced_unwind2.pass.cpp |
 | libcxx/test/std/localization/locale.categories/category.ctype/locale.codecvt/locale.codecvt.members/utf_sanity_check.pass.cpp |
 | libcxx/utils/libcxx/test/dsl.py |
 | libcxx/test/std/thread/thread.semaphore/try_acquire.pass.cpp |
 | libcxx/test/std/utilities/optional/optional.object/optional.object.observe/value.pass.cpp |
 | libcxx/test/std/utilities/any/any.class/any.modifiers/reset.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.swap/swap.pass.cpp |
 | libcxx/test/std/utilities/variant/variant.variant/variant.ctor/default.pass.cpp |
 | libcxx/test/std/language.support/support.dynamic/new.delete/new.delete.array/new_align_val_t_nothrow_replace.pass.cpp |
 | libcxx/test/std/utilities/any/any.class/any.assign/value.pass.cpp |
 | libcxx/test/std/input.output/filesystems/fs.op.funcs/fs.op.create_directories/create_directories.pass.cpp |
 | libcxxabi/test/catch_pointer_nullptr.pass.cpp |
 | libcxx/test/libcxx/selftest/dsl/dsl.sh.py |
 | libcxx/test/std/utilities/variant/variant.visit/visit.pass.cpp |
 | libcxx/test/std/input.output/file.streams/fstreams/fstream.cons/path.pass.cpp |
Commit
4524d8b7552c1d2dcb941c708620fb1f1b998cd1
by llvm-dev[X86] combineHorizOpWithShuffle - generalize HOP(SHUFFLE(X),SHUFFLE(Y)) -> SHUFFLE(HOP(X,Y)) fold.
For 128-bit types, generalize the fold to recognise duplicate operands in either shuffle.
|
 | llvm/lib/Target/X86/X86ISelLowering.cpp |
 | llvm/test/CodeGen/X86/horizontal-sum.ll |
Commit
ab5ee342b92b4661cfec3cdd647c9a5c18e346dd
by llvm-dev[GlobalISel] Ensure MachineIRBuilder::getDebugLoc() returns a const reference. NFCI.
Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.
|
 | llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h |
Commit
2bf34c0a93ff212c97cdc86dff9b615faacfe386
by flo[VPlan] Add test for sink scalars and merging using VPlan.
Add a couple of tests with scalars that can be sunk to their predicated users.
This pre-commits tests for D100258.
|
 | llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge.ll |
 | llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge-vf1.ll |
Commit
20544746402a94fece90dd2500cdaf62320310cc
by Louis Dionne[libc++] NFC: Refactor Lit annotations
Annotations for c++03 mode are useless, since we only run these tests in C++11 and C++14.
|
 | libcxx/test/std/thread/futures/futures.task/futures.task.nonmembers/uses_allocator.pass.cpp |
 | libcxx/test/std/thread/futures/futures.task/futures.task.members/ctor_func_alloc.pass.cpp |
 | libcxx/test/std/thread/futures/futures.task/futures.task.nonmembers/uses_allocator.compile.pass.cpp |
Commit
7b6dd265ce8320fee8741ec86edae5d40d7b7b86
by thakis[lld/mac] Copy some of the commit message of d5a70db193 into a comment
|
 | lld/MachO/UnwindInfoSection.cpp |
Commit
9ceea66602d9c11690004020a6e7a2b0f5291bd5
by andrea.dibiagio[MCA][RegisterFile] Refactor the move elimination logic to address PR50258.
This patch lifts the restriction on the number of read/write registers for a move elimination candidate. With this patch, move elimination candidates with exactly two reads and two writes are treated like register swap operations for the purpose of move elimination.
This patch currently doesn't affect any upstream model. However, it should help unblock the progress on PR50258.
|
 | llvm/include/llvm/MCA/HardwareUnits/RegisterFile.h |
 | llvm/lib/MCA/HardwareUnits/RegisterFile.cpp |
 | llvm/lib/MCA/Stages/DispatchStage.cpp |
Commit
5be8502271ac5644a7873fa801ecf537f8087d7f
by gkm[lld-macho] Explicitly undefine literal exported symbols
Symbols explicitly exported via command-line options `--exported_symbol SYM` and `--exported_symbols_list FILE` must be defined. Before this fix, lazy symbols defined in archives would be left to languish. We now force them to be included in the linked output.
Differential Revision: https://reviews.llvm.org/D102100
|
 | lld/MachO/Driver.cpp |
 | lld/test/MachO/export-options.s |
Commit
de1843e51a76c5628dc93c0507a4fb8e7be52482
by andrea.dibiagio[llvm-mca][View] Update the Register File statistics.
Correctly track the number of move eliminated in the Register File statistics.
|
 | llvm/tools/llvm-mca/Views/RegisterFileStatistics.cpp |
Commit
561026936bd244f31a1b8dbe953680b20994ec2f
by kparzysz[Hexagon] Propagate metadata in Hexagon Vector Combine
|
 | llvm/lib/Target/Hexagon/HexagonVectorCombine.cpp |
 | llvm/test/CodeGen/Hexagon/autohvx/vector-align-tbaa.ll |
Commit
492173d42b32cb91d5d0d72d5ed84fcab80d059a
by i[test] Fix tools/gold/X86/new-pm.ll after D101797
|
 | llvm/test/tools/gold/X86/new-pm.ll |
Commit
d5494931f2acd6a5b3ca349ed54813226b0c9040
by lebedev.ri[NFCI][X86] Mark a few lately-added system instructions as such for Scheduling purposes
|
 | llvm/lib/Target/X86/X86InstrSystem.td |
 | llvm/lib/Target/X86/X86InstrInfo.td |
Commit
f8589292084b41fc70da93fb1e23bb576bd1f8f3
by lebedev.ri[NFCI][X86] Mark Znver3 scheduling model as complete
To the best of my knowledge, all instructions are modelled, and have reasonable values to them; flipping the switch doesn't cause any diff for MCA tests, so either we're good, or we have test coverage gaps.
I'm not really sure why no other X86 sched model is marked as complete.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
Commit
4aec8f4ce0f564aa68c23b9e29c2e3a945eec947
by lebedev.ri[NFC][LoopIdiom] Add some tests for 'lshr until zero' ('count active bits') "on steroids" idiom
|
 | llvm/test/Transforms/LoopIdiom/X86/logical-right-shift-until-zero-debuginfo.ll |
 | llvm/test/Transforms/LoopIdiom/X86/logical-right-shift-until-zero-cost.ll |
 | llvm/test/Transforms/LoopIdiom/X86/logical-right-shift-until-zero.ll |
Commit
4b8962940322fe732126ec583013ecb5b6a1112e
by gkm[lld-macho][NFC] Purge stale test-output trees prior to split-file
Enforce standard practice
Differential Revision: https://reviews.llvm.org/D102112
|
 | lld/test/MachO/dylib-stub.yaml |
 | lld/test/MachO/invalid/bad-got-to-dylib-tlv-reference.s |
 | lld/test/MachO/nonweak-definition-override.s |
 | lld/test/MachO/common-symbol-coalescing.s |
 | lld/test/MachO/indirect-symtab.s |
 | lld/test/MachO/U-dynamic-lookup.s |
 | lld/test/MachO/adhoc-codesign.s |
 | lld/test/MachO/invalid/abs-duplicate.s |
 | lld/test/MachO/weak-import.s |
 | lld/test/MachO/entry-symbol.s |
 | lld/test/MachO/flat-namespace.s |
 | lld/test/MachO/weak-binding.s |
 | lld/test/MachO/invalid/range-check.s |
 | lld/test/MachO/tlv-dylib.s |
 | lld/test/MachO/private-extern.s |
 | lld/test/MachO/thin-archive.s |
 | lld/test/MachO/invalid/undefined-symbol.s |
 | lld/test/MachO/lc-linker-option.ll |
 | lld/test/MachO/t.s |
 | lld/test/MachO/weak-header-flags.s |
 | lld/test/MachO/u.s |
 | lld/test/MachO/dependency-info.s |
 | lld/test/MachO/why-load.s |
Commit
6ae15756a5a67b3791e91e2258d17b4dfa0ce87f
by koraq[libc++][doc] Update the Format library status.
- Move LWG-3218 to the chrono section. - Mark the several parts 'In progress'.
|
 | libcxx/docs/FormatIssuePaperStatus.csv |
 | libcxx/docs/FormatProposalStatus.csv |
Commit
7549399d0e0a961054f9d47fecbe8807b22ac25f
by nikita.ppv[SROA] Regenerate test checks (NFC)
|
 | llvm/test/Transforms/SROA/scalable-vectors.ll |
 | llvm/test/Transforms/SROA/slice-width.ll |
 | llvm/test/Transforms/SROA/pointer-offset-size.ll |
 | llvm/test/Transforms/SROA/basictest.ll |
Commit
ad5f3f525828e1e499e94a5065bada5b5df936cb
by thatlemon[SelectionDAG] Regenerate test checks (NFC)
|
 | llvm/test/CodeGen/X86/arg-copy-elide.ll |
Commit
a21df76db6c41dc87d027e9eb34695d7763f809e
by lebedev.ri[X86] AMD Zen 3: XCHG is a zero-cycle instruction
As measured by exegesis and confirmed by reference docs.
|
 | llvm/test/tools/llvm-mca/X86/Znver3/resources-x86_64.s |
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-gpr.s |
Commit
675daef58b5e7e9fddbb5878bb88399b1c634949
by lebedev.ri[NFC][X86] Znver3: drop obsolete fixme
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
Commit
2a08d7409bf920b0b81708cf2bce0c1179262006
by nikita.ppv[SCEV] Add additional loop guard and/or tests (NFC)
Add tests for and/and, and/or, or/or, or/and combinations.
|
 | llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll |
Commit
d26ca78c18ed21713d7d0a44fb75f1989575ab9d
by nikita.ppv[SCEV] Handle and/or in applyLoopGuards()
applyLoopGuards() already combines conditions from multiple nested guards. However, it cannot use multiple conditions on the same guard, combined using and/or. Add support for this by recursing into either `and` or `or`, depending on the direction of the branch.
Differential Revision: https://reviews.llvm.org/D101692
|
 | llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll |
 | llvm/lib/Analysis/ScalarEvolution.cpp |
Commit
76786037c68163c48d7d829bb654de6c8298bbb0
by david.green[ARM] Fix postinc of vst1xN
These nodes are not handled correctly by CombineBaseUpdate. For the moment, similar to 5f1cad4d296a20025f0b mark them as unsupported.
|
 | llvm/test/CodeGen/ARM/arm-vst1.ll |
 | llvm/lib/Target/ARM/ARMISelLowering.cpp |
Commit
ab794852ed41d75039aeb122e4268fa32ef1a68f
by lebedev.ri[NFC][X86][MCA] AMD Zen3: add GPR zero-idiom dependency breaking tests
|
 | llvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-gpr.s |
Commit
eed8552787d8e2e7c4fd257a8b5ddd78682a55fa
by lebedev.ri[X86] AMD Zen 3: same-register XOR/SUB are GPR dependency breaking zero-idioms
As measured by exegesis and confirmed in reference docs.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/zero-idioms-gpr.s |
Commit
8d0e2d2b0f0f555549255ee812f7ff5297b79420
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: add tests for SBB dependency breaking
|
 | llvm/test/tools/llvm-mca/X86/Znver3/dependency-breaking-gpr.s |
Commit
11b0568dce5a72d45780d07398650693537bfa67
by lebedev.ri[X86] AMD Zen 3: same-reg SBB is a dependency-breaking instruction
As confirmed by exegesis measurements, and ref docs. It does actually execute.
While there, bump latency for MULX32rr, that seems to match measurements.
|
 | llvm/test/tools/llvm-mca/X86/Znver3/dependency-breaking-gpr.s |
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/resources-bmi2.s |
Commit
9a31efa2f51b8150935a9b90eac6ab4eaa613841
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: add tests for CMP dependency breaking
|
 | llvm/test/tools/llvm-mca/X86/Znver3/dependency-breaking-gpr.s |
Commit
be23d5e81439e701c67c767b06fe4c7afcde6af9
by lebedev.ri[X86] AMD Zen 3: same-reg CMP is a zero-cycle dependency-breaking instruction
As measured by exegesis, and confirmed by ref docs.
|
 | llvm/test/tools/llvm-mca/X86/Znver3/dependency-breaking-gpr.s |
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
Commit
78e949159d105b7947dbae973080ea343e8f9eda
by dblaikie[Demangle][Rust] Print special namespaces
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D101821
|
 | llvm/include/llvm/Demangle/RustDemangle.h |
 | llvm/test/Demangle/rust.test |
 | llvm/lib/Demangle/RustDemangle.cpp |
Commit
0f8854f7f5d3e98f32eef7cfd09d5b8915a7d301
by jezng[lld-macho] Don't reference entry symbol for non-executables
This would cause us to pull in symbols (and code) that should be unused.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D102137
|
 | lld/MachO/Config.h |
 | lld/test/MachO/bundle-loader.s |
 | lld/MachO/Driver.cpp |
 | lld/MachO/Writer.cpp |
 | lld/test/MachO/entry-symbol.s |
Commit
7f673fcaa9a2cb4600f244792f8521ff19aa806c
by thakis[lld/mac] Fix alignment on subsections
On a section with alignment of 16, subsections aligned to 16-byte boundaries should keep their 16-byte alignment.
Fixes PR50274. (The same bug could have happened with -order_file previously.)
Differential Revision: https://reviews.llvm.org/D102139
|
 | lld/test/MachO/weak-definition-gc.s |
 | lld/MachO/InputFiles.cpp |
Commit
75f74f267350c71058a05904f9ee9cc72a061b6f
by jezng[lld-macho] Add llvm-otool as a test dependency
This unbreaks my local build, which is configured to build only parts of LLVM.
|
 | lld/test/CMakeLists.txt |
Commit
34b5482b334f2a3960ef079667adb5b3df20aa7d
by chiahungduanSupport NativeCodeCall binding in rewrite pattern.
We are able to bind the result from native function while rewriting pattern. In matching pattern, if we want to get some values back, we can do that by passing parameter as return value placeholder. Besides, add the semantic of '$_self' in NativeCodeCall while matching, it'll be the operation that defines certain operand.
Differential Revision: https://reviews.llvm.org/D100746
|
 | mlir/include/mlir/IR/OpBase.td |
 | mlir/test/mlir-tblgen/pattern.mlir |
 | mlir/tools/mlir-tblgen/RewriterGen.cpp |
 | mlir/docs/DeclarativeRewrites.md |
 | mlir/test/lib/Dialect/Test/TestOps.td |
 | mlir/lib/TableGen/Pattern.cpp |
 | mlir/test/mlir-tblgen/rewriter-errors.td |
 | mlir/test/lib/Dialect/Test/TestPatterns.cpp |
Commit
446ed6394bd36631508056ffac08bd740d401a00
by zakk.chen[RISCV][NFC] Don't need to create a new STI in RISCVAsmPrinter.
RISCVAsmPrinter already has MCSubtargetInfo.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D101889
|
 | llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp |
Commit
9ffd4924e8e1a056760256c4cd5ffde38ccdf010
by Yuanfang Chen[NFC][Coroutines] Fix two tests by removing hardcoded SSA value.
|
 | clang/test/CodeGenCoroutines/coro-dest-slot.cpp |
 | clang/test/CodeGenCoroutines/coro-params.cpp |
Commit
220f6e5271f2e6b39bf4e083c03cd3f91bb43685
by tejohnson[SimplifyCFG] Ignore ephemeral values when counting insts for threading
Ignore ephemeral values (only feeding llvm.assume intrinsics) when computing the instruction count to decide if a block is small enough for threading. This is similar to the handling of these values in the InlineCost computation. These instructions will eventually be removed and shouldn't count against code size (similar to the existing ignoring of phis).
Without this change, when enabling -fwhole-program-vtables, which causes type test / assume sequences to be inserted by clang, we can get different threading decisions. In particular, when building with instrumentation FDO it can affect the optimizations decisions before FDO matching, leading to some mismatches.
Differential Revision: https://reviews.llvm.org/D101494
|
 | llvm/lib/Transforms/Utils/SimplifyCFG.cpp |
 | llvm/test/Transforms/SimplifyCFG/unprofitable-pr.ll |
Commit
5344c88dcb2845f6a12cd0992deab1448b4d1419
by Lang Hames[ORC] Generalize materialization dispatch to task dispatch.
Generalizing this API allows work to be distributed more evenly. In particular, query callbacks can now be dispatched (rather than running immediately on the thread that satisfied the query). This avoids the pathalogical case where an operation on one thread satisfies many queries simultaneously, causing large amounts of work to be run on that thread while other threads potentially sit idle.
|
 | llvm/lib/ExecutionEngine/Orc/Core.cpp |
 | llvm/lib/ExecutionEngine/Orc/LLJIT.cpp |
 | llvm/unittests/ExecutionEngine/Orc/CoreAPIsTest.cpp |
 | llvm/include/llvm/ExecutionEngine/Orc/Core.h |
Commit
7f9a89f9a2cc55dbfc315aa11416fe3609918199
by Lang Hames[ORC] Use the new dispatchTask API to run query callbacks.
Dispatching query callbacks, rather than running them on the current thread, will allow them to be distributed across multiple threads.
|
 | llvm/include/llvm/ExecutionEngine/Orc/Core.h |
 | llvm/unittests/ExecutionEngine/Orc/CoreAPIsTest.cpp |
 | llvm/lib/ExecutionEngine/Orc/Core.cpp |
Commit
b3aeb138924577f0c7f1a665723edb6b4858fb6f
by JunMa[AArch64][SVE] Remove index_vector node.
Since index_vector is lowered into step_vector in D100816, we can just remove index_vector, use step_vector for codegen directly.
Differential Revision: https://reviews.llvm.org/D101593
|
 | llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td |
 | llvm/lib/Target/AArch64/SVEInstrFormats.td |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp |
Commit
9ba661f91276dd8cc728f9b2e82905b78c0119b4
by akuegel[mlir] Fix compile error.
Inside a templated function, other class members need to be called with this->. Otherwise we get: explicit qualification required to use member 'setDebugName' from dependent base class.
|
 | mlir/include/mlir/IR/PatternMatch.h |
Commit
6db0cedd238590023398bb20dad94773b56c4c74
by fraser[LegalizeVectorOps][RISCV] Add scalable-vector SELECT expansion
This patch extends VectorLegalizer::ExpandSELECT to permit expansion also for scalable vector types. The only real change is conditionally checking for BUILD_VECTOR or SPLAT_VECTOR legality depending on the vector type.
We can use this to fix "cannot select" errors for scalable vector selects on the RISCV target. Note that in future patches RISCV will possibly custom-lower vector SELECTs to VSELECTs for branchless codegen.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D102063
|
 | llvm/test/CodeGen/RISCV/rvv/fixed-vectors-select-fp.ll |
 | llvm/test/CodeGen/RISCV/rvv/select-fp.ll |
 | llvm/test/CodeGen/RISCV/rvv/fixed-vectors-select-int.ll |
 | llvm/test/CodeGen/RISCV/rvv/select-int.ll |
 | llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp |
 | llvm/lib/Target/RISCV/RISCVISelLowering.cpp |
Commit
c711aa0f6f9d9400fbe619c7f0d6d4aa723b3a64
by Pushpinder.Singh[amdgpu-arch] Guard hsa.h with __has_include
This patch is suppose to fix the issue of hsa.h not found. Issue was reported in D99949
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D102067
|
 | clang/tools/amdgpu-arch/AMDGPUArch.cpp |
Commit
9586937ef513b5b9b134322c6c81dcdd03ca784a
by Pushpinder.Singh[AMDGPU][OpenMP] Disable tests when amdgpu-arch fails
This patch prevents runtime tests running on systems without amdgpu.
Reviewed By: protze.joachim, tianshilei1992
Differential Revision: https://reviews.llvm.org/D102054
|
 | openmp/libomptarget/plugins/amdgpu/CMakeLists.txt |
Commit
ed4f4edea20caa13a468ab279a0df20b9ba667fb
by gchatelet[libc] Allow target architecture customization
This patch provides a way to specify the default target cpu optimizations to use when compiling llvm-libc. This ensures we don't rely on current compiler's default and allows compiling and cross compiling for a particular target.
Differential Revision: https://reviews.llvm.org/D101991
|
 | libc/cmake/modules/LLVMLibCLibraryRules.cmake |
 | libc/cmake/modules/LLVMLibCObjectRules.cmake |
 | libc/cmake/modules/LLVMLibCTestRules.cmake |
 | libc/CMakeLists.txt |
Commit
7f78e409d0280c62209e1a7dc8c6d1409acc9184
by Pushpinder.Singh[AMDGPU][OpenMP] Emit textual IR for -emit-llvm -S
Previously clang would print a binary blob into the bundled file for amdgcn. With this patch, it will instead print textual IR as expected.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D102065
|
 | clang/lib/Driver/ToolChains/Clang.cpp |
 | clang/test/Driver/amdgpu-openmp-toolchain.c |
Commit
72d013dd73f4b59eb421d7dbbfd0b2bccbb6fc7b
by zinenko[mlir] OpenMP-to-LLVM: properly set outer alloca insertion point
Previously, the OpenMP to LLVM IR conversion was setting the alloca insertion point to the same position as the main compuation when converting OpenMP `parallel` operations. This is problematic if, for example, the `parallel` operation is placed inside a loop and would keep allocating on stack on each iteration leading to stack overflow.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D101307
|
 | mlir/lib/Target/LLVMIR/ModuleTranslation.cpp |
 | mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp |
 | mlir/test/Target/LLVMIR/openmp-llvm.mlir |
 | mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h |
Commit
d13ce17bb4008b2907e6e85882a9295dce9f6b0a
by petar.avramovicAMDGPU/GlobalISel: Add regbankselect test for vgpr(dest) sgpr(address) load
Pre-commit for D101992.
|
 | llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-uniform-load-noclobber.mir |
Commit
f6985a197ef9065a64d0bb819bfd90d1862da45b
by petar.avramovicAMDGPU/GlobalISel: Use destination register bank in applyMappingLoad
Large loads on target that does not useFlatForGlobal have to be split in regbankselect. This did not happen in case when destination had vgpr bank and address had sgpr bank. Instead of checking if address bank is sgpr check bank of the destination.
Differential Revision: https://reviews.llvm.org/D101992
|
 | llvm/test/CodeGen/AMDGPU/GlobalISel/load-constant.96.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/load-unaligned.ll |
 | llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-uniform-load-noclobber.mir |
Commit
541f107871bc9c020925a6e5342542a47c902d12
by gchatelet[libc] Simplifies multi implementations and benchmarks
This is a follow up on D101524 which: - simplifies cpu features detection and usage, - flattens target dependent optimizations so it's obvious which implementations are generated, - provides an implementation targeting the host (march/mtune=native) for the mem* functions, - makes sure all implementations are unittested (provided the host can run them), - makes sure all implementations are benchmarkable (provided the host can run them).
Differential Revision: https://reviews.llvm.org/D101895
|
 | libc/test/src/string/CMakeLists.txt |
 | libc/src/string/x86_64/CMakeLists.txt |
 | libc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake |
 | libc/src/string/aarch64/CMakeLists.txt |
 | libc/src/string/CMakeLists.txt |
Commit
a81e45b8bcb8eb274ad73357e10e2cdf8a314a8c
by frgossen[MLIR][Shape] Concretize broadcast result type if possible
As a canonicalization, infer the resulting shape rank if possible.
Differential Revision: https://reviews.llvm.org/D102068
|
 | mlir/include/mlir/Dialect/Shape/IR/Shape.h |
 | mlir/lib/Dialect/Shape/IR/Shape.cpp |
 | mlir/test/Dialect/Shape/canonicalize.mlir |
Commit
831cf15ca6892e2044447f8dc516d76b8a827f1e
by david.spickett[compiler-rt] Handle None value when polling addr2line pipe
According to: https://docs.python.org/3/library/subprocess.html#subprocess.Popen.poll
poll can return None if the process hasn't terminated.
I'm not quite sure how addr2line could end up closing the pipe without terminating but we did see this happen on one of our bots: ``` <...>scripts/asan_symbolize.py", line 211, in symbolize logging.debug("addr2line exited early (broken pipe), returncode=%d" % self.pipe.poll()) TypeError: %d format: a number is required, not NoneType ```
Handle None by printing a message that we couldn't get the return code.
Reviewed By: delcypher
Differential Revision: https://reviews.llvm.org/D101891
|
 | compiler-rt/lib/asan/scripts/asan_symbolize.py |
Commit
fc253e69f9b988e8b2d4c940946146696b2acf5a
by julian.grossFixed bug in buffer deallocation pass using unranked memref types.
In the buffer deallocation pass, unranked memref types are not properly supported. After investigating this issue, it turns out that the Clone and Dealloc operation does not support unranked memref types in the current implementation. This patch adds the missing feature and enables the transformation of any memref type.
This patch solves this bug: https://bugs.llvm.org/show_bug.cgi?id=48385
Differential Revision: https://reviews.llvm.org/D101760
|
 | mlir/test/Dialect/MemRef/ops.mlir |
 | mlir/test/Transforms/buffer-deallocation.mlir |
 | mlir/test/Conversion/StandardToSPIRV/alloc.mlir |
 | mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp |
 | mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td |
 | mlir/test/lib/Dialect/Test/TestOps.td |
Commit
7280f4b279a6ecaea813b7dc63ee68eb95880115
by andrzej.warzynski[OpenMP][MLIR]Add support for guided, auto and runtime scheduling
When using parallel loop construct, the OpenMP specification allows for guided, auto and runtime as scheduling variants (as well as static and dynamic which are already supported).
This adds the translation from MLIR to LLVM-IR for these scheduling variants.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101435
|
 | llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp |
 | mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp |
 | llvm/include/llvm/Frontend/OpenMP/OMPConstants.h |
 | mlir/test/Target/LLVMIR/openmp-llvm.mlir |
 | llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp |
 | llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h |
Commit
761f3d16753ecea625173729dd8c53df022cd4ab
by kadircet[clang][PreProcessor] Cutoff parsing after hitting completion point
This fixes a crash caused by Lexers being invalidated at code completion points in https://github.com/llvm/llvm-project/blob/main/clang/lib/Lex/PPLexerChange.cpp#L520.
Differential Revision: https://reviews.llvm.org/D102069
|
 | clang/test/CodeCompletion/crash-if-directive.cpp |
 | clang/lib/Lex/PPDirectives.cpp |
Commit
ea64200b6197d8db97a11db15a5906fc1eb5ef4a
by llvm-devHexagonVectorCombine.cpp - don't negate a bool value. NFCI.
Silences MSVC warning.
|
 | llvm/lib/Target/Hexagon/HexagonVectorCombine.cpp |
Commit
407a33889de69c54bf4c0945f94a8417cf08e250
by sander.desmalen[AArch64][SVE] Fix isel failure for FP-extending loads
DAGCombiner tries to combine a (fpext (load)) to (fround (extload)) but SVE has no FP-extending loads. By marking these as expand, the combine no longer happens.
This also fixes a similar issue for fptrunc, where the source type is not a legal type.
Reviewed By: bsmith, kmclaughlin
Differential Revision: https://reviews.llvm.org/D102053
|
 | llvm/test/CodeGen/AArch64/sve-fptrunc-store.ll |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp |
 | llvm/test/CodeGen/AArch64/sve-fpext-load.ll |
Commit
f3139b20a0bf56b3a5e20542363a799619b98ec9
by momchil.velikov[GlobalISel] Fix wrong invocation of `getParamStackAlign` (NFC)
The function template `CallLowering::setArgFlags` is invoked both for arguments and return values. In the latter case, it calls `getParamStackAlign` with argument index `~0u`. Nothing wrong happens now, as the argument is safely incremented back to 0 inside `getParamStackAlign` (the type is `unsigned`), but in principle it's fragile and may become incorrect.
Differential Revision: https://reviews.llvm.org/D102004
|
 | llvm/lib/CodeGen/GlobalISel/CallLowering.cpp |
Commit
f8f953c2a6b3ee6bd50f1bc1bc81880c0d40eb6c
by bradley.smith[AArch64][SVE] Better utilisation of unpredicated forms of arithmetic intrinsics
When using predicated arithmetic intrinsics, if the predicate used is all lanes active, use an unpredicated form of the instruction, additionally this allows for better use of immediate forms.
This also includes a new complex isel pattern which allows matching an all active predicate when the types are different but the predicate is a superset of the type being used. For example, to allow a b8 ptrue for a b32 predicate operand.
This only includes instructions where the unpredicated/predicated forms are mismatched between variants, meaning that the removal of the predicate is done during instruction selection in order to prevent spurious re-introductions of ptrue instructions.
Co-authored-by: Paul Walker <paul.walker@arm.com>
Differential Revision: https://reviews.llvm.org/D101062
|
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp |
 | llvm/test/CodeGen/AArch64/sve-intrinsics-unpred-form.ll |
 | llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp |
 | llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.h |
 | llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith-imm.ll |
 | llvm/lib/Target/AArch64/SVEInstrFormats.td |
Commit
65c89cd1a62ad1f2e9b879edc10a806f13c21892
by bradley.smith[AArch64][SVE] Better utilisation of unpredicated forms of remaining intrinsics
When using predicated intrinsics, if the predicate used is all lanes active, use an unpredicated form of the instruction, additionally this allows for better use of immediate forms.
This only includes instructions where the unpredicated/predicated forms matched in such a way that instruction selection would not introduce extra ptrue instructions. This allows us to convert the intrinsics directly to architecture independent ISD nodes.
Depends on D101062
Differential Revision: https://reviews.llvm.org/D101828
|
 | llvm/test/CodeGen/AArch64/sve2-intrinsics-int-arith-imm.ll |
 | llvm/test/CodeGen/AArch64/sve-intrinsics-unpred-form.ll |
 | llvm/test/CodeGen/AArch64/sve-intrinsics-logical-imm.ll |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp |
 | llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td |
 | llvm/lib/Target/AArch64/SVEInstrFormats.td |
 | llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith-imm.ll |
Commit
08de6e3adaf6e991f5a40357f4634e5b70ec3fde
by thakisclang: Fix tests after 7f78e409d028 if clang is not called clang-13
We might release a new version at some point after all. In fact, use the same pattern the other CHECK lines in this test use, for consistency.
|
 | clang/test/Driver/amdgpu-openmp-toolchain.c |
Commit
9ad9f0c731702dddd401b5ec0c4bca8394dd43e8
by djtodoro[NFC][llvm-dwarfdump] Code clean up for inlined var loc stats
This is preparation for the https://reviews.llvm.org/D101025. The D101025 will start calculating var locstats for concrete fns that refere to an abstract origin as well.
|
 | llvm/tools/llvm-dwarfdump/Statistics.cpp |
Commit
f088af37e6b570dd070ae4e6fc14e22d21cda3be
by kadircet[clangd] Fix data type of WorkDoneProgressReport::percentage
According to the specification, this should be an unsigned integer.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D101616
|
 | clang-tools-extra/clangd/Protocol.h |
 | clang-tools-extra/clangd/ClangdLSPServer.cpp |
Commit
3212a08a8c811441ca68009118758998750ce905
by fraser[Constant] Allow ConstantAggregateZero a scalable element count
A ConstantAggregateZero may be created from a scalable vector type. However, it still assumed fixed number of elements when queried for them. This patch changes ConstantAggregateZero to correctly report its element count.
This change fixes a couple of issues. Firstly, it fixes a crash in Constant::getUniqueValue when called on a scalable-vector zeroinitializer constant.
Secondly, it fixes a latent bug in GlobalISel's IRTranslator in which translating a scalable-vector zeroinitializer would hit the assertion in ConstantAggregateZero::getNumElements when casting to a FixedVectorType, rather than reporting an error more gracefully. This is currently hypothetical as the IRTranslator has deeper issues preventing the use of scalable vector types.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D102082
|
 | llvm/test/Transforms/InstCombine/scalable-select.ll |
 | llvm/lib/IR/Constants.cpp |
 | llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp |
 | llvm/include/llvm/IR/Constants.h |
Commit
9243a584d3863d8baae59f6ab0d3c46c7d7b857c
by llvm-devX86LoadValueInjectionLoadHardening.cpp - use const-reference in for-range loops to avoid unnecessary copies. NFCI.
|
 | llvm/lib/Target/X86/X86LoadValueInjectionLoadHardening.cpp |
Commit
605f90475fc663433250f45f39dcb24cdb8c9e9b
by llvm-devX86FlagsCopyLowering.cpp - try to pass DebugLoc by const-ref to avoid costly TrackingMDNodeRef copies. NFCI.
|
 | llvm/lib/Target/X86/X86FlagsCopyLowering.cpp |
Commit
fefd03a89129ab13d2b3aa04ad2d6d52f8ba794d
by qixingxue[TableGen] Remove redundant `Error:` in msg (NFC)
Since calling `PrintFatalError` will automatically add `error: ` prefix in the message printed, there is no need having an extra `ERROR:` prefix in the argument passed.
Differential Revision: https://reviews.llvm.org/D102151 Reviewed By: Paul-C-Anagnostopoulos
|
 | llvm/utils/TableGen/CodeGenTarget.cpp |
 | llvm/utils/TableGen/ExegesisEmitter.cpp |
Commit
230953d5771f6f3ce6bf86b8bb6ae4d5eb75a218
by a.bataev[OPENMP]Fix PR48851: the locals are not globalized in SPMD mode.
Follow the more general patch for now, do not try to SPMDize the kernel if the variable is used and local.
Differential Revision: https://reviews.llvm.org/D101911
|
 | clang/lib/CodeGen/CGOpenMPRuntime.cpp |
 | clang/test/OpenMP/nvptx_SPMD_codegen.cpp |
Commit
635164b95a8e5d8c0c40804c7ffa3981a82b917a
by bradley.smith[AArch64][SVE] Improve SVE codegen for fixed length BITCAST
Expanding a fixed length operation involves wrapping the operation in an insert/extract subvector pair, as such, when this is done to bitcast we end up with an extract_subvector of a bitcast. DAGCombine tries to convert this into a bitcast of an extract_subvector which restores the initial fixed length bitcast, causing an infinite loop of legalization.
As part of this patch, we must make sure the above DAGCombine does not trigger after legalization if the created bitcast would not be legal.
Differential Revision: https://reviews.llvm.org/D101990
|
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.h |
 | llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp |
 | llvm/test/CodeGen/AArch64/sve-fixed-length-bitcast.ll |
Commit
4677d795b2042e783952fdcdaefaf2ca6bfb72a6
by jasonliu[libc++][AIX] Define _LIBCPP_ELAST
The aim is to define _LIBCPP_ELAST for AIX since strerror/strerror_r can't handle out-of-range errno values.
Differential Revision: https://reviews.llvm.org/D100986
|
 | libcxx/src/include/config_elast.h |
Commit
30463bc3f1839e8a238be4c137e2356f3cca2771
by a.bataev[SLP]Do not count perfect diamond matches for gathers several times.
Need to remove the old code for avoiding double counting of the gather nodes with perfect diamond matches within the tree after we started detecting perfect/shuffled matching in the previous patch D100495. We may skip the cost for such nodes completely.
Differential Revision: https://reviews.llvm.org/D102023
|
 | llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp |
 | llvm/test/Transforms/SLPVectorizer/AArch64/gather-cost.ll |
Commit
0c41f77857fccf2a20d48ca96ce61d3c4d1634e6
by zarko[PowerPC] Enable safe for 32bit vins* P10 instructions
Correctly emit `vins`instructions that are safe in 32bit mode.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D101383
|
 | llvm/lib/Target/PowerPC/PPCISelLowering.cpp |
 | llvm/lib/Target/PowerPC/PPCInstrPrefix.td |
 | llvm/test/CodeGen/PowerPC/aix-vec_insert_elt.ll |
Commit
6da348569cd20632d8ee2213fbab59850e133eb0
by jonathanchesterfield[libomptarget] Add support for target allocators to dynamic cuda RTL
[libomptarget] Add support for target allocators to dynamic cuda RTL
Follow on to D102000 which introduced new calls into libcuda. This patch adds the corresponding entry points to dynamic_cuda, fixing the build for systems that do not have the cuda toolkit installed.
Function types and enum from https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MEM.html
Reviewed By: pdhaliwal
Differential Revision: https://reviews.llvm.org/D102169
|
 | openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h |
 | openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.cpp |
Commit
822be4bec894134fa63ed4d2d289f353f3cfcc19
by spatelRevert "[PassManager] add helper function to hold set of vector passes"
This reverts commit fefcb1f878c2dad435af604955661ca02a5302de. It was supposed to be NFC, but as noted in the post-commit comments in D102002, that was not true: SimplifyCFG uses different parameters and there's a difference in an extension point / callback.
|
 | llvm/include/llvm/Passes/PassBuilder.h |
 | llvm/lib/Passes/PassBuilder.cpp |
 | llvm/include/llvm/Transforms/IPO/PassManagerBuilder.h |
 | llvm/lib/Transforms/IPO/PassManagerBuilder.cpp |
Commit
5c7b43aa8298a389b906d72c792941a0ce57782e
by momchil.velikov[clang][AArch32] Correctly align HA arguments when passed on the stack
Analogously to https://reviews.llvm.org/D98794 this patch uses the `alignstack` attribute to fix incorrect passing of homogeneous aggregate (HA) arguments on AArch32. The EABI/AAPCS was recently updated to clarify how VFP co-processor candidates are aligned: https://github.com/ARM-software/abi-aa/commit/4488e34998514dc7af5507236f279f6881eede62
Differential Revision: https://reviews.llvm.org/D100853
|
 | llvm/test/CodeGen/ARM/ha-alignstack-call.ll |
 | clang/test/CodeGen/arm-ha-alignstack.c |
 | llvm/test/CodeGen/ARM/ha-alignstack.ll |
 | clang/lib/CodeGen/TargetInfo.cpp |
 | llvm/lib/Target/ARM/ARMCallingConv.cpp |
Commit
91a919e8994a2c47b3feaf906f83122776ae2cae
by sguelton[NFC] Synchronize reserved identifier code between macro and variables / symbols
Differential Revision: https://reviews.llvm.org/D102164
|
 | clang/lib/Lex/PPDirectives.cpp |
Commit
b0ef2070bc7da2b458fb15b9413d9e90abc71759
by harald[X86] Fix position-independent TType encoding
The logic for x86_64 position-independent TType encodings was backwards, using 8 bytes where 4 were wanted and 4 where 8 were wanted. For regular x86_64, this was mostly harmless, exception tables are allowed to use 8-byte encodings even when it is not needed. For the large code model, and for X32, however, the generated exception tables were wrong. For the large code model, we cannot assume that the address will fit in 4 bytes. For X32, we cannot use 64-bit relocations.
Fixes PR50148.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D102132
|
 | llvm/test/CodeGen/X86/gcc_except_table_bb_sections.ll |
 | llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp |
Commit
cfef7c918b8297ffb1d882d6b31fe68f876607db
by koraq[libc++][NFC] Remove _VSTD:: when not needed.
Reviewed By: #libc, Quuxplusone
Differential Revision: https://reviews.llvm.org/D102133
|
 | libcxx/include/algorithm |
 | libcxx/include/iomanip |
 | libcxx/include/__memory/shared_ptr.h |
 | libcxx/include/set |
 | libcxx/include/functional |
 | libcxx/include/compare |
 | libcxx/include/concepts |
 | libcxx/include/format |
Commit
7a0231ae59e7221fa10ff99a69f634107ac5a517
by i[llvm-objdump][MachO] Print a newline before lazy bind/bind/weak/exports trie
This adds a separator between two pieces of information.
Reviewed By: #lld-macho, alexshap
Differential Revision: https://reviews.llvm.org/D102114
|
 | lld/test/MachO/tlv.s |
 | llvm/test/tools/llvm-objdump/MachO/bind.test |
 | lld/test/MachO/local-got.s |
 | lld/test/MachO/x86-64-stubs.s |
 | llvm/test/tools/llvm-objdump/MachO/rebase.test |
 | llvm/tools/llvm-objdump/MachODump.cpp |
 | llvm/test/tools/llvm-objdump/MachO/exports-trie.test |
 | llvm/test/tools/llvm-objdump/MachO/lazy-bind.test |
Commit
b483c0afb39e6d9f02e5fcc3b3cee0c99b4f977a
by llvm-dev[X86][SSE] Merge equal X32/X64 check prefixes. NFCI.
|
 | llvm/test/CodeGen/X86/horizontal-shuffle.ll |
Commit
1d802e16650785f6c37a5805d8787abdd611507e
by llvm-dev[X86][SSE] Add tests for missing shuffle(pack(x,y),pack(z,w)) -> permute(pack()) folds.
|
 | llvm/test/CodeGen/X86/horizontal-shuffle.ll |
Commit
2aa5f9b45a493eac5cb2951875af485b7e1b105c
by gbreynoo[llvm-symbolizer] Update Command Guide
The option --use-symbol-table is now a noop and does not appear in the help text, however it still appears in the command guide. This change removes it from the command guide and updates the description of --output-style .
Differential Revision: https://reviews.llvm.org/D102078
|
 | llvm/docs/CommandGuide/llvm-symbolizer.rst |
Commit
c74176ee31fad6ccaa1d8771be2cc2b7e9fa988a
by gbreynoo[llvm-nm] Help option output should be consistent with the command guide
The nm command guide shows the short options used as aliases but these are not found in the help text unless --show-hidden is used, other tools show aliases with --help. This change fixes the help output to be consistent with the command guide.
Differential Revision: https://reviews.llvm.org/D102072
|
 | llvm/tools/llvm-nm/llvm-nm.cpp |
Commit
08d18af26105fb77665199365c5f3124e0278826
by Lang Hames[ORC] Update SpeculativeJIT example for dispatchTask changes in 5344c88dcb2.
|
 | llvm/examples/SpeculativeJIT/SpeculativeJIT.cpp |
Commit
68a20c7f36d1d51cc46c0bd17384c16bc7818fa2
by i[clang] Support -fpic -fno-semantic-interposition for AArch64
-fno-semantic-interposition (only effective with -fpic) can optimize default visibility external linkage (non-ifunc-non-COMDAT) variable access and function calls to avoid GOT/PLT, by using local aliases, e.g. ``` int var; __attribute__((optnone)) int fun(int x) { return x * x; } int test() { return fun(var); } ```
-fpic (var and fun are dso_preemptable) ``` test: // @test adrp x8, :got:var ldr x8, [x8, :got_lo12:var] ldr w0, [x8] // fun is preemptible by default in ld -shared mode. ld will create a PLT. b fun ```
vs -fpic -fno-semantic-interposition (var and fun are dso_local) ``` test: // @test .Ltest$local: adrp x8, .Lvar$local ldr w0, [x8, :lo12:.Lvar$local] // The assembler either resolves .Lfun$local at assembly time, or produces a // relocation referencing a non-preemptible section symbol (which can avoid PLT). b .Lfun$local ```
Note: Clang's default -fpic is more aggressive than GCC -fpic: interprocedural optimizations (including inlining) are available but local aliases are not used. -fpic -fsemantic-interposition can disable interprocedural optimizations.
Depends on D101872
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D101873
|
 | clang/test/Driver/fsemantic-interposition.c |
 | clang/lib/Driver/ToolChains/Clang.cpp |
Commit
2961f86317f8560c948af8f38581e1ea5a9cfebf
by dblaikie[Demangle][Rust] Parse basic types
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102142
|
 | llvm/include/llvm/Demangle/RustDemangle.h |
 | llvm/lib/Demangle/RustDemangle.cpp |
 | llvm/test/Demangle/rust.test |
Commit
80b9510806cf11c57f2dd87191d3989fc45defa8
by craig.topper[RISCV] Correct VL for fixed length masked scatter.
We were incorrectly calling getVectorNumElements on a scalable vector type. This shouldn't be allowed. This gives a warning on EVT, but not MVT.
|
 | llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-scatter.ll |
 | llvm/lib/Target/RISCV/RISCVISelLowering.cpp |
Commit
22f834210adb79ef266968c55f1e562c0047d18f
by llvm-dev[X86][SSE] Add examples of failures to remove a permute(pack(pack(),pack())) shuffle by reordering the packed operands.
|
 | llvm/test/CodeGen/X86/horizontal-shuffle-4.ll |
Commit
bcfa7baec8bbf45b98bcde60305efa23df7399e6
by stellaraccident[mlir][CAPI] Add CAPI bindings for the sparse_tensor dialect.
* Adds dialect registration, hand coded 'encoding' attribute and test. * An MLIR CAPI tablegen backend for attributes does not exist, and this is a relatively complicated case. I opted to hand code it in a canonical way for now, which will provide a reasonable blueprint for building out the tablegen version in the future. * Also added a (local) CMake function for declaring new CAPI tests, since it was getting repetitive/buggy.
Differential Revision: https://reviews.llvm.org/D102141
|
 | mlir/lib/CAPI/Dialect/SparseTensor.cpp |
 | mlir/test/CAPI/sparse_tensor.c |
 | mlir/lib/CAPI/Dialect/CMakeLists.txt |
 | mlir/include/mlir-c/Dialect/SparseTensor.h |
 | mlir/test/CMakeLists.txt |
 | mlir/test/CAPI/CMakeLists.txt |
Commit
f44c6f20f5e9976357b4851c4432d96b4e4d3521
by davelee.com[cmake] Enable -Wmisleading-indentation
Enable `-Wmisleading-indentation` to balance with the LLVM style of optional parentheses.
Differential Revision: https://reviews.llvm.org/D102092
|
 | llvm/cmake/modules/HandleLLVMOptions.cmake |
Commit
bda8b8488442215e0557a53016a8d9c0a36b90c5
by sbc[lld][WebAssembly] Disallow exporting of TLS symbols
Cross module TLS is currently not supported by our ABI. This change makes explicitly exporting a TLS symbol into an error and prevents implicit exporting (via --export-all).
See https://github.com/emscripten-core/emscripten/issues/14120
Differential Revision: https://reviews.llvm.org/D102044
|
 | lld/wasm/OutputSegment.h |
 | lld/wasm/Symbols.h |
 | lld/wasm/Relocations.cpp |
 | lld/test/wasm/tls-export.s |
 | lld/wasm/InputFiles.cpp |
 | lld/wasm/Writer.cpp |
 | lld/wasm/InputChunks.h |
 | lld/wasm/Symbols.cpp |
Commit
f13893f66a228400bf9bdf14be425e3dc6da0034
by stellaraccident[mlir][Python] Upstream the PybindAdaptors.h helpers and use it to implement sparse_tensor.encoding.
* The PybindAdaptors.h file has been evolving across different sub-projects (npcomp, circt) and has been successfully used for out of tree python API interop/extensions and defining custom types. * Since sparse_tensor.encoding is the first in-tree custom attribute we are supporting, it seemed like the right time to upstream this header and use it to define the attribute in a way that we can support for both in-tree and out-of-tree use (prior, I had not wanted to upstream dead code which was not used in-tree). * Adapted the circt version of `mlir_type_subclass`, also providing an `mlir_attribute_subclass`. As we get a bit of mileage on this, I would like to transition the builtin types/attributes to this mechanism and delete the old in-tree only `PyConcreteType` and `PyConcreteAttribute` template helpers (which cannot work reliably out of tree as they depend on internals). * Added support for defaulting the MlirContext if none is passed so that we can support the same idioms as in-tree versions.
There is quite a bit going on here and I can split it up if needed, but would prefer to keep the first use and the header together so sending out in one patch.
Differential Revision: https://reviews.llvm.org/D102144
|
 | mlir/test/python/dialects/sparse_tensor/dialect.py |
 | mlir/lib/Bindings/Python/DialectSparseTensor.cpp |
 | mlir/include/mlir/Bindings/Python/PybindAdaptors.h |
 | mlir/lib/Bindings/Python/DialectLinalg.h |
 | mlir/lib/Bindings/Python/CMakeLists.txt |
 | mlir/lib/Bindings/Python/Dialects.h |
 | mlir/lib/Bindings/Python/MainModule.cpp |
 | mlir/lib/Bindings/Python/DialectLinalg.cpp |
Commit
7086025d6567562d31fadbaccf08b4fd72ec2100
by andrew.kaylor[Dependence Analysis] Enable delinearization of fixed sized arrays
Patch by Artem Radzikhovskyy!
Allow delinearization of fixed sized arrays if we can prove that the GEP indices do not overflow the array dimensions. The checks applied are similar to the ones that are used for delinearization of parametric size arrays. Make sure that the GEP indices are non-negative and that they are smaller than the range of that dimension.
Changes Summary:
- Updated the LIT tests with more exact values, as we are able to delinearize and apply more exact tests - profitability.ll - now able to delinearize in all cases, no need to use -da-disable-delinearization-checks flag and run the test twice - loop-interchange-optimization-remarks.ll - in one of the cases we are able to delinearize without using -da-disable-delinearization-checks - SimpleSIVNoValidityCheckFixedSize.ll - removed unnecessary "-da-disable-delinearization-checks" flag. Now can get the exact answer without it. - SimpleSIVNoValidityCheckFixedSize.ll and PreliminaryNoValidityCheckFixedSize.ll - made negative tests more explicit, in order to demonstrate the need for "-da-disable-delinearization-checks" flag
Differential Revision: https://reviews.llvm.org/D101486
|
 | llvm/test/Analysis/DependenceAnalysis/Separability.ll |
 | llvm/test/Analysis/DependenceAnalysis/SimpleSIVNoValidityCheck.ll |
 | llvm/test/Analysis/DependenceAnalysis/Invariant.ll |
 | llvm/test/Transforms/LoopInterchange/profitability.ll |
 | llvm/test/Transforms/LoopInterchange/loop-interchange-optimization-remarks.ll |
 | llvm/test/Analysis/DependenceAnalysis/PreliminaryNoValidityCheckFixedSize.ll |
 | llvm/test/Analysis/DependenceAnalysis/SimpleSIVNoValidityCheckFixedSize.ll |
 | llvm/lib/Analysis/DependenceAnalysis.cpp |
 | llvm/test/Analysis/DependenceAnalysis/Coupled.ll |
Commit
1f44fee521c84f5917d323d15301ab27c358178e
by i[lld-macho] Improve an external weak def test
The rebase table entry is untested.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D102150
|
 | lld/test/MachO/weak-binding.s |
Commit
e32374ed5cb27494c67817f12d1e1cec05486f40
by llvm-dev[X86][SSE] canonicalizeShuffleMaskWithHorizOp - add TODO for better 256/512-bit shuffle+hop folding support. NFC.
|
 | llvm/lib/Target/X86/X86ISelLowering.cpp |
Commit
a9196db905aaa35599ecd70e9f33d276bff355bb
by llvm-dev[X86][AVX] Add example of failure to remove a 256-bit permute(hadd(hadd(),hadd())) shuffle by reordering the packed operands.
|
 | llvm/test/CodeGen/X86/horizontal-shuffle-4.ll |
Commit
ecff974b66a5b365c7834fe4cd8309b36859437c
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: add tests for sub-32-bit CMP dep breaking
|
 | llvm/test/tools/llvm-mca/X86/Znver3/dependency-breaking-gpr.s |
Commit
08cf2776acff6f2dc9998ef15e0bea7a8aeca0c3
by lebedev.ri[X86] AMD Zen 3: sub-32-bit CMP also break dependencies
They measure as having the same effect as 32-bit CMP.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/dependency-breaking-gpr.s |
Commit
f38633d1bbf5842f37ad722a2f0edfdfd80733a2
by stellaraccident[mlir][Python] Re-export cext sparse_tensor module to the public namespace.
* This was left out of the previous commit accidentally.
Differential Revision: https://reviews.llvm.org/D102183
|
 | mlir/test/python/dialects/sparse_tensor/dialect.py |
 | mlir/python/mlir/dialects/sparse_tensor.py |
 | mlir/python/mlir/_cext_loader.py |
Commit
88d8f10baf30b0df18eb542c426afc29b69f1313
by spatel[PassManager] add helper function to hold set of vector passes (2nd try)
This is better no-functional-change-intended than the 1st attempt. As noted in D102002, there were at least 2 diffs that went unchecked in pass manager regressions tests: different pass parameters (SimplifyCFG) and an extension point/callback. Those should be lifted from the original code blocks correctly now.
|
 | llvm/include/llvm/Passes/PassBuilder.h |
 | llvm/include/llvm/Transforms/IPO/PassManagerBuilder.h |
 | llvm/lib/Passes/PassBuilder.cpp |
 | llvm/lib/Transforms/IPO/PassManagerBuilder.cpp |
Commit
dc75499998352ffbbe0d1da196631ddb73ad47f3
by Amara Emerson[GlobalISel][IRTranslator] Fix bit-test lowering dropping phi edges.
For contiguous ranges we drop the last bit-test case but in doing so we skip adding the new MBB PHI edges to the list of replacement PHI edges, and as a result we incorrectly omit them in the G_PHI in finishPendingPhis().
Was found when bootstrapping clang with -O3 and GlobalISel enabled on Apple Silicon.
|
 | llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp |
 | llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-switch-bittest.ll |
Commit
18f3a14e1328c813fa5dbacc9bb931d22f0669cd
by craig.topper[RISCV] Validate the SEW and LMUL operands to __builtin_rvv_vsetvli(max)
These are required to be constants, this patch makes sure they are in the accepted range of values.
These are usually created by wrappers in the riscv_vector.h header which should always be correct. This patch protects against a user using the builtin directly.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D102086
|
 | clang/test/CodeGen/RISCV/rvv_errors.c |
 | clang/include/clang/Basic/DiagnosticSemaKinds.td |
 | clang/lib/Sema/SemaChecking.cpp |
 | clang/include/clang/Sema/Sema.h |
Commit
8936608e6f4dbd2a80acde660849cd87ef5c9d26
by 31459023+hctim[scudo] [GWP-ASan] Add GWP-ASan variant of scudo benchmarks.
GWP-ASan is the "production" variant as compiled by compiler-rt, and it's useful to be able to benchmark changes in GWP-ASan or Scudo's GWP-ASan hooks across versions. GWP-ASan is sampled, and sampled allocations are much slower, but given the amount of allocations that happen under test here - we actually get a reasonable representation of GWP-ASan's negligent performance impact between runs.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D101865
|
 | compiler-rt/lib/scudo/standalone/benchmarks/malloc_benchmark.cpp |
 | compiler-rt/lib/scudo/standalone/benchmarks/CMakeLists.txt |
 | compiler-rt/lib/scudo/standalone/combined.h |
Commit
0c64cef8943546fce28936bb51cce2e22f88c698
by sivachandra[libc] Rever "Simplifies multi implementations and benchmarks".
This reverts commit 541f107871bc9c020925a6e5342542a47c902d12 as the bots are failing with unknown architecture "x86-64-v*". Will let the original author decide on the right course of action to correct the problem and reland.
|
 | libc/src/string/x86_64/CMakeLists.txt |
 | libc/src/string/CMakeLists.txt |
 | libc/test/src/string/CMakeLists.txt |
 | libc/src/string/aarch64/CMakeLists.txt |
 | libc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake |
Commit
7e71823f1deb54a1465bc4040f4e3158357f71df
by antiagainst[mlir][linalg] Restrict distribution to parallel dims
According to the API contract, LinalgLoopDistributionOptions expects to work on parallel iterators. When getting processor information, only loop ranges for parallel dimensions should be fed in. But right now after generating scf.for loop nests, we feed in *all* loops, including the ones materialized for reduction iterators. This can cause unexpected distribution of reduction dimensions. This commit fixes it.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D102079
|
 | mlir/test/lib/Transforms/TestLinalgTransforms.cpp |
 | mlir/lib/Dialect/Linalg/Utils/Utils.cpp |
 | mlir/test/Dialect/Linalg/tile-and-distribute.mlir |
Commit
16748bd2fb1fe10d7d097961f1988327338f3f9f
by aeubanks[TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Parameter attributes are considered part of the function [1], and like mismatched calling conventions [2], we can't have the verifier check for mismatched parameter attributes.
[1] https://llvm.org/docs/LangRef.html#parameter-attributes [2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D101806
|
 | llvm/test/CodeGen/ARM/returned-ext.ll |
 | llvm/test/CodeGen/X86/mismatched-byval.ll |
 | llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll |
 | llvm/test/CodeGen/SystemZ/args-02.ll |
 | llvm/test/CodeGen/X86/movtopush.ll |
 | llvm/test/CodeGen/ARM/ipra-r0-returned.ll |
 | llvm/test/CodeGen/ARM/this-return.ll |
 | llvm/test/CodeGen/AArch64/tailcall-explicit-sret.ll |
 | llvm/test/CodeGen/X86/tailcall-msvc-conventions.ll |
 | llvm/test/CodeGen/AArch64/arm64-this-return.ll |
 | llvm/test/CodeGen/SPARC/64abi.ll |
 | llvm/test/CodeGen/SystemZ/args-03.ll |
 | llvm/test/CodeGen/X86/fast-cc-pass-in-regs.ll |
 | llvm/test/CodeGen/X86/fast-cc-merge-stack-adj.ll |
 | llvm/test/CodeGen/AArch64/bitfield-extract.ll |
 | llvm/docs/ReleaseNotes.rst |
 | llvm/test/CodeGen/AMDGPU/call-argument-types.ll |
 | llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs-packed.ll |
 | llvm/test/CodeGen/X86/pop-stack-cleanup.ll |
 | llvm/test/CodeGen/X86/preallocated.ll |
 | llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll |
 | llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp |
 | llvm/test/CodeGen/AMDGPU/tail-call-amdgpu-gfx.ll |
Commit
6215f49b8f2fa479535ec27a0f029081ac394100
by stefanp[PowerPC] Spilling to registers does not require frame index scavenging
If spills are to registers instead of to the stack then a copy will be used and frame index scavenging is not required.
This patch adds debug info to frame index scavenging and makes sure that spilling to registers does not cause frame index scavenging.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D101360
|
 | llvm/test/CodeGen/PowerPC/frame_index_scavenging.mir |
 | llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp |
Commit
3d5e5066f1af50ea622d136e9543aedae178c8e5
by jezng[lld-macho][nfc] Clean up tests
* Remove unnecessary `rm -rf %t`s * Have lc-linker-option.ll use the right comment marker
|
 | lld/test/MachO/flat-namespace.s |
 | lld/test/MachO/lc-linker-option.ll |
 | lld/test/MachO/sub-library.s |
 | lld/test/MachO/u.s |
 | lld/test/MachO/t.s |
 | lld/test/MachO/U-dynamic-lookup.s |
 | lld/test/MachO/dependency-info.s |
 | lld/test/MachO/invalid/undefined-symbol.s |
 | lld/test/MachO/adhoc-codesign.s |
 | lld/test/MachO/why-load.s |
Commit
2516b0b5261d5f1fe7cfe357550a826f88fc68b7
by jezng[lld-macho] Treat undefined symbols uniformly
In particular, we should apply the `-undefined` behavior to all such symbols, include those that are specified via the command line (i.e. `-e`, `-u`, and `-exported_symbol`). ld64 supports this too.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D102143
|
 | lld/test/MachO/entry-symbol.s |
 | lld/MachO/Driver.cpp |
 | lld/test/MachO/u.s |
 | lld/MachO/SymbolTable.h |
 | lld/test/MachO/export-options.s |
 | lld/MachO/Writer.cpp |
 | lld/MachO/SymbolTable.cpp |
Commit
b1c3c2e4fc219c23e311fe56d8687fdf6bba3c89
by jezng[lld-macho] Fix order file arch filtering
We had a hardcoded check and a stale TODO, written back when we only had support for one architecture.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D102154
|
 | lld/MachO/Target.h |
 | lld/MachO/Driver.cpp |
 | lld/MachO/InputFiles.cpp |
 | lld/test/MachO/order-file.s |
Commit
96a23911f6d72cc1ef0788b34caa553f1ce99c5d
by ajcbik[mlir][sparse] complete migration to sparse tensor type
A very elaborate, but also very fun revision because all puzzle pieces are finally "falling in place".
1. replaces lingalg annotations + flags with proper sparse tensor types 2. add rigorous verification on sparse tensor type and sparse primitives 3. removes glue and clutter on opaque pointers in favor of sparse tensor types 4. migrates all tests to use sparse tensor types
NOTE: next CL will remove *all* obsoleted sparse code in Linalg
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D102095
|
 | mlir/test/Dialect/SparseTensor/roundtrip.mlir |
 | mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp |
 | mlir/test/Dialect/SparseTensor/sparse_parallel.mlir |
 | mlir/test/Dialect/SparseTensor/invalid.mlir |
 | mlir/test/Dialect/SparseTensor/sparse_3d.mlir |
 | mlir/test/Dialect/SparseTensor/sparse_invalid.mlir |
 | mlir/test/Dialect/SparseTensor/sparse_lower.mlir |
 | mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matvec.mlir |
 | mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensor.h |
 | mlir/test/Dialect/SparseTensor/conversion.mlir |
 | mlir/test/Dialect/SparseTensor/sparse_storage.mlir |
 | mlir/test/Dialect/SparseTensor/roundtrip_encoding.mlir |
 | mlir/test/Dialect/SparseTensor/sparse_nd.mlir |
 | mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sum.mlir |
 | mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorPasses.cpp |
 | mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td |
 | mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp |
 | mlir/lib/Dialect/SparseTensor/IR/SparseTensorDialect.cpp |
 | mlir/lib/ExecutionEngine/SparseUtils.cpp |
 | mlir/test/Dialect/SparseTensor/sparse_vector.mlir |
 | mlir/test/Dialect/SparseTensor/sparse_1d.mlir |
 | mlir/test/Dialect/SparseTensor/invalid_encoding.mlir |
 | mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_matmul.mlir |
 | mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.h |
 | mlir/test/Dialect/SparseTensor/sparse_2d.mlir |
Commit
e78b64df98878d1da56275e0c272ed58364da3ad
by 31459023+hctim[Scudo] Use GWP-ASan's aligned allocations and fixup postalloc hooks.
This patch does a few cleanup things: 1. The non-standalone scudo has a problem where GWP-ASan allocations may not meet alignment requirements where Scudo was requested to have alignment >= 16. Use the new GWP-ASan API to fix this. 2. The standalone variant loses some debugging information inside of GWP-ASan because we ask GWP-ASan to allocate an aligned size in the frontend. This means reports end up with 'UaF on a 16-byte allocation' for a 1-byte allocation with 16-byte alignment. Also use the new API to fix this. 3. Add post-alloc hooks for GWP-ASan intercepted allocations, and add stats tracking for GWP-ASan allocations. 4. Add a small test that checks the alignment of the frontend allocator, so that it can be used under GWP-ASan torture mode. 5. Add GWP-ASan torture mode as a testing configuration to catch these regressions.
Depends on D94830, D95889.
Reviewed By: cryptoad
Differential Revision: https://reviews.llvm.org/D95884
|
 | compiler-rt/lib/gwp_asan/common.h |
 | compiler-rt/test/scudo/standalone/CMakeLists.txt |
 | compiler-rt/lib/scudo/standalone/tests/wrappers_c_test.cpp |
 | compiler-rt/lib/scudo/standalone/combined.h |
 | compiler-rt/lib/scudo/scudo_allocator.cpp |
 | compiler-rt/lib/scudo/standalone/stats.h |
 | compiler-rt/lib/scudo/standalone/tests/wrappers_cpp_test.cpp |
 | compiler-rt/test/scudo/standalone/unit/gwp_asan/lit.site.cfg.py.in |
Commit
aa9b02ac75350a6c7c949dd24d5c6a931be26ff9
by nikita.ppv[Inliner] Fix noalias metadata handling for instructions simplified during cloning (PR50270)
Instead of using VMap, which may include instructions from the caller as a result of simplification, iterate over the (FirstNewBlock, Caller->end()) range, which will only include new instructions.
Fixes https://bugs.llvm.org/show_bug.cgi?id=50270.
Differential Revision: https://reviews.llvm.org/D102110
|
 | llvm/lib/Transforms/Utils/InlineFunction.cpp |
 | llvm/test/Transforms/Inline/pr50270.ll |
Commit
9507bace6c122898ac1e7c01bbdcf3c448214c81
by Lang Hames[ORC] Use a unique_function rather than std::function for dispatchTask.
|
 | llvm/include/llvm/ExecutionEngine/Orc/Core.h |
Commit
85af8a8c1b574faa0d5d57d189ae051debdfada8
by aeubanks[NFC] Use ArgListEntry indirect types more in ISel lowering
For opaque pointers, we're trying to avoid uses of PointerType::getElementType().
A couple of ISel places use PointerType::getElementType(). Some of these are easy to fix by using ArgListEntry's indirect types.
The inalloca type wasn't stored there, as opposed to preallocated and byval which have their indirect types available, so add it and use it.
Differential Revision: https://reviews.llvm.org/D101713
|
 | llvm/lib/CodeGen/SelectionDAG/FastISel.cpp |
 | llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp |
 | llvm/include/llvm/CodeGen/TargetLowering.h |
 | llvm/include/llvm/IR/InstrTypes.h |
 | llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp |
Commit
5000a1b4b9edeb9e994f2a5b36da8d48599bea49
by sbc[lld][WebAssembly] Initial support merging string data
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in the object format which works in a similar fashion to SHF_STRINGS in the ELF world.
Unlike the ELF linker this support is currently limited: - No support for SHF_MERGE (non-string merging) - Always do full tail merging ("lo" can be merged with "hello") - Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828, although crucially it doesn't not currently support debug sections because they are not represented by data segments (they are custom sections)
Differential Revision: https://reviews.llvm.org/D97657
|
 | lld/wasm/CMakeLists.txt |
 | lld/test/wasm/merge-string.s |
 | lld/wasm/Symbols.cpp |
 | llvm/lib/ObjectYAML/WasmYAML.cpp |
 | lld/wasm/Writer.cpp |
 | llvm/include/llvm/BinaryFormat/Wasm.h |
 | llvm/lib/MC/WasmObjectWriter.cpp |
 | lld/wasm/SyntheticSections.cpp |
 | llvm/test/MC/WebAssembly/section-flags-changed.s |
 | llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp |
 | llvm/lib/Object/WasmObjectFile.cpp |
 | lld/wasm/InputChunks.h |
 | llvm/lib/MC/MCParser/WasmAsmParser.cpp |
 | lld/wasm/OutputSegment.h |
 | lld/wasm/Driver.cpp |
 | llvm/lib/MC/MCObjectFileInfo.cpp |
 | llvm/tools/obj2yaml/wasm2yaml.cpp |
 | llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp |
 | llvm/include/llvm/MC/MCSectionWasm.h |
 | lld/wasm/InputChunks.cpp |
 | llvm/lib/MC/MCContext.cpp |
 | llvm/include/llvm/MC/MCContext.h |
 | lld/wasm/InputFiles.cpp |
 | lld/wasm/OutputSegment.cpp |
 | llvm/lib/MC/MCSectionWasm.cpp |
Commit
93a9a8a8d90f5b9bb6965ebb1104082692d41833
by flo[VecLib] Add support for vector fns from Darwin's libsystem.
This patch adds support for Darwin's libsystem math vector functions to TLI. Darwin's libsystem provides a range of vector functions for libm functions.
This initial patch only adds the 2 x double and 4 x float versions, which are available on both X86 and ARM64. On X86, wider vector versions are supported as well.
Reviewed By: jroelofs
Differential Revision: https://reviews.llvm.org/D101856
|
 | llvm/include/llvm/Analysis/TargetLibraryInfo.h |
 | llvm/test/Transforms/LoopVectorize/AArch64/veclib-calls-libsystem-darwin.ll |
 | llvm/include/llvm/Analysis/VecFuncs.def |
 | llvm/lib/Analysis/TargetLibraryInfo.cpp |
 | llvm/test/CodeGen/Generic/replace-intrinsics-with-veclib-darwin-libsystem-m.ll |
Commit
463ea28e96c78f484d9ea44912d9bc70ff084c86
by nikita.ppv[InstCombine] Fold comparison of integers by parts
Let's say you represent (i32, i32) as an i64 from which the parts are extracted with lshr/trunc. Then, if you compare two tuples by parts you get something like A[0] == B[0] && A[1] == B[1], just that the part extraction happens by lshr/trunc and not a narrow load or similar.
The fold implemented here reduces such equality comparisons by converting them into a comparison on a larger part of the integer (which might be the whole integer). It handles both the "and of eq" and the conjugated "or of ne" case.
I'm being conservative with one-use for now, though this could be relaxed if profitable (the base pattern converts 11 instructions into 5 instructions, but there's quite a few variations on how it can play out).
Differential Revision: https://reviews.llvm.org/D101232
|
 | llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp |
 | llvm/test/Transforms/InstCombine/eq-of-parts.ll |
Commit
a2c8aebd8f8f81ba0af1c50580036faf73e8e2dc
by stellaraccident[mlir][Python] Finish adding RankedTensorType support for encoding.
Differential Revision: https://reviews.llvm.org/D102184
|
 | mlir/include/mlir-c/BuiltinTypes.h |
 | mlir/test/CAPI/ir.c |
 | mlir/test/python/dialects/sparse_tensor/dialect.py |
 | mlir/test/python/ir/builtin_types.py |
 | mlir/lib/Bindings/Python/IRTypes.cpp |
 | mlir/lib/CAPI/IR/BuiltinTypes.cpp |
Commit
295087644a468c47a1dbfaca2b5ea552204ab35f
by stellaraccident[mlir] Fix windows build bot break due to use of `alloca` in a test.
Differential Revision: https://reviews.llvm.org/D102189
|
 | mlir/test/CAPI/sparse_tensor.c |
Commit
edfa44b732984541105917934b1d9838fbf368ae
by aeubanks[test] Put aix-xcoff-huge-relocs.ll under expensive checks
It is an order of magnitude slower than the second slowest test according to obj/llvm/test/.lit_test_times.txt.
The two slowest are: 2.870437e+02 CodeGen/PowerPC/aix-xcoff-huge-relocs.ll 2.850697e+01 tools/llvm-readobj/ELF/file-header-machine-types.test
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D102190
|
 | llvm/test/CodeGen/PowerPC/aix-xcoff-huge-relocs.ll |
Commit
4ff2fe1df0cea28e3ef2963116385c86bf3b5055
by cjdb[libcxx] removes `weak_equality` and `strong_equality` from <compare>
`weak_equality` and `strong_equality` were removed before being standardised, and need to be removed.
Also adjusts `common_comparison_category` since its test needed adjusting due to the equality deletions.
Differential Revision: https://reviews.llvm.org/D100283
|
 | libcxx/test/std/language.support/cmp/cmp.strongord/strongord.pass.cpp |
 | libcxx/test/std/language.support/cmp/cmp.weakord/weakord.pass.cpp |
 | libcxx/test/std/language.support/cmp/cmp.strongeq/cmp.strongeq.pass.cpp |
 | libcxx/include/compare |
 | libcxx/test/std/language.support/cmp/cmp.common/common_comparison_category.pass.cpp |
 | libcxx/test/std/language.support/cmp/cmp.weakeq/cmp.weakeq.pass.cpp |
 | libcxx/test/std/language.support/cmp/cmp.partialord/partialord.pass.cpp |
Commit
ba225ce961b4ec5e4d64b393b042bbaae5e9b41b
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: add tests for same-reg MMX PCMPEQ
|
 | llvm/test/tools/llvm-mca/X86/Znver3/one-idioms-mmx.s |
Commit
b24edfff4fb16549b3e5ec434ca79dd86fdb4e43
by lebedev.ri[X86] AMD Zen 3: same-reg PCMPEQ is an MMX all-ones dep breaking idiom
They are, however, not zero-cycle, and do actually execute.
As measured by exegesis, and confirmed by ref docs.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/one-idioms-mmx.s |
Commit
0e538f937a02eb5a1a999319ef023932be64e130
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: add tests for same-reg XMM SSE PCMP
|
 | llvm/test/tools/llvm-mca/X86/Znver3/one-idioms-sse-xmm.s |
Commit
0f3bcb97efa8ac6c3277390c3fa2085ee72b074e
by lebedev.ri[X86] AMD Zen 3: same-reg SSE XMM PCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs. Much like with MMX PCMP, it does actually have to execute, though.
|
 | llvm/test/tools/llvm-mca/X86/Znver3/one-idioms-sse-xmm.s |
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
Commit
f59db6c4f84590aeeaf7753b8957a58cad12867b
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX XMM VPCMP
|
 | llvm/test/tools/llvm-mca/X86/Znver3/one-idioms-avx-xmm.s |
Commit
29532453370044a4c2ddeea130a3db1648b42aa9
by lebedev.ri[X86] AMD Zen 3: same-reg AVX XMM VPCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs. Again, it's not zero-cycle.
|
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
 | llvm/test/tools/llvm-mca/X86/Znver3/one-idioms-avx-xmm.s |
Commit
5864e7b86b919651e63ede7ba77ddca48385ea4d
by lebedev.ri[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX YMM VPCMP
|
 | llvm/test/tools/llvm-mca/X86/Znver3/one-idioms-avx-ymm.s |
Commit
6a64c462eb82f5f37e4ce512f4c25c474ddfcc4c
by lebedev.ri[X86] AMD Zen 3: same-reg AVX YMM VPCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs. Still not zero-cycle :)
|
 | llvm/test/tools/llvm-mca/X86/Znver3/one-idioms-avx-ymm.s |
 | llvm/lib/Target/X86/X86ScheduleZnver3.td |
Commit
43f4331edfb595979f6854351d24f9a9219595fa
by Artem Dergachev[clang-tidy] Aliasing: Add support for captures.
The utility function clang::tidy::utils::hasPtrOrReferenceInFunc() scans the function for pointer/reference aliases to a given variable. It currently scans for operator & over that variable and for declarations of references to that variable.
This patch makes it also scan for C++ lambda captures by reference and for Objective-C block captures.
Differential Revision: https://reviews.llvm.org/D96215
|
 | clang-tools-extra/test/clang-tidy/checkers/bugprone-redundant-branch-condition.cpp |
 | clang-tools-extra/test/clang-tidy/checkers/bugprone-infinite-loop.cpp |
 | clang-tools-extra/clang-tidy/utils/Aliasing.cpp |
Commit
9b292e0edcd4e889dbcf4bbaad6c1cc80fffcfd1
by Artem Dergachev[clang-tidy] Aliasing: Add more support for captures.
D96215 takes care of the situation where the variable is captured into a nearby lambda. This patch takes care of the situation where the current function is the lambda and the variable is one of its captures from an enclosing scope.
The analogous problem for ^{blocks} is already handled automagically by D96215.
Differential Revision: https://reviews.llvm.org/D101787
|
 | clang-tools-extra/test/clang-tidy/checkers/bugprone-infinite-loop.cpp |
 | clang-tools-extra/clang-tidy/utils/Aliasing.cpp |
 | clang-tools-extra/test/clang-tidy/checkers/bugprone-redundant-branch-condition.cpp |
Commit
91ca3269a1b544db1303b496101fd9d6fe953277
by Artem Dergachev[clang-tidy] Aliasing: Add support for aggregates with references.
When a variable is used in an initializer of an aggregate for its reference-type field this counts as aliasing.
Differential Revision: https://reviews.llvm.org/D101791
|
 | clang-tools-extra/clang-tidy/utils/Aliasing.cpp |
 | clang-tools-extra/test/clang-tidy/checkers/bugprone-infinite-loop.cpp |
Commit
8a74cc139d1fbafcf1dd0482490633924a46599a
by spatel[InstCombine] add tests for extract-subvector of insert; NFC
|
 | llvm/test/Transforms/InstCombine/shufflevec-bitcast.ll |
Commit
5577e866912e86147206ffc586e4f080c59ae4bf
by spatel[InstCombine] fold extract subvector of bitcast insertelt
This is visible in the original example from: https://llvm.org/PR50055 (but this change doesn't solve the bug)
https://alive2.llvm.org/ce/z/vM_Yq-
|
 | llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp |
 | llvm/test/Transforms/InstCombine/shufflevec-bitcast.ll |
Commit
6dc2a6a8c9a0e4f8b46a0ba05430b77229789b8e
by dblaikieRemove some unnecessary explicit defaulted copy ctors to cleanup -Wdeprecated-copy
These types also wanted to be/were copy assignable, and using the implicit copy ctor is deprecated in the presence of an explicit copy ctor.
Removing the explicit copy ctor provides the desired behavior - both ctor and assignment operator are available implicitly.
Also while I was nearby there were some missing std::moves on shared pointer parameters.
|
 | lldb/include/lldb/Symbol/UnwindPlan.h |
 | lldb/source/Plugins/Language/CPlusPlus/LibCxxList.cpp |
 | lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp |
 | lldb/include/lldb/Utility/Timeout.h |
 | lldb/include/lldb/DataFormatters/DumpValueObjectOptions.h |
Commit
174606877df46f3e8ce0c60a4c744687d3ee3271
by dblaikieClangd Matchers.h: Fix -Wdeprecated-copy by making the defaulted copy ctor and deleted copy assignment operators explicit
|
 | clang-tools-extra/clangd/unittests/Matchers.h |
Commit
8b9c15c2819bc4736e2c8315c6e0e71e8b7483bf
by kparzysz[Hexagon] Handle loads and stores of scalar predicate vectors
Handle v2i1, v4i1, and v8i1.
|
 | llvm/test/CodeGen/Hexagon/isel-memory-vNi1.ll |
 | llvm/lib/Target/Hexagon/HexagonISelLowering.cpp |
Commit
a0fed635fe1701470062495a6ffee1c608f3f1bc
by carrotPre-commit test case for D101970
This is a test case for D101970, which shows the optimization opportunity for
lea (reg1, reg2), reg3 sub reg3, reg4
to
sub reg1, reg4 sub reg2, reg4
Differential Revision: https://reviews.llvm.org/D102010
|
 | llvm/test/CodeGen/X86/lea-opt2.ll |
Commit
6d8b070d96197df6b5bf9fc2c53a78171ba64c6c
by Jessica Paquette[AArch64][GlobalISel] Enable memcpy family combines on minsize functions
The combines in `tryCombineMemCpyFamily` have heuristics (e.g. `TLI.getMaxStoresPerMemset`) which consider size. So, theoretically, enabling these combines on minsize functions shouldn't be harmful.
With this enabled we save 0.9% geomean on CTMark at -Oz, and 5.1% on Bullet. There are no code size regressions.
Differential Revision: https://reviews.llvm.org/D102198
|
 | llvm/lib/Target/AArch64/GISel/AArch64PreLegalizerCombiner.cpp |
 | llvm/test/CodeGen/AArch64/GlobalISel/inline-memset.mir |
Commit
061e071d8c9b98526f35cad55a918a4f1615afd4
by thakisRevert "[lld][WebAssembly] Initial support merging string data"
This reverts commit 5000a1b4b9edeb9e994f2a5b36da8d48599bea49. Breaks tests, see https://reviews.llvm.org/D97657#2749151
Easily repros locally with `ninja check-llvm-mc-webassembly`.
|
 | llvm/include/llvm/MC/MCContext.h |
 | llvm/lib/Object/WasmObjectFile.cpp |
 | llvm/test/MC/WebAssembly/section-flags-changed.s |
 | llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp |
 | lld/wasm/Driver.cpp |
 | lld/test/wasm/merge-string.s |
 | lld/wasm/Symbols.cpp |
 | llvm/include/llvm/BinaryFormat/Wasm.h |
 | llvm/lib/MC/WasmObjectWriter.cpp |
 | lld/wasm/InputChunks.cpp |
 | llvm/lib/ObjectYAML/WasmYAML.cpp |
 | llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp |
 | llvm/lib/MC/MCParser/WasmAsmParser.cpp |
 | llvm/tools/obj2yaml/wasm2yaml.cpp |
 | lld/wasm/OutputSegment.h |
 | llvm/lib/MC/MCSectionWasm.cpp |
 | lld/wasm/InputChunks.h |
 | lld/wasm/InputFiles.cpp |
 | llvm/lib/MC/MCObjectFileInfo.cpp |
 | lld/wasm/CMakeLists.txt |
 | lld/wasm/OutputSegment.cpp |
 | lld/wasm/SyntheticSections.cpp |
 | lld/wasm/Writer.cpp |
 | llvm/include/llvm/MC/MCSectionWasm.h |
 | llvm/lib/MC/MCContext.cpp |
Commit
79be9c59c6acd79fe4ac3a65eee569b3b65fc20f
by Jessica Paquette[AArch64][GlobalISel] Add post-legalizer lowering for NEON vector fcmps
This is roughly equivalent to the floating point portion of `AArch64TargetLowering::LowerVSETCC`. Main part that's missing is the v4s16 bit.
This also adds helpers equivalent to `EmitVectorComparison`, and `changeVectorFPCCToAArch64CC`. This moves `changeFCMPPredToAArch64CC` out of the selector into AArch64GlobalISelUtils for the sake of code reuse.
This is done in post-legalizer lowering with pseudos to simplify selection. The imported patterns end up handling selection for us this way.
Differential Revision: https://reviews.llvm.org/D101782
|
 | llvm/test/CodeGen/AArch64/neon-compare-instructions.ll |
 | llvm/lib/Target/AArch64/AArch64InstrGISel.td |
 | llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp |
 | llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp |
 | llvm/lib/Target/AArch64/AArch64Combine.td |
 | llvm/test/CodeGen/AArch64/GlobalISel/lower-neon-vector-fcmp.mir |
 | llvm/test/CodeGen/AArch64/GlobalISel/select-neon-vector-fcmp.mir |
 | llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.h |
 | llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp |
Commit
7b52aeadfa38c8a1fc0e97066f50900f1efafd42
by benny.kra[mlir][Tensor] Add folding for tensor.from_elements
This trivially folds into a constant when all operands are constant.
Differential Revision: https://reviews.llvm.org/D102199
|
 | mlir/lib/Dialect/Tensor/IR/TensorOps.cpp |
 | mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td |
 | mlir/test/Dialect/Linalg/detensorize_trivial.mlir |
 | mlir/test/Dialect/Tensor/canonicalize.mlir |
Commit
3b8d2be527259b303d6c3428df16fb3fd02af2bc
by sbcReland: "[lld][WebAssembly] Initial support merging string data"
This change was originally landed in: 5000a1b4b9edeb9e994f2a5b36da8d48599bea49 It was reverted in: 061e071d8c9b98526f35cad55a918a4f1615afd4
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in the object format which works in a similar fashion to SHF_STRINGS in the ELF world.
Unlike the ELF linker this support is currently limited: - No support for SHF_MERGE (non-string merging) - Always do full tail merging ("lo" can be merged with "hello") - Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828, although crucially it doesn't not currently support debug sections because they are not represented by data segments (they are custom sections)
Differential Revision: https://reviews.llvm.org/D97657
|
 | lld/wasm/InputFiles.cpp |
 | llvm/lib/MC/MCParser/WasmAsmParser.cpp |
 | lld/wasm/OutputSegment.h |
 | lld/wasm/Symbols.cpp |
 | lld/test/wasm/merge-string.s |
 | lld/wasm/SyntheticSections.cpp |
 | llvm/include/llvm/MC/MCContext.h |
 | llvm/lib/ObjectYAML/WasmYAML.cpp |
 | llvm/test/MC/WebAssembly/section-flags-changed.s |
 | llvm/lib/MC/WasmObjectWriter.cpp |
 | llvm/lib/Object/WasmObjectFile.cpp |
 | lld/wasm/InputChunks.h |
 | llvm/lib/MC/MCSectionWasm.cpp |
 | lld/wasm/Writer.cpp |
 | lld/wasm/Driver.cpp |
 | lld/wasm/CMakeLists.txt |
 | llvm/lib/MC/MCContext.cpp |
 | lld/wasm/InputChunks.cpp |
 | llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp |
 | llvm/tools/obj2yaml/wasm2yaml.cpp |
 | llvm/include/llvm/BinaryFormat/Wasm.h |
 | llvm/test/MC/WebAssembly/unnamed-data.ll |
 | llvm/include/llvm/MC/MCSectionWasm.h |
 | llvm/lib/MC/MCObjectFileInfo.cpp |
 | lld/wasm/OutputSegment.cpp |
 | llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp |
Commit
0077dce361ae54ddb5d7da02665795cfb7aab125
by llvmgnsyncbot[gn build] Port 3b8d2be52725
|
 | llvm/utils/gn/secondary/lld/wasm/BUILD.gn |
Commit
22d295f6953c07129837703c811fdda83775e75e
by Stanislav.Mekhanoshin[AMDGPU] Constant fold Intrinsic::amdgcn_perm
Differential Revision: https://reviews.llvm.org/D102203
|
 | llvm/lib/Analysis/ConstantFolding.cpp |
 | llvm/test/Transforms/InstSimplify/ConstProp/AMDGPU/perm.ll |
Commit
bf812ea484b71ec41d6811646d89876499956235
by ajcbik[mlir][linalg] remove the -now- obsolete sparse support in linalg
All glue and clutter in the linalg ops has been replaced by proper sparse tensor type encoding. This code is no longer needed. Thanks to ntv@ for giving us a temporary home in linalg.
So long, and thanks for all the fish.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D102098
|
 | mlir/include/mlir/Dialect/Utils/StructuredOpsUtils.h |
 | mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp |
 | mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp |
 | mlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp |
 | mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td |
 | mlir/python/mlir/dialects/linalg/opdsl/lang/emitter.py |
 | mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td |
Commit
e5d483f28a3af0972fc9b0df6073e4c14bb39359
by zoecarver[libcxx][ranges] Add ranges::empty CPO.
Depends on D101079. Refs D101189.
Differential Revision: https://reviews.llvm.org/D101193
|
 | libcxx/test/std/ranges/range.access/range.prim/empty.incomplete.verify.cpp |
 | libcxx/include/CMakeLists.txt |
 | libcxx/include/ranges |
 | libcxx/include/__ranges/empty.h |
 | libcxx/test/std/ranges/range.access/range.prim/empty.pass.cpp |
Commit
6d263b6f1c97fe6c45c75443e7daf6cd0c1c4222
by Lang Hames[ORC-RT] Add unit test infrastructure, extensible_rtti implementation, unit test
Add unit test infrastructure for the ORC runtime, plus a cut-down extensible_rtti system and extensible_rtti unit test.
Removes the placeholder.cpp source file.
Differential Revision: https://reviews.llvm.org/D102080
|
 | compiler-rt/lib/orc/extensible_rtti.cpp |
 | compiler-rt/lib/orc/unittests/CMakeLists.txt |
 | compiler-rt/lib/orc/placeholder.cpp |
 | compiler-rt/lib/orc/CMakeLists.txt |
 | compiler-rt/cmake/config-ix.cmake |
 | compiler-rt/lib/orc/unittests/extensible_rtti_test.cpp |
 | compiler-rt/lib/orc/unittests/orc_unit_test_main.cpp |
 | compiler-rt/lib/orc/extensible_rtti.h |
Commit
842b1624460b2904ec5439d8b0d8b50ae5d35a7a
by llvmgnsyncbot[gn build] Port e5d483f28a3a
|
 | llvm/utils/gn/secondary/libcxx/include/BUILD.gn |
Commit
c057779d389c5c1740a8051aa2929f7bc0f8ee00
by Vitaly Buka[NFC][LSAN] Fix flaky multithreaded test
|
 | compiler-rt/test/lsan/TestCases/many_threads_detach.cpp |
Commit
1e11616a071d07d0f3cdae1140b5c8685eb564a2
by rkauffmannEnable export of FIR includes into the install tree https://reviews.llvm.org/D102040
|
 | flang/CMakeLists.txt |
Commit
d8ec2b183e9243366e3a0cd1116dbe879856b333
by kai.wang[RISCV] Fix the calculation of the offset of Zvlsseg spilling.
For Zvlsseg spilling, we need to convert the pseudo instructions into multiple vector load/store instructions with appropriate offsets. For example, for PseudoVSPILL3_M2, we need to convert it to
VS2R %v2, %base ADDI %base, %base, (vlenb x 2) VS2R %v4, %base ADDI %base, %base, (vlenb x 2) VS2R %v6, %base
We need to keep the size of the offset in the pseudo spilling instructions. In this case, it is (vlenb x 2).
In the original implementation, we use the size of frame objects divide the number of vectors in zvlsseg types. The size of frame objects is not necessary exactly the same as the spilling data. It may be larger than it. So, we change it to (VLENB x LMUL) in this patch. The calculation is more direct and easy to understand.
Differential Revision: https://reviews.llvm.org/D101869
|
 | llvm/test/CodeGen/RISCV/rvv/zvlsseg-spill.mir |
 | llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp |
Commit
ad558a4ff7cd61081cfeaabff1dbc8c0a9afa92b
by carl.ritson[AMDGPU] Pre-commit tests for D102211
|
 | llvm/test/CodeGen/AMDGPU/hard-clauses.mir |
Commit
2b09a89daf956795d82076d983c3d78b96e1af4b
by clattner[OpAsmParser] Refactor parseOptionalInteger to support wide integers, NFC.
OpAsmParser (and DialectAsmParser) supports a pair of parseInteger/parseOptionalInteger methods, which allow parsing a bare integer into a C type of your choice (e.g. int8_t) using templates. It was implemented in terms of a virtual method call that is hard coded to int64_t because "that should be big enough".
Change the virtual method hook to return an APInt instead. This allows asmparsers for custom ops to parse large integers if they want to, without changing any of the clients of the fixed size C API.
Differential Revision: https://reviews.llvm.org/D102120
|
 | mlir/lib/Parser/Parser.cpp |
 | mlir/lib/Parser/Parser.h |
 | mlir/lib/Parser/DialectSymbolParser.cpp |
 | mlir/include/mlir/IR/OpImplementation.h |
 | mlir/include/mlir/IR/DialectImplementation.h |
Commit
70c23e232e50b190c9e0d8e4b5b6a8ddfc19b80c
by ikudrin[LLD] Improve reporting unresolved symbols in shared libraries
Currently, when reporting unresolved symbols in shared libraries, if an undefined symbol is firstly seen in a regular object file that shadows the reference for the same symbol in a shared object. As a result, the error for the unresolved symbol in the shared library is not reported. If referencing sections in regular object files are discarded because of '--gc-sections', no reports about such symbols are generated, and the linker finishes successfully, generating an output image that fails on the run.
The patch fixes the issue by keeping symbols, which should be checked, for each shared library separately.
Differential Revision: https://reviews.llvm.org/D101996
|
 | lld/ELF/InputFiles.h |
 | lld/test/ELF/allow-shlib-undefined.s |
 | lld/ELF/InputFiles.cpp |
 | lld/ELF/Writer.cpp |
Commit
d69bccf1ed30d16e043d4bb71b4ebd6100efa75b
by gysit[mlir][linalg] Remove IndexedGenericOp support from Tiling...
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).
Differential Revision: https://reviews.llvm.org/D102176
|
 | mlir/test/Dialect/Linalg/tile-indexed-generic.mlir |
 | mlir/lib/Dialect/Linalg/Transforms/Tiling.cpp |
 | mlir/test/Dialect/Linalg/tile-tensors.mlir |
Commit
daf3cb3b8a5868d9089a69025c556b564615b844
by kadircet[clangd][index-sever] Limit results in repsonse
This is to prevent server from being DOS'd by possible malicious parties issuing requests that can yield huge responses.
One possible drawback is on rename workflow. As it really requests all occurences, but it has an internal limit on 50 files currently. We are putting the limit on 10000 elements per response So for rename to regress one should have 10k refs to a symbol in less than 50 files. This seems unlikely and we fix it if there are complaints by giving up on the response based on the number of files covered instead.
Differential Revision: https://reviews.llvm.org/D101914
|
 | clang-tools-extra/clangd/index/remote/server/Server.cpp |
 | clang-tools-extra/clangd/test/remote-index/result-limiting.test |
Commit
888307ee625b50b060317e2100fb16e0be2626b7
by kadircet[clangd][remote-client] Set HasMore to true for failure
Currently client was setting the HasMore to true iff stream said so. Hence if we had a broken stream for whatever reason (e.g. hitting deadline for a huge response), HasMore would be false, which is semantically incorrect (e.g. will throw rename off).
Differential Revision: https://reviews.llvm.org/D101915
|
 | clang-tools-extra/clangd/index/remote/Client.cpp |
Commit
20506fb1f361e41012506c6c252fd690541fc708
by cjdb[libcxx] removes operator!= and globally guards against no spaceship operator
* `operator!=` isn't in the spec * `<compare>` is designed to work with `operator<=>` so it doesn't really make sense to have `operator<=>`-less friendly sections.
Depends on D100283.
Differential Revision: https://reviews.llvm.org/D100342
|
 | libcxx/test/std/language.support/cmp/cmp.categories.pre/zero_type.verify.cpp |
 | libcxx/test/std/language.support/cmp/cmp.weakord/weakord.pass.cpp |
 | libcxx/include/compare |
 | libcxx/test/std/language.support/cmp/cmp.partialord/partialord.pass.cpp |
 | libcxx/test/std/language.support/cmp/cmp.strongord/strongord.pass.cpp |
Commit
9eb0969a767bdc8ed5b28dbcc51b46c2ee088256
by cjdb[libcxx] makes comparison operators for `std::*_ordering` types hidden friends
The standard leaves it up to the implementation to decide whether or not these operators are hidden friends. There are several (well-documented) reasons to prefer hidden friends, as well as an argument for improved readability.
Depends on D100342.
Differential Revision: https://reviews.llvm.org/D101707
|
 | libcxx/include/compare |
Commit
578d09c1b195d859ca7e62840ff6bb83421a77b5
by cjdb[libcxx] deprecates/removes `std::raw_storage_iterator`
C++17 deprecates `std::raw_storage_iterator` and C++20 removes it.
Implements part of: * P0174R2 'Deprecating Vestigial Library Parts in C++17' * P0619R4 'Reviewing Deprecated Facilities of C++17 for C++20'
Differential Revision: https://reviews.llvm.org/D101730
|
 | libcxx/include/__memory/raw_storage_iterator.h |
 | libcxx/test/std/utilities/memory/storage.iterator/deprecated.verify.cpp |
 | libcxx/test/std/utilities/memory/storage.iterator/raw_storage_iterator.pass.cpp |
 | libcxx/test/std/utilities/memory/storage.iterator/raw_storage_iterator.base.pass.cpp |
Commit
6676e09b22c3478686d48cb835a98df62fcfbb7e
by gysit[mlir][linalg] Remove IndexedGenericOp support from Fusion...
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).
Differential Revision: https://reviews.llvm.org/D102174
|
 | mlir/test/Dialect/Linalg/fusion-indexed-generic.mlir |
 | mlir/test/Dialect/Linalg/fusion-indexed.mlir |
 | mlir/lib/Dialect/Linalg/Transforms/Fusion.cpp |