Commit
aaf62958f1ae3c17ed1f4551bac37c2e202ffd5e
by i[CMake] Delete obsoleted COMPILER_RT_TEST_TARGET_TRIPLE
The last user has been removed from llvm-zorg for Android.
|
 | compiler-rt/cmake/Modules/CompilerRTUtils.cmake |
Commit
4a36e96c3fc2a9128097bfc4f907ccebc5dc66af
by Matthew.ArsenaultRegAllocGreedy: Account for reserved registers in num regs heuristic
This simple heuristic uses the estimated live range length combined with the number of registers in the class to switch which heuristic to use. This was taking the raw number of registers in the class, even though not all of them may be available. AMDGPU heavily relies on dynamically reserved numbers of registers based on user attributes to satisfy occupancy constraints, so the raw number is highly misleading.
There are still a few problems here. In the original testcase that made me notice this, the live range size is incorrect after the scheduler rearranges instructions, since the instructions don't have the original InstrDist offsets. Additionally, I think it would be more appropriate to use the number of disjointly allocatable registers in the class. For the AMDGPU register tuples, there are a large number of registers in each tuple class, but only a small fraction can actually be allocated at the same time since they all overlap with each other. It seems we do not have a query that corresponds to the number of independently allocatable registers. Relatedly, I'm still debugging some allocation failures where overlapping tuples seem to not be handled correctly.
The test changes are mostly noise. There are a handful of x86 tests that look like regressions with an additional spill, and a handful that now avoid a spill. The worst looking regression is likely test/Thumb2/mve-vld4.ll which introduces a few additional spills. test/CodeGen/AMDGPU/soft-clause-exceeds-register-budget.ll shows a massive improvement by completely eliminating a large number of spills inside a loop.
|
 | llvm/test/CodeGen/RISCV/rv32zbp.ll |
 | llvm/test/CodeGen/X86/load-combine.ll |
 | llvm/test/CodeGen/X86/vec_umulo.ll |
 | llvm/test/CodeGen/X86/gather-addresses.ll |
 | llvm/test/CodeGen/X86/2007-10-12-SpillerUnfold1.ll |
 | llvm/test/CodeGen/X86/bswap.ll |
 | llvm/test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir |
 | llvm/test/CodeGen/Hexagon/reg-scavengebug-2.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll |
 | llvm/test/CodeGen/PowerPC/urem-vector-lkk.ll |
 | llvm/test/CodeGen/X86/vec-strict-cmp-128.ll |
 | llvm/test/CodeGen/X86/bool-vector.ll |
 | llvm/test/CodeGen/X86/i128-mul.ll |
 | llvm/test/CodeGen/X86/vector-tzcnt-128.ll |
 | llvm/test/CodeGen/X86/vec-strict-inttofp-512.ll |
 | llvm/test/CodeGen/X86/sse-intrinsics-fast-isel.ll |
 | llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll |
 | llvm/test/CodeGen/X86/fptosi-sat-scalar.ll |
 | llvm/test/CodeGen/X86/funnel-shift.ll |
 | llvm/test/CodeGen/X86/setcc-wide-types.ll |
 | llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bswap.ll |
 | llvm/test/CodeGen/X86/smul_fix.ll |
 | llvm/test/CodeGen/AMDGPU/load-constant-i16.ll |
 | llvm/test/CodeGen/X86/hoist-and-by-const-from-shl-in-eqcmp-zero.ll |
 | llvm/test/CodeGen/X86/funnel-shift-rot.ll |
 | llvm/test/CodeGen/X86/vec_shift4.ll |
 | llvm/test/CodeGen/X86/i256-add.ll |
 | llvm/test/CodeGen/X86/vector-fshr-128.ll |
 | llvm/test/CodeGen/X86/sdiv_fix_sat.ll |
 | llvm/test/CodeGen/AMDGPU/load-global-i16.ll |
 | llvm/test/CodeGen/X86/64-bit-shift-by-32-minus-y.ll |
 | llvm/test/CodeGen/X86/illegal-bitfield-loadstore.ll |
 | llvm/test/CodeGen/AMDGPU/sdiv64.ll |
 | llvm/test/CodeGen/X86/umin.ll |
 | llvm/test/CodeGen/AMDGPU/frem.ll |
 | llvm/test/CodeGen/X86/umax.ll |
 | llvm/test/CodeGen/X86/mul-constant-result.ll |
 | llvm/test/CodeGen/X86/fp128-cast.ll |
 | llvm/test/CodeGen/X86/popcnt.ll |
 | llvm/test/CodeGen/X86/mul-i512.ll |
 | llvm/test/CodeGen/X86/vector-idiv-v2i32.ll |
 | llvm/test/CodeGen/X86/horizontal-reduce-smax.ll |
 | llvm/test/CodeGen/X86/vector-fshr-rot-128.ll |
 | llvm/test/CodeGen/X86/avx512bwvl-intrinsics-upgrade.ll |
 | llvm/test/CodeGen/Thumb2/mve-fptosi-sat-vector.ll |
 | llvm/test/CodeGen/X86/select.ll |
 | llvm/test/CodeGen/AMDGPU/half.ll |
 | llvm/test/CodeGen/X86/horizontal-reduce-umin.ll |
 | llvm/test/CodeGen/AMDGPU/soft-clause-exceeds-register-budget.ll |
 | llvm/test/CodeGen/X86/abs.ll |
 | llvm/test/CodeGen/X86/clear-highbits.ll |
 | llvm/test/CodeGen/X86/usub_sat.ll |
 | llvm/test/CodeGen/X86/vector-sext.ll |
 | llvm/test/CodeGen/X86/vector-lzcnt-128.ll |
 | llvm/test/CodeGen/X86/vec-strict-cmp-sub128.ll |
 | llvm/test/CodeGen/X86/vector-shift-shl-256.ll |
 | llvm/test/CodeGen/X86/umul_fix_sat.ll |
 | llvm/test/CodeGen/X86/vector-gep.ll |
 | llvm/test/CodeGen/X86/memcmp-more-load-pairs-x32.ll |
 | llvm/test/CodeGen/X86/mul-constant-i64.ll |
 | llvm/test/CodeGen/RISCV/stack-store-check.ll |
 | llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll |
 | llvm/test/CodeGen/X86/neg-abs.ll |
 | llvm/test/CodeGen/X86/pr32284.ll |
 | llvm/test/CodeGen/X86/subvector-broadcast.ll |
 | llvm/test/CodeGen/X86/umulo-64-legalisation-lowering.ll |
 | llvm/test/CodeGen/X86/vector-rotate-128.ll |
 | llvm/test/CodeGen/X86/smulo-128-legalisation-lowering.ll |
 | llvm/test/CodeGen/X86/pr32610.ll |
 | llvm/test/CodeGen/X86/xmulo.ll |
 | llvm/test/CodeGen/Mips/cconv/vector.ll |
 | llvm/test/CodeGen/X86/peephole-na-phys-copy-folding.ll |
 | llvm/test/CodeGen/X86/bitreverse.ll |
 | llvm/test/CodeGen/X86/nosse-vector.ll |
 | llvm/test/CodeGen/X86/sshl_sat_vec.ll |
 | llvm/test/CodeGen/X86/sadd_sat.ll |
 | llvm/test/CodeGen/X86/build-vector-128.ll |
 | llvm/test/CodeGen/X86/horizontal-reduce-umax.ll |
 | llvm/test/CodeGen/X86/smin.ll |
 | llvm/test/CodeGen/X86/widen_cast-4.ll |
 | llvm/test/CodeGen/AMDGPU/srl.ll |
 | llvm/test/CodeGen/X86/overflow.ll |
 | llvm/test/CodeGen/X86/scheduler-backtracking.ll |
 | llvm/test/CodeGen/X86/mul-i256.ll |
 | llvm/test/CodeGen/X86/div-rem-pair-recomposition-unsigned.ll |
 | llvm/test/CodeGen/X86/mul128.ll |
 | llvm/test/CodeGen/X86/udiv_fix_sat.ll |
 | llvm/test/CodeGen/X86/i64-to-float.ll |
 | llvm/test/CodeGen/X86/fshr.ll |
 | llvm/test/CodeGen/X86/combine-sbb.ll |
 | llvm/test/CodeGen/X86/stack-align-memcpy.ll |
 | llvm/test/CodeGen/Thumb2/mve-simple-arith.ll |
 | llvm/test/CodeGen/X86/merge-consecutive-stores-nt.ll |
 | llvm/test/CodeGen/X86/avx512bw-intrinsics-upgrade.ll |
 | llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll |
 | llvm/test/CodeGen/X86/vector-fshl-128.ll |
 | llvm/test/CodeGen/ARM/fptosi-sat-scalar.ll |
 | llvm/test/CodeGen/X86/mul-i1024.ll |
 | llvm/lib/CodeGen/RegAllocGreedy.cpp |
 | llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitreverse.ll |
 | llvm/test/CodeGen/ARM/srem-seteq-illegal-types.ll |
 | llvm/test/CodeGen/Thumb2/srem-seteq-illegal-types.ll |
 | llvm/test/CodeGen/X86/vec-strict-fptoint-256.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll |
 | llvm/test/CodeGen/X86/uadd_sat.ll |
 | llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll |
 | llvm/test/CodeGen/AMDGPU/shl.ll |
 | llvm/test/CodeGen/X86/sdiv_fix.ll |
 | llvm/test/CodeGen/X86/ushl_sat.ll |
 | llvm/test/CodeGen/X86/avx512-regcall-NoMask.ll |
 | llvm/test/CodeGen/X86/vector-fshl-rot-128.ll |
 | llvm/test/CodeGen/X86/umul_fix.ll |
 | llvm/test/CodeGen/X86/masked_gather_scatter.ll |
 | llvm/test/CodeGen/X86/div-rem-pair-recomposition-signed.ll |
 | llvm/test/CodeGen/X86/umul-with-overflow.ll |
 | llvm/test/CodeGen/X86/shrink_vmul.ll |
 | llvm/test/CodeGen/X86/2008-04-16-ReMatBug.ll |
 | llvm/test/CodeGen/X86/legalize-shl-vec.ll |
 | llvm/test/CodeGen/X86/pr32329.ll |
 | llvm/test/CodeGen/X86/pr34080-2.ll |
 | llvm/test/CodeGen/Thumb2/mve-fptoui-sat-vector.ll |
 | llvm/test/CodeGen/X86/avx512-calling-conv.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.round.f64.ll |
 | llvm/test/CodeGen/X86/pr31088.ll |
 | llvm/test/CodeGen/X86/vector-shift-lshr-256.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll |
 | llvm/test/CodeGen/ARM/umulo-128-legalisation-lowering.ll |
 | llvm/test/CodeGen/RISCV/rv64zbp.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll |
 | llvm/test/CodeGen/X86/i128-sdiv.ll |
 | llvm/test/CodeGen/X86/smul_fix_sat.ll |
 | llvm/test/CodeGen/X86/nontemporal.ll |
 | llvm/test/CodeGen/X86/hoist-and-by-const-from-lshr-in-eqcmp-zero.ll |
 | llvm/test/CodeGen/AMDGPU/greedy-global-heuristic.mir |
 | llvm/test/CodeGen/X86/mmx-arith.ll |
 | llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll |
 | llvm/test/CodeGen/X86/vector-trunc-ssat.ll |
 | llvm/test/CodeGen/X86/horizontal-reduce-smin.ll |
 | llvm/test/CodeGen/X86/sshl_sat.ll |
 | llvm/test/CodeGen/X86/known-signbits-vector.ll |
 | llvm/test/CodeGen/X86/avx512-select.ll |
 | llvm/test/CodeGen/X86/smax.ll |
 | llvm/test/CodeGen/X86/vshift-6.ll |
 | llvm/test/CodeGen/RISCV/rvv/fixed-vectors-cttz.ll |
 | llvm/test/CodeGen/X86/ushl_sat_vec.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll |
 | llvm/test/CodeGen/X86/pr46527.ll |
 | llvm/test/CodeGen/X86/statepoint-vreg-unlimited-tied-opnds.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll |
 | llvm/test/CodeGen/PowerPC/srem-vector-lkk.ll |
Commit
962acf0a27fbdb945b7b790cc57ba4ae4729879f
by tlively[lld][WebAssembly] Use llvm-objdump to test __wasm_init_memory
Rather than depending on the hex dump from obj2yaml. Now the test shows the expected function body in a human readable format.
Differential Revision: https://reviews.llvm.org/D109730
|
 | lld/test/wasm/data-segments.ll |
Commit
299b5d420df15fafc9936bc24995f6cd6ad325be
by hoy[CSSPGO] Enable pseudo probe instrumentation in O0 mode.
Pseudo probe instrumentation was missing from O0 build. It is needed in cases where some source files are built in O0 while the others are built in optimize mode.
Reviewed By: wenlei, wlei, wmi
Differential Revision: https://reviews.llvm.org/D109531
|
 | clang/test/CodeGen/pseudo-probe-emit.c |
 | llvm/lib/Passes/PassBuilder.cpp |
Commit
54d755a034362814bd7a0b90f172cbba39729cf4
by Matthew.ArsenaultDAG: Fix incorrect folding of fmul -1 to fneg
The fmul is a canonicalizing operation, and fneg is not so this would break denormals that need flushing and also would not quiet signaling nans. Fold to fsub instead, which is also canonicalizing.
|
 | llvm/test/CodeGen/AMDGPU/fneg-combines.ll |
 | llvm/test/CodeGen/AArch64/arm64-fmadd.ll |
 | llvm/test/CodeGen/ARM/fnegs.ll |
 | llvm/test/CodeGen/AArch64/fp16_intrinsic_scalar_3op.ll |
 | llvm/test/CodeGen/PowerPC/combine-fneg.ll |
 | llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp |
 | llvm/test/CodeGen/Hexagon/opt-fneg.ll |
Commit
d4e03bccd4567b455ce0bf03a797b5189b4dcba8
by listmailregen an autogened test which is stale
|
 | llvm/test/Transforms/LoopIdiom/basic.ll |
Commit
626586fc253c6f032aedb325dba6b1ff3f11875e
by thakisRe-Revert "clang-tidy: introduce readability-containter-data-pointer check"
This reverts commit 49992c04148e5327bef9bd2dff53a0d46004b4b4. The test is still failing on Windows, see comments on https://reviews.llvm.org/D108893
|
 | clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp |
 | clang-tools-extra/clang-tidy/readability/ContainerDataPointerCheck.cpp |
 | clang-tools-extra/test/clang-tidy/checkers/readability-container-data-pointer.cpp |
 | clang-tools-extra/clang-tidy/readability/CMakeLists.txt |
 | clang-tools-extra/clang-tidy/readability/ContainerDataPointerCheck.h |
 | clang-tools-extra/docs/ReleaseNotes.rst |
 | clang-tools-extra/docs/clang-tidy/checks/readability-data-pointer.rst |
Commit
10b069d1a09f3ef145ce468e790243015a7c84ec
by llvmgnsyncbot[gn build] Port 626586fc253c
|
 | llvm/utils/gn/secondary/clang-tools-extra/clang-tidy/readability/BUILD.gn |
Commit
500d4c45ba7f31907a64dead8ddb292649e6ce75
by joker.eph[MLIR] Use memref.copy ops in BufferResultsToOutParams pass.
Both copy/alloc ops are using memref dialect after this change.
Reviewed By: silvas, mehdi_amini
Differential Revision: https://reviews.llvm.org/D109480
|
 | mlir/lib/Transforms/BufferResultsToOutParams.cpp |
 | mlir/lib/Transforms/PassDetail.h |
 | mlir/include/mlir/Transforms/Passes.td |
 | mlir/lib/Transforms/CMakeLists.txt |
 | mlir/test/Transforms/buffer-results-to-out-params.mlir |
Commit
a32300a68f6c94b7b275e3560ed31e9174cec5ad
by joker.ephMake the --mlir-disable-threading command line option overrides the C++ API usage
This seems in-line with the intent and how we build tools around it. Update the description for the flag accordingly. Also use an injected thread pool in MLIROptMain, now we will create threads up-front and reuse them across split buffers.
Differential Revision: https://reviews.llvm.org/D109802
|
 | mlir/include/mlir/IR/MLIRContext.h |
 | mlir/lib/Support/MlirOptMain.cpp |
 | mlir/lib/IR/MLIRContext.cpp |
Commit
0dc461441eed3b49b36bec889ddf1449b502d17a
by joker.ephRevert "[flang] Make 'this_image()' an intrinsic function"
This reverts commit 81f8ad1769665a569a235b749e0e9e69ce7dc65e. This seems to break the shared libs build (linaro-flang-aarch64-sharedlibs bot) with:
undefined reference to `Fortran::semantics::IsCoarray(Fortran::semantics::Symbol const&)
(from tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/tools.cpp.o)
When linking lib/libFortranEvaluate.so.14git
|
 | flang/include/flang/Evaluate/tools.h |
 | flang/test/Semantics/this_image.f90 |
 | flang/docs/Intrinsics.md |
 | flang/lib/Evaluate/intrinsics.cpp |
 | flang/lib/Evaluate/tools.cpp |
 | flang/test/Semantics/call10.f90 |