Commit
9b1e00738c5ddba681e17e5cb7c260d9afc4c3a7
by preames[BasicAA] Remove unneeded special case for malloc/calloc
This code pre-exists the generic handling for inaccessiblememonly. If we remove it and update one test with inaccessiblememonly, nothing else changes. Note that simply running O1 on that test would annotate malloc with the missing inaccessiblememonly.
|
 | llvm/lib/Analysis/BasicAliasAnalysis.cpp |
 | llvm/test/Transforms/GVN/nonescaping-malloc.ll |
Commit
862b5a52335fef9e29013b00506e49342ac473f1
by groverkss[MLIR][Presburger] Attach values only to non-local identifiers in FAVC
This patch changes `FlatAffineValueConstraints` to only allow attaching values to non-local identifiers.
The reasoning for this change is: 1. Information attached to local identifiers can be lost since local identifiers can be removed for output size optimizations. 2. There are no current use cases for attaching values to Local identifiers. 3. Attaching a value to a local identifier does not make sense since a local identifier represents existential quantification.
This patch also adds some additional asserts to the affected functions.
Reviewed By: arjunp, bondhugula
Differential Revision: https://reviews.llvm.org/D125613
|
 | mlir/include/mlir/Dialect/Affine/Analysis/AffineStructures.h |
 | mlir/lib/Dialect/Affine/Analysis/AffineStructures.cpp |
Commit
e00cbbec06c08dc616a0d52a20f678b8fbd4e304
by groverkss[MLIR][Presburger] Cleanup getMaybeValues in FACV
This patch cleans up multiple getMaybeValue functions to take an IdKind instead of special functions.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D125617
|
 | mlir/lib/Dialect/SCF/Utils/AffineCanonicalizationUtils.cpp |
 | mlir/include/mlir/Dialect/Affine/Analysis/AffineStructures.h |
 | mlir/lib/Dialect/Affine/Analysis/AffineStructures.cpp |
Commit
573a5b58001d6dd86d404832b7b1c45a1b4f4c55
by marek.kurdej+llvm.orgRevert "[clang-format] Fix WhitespaceSensitiveMacros not being honoured when macro closing parenthesis is followed by a newline."
This reverts commit 50cd52d9357224cce66a9e00c9a0417c658a5655.
It provoked regressions in C++ and ObjectiveC as described in https://reviews.llvm.org/D123676#3515949.
Reproducers: ``` MACRO_BEGIN #if A int f(); #else int f(); #endif ```
``` NS_SWIFT_NAME(A) @interface B : C @property(readonly) D value; @end ```
|
 | clang/unittests/Format/FormatTest.cpp |
 | clang/lib/Format/UnwrappedLineParser.cpp |
 | clang/lib/Format/FormatTokenLexer.cpp |
Commit
d81064949f41b95a5a2122889f07c9ffcc875834
by samolisov[ArgPromotion] Add unused-argument.ll test (NFC)
If a pointer argument is unused within the callee, this argument should be removed from the function's signature while all used pointer arguments should be promoted as it is expected. The ArgumentPromotion pass doesn't touch unused non-pointer arguments at all.
|
 | llvm/test/Transforms/ArgumentPromotion/unused-argument.ll |
Commit
92f1028ceb30dc8e7eda3f06a8c7aa8e8082ff65
by martin[llvm-readobj] Fix printing of Windows ARM unwind opcodes, add tests
The existing code was essentially untested; in some cases, it used too narrow variable types to fit all the bits, in some cases the bit manipulation operations were incorrect.
For the "ldr lr, [sp], #x" opcode, there's nothing in the documentation that says it cannot be used in a prologue. (In practice, it would probably seldom be used there, but technically there's nothing stopping it from being used.) The documentation only specifies the operation to replay for unwinding it, but the corresponding mirror instruction to be printed for a prologue is "str lr, [sp, #-x]!".
Also improve printing of register masks, by aggregating registers into ranges where possible, and make the printing of the terminating branches clearer, as "bx <reg>" and "b.w <target>".
Differential Revision: https://reviews.llvm.org/D125643
|
 | llvm/tools/llvm-readobj/ARMWinEHPrinter.cpp |
 | llvm/test/tools/llvm-readobj/COFF/arm-unwind-opcodes.s |
Commit
e213e5a999dbaa3c1aa97a6f81b77a3358b00b2a
by riddleriver[mlir:PDLL] Drop space as a completion commit character
This causes annoyances when attempting to use space as a trigger character (to start a different completion).
|
 | mlir/lib/Tools/mlir-pdll-lsp-server/LSPServer.cpp |
Commit
6d4471efb0b94c066e5e93c99278397691869dbc
by riddleriver[mlir:PDLL] Improve the location ranges of several expressions during parsing
This allows for the range to encompass more of the source associated with the full expression, making diagnostics easier to see/tooling easier/etc.
|
 | mlir/lib/Tools/PDLL/Parser/Parser.cpp |
Commit
17e2e7b7885c0afe688bcd4d6b198aab6ea8f58a
by riddleriver[mlir:PDLL] Don't append / for directory code completion
This allows for properly using / as a trigger character, i.e. more easily allows chaining include directory completions.
|
 | mlir/lib/Tools/mlir-pdll-lsp-server/PDLLServer.cpp |
Commit
ebad5fb309570765e8f121c441dcd90b5aa0536a
by riddleriver[mlir][Canonicalize] Fix command-line options
The canonicalize command-line options currently have no effect, as the pass is reading the pass options in its constructor, before they're actually initialized. This results in the default values of the options always being used.
The change here moves the initialization of the `GreedyRewriteConfig` out of the constructor, so that it runs after the pass options have been parsed.
Fixes #55466
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125621
|
 | mlir/test/Transforms/test-canonicalize.mlir |
 | mlir/lib/Transforms/Canonicalizer.cpp |
Commit
c4c01e4e4e388a1e3cefc9e3982ac15fb94d4f40
by npopov[llvm-nm] Always use opaque pointers (PR55506)
Always enable opaque pointers in llvm-nm, because the tool doesn't actually care, and this allows us to read both typed pointer and opaque pointer bitcode files in one archive. Previously this depended on the order inside the archive (it would work with an opaque pointer bitcode file first, but fail with a typed pointer bitcode file first).
Fixes https://github.com/llvm/llvm-project/issues/55506.
Differential Revision: https://reviews.llvm.org/D125751
|
 | llvm/test/tools/llvm-nm/opaque-pointers.ll |
 | llvm/tools/llvm-nm/llvm-nm.cpp |
Commit
323514de58aba8c073faa37335345338ae57173c
by npopov[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits
When performing runtime unrolling with multiple exits, one of the earlier (non-latch) exits may exit the loop on the first iteration, such that we never branch on the latch exit condition. As such, we need to freeze the condition of the new branch that is introduced before the loop, as it now executes unconditionally.
Differential Revision: https://reviews.llvm.org/D125754
|
 | llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll |
 | llvm/test/Transforms/LoopUnroll/runtime-loop-multiexit-dom-verify.ll |
 | llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp |
 | llvm/test/Transforms/LoopUnroll/runtime-loop-at-most-two-exits.ll |
 | llvm/test/Transforms/LoopUnroll/runtime-multiexit-heuristic.ll |
Commit
e9a1c82d695472820c93af40cbf3d9fde2a149c6
by npopov[SCEVExpander] Expand umin_seq using freeze
%x umin_seq %y is currently expanded to %x == 0 ? 0 : umin(%x, %y). This patch changes the expansion to umin(%x, freeze %y) instead (https://alive2.llvm.org/ce/z/wujUhp).
The motivation for this change are the test cases affected by D124910, where the freeze expansion ultimately produces better optimization results. This is largely because `(%x umin_seq %y) == %x` is a common expansion pattern, which reliably optimizes in freeze representation, but only sometimes with the zero comparison (in particular, if %x == 0 can fold to something else, we generally won't be able to cover reasonable code from this.)
Differential Revision: https://reviews.llvm.org/D125372
|
 | llvm/test/Transforms/IndVarSimplify/exit-count-select.ll |
 | llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp |
 | llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h |
Commit
7814b559bd5e1dbb3c016b393068698bc5781cc5
by riddleriver[GreedyPatternRewriter] Avoid reversing constant order
The previous fix from af371f9f98da only applied when using a bottom-up traversal. The change here applies the constant preprocessing logic to the top-down case as well. This resolves the issue with the canonicalizer pass still reordering constants, since it uses a top-down traversal by default.
Fixes #51892
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125623
|
 | mlir/lib/Transforms/Utils/GreedyPatternRewriteDriver.cpp |
 | mlir/test/lib/Dialect/Test/TestPatterns.cpp |
 | mlir/test/Dialect/Arithmetic/canonicalize.mlir |
 | mlir/test/Transforms/test-operation-folder.mlir |
 | flang/test/Lower/Intrinsics/achar.f90 |
 | mlir/test/Dialect/SCF/canonicalize.mlir |
Commit
d9d15af7873fe16d7a0dde4def30f40fa9901777
by qiucofan[PowerPC] Treat llvm.fmuladd intrinsic as using CTR
This fixes bug 55463, similar to D78668. This is a temporary fix since we will switch to post-isel CTR loop determination in the future.
Reviewed By: dim, shchenz
Differential Revision: https://reviews.llvm.org/D125746
|
 | llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp |
 | llvm/test/CodeGen/PowerPC/pr55463.ll |
Commit
6bcafce103a4d759fd9acdc472bb2c8d0b2c859c
by diana.picus[flang][Runtime] Use proper prototypes in Fortran_main. NFCI
This is compiled as C code, so it's a good idea to be explicit about the prototype. Clang complains about this when -Wstrict-prototypes is used.
Differential Revision: https://reviews.llvm.org/D125672
|
 | flang/include/flang/Runtime/main.h |
 | flang/runtime/FortranMain/Fortran_main.c |
Commit
00999fb6e14231de14db292510c854e1bf3baded
by yeting.kuo[SelectionDAGBuilder] Pass fast math flags to most of VP SDNodes.
The patch does not pass math flags to float VPCmpIntrinsics because LLParser could not identify float VPCmpIntrinsics as FPMathOperators.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D125600
|
 | llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp |
 | llvm/test/CodeGen/RISCV/pass-fast-math-flags-sdnode.ll |
Commit
1c0b03f6e706978f2e87408f7fd5e4c846d6c9a8
by diana.picus[flang][driver] Support parsing response files
Add support for reading response files in the flang driver. Response files contain command line arguments and are used whenever a command becomes longer than the shell/environment limit. Response files are recognized via the special "@path/to/response/file.rsp" syntax, which distinguishes them from other file inputs.
This patch hardcodes GNU tokenization, since we don't have a CL mode for the driver. In the future we might want to add a --rsp-quoting command line option, like clang has, to accommodate Windows platforms.
Differential Revision: https://reviews.llvm.org/D124846
|
 | flang/test/Driver/response-file.f90 |
 | flang/tools/flang-driver/driver.cpp |
Commit
7e65ffaa8bb65adc0324ccbea1fef56cab6eafe1
by thomasp[test, x86] Fix spurious x86-target-features.c failure
x86-target-features.c can spuriously fail when checking for absence of the string "lvi" in the compiler output due to the temporary path used for the output file. For example: "-o" "/tmp/lit-tmp-981j7lvi/x86-target-features-670b86.o" will make the test fail. This commit checks specifically for lvi as a target feature, in a similar way to the positive CHECK directive just above.
Test Plan: fails when using -mlvi-hardening and pass otherwise
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D125084
|
 | clang/test/Driver/x86-target-features.c |
Commit
fcfb86483b29df124c0b4a61ff65b0c6800f64b7
by flo[LV] set Header earlier, use variable instead of repeated access (NFC).
|
 | llvm/lib/Transforms/Vectorize/LoopVectorize.cpp |
Commit
25ac078a961de91522e5b5afaa6d4ffdd0dd05c4
by gabor.marton[clang][ASTImporter] Add isNewDecl
Summary: Add a new function with which we can query if a Decl had been newly created during the import process. This feature is a must if we want to have a different static analysis strategy for such newly created declarations.
This is a dependent patch that is needed for the new CTU implementation discribed at https://discourse.llvm.org/t/rfc-much-faster-cross-translation-unit-ctu-analysis-implementation/61728
Differential Revision: https://reviews.llvm.org/D123685
|
 | clang/include/clang/AST/ASTImporterSharedState.h |
 | clang/unittests/AST/ASTImporterTest.cpp |
 | clang/lib/AST/ASTImporter.cpp |
Commit
56b9b97c1ef594f218eb06d2e62daa85cc238500
by gabor.marton[clang][analyzer][ctu] Make CTU a two phase analysis
This new CTU implementation is the natural extension of the normal single TU analysis. The approach consists of two analysis phases. During the first phase, we do a normal single TU analysis. During this phase, if we find a foreign function (that could be inlined from another TU) then we don’t inline that immediately, we rather mark that to be analysed later. When the first phase is finished then we start the second phase, the CTU phase. In this phase, we continue the analysis from that point (exploded node) which had been enqueued during the first phase. We gradually extend the exploded graph of the single TU analysis with the new node that was created by the inlining of the foreign function.
We count the number of analysis steps of the first phase and we limit the second (ctu) phase with this number.
This new implementation makes it convenient for the users to run the single-TU and the CTU analysis in one go, they don't need to run the two analysis separately. Thus, we name this new implementation as "onego" CTU.
Discussion: https://discourse.llvm.org/t/rfc-much-faster-cross-translation-unit-ctu-analysis-implementation/61728
Differential Revision: https://reviews.llvm.org/D123773
|
 | clang/test/Analysis/ctu-onego-existingdef.cpp |
 | clang/test/Analysis/Inputs/ctu-onego-toplevel-other.cpp.externalDefMap.ast-dump.txt |
 | clang/test/Analysis/ctu-onego-indirect.cpp |
 | clang/test/Analysis/Inputs/ctu-onego-existingdef-other.cpp.externalDefMap.ast-dump.txt |
 | clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h |
 | clang/test/Analysis/Inputs/ctu-onego-small-other.cpp |
 | clang/test/Analysis/Inputs/ctu-onego-toplevel-other.cpp |
 | clang/lib/CrossTU/CrossTranslationUnit.cpp |
 | clang/test/Analysis/ctu-on-demand-parsing.cpp |
 | clang/test/Analysis/ctu-implicit.c |
 | clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def |
 | clang/test/Analysis/analyzer-config.c |
 | clang/test/Analysis/Inputs/ctu-onego-indirect-other.cpp.externalDefMap.ast-dump.txt |
 | clang/test/Analysis/Inputs/ctu-onego-existingdef-other.cpp |
 | clang/lib/StaticAnalyzer/Core/CallEvent.cpp |
 | clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h |
 | clang/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp |
 | clang/include/clang/CrossTU/CrossTranslationUnit.h |
 | clang/test/Analysis/ctu-main.c |
 | clang/test/Analysis/ctu-onego-small.cpp |
 | clang/test/Analysis/ctu-main.cpp |
 | clang/lib/StaticAnalyzer/Core/ExprEngine.cpp |
 | clang/test/Analysis/Inputs/ctu-onego-indirect-other.cpp |
 | clang/test/Analysis/ctu-onego-toplevel.cpp |
 | clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h |
 | clang/docs/ReleaseNotes.rst |
 | clang/lib/StaticAnalyzer/Core/CoreEngine.cpp |
 | clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp |
 | clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp |
 | clang/include/clang/StaticAnalyzer/Core/PathSensitive/CoreEngine.h |
 | clang/test/Analysis/Inputs/ctu-onego-small-other.cpp.externalDefMap.ast-dump.txt |
 | clang/test/Analysis/ctu-on-demand-parsing.c |
Commit
d4cdf013c76419140c6bdfb460088b5f09d6472f
by npopov[JumpThreading] Use common code to skip freeze (NFC)
There are multiple places that want to look through freeze, so store condition without freeze in a separate variable.
|
 | llvm/lib/Transforms/Scalar/JumpThreading.cpp |
Commit
7d8ec4dc4461102bafed8063977a66e40562bbb3
by david.spickett[lldb] const a couple of getters on MemoryRegionInfo
GetDirtyPageList was being assigned to const & in most places anyway. If you wanted to change the list you'd make a new one and call SetDirtyPageList.
GetPageSize is just an int so no issues being const.
Differential Revision: https://reviews.llvm.org/D125786
|
 | lldb/source/API/SBMemoryRegionInfo.cpp |
 | lldb/include/lldb/Target/MemoryRegionInfo.h |
Commit
dd12c3433ee9b4ef15c633bd325ab5a0c9c5e03b
by jay.foad[AMDGPU] Shrink F16 MAD/FMA to MADAK/MADMK/FMAAK/FMAMK on GFX10
Differential Revision: https://reviews.llvm.org/D125803
|
 | llvm/test/CodeGen/AMDGPU/gfx10-shrink-mad-fma.mir |
 | llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp |
Commit
aa568e082b4c0aa1cfbc8d1937544af8adbde552
by riddleriver[mlir:GreedyDriver] Return WalkResult::skip after deleting a known constant
This avoids use-after-free when trying to access the regions after visiting the operation.
|
 | mlir/lib/Transforms/Utils/GreedyPatternRewriteDriver.cpp |
Commit
3eb2281bc067688dc701cf94e267395680892cf0
by jay.foad[AMDGPU] Aggressively fold immediates in SIFoldOperands
Previously SIFoldOperands::foldInstOperand would only fold a non-inlinable immediate into a single user, so as not to increase code size by adding the same 32-bit literal operand to many instructions.
This patch removes that restriction, so that a non-inlinable immediate will be folded into any number of users. The rationale is: - It reduces the number of registers used for holding constant values, which might increase occupancy. (On the other hand, many of these registers are SGPRs which no longer affect occupancy on GFX10+.) - It reduces ALU stalls between the instruction that loads a constant into a register, and the instruction that uses it. - The above benefits are expected to outweigh any increase in code size.
Differential Revision: https://reviews.llvm.org/D114643
|
 | llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll |
 | llvm/test/CodeGen/AMDGPU/idot8s.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.a16.dim.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/usubsat.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.sbfe.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/roundeven.ll |
 | llvm/test/CodeGen/AMDGPU/add.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/sdiv64.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll |
 | llvm/test/CodeGen/AMDGPU/shrink-add-sub-constant.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.buffer.store.format.d16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll |
 | llvm/test/CodeGen/AMDGPU/shift-i128.ll |
 | llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i32.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.round.f64.ll |
 | llvm/test/CodeGen/AMDGPU/mul_uint24-amdgcn.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.log10.f16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/fneg-fabs.f64.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/combine-fma-add-mul.ll |
 | llvm/test/CodeGen/AMDGPU/flat-scratch.ll |
 | llvm/test/CodeGen/AMDGPU/zero_extend.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/store-local.128.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/shl.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.gather4.a16.dim.ll |
 | llvm/test/CodeGen/AMDGPU/v_pack.ll |
 | llvm/test/CodeGen/AMDGPU/ctlz.ll |
 | llvm/test/CodeGen/AMDGPU/sdwa-peephole.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fshl.ll |
 | llvm/test/CodeGen/AMDGPU/urem64.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/subo.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.log.f16.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.buffer.store.format.d16.ll |
 | llvm/lib/Target/AMDGPU/SIFoldOperands.cpp |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/store-local.96.ll |
 | llvm/test/CodeGen/AMDGPU/fneg.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f32.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.a16.dim.ll |
 | llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll |
 | llvm/test/CodeGen/AMDGPU/idot4u.ll |
 | llvm/test/CodeGen/AMDGPU/madak.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.raw.buffer.store.format.d16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/addo.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll |
 | llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll |
 | llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll |
 | llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll |
 | llvm/test/CodeGen/AMDGPU/fneg-fabs.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fmed3.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/combine-fma-sub-ext-neg-mul.ll |
 | llvm/test/CodeGen/AMDGPU/shl.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.sample.g16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.sdot4.ll |
 | llvm/test/CodeGen/AMDGPU/xor.ll |
 | llvm/test/CodeGen/AMDGPU/cttz.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/andn2.ll |
 | llvm/test/CodeGen/AMDGPU/strict_fadd.f16.ll |
 | llvm/test/CodeGen/AMDGPU/idot2.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll |
 | llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll |
 | llvm/test/CodeGen/AMDGPU/fneg-fabs.f16.ll |
 | llvm/test/CodeGen/AMDGPU/srem64.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i8.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/xnor.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/uaddsat.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll |
 | llvm/test/CodeGen/AMDGPU/fabs.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.tbuffer.store.d16.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.raw.tbuffer.store.d16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/trunc.ll |
 | llvm/test/CodeGen/AMDGPU/constant-address-space-32bit.ll |
 | llvm/test/CodeGen/AMDGPU/combine-reg-or-const.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i32.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll |
 | llvm/test/CodeGen/AMDGPU/setcc-opt.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/bswap.ll |
 | llvm/test/CodeGen/AMDGPU/strict_fmul.f16.ll |
 | llvm/test/CodeGen/AMDGPU/usubsat.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/combine-fma-sub-mul.ll |
 | llvm/test/CodeGen/AMDGPU/scratch-buffer.ll |
 | llvm/test/CodeGen/AMDGPU/fabs.f16.ll |
 | llvm/test/CodeGen/AMDGPU/uaddsat.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.div.scale.ll |
 | llvm/test/CodeGen/AMDGPU/bypass-div.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.encode.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.3d.a16.ll |
 | llvm/test/CodeGen/AMDGPU/idot8u.ll |
 | llvm/test/CodeGen/AMDGPU/idiv-licm.ll |
 | llvm/test/CodeGen/AMDGPU/salu-to-valu.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.i8.ll |
 | llvm/test/CodeGen/AMDGPU/udiv64.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.tbuffer.store.d16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/combine-fma-sub-neg-mul.ll |
 | llvm/test/CodeGen/AMDGPU/extract-subvector-16bit.ll |
 | llvm/test/CodeGen/AMDGPU/fexp.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/orn2.ll |
 | llvm/test/CodeGen/AMDGPU/fmed3.ll |
 | llvm/test/CodeGen/AMDGPU/immv216.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll |
 | llvm/test/CodeGen/AMDGPU/fold-immediate-operand-shrink-with-carry.mir |
 | llvm/test/CodeGen/AMDGPU/udiv.ll |
 | llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll |
 | llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/fshr.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.i16.ll |
 | llvm/test/CodeGen/AMDGPU/splitkit-getsubrangeformask.ll |
 | llvm/test/CodeGen/AMDGPU/or.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fshr.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.1d.d16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll |
 | llvm/test/CodeGen/AMDGPU/s_addk_i32.ll |
 | llvm/test/CodeGen/AMDGPU/load-constant-i16.ll |
 | llvm/test/CodeGen/AMDGPU/mul.ll |
 | llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll |
 | llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll |
 | llvm/test/CodeGen/AMDGPU/max.ll |
 | llvm/test/CodeGen/AMDGPU/fabs.f64.ll |
 | llvm/test/CodeGen/AMDGPU/frem.ll |
 | llvm/test/CodeGen/AMDGPU/load-global-i16.ll |
 | llvm/test/CodeGen/AMDGPU/and.ll |
 | llvm/test/CodeGen/AMDGPU/packed-fp32.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.2darraymsaa.a16.ll |
 | llvm/test/CodeGen/AMDGPU/udivrem24.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fpow.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i32.ll |
 | llvm/test/CodeGen/AMDGPU/sub.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.atomic.dim.a16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.udot4.ll |
Commit
18c70a7bd932d1259cd28b82f946bec5dc77bfc2
by npopov[JumpThreading] Simplify getPredicateAt() based folding
It's sufficient to just fold the icmp to true/false here, and then let constant terminator folding take care of the rest.
It should be noted that while replaceFoldableUses() may not replace all uses of the icmp, at least the use in the terminator we're working on is always replaceable, so terminator constant folding should be reliably enabled as a subsequent step.
|
 | llvm/lib/Transforms/Scalar/JumpThreading.cpp |
Commit
6d36cfed3b5db3e2d73c3ff1cc669464ef502e3d
by frgossen[MLIR] Make `parseDimensionListRanked` configurable wrt parsing a trailing `x`
Differential Revision: https://reviews.llvm.org/D125797
|
 | mlir/lib/Parser/Parser.h |
 | mlir/include/mlir/IR/OpImplementation.h |
 | mlir/lib/Parser/AsmParserImpl.h |
 | mlir/lib/Parser/TypeParser.cpp |
Commit
242961f23b4abaca999611fd364e93a8d2186371
by Tim Northover[llvm][fix-irreducible] ensure that loop subtree under child is correctly reconnected to new loop
The modified function was incorrectly (not unnecessarily) ignoring grandchild loops, and this change fixes the bug. In particular, this fixes the handling of the loop { inner, body }. The TODO in the same function is talking about the b1 self loop, which may be "unnecessarily" lost, but that is a different issue.
|
 | llvm/lib/Transforms/Utils/FixIrreducible.cpp |
Commit
e1d47d86d84588d7e49dbb5172403d17c44467f7
by npopov[IR] Report whether replaceUsesOfWith() changed something (NFC)
With change reporting in transformation passes in mind.
|
 | llvm/include/llvm/IR/User.h |
 | llvm/lib/IR/User.cpp |
 | llvm/unittests/IR/UserTest.cpp |
Commit
bdf25477f6f24621ceba2fb0a8ff1e7c0e181144
by npopov[JumpThreading] Add additional freeze tests (NFC)
These are for the getPredicateAt() codepath.
|
 | llvm/test/Transforms/JumpThreading/freeze.ll |
Commit
e2926501d886ce8c1cd08db1d3a02c314c2f412d
by jay.foad[AMDGPU] Aggressively fold immediates in SIShrinkInstructions
Fold immediates regardless of how many uses they have. This is expected to increase overall code size, but decrease register usage.
Differential Revision: https://reviews.llvm.org/D114644
|
 | llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp |
 | llvm/test/CodeGen/AMDGPU/uaddsat.ll |
 | llvm/test/CodeGen/AMDGPU/frem.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i32.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/orn2.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fmed3.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll |
 | llvm/test/CodeGen/AMDGPU/bswap.ll |
 | llvm/test/CodeGen/AMDGPU/fmax_legacy.f16.ll |
 | llvm/test/CodeGen/AMDGPU/sdiv64.ll |
 | llvm/test/CodeGen/AMDGPU/fold-imm-f16-f32.mir |
 | llvm/test/CodeGen/AMDGPU/udiv.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i8.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/usubsat.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.a16.dim.ll |
 | llvm/test/CodeGen/AMDGPU/usubsat.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.i8.ll |
 | llvm/test/CodeGen/AMDGPU/packed-fp32.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/combine-fma-sub-neg-mul.ll |
 | llvm/test/CodeGen/AMDGPU/idot4u.ll |
 | llvm/test/CodeGen/AMDGPU/lshr.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/mul.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/subo.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.i16.ll |
 | llvm/test/CodeGen/AMDGPU/add.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/trunc.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/s_movk_i32.ll |
 | llvm/test/CodeGen/AMDGPU/urem64.ll |
 | llvm/test/CodeGen/AMDGPU/fexp.ll |
 | llvm/test/CodeGen/AMDGPU/shl.ll |
 | llvm/test/CodeGen/AMDGPU/sdwa-peephole.ll |
 | llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll |
 | llvm/test/CodeGen/AMDGPU/idot8u.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.sdot4.ll |
 | llvm/test/CodeGen/AMDGPU/extract-subvector-16bit.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll |
 | llvm/test/CodeGen/AMDGPU/mul_uint24-amdgcn.ll |
 | llvm/test/CodeGen/AMDGPU/srem-seteq-illegal-types.ll |
 | llvm/test/CodeGen/AMDGPU/saddsat.ll |
 | llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/andn2.ll |
 | llvm/test/CodeGen/AMDGPU/fp_to_uint.ll |
 | llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.load.1d.d16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll |
 | llvm/test/CodeGen/AMDGPU/fmin_legacy.f16.ll |
 | llvm/test/CodeGen/AMDGPU/fneg-combines.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/shl.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i32.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fshr.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.log.f16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i32.ll |
 | llvm/test/CodeGen/AMDGPU/amdgpu-mul24-knownbits.ll |
 | llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll |
 | llvm/test/CodeGen/AMDGPU/udivrem.ll |
 | llvm/test/CodeGen/AMDGPU/mul.i16.ll |
 | llvm/test/CodeGen/AMDGPU/idot2.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.log10.ll |
 | llvm/test/CodeGen/AMDGPU/fpow.ll |
 | llvm/test/CodeGen/AMDGPU/idot8s.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/uaddsat.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fmul.v2f16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fpow.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/addo.ll |
 | llvm/test/CodeGen/AMDGPU/sdiv.ll |
 | llvm/test/CodeGen/AMDGPU/madmk.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.log.ll |
 | llvm/test/CodeGen/AMDGPU/sub.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/ctpop16.ll |
 | llvm/test/CodeGen/AMDGPU/fshr.ll |
 | llvm/test/CodeGen/AMDGPU/flat-scratch.ll |
 | llvm/test/CodeGen/AMDGPU/urem-seteq-illegal-types.ll |
 | llvm/test/CodeGen/AMDGPU/and.ll |
 | llvm/test/CodeGen/AMDGPU/load-global-i16.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.sin.f16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.udot4.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i32.ll |
 | llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll |
 | llvm/test/CodeGen/AMDGPU/operand-folding.ll |
 | llvm/test/CodeGen/AMDGPU/udiv64.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/hip.extern.shared.array.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.log10.f16.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.cos.f16.ll |
 | llvm/test/CodeGen/AMDGPU/srem64.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/roundeven.ll |
 | llvm/test/CodeGen/AMDGPU/ssubsat.ll |
 | llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/sra.ll |
 | llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll |
 | llvm/test/CodeGen/AMDGPU/idot4s.ll |
 | llvm/test/CodeGen/AMDGPU/shl.v2i16.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fshl.ll |
 | llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll |
Commit
c9e7049754ac94a952de8ac6962c36f844e43b3c
by npopov[JumpThreading] Look through freeze in getPredicateAt() fold
This code is valid for any icmp, so we can safely look through a freeze when trying to find one.
A caveat here is that replaceFoldableUses() may not end up replacing any uses in this case. It might make sense to use the freeze as the context instruction (rather than the terminator) if there is a freeze, to ensure that it always gets folded. This would require some changes to how replaceFoldedUses() works though, as it currently assumes that the value is valid at the end of the block.
|
 | llvm/test/Transforms/JumpThreading/select-unfold-freeze.ll |
 | llvm/lib/Transforms/Scalar/JumpThreading.cpp |
 | llvm/test/Transforms/JumpThreading/freeze.ll |
Commit
140ad30b24fa3154808311f2ea4e52167dda378a
by ivan.kosarev[AMDGPU][MC][GFX10] Add missing s_scratch_load tests.
Completes https://reviews.llvm.org/D125117
Reviewed By: dp, arsenm
Differential Revision: https://reviews.llvm.org/D125753
|
 | llvm/test/MC/AMDGPU/gfx10_asm_smem.s |
 | llvm/test/MC/Disassembler/AMDGPU/gfx10_dasm_all.txt |