Commit
a05b694b1e1d742d7702c1774abfaf98f502f04b
by ikudrin[ELF][NFC] Do not pass region name to expandMemoryRegion()
The name can be easily got on-site.
Differential Revision: https://reviews.llvm.org/D114228
|
 | lld/ELF/LinkerScript.cpp |
Commit
6a3958247aeeacdbf40833151220b089f066c82f
by dvyukovtsan: add another fork test
Add a fork test that models what happens on Mac where fork calls malloc/free inside of our atfork callbacks.
Reviewed By: vitalybuka, yln
Differential Revision: https://reviews.llvm.org/D114250
|
 | compiler-rt/lib/tsan/rtl/tsan_rtl.cpp |
 | compiler-rt/test/tsan/Linux/fork_deadlock.cpp |
Commit
2ac339ef5f0feca2abe2b8a1720839c58184166c
by yedeng.yd[C++20] [Coroutines] Warn for deprecated form 'for co_await'
The form 'for co_await' is part of CoroutineTS instead of C++20. So if we detected the use of 'for co_await' in C++20, we should emit a warning at least.
|
 | clang/include/clang/Basic/DiagnosticParseKinds.td |
 | clang/include/clang/Basic/DiagnosticGroups.td |
 | clang/lib/Parse/ParseStmt.cpp |
 | clang/test/SemaCXX/co_await-range-for.cpp |
Commit
83484f8472ad7f8ab91b4e944a6f092e8f4d16a8
by mailFix nits in clang-tidy's documentation (NFC)
Add commas, articles, and conjunctions where missing.
|
 | clang-tools-extra/docs/clang-tidy/index.rst |
Commit
760d4d03d5d3fc0e0d6e4222f670e5fd068645f2
by david.green[AArch64] Sink splat shuffles to lane index intrinsics
This teaches AArch64TargetLowering::shouldSinkOperands to sink splat shuffles to certain neon intrinsics, so that they can make use of the lane variants of the instructions that are available.
Differential Revision: https://reviews.llvm.org/D112994
|
 | llvm/test/Transforms/CodeGenPrepare/AArch64/sink-free-instructions.ll |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp |
 | llvm/test/CodeGen/AArch64/sinksplat.ll |
Commit
b5f20372a82f72f03d47181b87fb55f62772324f
by kbobyrev[clangd] IncludeCleaner: Mark possible expr resolutions as used
Fixes: https://github.com/clangd/clangd/issues/934
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D114287
|
 | clang-tools-extra/clangd/unittests/IncludeCleanerTests.cpp |
 | clang-tools-extra/clangd/IncludeCleaner.cpp |
Commit
a82942dd07ea652081f8f293b73801323a4dbbe9
by mailAdd missing clang-tidy args in index.rst (NFC)
The RST docs have gone out of sync with the command-line args that the clang-tidy program actually supports.
|
 | clang-tools-extra/docs/clang-tidy/index.rst |
Commit
84bf5e328664db2e744c4651c52d2460b1733d09
by klimekFix various problems found by fuzzing.
1. IndexTokenSource::getNextToken cannot return nullptr; some code was still written assuming it can; make getNextToken more resilient against incorrect input and fix its call-sites.
2. Change various asserts that can happen due to user provided input to conditionals in the code.
|
 | clang/lib/Format/UnwrappedLineParser.cpp |
 | clang/lib/Format/TokenAnnotator.cpp |
 | clang/lib/Format/WhitespaceManager.cpp |
 | clang/lib/Format/ContinuationIndenter.cpp |
Commit
2f1c037bbdc4a949e83466d6b315002d71c67731
by gchatelet[libc] Remove unused variable
|
 | libc/src/__support/str_to_float.h |
Commit
a7027bb7997184fd1e6d2ba370ebd4f109a6e737
by diegocaballero[LV] Pre-commit test for D111846
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D112054
|
 | llvm/test/Transforms/LoopVectorize/X86/drop-poison-generating-flags.ll |
Commit
d92aabc33666e83612c93e7c9c5c454510ba9b07
by arjunpitchanathan[MLIR][NFC] Simplex: remove repeated words in comment
|
 | mlir/include/mlir/Analysis/Presburger/Simplex.h |
Commit
4d21b64464ac548ec8442bc0d2a7e984ba78bd88
by sjoerd.meijer[BPI] Look-up tables for non-loop branches. NFC.
This adds and uses look-up tables for non-loop branch probabilities, which have have probabilities directly encoded into the tables for the different condition codes. Compared to having this logic inlined in different functions, as it used to be the case, I think this is compacter and thus also easier to check/cross reference. This also adds a test for pointer heuristics that was missing.
Differential Revision: https://reviews.llvm.org/D114009
|
 | llvm/test/Analysis/BranchProbabilityInfo/pointer_heuristics.ll |
 | llvm/lib/Analysis/BranchProbabilityInfo.cpp |
Commit
a9e236bed835c58be381dadb973a1db0681e4795
by nicolas.vasilache[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)
This revision follows up on the conversation titled:
```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```
The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.
This results in roughly 20% fewer cycles as reported by llvm-mca:
After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted): ``` Iterations: 100 Instructions: 5900 Total Cycles: 2415 Total uOps: 7300
Dispatch Width: 6 uOps Per Cycle: 3.02 IPC: 2.44 Block RThroughput: 24.0
Cycles with backend pressure increase [ 89.90% ] Throughput Bottlenecks: Resource Pressure [ 89.65% ] - SKXPort1 [ 0.04% ] - SKXPort2 [ 12.42% ] - SKXPort3 [ 12.42% ] - SKXPort5 [ 89.52% ] Data Dependencies: [ 37.06% ] - Register Dependencies [ 37.06% ] - Memory Dependencies [ 0.00% ] ```
After this revision (inline_asm version, vblendps instructions are indeed emitted): ``` Iterations: 100 Instructions: 6300 Total Cycles: 2015 Total uOps: 7700
Dispatch Width: 6 uOps Per Cycle: 3.82 IPC: 3.13 Block RThroughput: 20.0
Cycles with backend pressure increase [ 83.47% ] Throughput Bottlenecks: Resource Pressure [ 83.18% ] - SKXPort0 [ 14.49% ] - SKXPort1 [ 14.54% ] - SKXPort2 [ 19.70% ] - SKXPort3 [ 19.70% ] - SKXPort5 [ 83.03% ] - SKXPort6 [ 14.49% ] Data Dependencies: [ 39.75% ] - Register Dependencies [ 39.75% ] - Memory Dependencies [ 0.00% ] ```
An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).
Reviewed By: ftynse, dcaballe
Differential Revision: https://reviews.llvm.org/D114335
|
 | mlir/lib/Dialect/X86Vector/Transforms/AVXTranspose.cpp |
 | mlir/test/lib/Dialect/Vector/CMakeLists.txt |
 | mlir/include/mlir/Dialect/X86Vector/Transforms.h |
 | mlir/test/Integration/Dialect/LLVMIR/CPU/X86/test-inline-asm-vector.mlir |
 | utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel |
 | mlir/test/Dialect/Vector/vector-transpose-lowering.mlir |
 | mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp |
Commit
0ccc44cec067abbc702d5d3afb44e0395c55820d
by gysit[mlir][linalg] Fix tile and fuse for outermost reduction.
Tile and fuse failed if the outermost tile loop is a reduction dimension. Add the necessary check to handle outermost reductions and introduce a test case to verify the change.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114012
|
 | mlir/test/Dialect/Linalg/tile-and-fuse-on-tensors.mlir |
 | mlir/include/mlir/Dialect/Linalg/Utils/Utils.h |
 | mlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp |
Commit
789c88e80e878ed866a2d8cfe29c7fd36082274c
by nicolas.vasilache[mlir] Fix unintentional mutation by VectorType/RankedTensorType::Builder dropDim
Differential Revision: https://reviews.llvm.org/D113933
|
 | mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp |
 | mlir/include/mlir/IR/BuiltinTypes.h |
 | mlir/lib/Dialect/Vector/VectorTransforms.cpp |
Commit
4348cd42c385e71b63e5da7e492172cff6a79d7b
by diegocaballero[LV] Drop integer poison-generating flags from instructions that need predication
This patch fixes PR52111. The problem is that LV propagates poison-generating flags (`nuw`/`nsw`, `exact` and `inbounds`) in instructions that contribute to the address computation of widen loads/stores that are guarded by a condition. It may happen that when the code is vectorized and the control flow within the loop is linearized, these flags may lead to generating a poison value that is effectively used as the base address of the widen load/store. The fix drops all the integer poison-generating flags from instructions that contribute to the address computation of a widen load/store whose original instruction was in a basic block that needed predication and is not predicated after vectorization.
Reviewed By: fhahn, spatel, nlopes
Differential Revision: https://reviews.llvm.org/D111846
|
 | llvm/test/Transforms/LoopVectorize/X86/gather_scatter.ll |
 | llvm/test/Transforms/LoopVectorize/single-value-blend-phis.ll |
 | llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll |
 | llvm/test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll |
 | llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll |
 | llvm/lib/Transforms/Vectorize/LoopVectorize.cpp |
 | llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll |
 | llvm/test/Transforms/PhaseOrdering/AArch64/hoisting-sinking-required-for-vectorization.ll |
 | llvm/test/Transforms/LoopVectorize/AArch64/sve-masked-loadstore.ll |
 | llvm/test/Transforms/LoopVectorize/X86/load-deref-pred.ll |
 | llvm/lib/Transforms/Vectorize/VPlan.h |
 | llvm/test/Transforms/LoopVectorize/X86/x86-pr39099.ll |
 | llvm/test/Transforms/LoopVectorize/X86/drop-poison-generating-flags.ll |
 | llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll |
 | llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll |
Commit
e3d386ea27336edc04ae4fd324ab4337b9f3cf16
by gysit[mlir][linalg] Add a tile and fuse on tensors pattern.
Add a pattern to apply the new tile and fuse on tensors method. Integrate the pattern into the CodegenStrategy and use the CodegenStrategy to implement the tests.
Depends On D114012
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114067
|
 | mlir/test/lib/Dialect/Linalg/TestLinalgCodegenStrategy.cpp |
 | mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp |
 | mlir/lib/Dialect/Linalg/Transforms/LinalgStrategyPasses.cpp |
 | mlir/include/mlir/Dialect/Linalg/Transforms/CodegenStrategy.h |
 | mlir/include/mlir/Dialect/Linalg/Utils/Utils.h |
 | mlir/test/Dialect/Linalg/tile-and-fuse-sequence-on-tensors.mlir |
 | mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h |
 | mlir/test/Dialect/Linalg/tile-and-fuse-on-tensors.mlir |
 | mlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp |
 | mlir/include/mlir/Dialect/Linalg/Passes.td |
 | mlir/include/mlir/Dialect/Linalg/Passes.h |
Commit
050cc1cd6e6882eadba6e5ea7b588ca0b8aa1b12
by nicolas.vasilache[mlir] Add InitializeNativeTargetAsmParser to ExecutionEngine.
This is required to allow python to work with lowerings that use inline_asm.
Differential Revision: https://reviews.llvm.org/D114338
|
 | utils/bazel/llvm-project-overlay/mlir/BUILD.bazel |
 | mlir/lib/CAPI/ExecutionEngine/ExecutionEngine.cpp |
 | mlir/lib/ExecutionEngine/CMakeLists.txt |
Commit
8d09dd61c381b9c037da0c172b7b4592d9503d2c
by lebedev.ri[X86][TTI] Costmodel for AVX512DQ's VPMOVM2[DQ] / VPMOV[DQ]2M instructions
Much like the VPMOVM2[BW] / VPMOV[BW]2M from AVX512BW, these either sign-extent the mask register into a vector, or pack the mask from vector register.
Apparently, we didn't even have MCA tests for these, added in rG2f364f6f0d3a2420ca78cbd80abb186657180e05, so i'm just guessing that their perf characteristics are optimal.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D114314
|
 | llvm/test/Analysis/CostModel/X86/min-legal-vector-width.ll |
 | llvm/test/Analysis/CostModel/X86/trunc.ll |
 | llvm/test/Analysis/CostModel/X86/extend.ll |
 | llvm/lib/Target/X86/X86TargetTransformInfo.cpp |
Commit
704d92607d26e696daba596b72cb70effe79a872
by lebedev.ri[X86][TTI] Finish costmodel for AVX512BW's VPMOVM2[BW] / VPMOV[BW]2M instructions
Apparently my methodology was suboptimal, and not only did miss all the +VL tuples, i also missed some plain tuples. I believe, this adds everything missing. Indeed, these manual costmodels are just not okay long-term.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D114334
|
 | llvm/test/Analysis/CostModel/X86/shuffle-replication-i1.ll |
 | llvm/test/Analysis/CostModel/X86/min-legal-vector-width.ll |
 | llvm/lib/Target/X86/X86TargetTransformInfo.cpp |
 | llvm/test/Analysis/CostModel/X86/trunc.ll |
 | llvm/test/Analysis/CostModel/X86/extend.ll |
Commit
56db1c072c92be36fb1d76aa30487ad62dc58ea8
by simon.moll[DA][NFC] Update publication - add remarks
Update the reference publication for the SyncDependenceAnalysis and Divergence Analysis. Fix phrasing, formatting. Add comments on reducible loop limitation.
Reviewed By: sameerds
Differential Revision: https://reviews.llvm.org/D114146
|
 | llvm/lib/Analysis/SyncDependenceAnalysis.cpp |
 | llvm/lib/Analysis/DivergenceAnalysis.cpp |
Commit
955c72c35caf68fe4e2f026da67c6fdcd31d01ad
by bradley.smith[AArch64][ARM] Add missing SVE/SVE2 features from Cortex-A710
Differential Revision: https://reviews.llvm.org/D114169
|
 | llvm/include/llvm/Support/AArch64TargetParser.def |
 | llvm/unittests/Support/TargetParserTest.cpp |
Commit
f7751a3a4218229c59adced4964831f7a57d256d
by gysit[mlir][linalg] Remove tile and fuse test pass (NFC).
Remove the tile and fuse test pass that has been replaced by codegen strategy.
Depends On D114067
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114068
|
 | mlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp |
 | mlir/include/mlir/Dialect/Linalg/Passes.td |
 | mlir/include/mlir/Dialect/Linalg/Passes.h |
Commit
106f3074996c69ab732c6371d5ad6b25fcfd4fa5
by tpoppRename MlirExecutionEngine lookup to lookupPacked
The purpose of the change is to make clear whether the user is retrieving the original function or the wrapper function, in line with the invoke commands. This new functionality is useful for users that already have defined their own packed interface, so they do not want the extra layer of indirection, or for users wanting to the look at the resulting primary function rather than the wrapper function.
All locations, except the python bindings now have a `lookupPacked` method that matches the original `lookup` functionality. `lookup` still exists, but with new semantics.
- `lookup` returns the function with a given name. If `bool f(int,int)` is compiled, `lookup` will return a reference to `bool(*f)(int,int)`. - `lookupPacked` returns the packed wrapper of the function with the given name. If `bool f(int,int)` is compiled, `lookupPacked` will return `void(*mlir_f)(void**)`.
Differential Revision: https://reviews.llvm.org/D114352
|
 | mlir/lib/CAPI/ExecutionEngine/ExecutionEngine.cpp |
 | mlir/lib/Bindings/Python/ExecutionEngineModule.cpp |
 | mlir/lib/ExecutionEngine/ExecutionEngine.cpp |
 | mlir/lib/ExecutionEngine/JitRunner.cpp |
 | mlir/include/mlir/ExecutionEngine/ExecutionEngine.h |
 | mlir/include/mlir-c/ExecutionEngine.h |
Commit
32c43241e716280d3443d684416826b1e7e5781b
by gysit[mlir][linalg] Always generate an extract/insert slice pair when tiling output tensors.
Adapt tiling to always generate an extract/insert slice pair for output tensors even if the tensor is not tiled. Having an explicit extract/insert slice pair simplifies followup transformations such as padding and bufferization. In particular, it makes read and written iteration argument slices explicit.
Depends On D114067
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114085
|
 | mlir/test/Dialect/Linalg/tile-and-fuse-on-tensors.mlir |
 | mlir/test/Dialect/Linalg/fusion-tensor-pattern.mlir |
 | mlir/lib/Dialect/Linalg/Utils/Utils.cpp |
Commit
247a1a55eb6a58199006565d594c6f6c6b58b736
by gysit[mlir][linalg] Use getAsOpFoldResult in padding (NFC).
After padding, we introduce a ExtractSliceOp to get the final unpadded result. This revision uses getAsOpFoldResult to compute the size of the unpadded result, which guarantees the result type has a partially static shape if some of the sizes of the unpadded result are statically known. At the moment, we rely on canonicalization to cleanup the types after padding.
Depends On D114085
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114153
|
 | mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp |