|
 | mlir/test/Dialect/Linalg/comprehensive-module-bufferize-invalid.mlir (diff) |
 | mlir/lib/Dialect/Linalg/ComprehensiveBufferize/ComprehensiveBufferize.cpp (diff) |
|
 | llvm/lib/Target/Hexagon/HexagonGenInsert.cpp (diff) |
 | llvm/lib/Target/Sparc/LeonPasses.cpp (diff) |
 | llvm/lib/Target/PowerPC/PPCBranchSelector.cpp (diff) |
 | llvm/lib/Target/PowerPC/PPCExpandAtomicPseudoInsts.cpp (diff) |
 | llvm/lib/CodeGen/BranchRelaxation.cpp (diff) |
 | llvm/lib/Target/PowerPC/PPCCTRLoops.cpp (diff) |
 | llvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp (diff) |
 | llvm/lib/Target/PowerPC/PPCInstrInfo.cpp (diff) |
 | llvm/lib/Target/NVPTX/NVPTXReplaceImageHandles.cpp (diff) |
 | llvm/lib/Target/Sparc/SparcFrameLowering.cpp (diff) |
 | llvm/lib/Target/XCore/XCoreFrameToArgsOffsetElim.cpp (diff) |
 | llvm/lib/Target/Sparc/DelaySlotFiller.cpp (diff) |
 | llvm/lib/Target/PowerPC/PPCFrameLowering.cpp (diff) |
 | llvm/lib/CodeGen/MachineFunction.cpp (diff) |
Commit
e5a8c8c883f1f3f91f40c883dd4f613aca0f7105
by joker.eph[mlir] Refactoring a few Parser APIs
Refactored two new parser APIs parseGenericOperationAfterOperands and parseCustomOperationName out of parseGenericOperation and parseCustomOperation.
Motivation: Sometimes an op can be printed in a special way if certain criteria is met. While parsing, we need to handle all the versions. `parseGenericOperationAfterOperands` is handy in situation where we already parsed the operands and decide to fall back to default parsing.
`parseCustomOperationName` is useful when we need to know details (dialect, operation name etc.) about a parsed token meant to be an mlir operation.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D113719
|
 | mlir/test/lib/Dialect/Test/TestDialect.cpp (diff) |
 | mlir/test/lib/Dialect/Test/TestOps.td (diff) |
 | mlir/lib/Parser/Parser.cpp (diff) |
 | mlir/test/IR/pretty_printed_region_op.mlir |
 | mlir/include/mlir/IR/OpImplementation.h (diff) |
Commit
b2729fda60dbda595e7b5974279d8f860bce75ab
by nicolas.vasilache[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)
This revision follows up on the conversation titled:
```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```
The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.
This results in roughly 20% fewer cycles as reported by llvm-mca:
After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted): ``` Iterations: 100 Instructions: 5900 Total Cycles: 2415 Total uOps: 7300
Dispatch Width: 6 uOps Per Cycle: 3.02 IPC: 2.44 Block RThroughput: 24.0
Cycles with backend pressure increase [ 89.90% ] Throughput Bottlenecks: Resource Pressure [ 89.65% ] - SKXPort1 [ 0.04% ] - SKXPort2 [ 12.42% ] - SKXPort3 [ 12.42% ] - SKXPort5 [ 89.52% ] Data Dependencies: [ 37.06% ] - Register Dependencies [ 37.06% ] - Memory Dependencies [ 0.00% ] ```
After this revision (inline_asm version, vblendps instructions are indeed emitted): ``` Iterations: 100 Instructions: 6300 Total Cycles: 2015 Total uOps: 7700
Dispatch Width: 6 uOps Per Cycle: 3.82 IPC: 3.13 Block RThroughput: 20.0
Cycles with backend pressure increase [ 83.47% ] Throughput Bottlenecks: Resource Pressure [ 83.18% ] - SKXPort0 [ 14.49% ] - SKXPort1 [ 14.54% ] - SKXPort2 [ 19.70% ] - SKXPort3 [ 19.70% ] - SKXPort5 [ 83.03% ] - SKXPort6 [ 14.49% ] Data Dependencies: [ 39.75% ] - Register Dependencies [ 39.75% ] - Memory Dependencies [ 0.00% ] ```
An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).
Differential Revision: https://reviews.llvm.org/D114393
|
 | mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp (diff) |
 | mlir/test/Integration/Dialect/LLVMIR/CPU/X86/test-inline-asm-vector.mlir |
 | mlir/test/Dialect/Vector/vector-transpose-lowering.mlir (diff) |
 | utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel (diff) |
 | mlir/include/mlir/Dialect/X86Vector/Transforms.h (diff) |
 | mlir/test/lib/Dialect/Vector/CMakeLists.txt (diff) |
 | mlir/lib/Dialect/X86Vector/Transforms/AVXTranspose.cpp (diff) |
Commit
06d0d449d8555ae5f1ac33e8d4bb4ae40eb080d3
by martin[COFF] [ARM64] Create symbols with regular intervals for relocations against temporary symbols
For relocations against temporary symbols (that don't persist in the object file), we normally adjust them to reference the start of the section.
For adrp relocations, the immediate offset from the referenced symbol is stored in the opcode as the 21 bit signed immediate; this means that the symbol referenced must be within +/- 1 MB from the referenced symbol.
Create label symbols with regular intervals (1 MB intervals). For relocations against temporary symbols, pick the preceding added offset symbol and make the relocation against that instead of against the start of the section.
This should fix the root issue behind https://bugs.llvm.org/show_bug.cgi?id=52378.
Differential Revision: https://reviews.llvm.org/D114340
|
 | llvm/lib/MC/WinCOFFObjectWriter.cpp (diff) |
 | llvm/test/MC/AArch64/coff-relocations-offset.s |
Commit
7c15da67614eca9272553ecfe8c1a0f6f68c134b
by martin[LLD] [COFF] Interpret the immediate in ARM64 adr/adrp relocations as signed 21 bit
This matches how MS link.exe interprets this relocation.
Differential Revision: https://reviews.llvm.org/D114347
|
 | lld/COFF/Chunks.cpp (diff) |
 | lld/test/COFF/arm64-relocs-imports.test (diff) |
Commit
4e5488afb27a64d12a76b770cc86bab8074e9c57
by martin[AArch64] [COFF] Move jump tables back to the readonly section
This essentially reverts f5884d255e78305d41c28c6e001a460ff83981d8 (D57277).
That commit was made as a workaround since LLVM back then didn't support cross-section relative relocations (IMAGE_REL_ARM64_REL32) in COFF for ARM64. Support for this was implemented later, in d5c5cf5ce8d921fc8c5e1b608c298a1ffa688d37 (D99572) and 382c505d9cfca8adaec47aea2da7bbcbc00fc05c (D102217).
The commit that moved jump tables to the function section noted that it woud be ideal to utilize IMAGE_REL_ARM64_REL32.
Differential Revision: https://reviews.llvm.org/D113576
|
 | llvm/test/CodeGen/AArch64/win64-jumptable.ll (diff) |
 | llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (diff) |
Commit
d703b922961e0d02a5effdd4bfbb23ad50a3cc9f
by martin[LLD] [COFF] Omit section symbols and IMAGE_SYM_CLASS_LABEL from the PE symbol table
The section symbols aren't of much practical use when looking at a linked image. This shrinks one observed mingw style unstripped binary by 14%.
IMAGE_SYM_CLASS_LABEL is in spirit the same as a temporary assembler label that isn't emitted on the object file level at all.
Differential Revision: https://reviews.llvm.org/D113866
|
 | lld/test/COFF/strtab-size.s (diff) |
 | lld/COFF/Writer.cpp (diff) |
 | lld/test/COFF/symtab.test (diff) |
|
 | llvm/test/CodeGen/Thumb2/mve-masked-store-mmo.ll |
|
 | llvm/test/CodeGen/X86/vmaskmov-offset.ll (diff) |
Commit
59f4b3d3081535b61609f12ea5f638905616fcbc
by qiucofan[PowerPC] Implement more fusion types for Power10
This implements the rest of Power10 instruction fusion pairs, according to user manual, including 'wide immediate', 'load compare', 'zero move' and 'SHA3 assist'.
Only 'SHA3 assist' is enabled by default.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D112912
|
 | llvm/lib/Target/PowerPC/PPCMacroFusion.cpp (diff) |
 | llvm/lib/Target/PowerPC/PPCMacroFusion.def (diff) |
 | llvm/lib/Target/PowerPC/PPCSubtarget.h (diff) |
 | llvm/test/CodeGen/PowerPC/macro-fusion.mir (diff) |
 | llvm/lib/Target/PowerPC/PPC.td (diff) |
 | llvm/lib/Target/PowerPC/PPCSubtarget.cpp (diff) |
Commit
32b6c17b29079e7d2ac61cdc90b10983ee97d78d
by david.green[SDAG] Use UnknownSize for masked load/store MMO size
A masked load or store will load a potentially unknown number of bytes from a memory location - that is not generally known at compile time. They do not necessarily load/store the entire vector width, and treating them as such can lead to incorrect aliasing information (for example, if the underlying object is smaller than the size of the vector).
This makes sure that the MMO is given an unknown size to represent this. which is less accurate that "may load/store from up to 16 bytes", but less incorrect that "will load/store from 16 bytes".
Differential Revision: https://reviews.llvm.org/D113888
|
 | llvm/include/llvm/CodeGen/MachineFunction.h (diff) |
 | llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (diff) |
 | llvm/test/CodeGen/Thumb2/mve-masked-store-mmo.ll (diff) |
 | llvm/test/CodeGen/X86/masked_compressstore.ll (diff) |
 | llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp (diff) |
 | llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (diff) |
 | llvm/test/CodeGen/X86/vmaskmov-offset.ll (diff) |
|
 | mlir/include/mlir/Dialect/Bufferization/IR/AllocationOpInterface.td |
 | mlir/include/mlir/Dialect/Bufferization/IR/AllocationOpInterface.h |
 | mlir/lib/Dialect/Bufferization/IR/AllocationOpInterface.cpp |
 | mlir/lib/Dialect/CMakeLists.txt (diff) |
 | mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td (diff) |
 | mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp (diff) |
 | mlir/include/mlir/Interfaces/SideEffectInterfaces.td (diff) |
 | mlir/include/mlir/Dialect/Bufferization/IR/CMakeLists.txt |
 | mlir/include/mlir/Dialect/Bufferization/CMakeLists.txt |
 | mlir/include/mlir/Dialect/CMakeLists.txt (diff) |
 | mlir/lib/Transforms/BufferDeallocation.cpp (diff) |
 | utils/bazel/llvm-project-overlay/mlir/BUILD.bazel (diff) |
 | mlir/lib/Dialect/Bufferization/IR/CMakeLists.txt |
 | mlir/lib/Transforms/CMakeLists.txt (diff) |
 | mlir/lib/Dialect/Bufferization/CMakeLists.txt |
Commit
a5fff58781f30ff3fd7a3f56948552cf7b8842bb
by flo[ThreadPool] Do not return shared futures.
The only users of returned futures from ThreadPool is llvm-reduce after D113857.
There should be no cases where multiple threads wait on the same future, so there should be no need to return std::shared_future<>. Instead return plain std::future<>.
If users need to share a future between multiple threads, they can share the futures themselves.
Reviewed By: Meinersbur, mehdi_amini
Differential Revision: https://reviews.llvm.org/D114363
|
 | mlir/include/mlir/IR/Threading.h (diff) |
 | llvm/include/llvm/Support/ThreadPool.h (diff) |
Commit
47e2644c89b3be6faa0f5cc4c70ef96ec295da9a
by ybrevnov[DSE][NFC] Introduce "doesn't overwrite" return code for isOverwrite
Add OR_None code to indicate that there is no overwrite. This has no any effect for current uses but will be used in one of the next patches building support for PHI translation.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D105098
|
 | llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (diff) |