Changes

Summary

  1. clang-ve-ninja: Add hpce-ve-main, hpce-ve-staging workers (details)
Commit 06987827915a936cb61dda7fc352c336f7f78ac6 by simon.moll
clang-ve-ninja: Add hpce-ve-main, hpce-ve-staging workers
The file was modifiedbuildbot/osuosl/master/config/builders.py (diff)
The file was modifiedbuildbot/osuosl/master/config/workers.py (diff)

Summary

  1. [mlir][linalg][bufferize] Limited support for scf.execute_region (details)
  2. [llvm] Use range-based for loops (NFC) (details)
  3. [mlir] Refactoring a few Parser APIs (details)
  4. [mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm) (details)
  5. [COFF] [ARM64] Create symbols with regular intervals for relocations against temporary symbols (details)
  6. [LLD] [COFF] Interpret the immediate in ARM64 adr/adrp relocations as signed 21 bit (details)
  7. [AArch64] [COFF] Move jump tables back to the readonly section (details)
  8. [LLD] [COFF] Omit section symbols and IMAGE_SYM_CLASS_LABEL from the PE symbol table (details)
  9. [ARM] Add an test for showing the incorrect aliasing info around masked loads/stores. NFC (details)
  10. [X86] Regenerate X86/vmaskmov-offset.ll check lines as per new mir format. NFC (details)
  11. [PowerPC] Implement more fusion types for Power10 (details)
  12. [SDAG] Use UnknownSize for masked load/store MMO size (details)
  13. Revert "Revert "[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td."" (details)
  14. [ThreadPool] Do not return shared futures. (details)
  15. [DSE][NFC] Introduce "doesn't overwrite" return code for isOverwrite (details)
Commit fb99686bfd82061d07877228e4737f98fa4e83d4 by springerm
[mlir][linalg][bufferize] Limited support for scf.execute_region

Add support for analysis only.

Differential Revision: https://reviews.llvm.org/D114055
The file was modifiedmlir/test/Dialect/Linalg/comprehensive-module-bufferize-invalid.mlir (diff)
The file was modifiedmlir/lib/Dialect/Linalg/ComprehensiveBufferize/ComprehensiveBufferize.cpp (diff)
Commit d5b73a70a0611fc6c082e20acb6ce056980c8323 by kazu
[llvm] Use range-based for loops (NFC)
The file was modifiedllvm/lib/Target/Hexagon/HexagonGenInsert.cpp (diff)
The file was modifiedllvm/lib/Target/Sparc/LeonPasses.cpp (diff)
The file was modifiedllvm/lib/Target/PowerPC/PPCBranchSelector.cpp (diff)
The file was modifiedllvm/lib/Target/PowerPC/PPCExpandAtomicPseudoInsts.cpp (diff)
The file was modifiedllvm/lib/CodeGen/BranchRelaxation.cpp (diff)
The file was modifiedllvm/lib/Target/PowerPC/PPCCTRLoops.cpp (diff)
The file was modifiedllvm/lib/Target/Hexagon/HexagonHardwareLoops.cpp (diff)
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrInfo.cpp (diff)
The file was modifiedllvm/lib/Target/NVPTX/NVPTXReplaceImageHandles.cpp (diff)
The file was modifiedllvm/lib/Target/Sparc/SparcFrameLowering.cpp (diff)
The file was modifiedllvm/lib/Target/XCore/XCoreFrameToArgsOffsetElim.cpp (diff)
The file was modifiedllvm/lib/Target/Sparc/DelaySlotFiller.cpp (diff)
The file was modifiedllvm/lib/Target/PowerPC/PPCFrameLowering.cpp (diff)
The file was modifiedllvm/lib/CodeGen/MachineFunction.cpp (diff)
Commit e5a8c8c883f1f3f91f40c883dd4f613aca0f7105 by joker.eph
[mlir] Refactoring a few Parser APIs

Refactored two new parser APIs parseGenericOperationAfterOperands and
parseCustomOperationName out of parseGenericOperation and parseCustomOperation.

Motivation: Sometimes an op can be printed in a special way if certain criteria
is met. While parsing, we need to handle all the versions.
`parseGenericOperationAfterOperands` is handy in situation where we already
parsed the operands and decide to fall back to default parsing.

`parseCustomOperationName` is useful when we need to know details (dialect,
operation name etc.) about a parsed token meant to be an mlir operation.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D113719
The file was modifiedmlir/test/lib/Dialect/Test/TestDialect.cpp (diff)
The file was modifiedmlir/test/lib/Dialect/Test/TestOps.td (diff)
The file was modifiedmlir/lib/Parser/Parser.cpp (diff)
The file was addedmlir/test/IR/pretty_printed_region_op.mlir
The file was modifiedmlir/include/mlir/IR/OpImplementation.h (diff)
Commit b2729fda60dbda595e7b5974279d8f860bce75ab by nicolas.vasilache
[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)

This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```

The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.

This results in roughly 20% fewer cycles as reported by llvm-mca:

After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations:        100
Instructions:      5900
Total Cycles:      2415
Total uOps:        7300

Dispatch Width:    6
uOps Per Cycle:    3.02
IPC:               2.44
Block RThroughput: 24.0

Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
  Resource Pressure       [ 89.65% ]
  - SKXPort1  [ 0.04% ]
  - SKXPort2  [ 12.42% ]
  - SKXPort3  [ 12.42% ]
  - SKXPort5  [ 89.52% ]
  Data Dependencies:      [ 37.06% ]
  - Register Dependencies [ 37.06% ]
  - Memory Dependencies   [ 0.00% ]
```

After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations:        100
Instructions:      6300
Total Cycles:      2015
Total uOps:        7700

Dispatch Width:    6
uOps Per Cycle:    3.82
IPC:               3.13
Block RThroughput: 20.0

Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
  Resource Pressure       [ 83.18% ]
  - SKXPort0  [ 14.49% ]
  - SKXPort1  [ 14.54% ]
  - SKXPort2  [ 19.70% ]
  - SKXPort3  [ 19.70% ]
  - SKXPort5  [ 83.03% ]
  - SKXPort6  [ 14.49% ]
  Data Dependencies:      [ 39.75% ]
  - Register Dependencies [ 39.75% ]
  - Memory Dependencies   [ 0.00% ]
```

An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).

Differential Revision: https://reviews.llvm.org/D114393
The file was modifiedmlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp (diff)
The file was addedmlir/test/Integration/Dialect/LLVMIR/CPU/X86/test-inline-asm-vector.mlir
The file was modifiedmlir/test/Dialect/Vector/vector-transpose-lowering.mlir (diff)
The file was modifiedutils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel (diff)
The file was modifiedmlir/include/mlir/Dialect/X86Vector/Transforms.h (diff)
The file was modifiedmlir/test/lib/Dialect/Vector/CMakeLists.txt (diff)
The file was modifiedmlir/lib/Dialect/X86Vector/Transforms/AVXTranspose.cpp (diff)
Commit 06d0d449d8555ae5f1ac33e8d4bb4ae40eb080d3 by martin
[COFF] [ARM64] Create symbols with regular intervals for relocations against temporary symbols

For relocations against temporary symbols (that don't persist in
the object file), we normally adjust them to reference the start of
the section.

For adrp relocations, the immediate offset from the referenced
symbol is stored in the opcode as the 21 bit signed immediate; this
means that the symbol referenced must be within +/- 1 MB from the
referenced symbol.

Create label symbols with regular intervals (1 MB intervals). For
relocations against temporary symbols, pick the preceding added
offset symbol and make the relocation against that instead of
against the start of the section.

This should fix the root issue behind
https://bugs.llvm.org/show_bug.cgi?id=52378.

Differential Revision: https://reviews.llvm.org/D114340
The file was modifiedllvm/lib/MC/WinCOFFObjectWriter.cpp (diff)
The file was addedllvm/test/MC/AArch64/coff-relocations-offset.s
Commit 7c15da67614eca9272553ecfe8c1a0f6f68c134b by martin
[LLD] [COFF] Interpret the immediate in ARM64 adr/adrp relocations as signed 21 bit

This matches how MS link.exe interprets this relocation.

Differential Revision: https://reviews.llvm.org/D114347
The file was modifiedlld/COFF/Chunks.cpp (diff)
The file was modifiedlld/test/COFF/arm64-relocs-imports.test (diff)
Commit 4e5488afb27a64d12a76b770cc86bab8074e9c57 by martin
[AArch64] [COFF] Move jump tables back to the readonly section

This essentially reverts f5884d255e78305d41c28c6e001a460ff83981d8
(D57277).

That commit was made as a workaround since LLVM back then didn't
support cross-section relative relocations (IMAGE_REL_ARM64_REL32)
in COFF for ARM64. Support for this was implemented later,
in d5c5cf5ce8d921fc8c5e1b608c298a1ffa688d37 (D99572) and
382c505d9cfca8adaec47aea2da7bbcbc00fc05c (D102217).

The commit that moved jump tables to the function section noted
that it woud be ideal to utilize IMAGE_REL_ARM64_REL32.

Differential Revision: https://reviews.llvm.org/D113576
The file was modifiedllvm/test/CodeGen/AArch64/win64-jumptable.ll (diff)
The file was modifiedllvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (diff)
Commit d703b922961e0d02a5effdd4bfbb23ad50a3cc9f by martin
[LLD] [COFF] Omit section symbols and IMAGE_SYM_CLASS_LABEL from the PE symbol table

The section symbols aren't of much practical use when looking at
a linked image. This shrinks one observed mingw style unstripped
binary by 14%.

IMAGE_SYM_CLASS_LABEL is in spirit the same as a temporary assembler
label that isn't emitted on the object file level at all.

Differential Revision: https://reviews.llvm.org/D113866
The file was modifiedlld/test/COFF/strtab-size.s (diff)
The file was modifiedlld/COFF/Writer.cpp (diff)
The file was modifiedlld/test/COFF/symtab.test (diff)
Commit dc79d73605305f9dfaa7eb777b6ed317363bdb04 by david.green
[ARM] Add an test for showing the incorrect aliasing info around masked loads/stores. NFC
The file was addedllvm/test/CodeGen/Thumb2/mve-masked-store-mmo.ll
Commit 8ea3e70fb02e59ddfd6a050344c7d177b11104f7 by david.green
[X86] Regenerate X86/vmaskmov-offset.ll check lines as per new mir format. NFC
The file was modifiedllvm/test/CodeGen/X86/vmaskmov-offset.ll (diff)
Commit 59f4b3d3081535b61609f12ea5f638905616fcbc by qiucofan
[PowerPC] Implement more fusion types for Power10

This implements the rest of Power10 instruction fusion pairs, according
to user manual, including 'wide immediate', 'load compare', 'zero move'
and 'SHA3 assist'.

Only 'SHA3 assist' is enabled by default.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D112912
The file was modifiedllvm/lib/Target/PowerPC/PPCMacroFusion.cpp (diff)
The file was modifiedllvm/lib/Target/PowerPC/PPCMacroFusion.def (diff)
The file was modifiedllvm/lib/Target/PowerPC/PPCSubtarget.h (diff)
The file was modifiedllvm/test/CodeGen/PowerPC/macro-fusion.mir (diff)
The file was modifiedllvm/lib/Target/PowerPC/PPC.td (diff)
The file was modifiedllvm/lib/Target/PowerPC/PPCSubtarget.cpp (diff)
Commit 32b6c17b29079e7d2ac61cdc90b10983ee97d78d by david.green
[SDAG] Use UnknownSize for masked load/store MMO size

A masked load or store will load a potentially unknown number of bytes
from a memory location - that is not generally known at compile time.
They do not necessarily load/store the entire vector width, and treating
them as such can lead to incorrect aliasing information (for example, if
the underlying object is smaller than the size of the vector).

This makes sure that the MMO is given an unknown size to represent this.
which is less accurate that "may load/store from up to 16 bytes", but
less incorrect that "will load/store from 16 bytes".

Differential Revision: https://reviews.llvm.org/D113888
The file was modifiedllvm/include/llvm/CodeGen/MachineFunction.h (diff)
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (diff)
The file was modifiedllvm/test/CodeGen/Thumb2/mve-masked-store-mmo.ll (diff)
The file was modifiedllvm/test/CodeGen/X86/masked_compressstore.ll (diff)
The file was modifiedllvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp (diff)
The file was modifiedllvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (diff)
The file was modifiedllvm/test/CodeGen/X86/vmaskmov-offset.ll (diff)
Commit c7cc70c8f87789ba04a8610162de4ad135d99e16 by pifon
Revert "Revert "[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td.""

This reverts and fixes commit de18b7dee6a81e5e790c8e8060065b1ef72d13ed.
The file was addedmlir/include/mlir/Dialect/Bufferization/IR/AllocationOpInterface.td
The file was addedmlir/include/mlir/Dialect/Bufferization/IR/AllocationOpInterface.h
The file was addedmlir/lib/Dialect/Bufferization/IR/AllocationOpInterface.cpp
The file was modifiedmlir/lib/Dialect/CMakeLists.txt (diff)
The file was modifiedmlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td (diff)
The file was modifiedmlir/lib/Dialect/MemRef/IR/MemRefOps.cpp (diff)
The file was modifiedmlir/include/mlir/Interfaces/SideEffectInterfaces.td (diff)
The file was addedmlir/include/mlir/Dialect/Bufferization/IR/CMakeLists.txt
The file was addedmlir/include/mlir/Dialect/Bufferization/CMakeLists.txt
The file was modifiedmlir/include/mlir/Dialect/CMakeLists.txt (diff)
The file was modifiedmlir/lib/Transforms/BufferDeallocation.cpp (diff)
The file was modifiedutils/bazel/llvm-project-overlay/mlir/BUILD.bazel (diff)
The file was addedmlir/lib/Dialect/Bufferization/IR/CMakeLists.txt
The file was modifiedmlir/lib/Transforms/CMakeLists.txt (diff)
The file was addedmlir/lib/Dialect/Bufferization/CMakeLists.txt
Commit a5fff58781f30ff3fd7a3f56948552cf7b8842bb by flo
[ThreadPool] Do not return shared futures.

The only users of returned futures from ThreadPool is llvm-reduce after
D113857.

There should be no cases where multiple threads wait on the same future,
so there should be no need to return std::shared_future<>. Instead return
plain std::future<>.

If users need to share a future between multiple threads, they can share
the futures themselves.

Reviewed By: Meinersbur, mehdi_amini

Differential Revision: https://reviews.llvm.org/D114363
The file was modifiedmlir/include/mlir/IR/Threading.h (diff)
The file was modifiedllvm/include/llvm/Support/ThreadPool.h (diff)
Commit 47e2644c89b3be6faa0f5cc4c70ef96ec295da9a by ybrevnov
[DSE][NFC] Introduce "doesn't overwrite" return code for isOverwrite

Add OR_None code to indicate that there is no overwrite. This has no any effect for current uses but will be used in one of the next patches building support for PHI translation.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D105098
The file was modifiedllvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (diff)