1. Add flang to clang-cmake-aarch64-full (details)
Commit 4720197cd41704fb7294de7452656c77e4183c8d by diana.picus
Add flang to clang-cmake-aarch64-full

This requires some minor changes in, to make it possible
to enable flang along all the other projects. Enabling flang will also
pull in MLIR.

Differential Revision:
The file was modifiedzorg/buildbot/builders/ (diff)
The file was modifiedbuildbot/osuosl/master/config/ (diff)


  1. [X86][SSE] SimplifyDemandedVectorEltsForTargetNode - add general shuffle combining support (details)
  2. [mlir][VectorOps] Fail fast when a strided memref is passed to vector_transfer (details)
  3. [X86] Remove superfluous trailing semicolons, fixing warnings. NFC. (details)
  4. [DebugInfo] Remove Dwarf5AccelTableWriter::Header::UnitLength. NFC. (details)
  5. [DebugInfo] Emit a 1-byte value as a terminator of entries list in the name index. (details)
  6. [AArch64][SVE] Preserve full vector regs over EH edge. (details)
  7. [AMDGPU] Fix offset for REL32_HI relocs (details)
  8. [SVE] Don't reorder subvector/binop sequences when the resulting binop is not legal. (details)
Commit 21d02dc595797677a533c6b0508c0c9235bc4f13 by llvm-dev
[X86][SSE] SimplifyDemandedVectorEltsForTargetNode - add general shuffle combining support

This patch uses partial DemandedElts masks to further simplify target shuffle chains and finally starts making target shuffle combining part of SimplifyDemandedBits/SimplifyDemandedVectorElts.

We already manage this for Depth == 0 cases, where combineX86ShuffleChain would early-out if the shuffle combined to the same op, but the patch generalizes this by manipulating the depth handling of combineX86ShufflesRecursively - calling with a new Depth = 0 and reducing the maximum shuffle combine depth accordingly.

Differential Revision:
The file was modifiedllvm/test/CodeGen/X86/avx-trunc.ll
The file was modifiedllvm/test/CodeGen/X86/urem-seteq-vec-nonzero.ll
The file was modifiedllvm/test/CodeGen/X86/vshift-4.ll
The file was modifiedllvm/test/CodeGen/X86/promote-cmp.ll
The file was modifiedllvm/test/CodeGen/X86/load-partial.ll
The file was modifiedllvm/test/CodeGen/X86/pr29112.ll
The file was modifiedllvm/test/CodeGen/X86/shuffle-of-insert.ll
The file was modifiedllvm/test/CodeGen/X86/avg.ll
The file was modifiedllvm/test/CodeGen/X86/vector-trunc-math.ll
The file was modifiedllvm/test/CodeGen/X86/vector-zext.ll
The file was modifiedllvm/test/CodeGen/X86/test-shrink-bug.ll
The file was modifiedllvm/test/CodeGen/X86/vector-pack-256.ll
The file was modifiedllvm/test/CodeGen/X86/buildvec-extract.ll
The file was modifiedllvm/test/CodeGen/X86/pmul.ll
The file was modifiedllvm/test/CodeGen/X86/vec_insert-5.ll
The file was modifiedllvm/test/CodeGen/X86/pmulh.ll
The file was modifiedllvm/test/CodeGen/X86/combine-fcopysign.ll
The file was modifiedllvm/test/CodeGen/X86/masked_expandload.ll
The file was modifiedllvm/test/CodeGen/X86/trunc-subvector.ll
The file was modifiedllvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
The file was modifiedllvm/test/CodeGen/X86/haddsub-undef.ll
The file was modifiedllvm/test/CodeGen/X86/vector-trunc.ll
The file was modifiedllvm/test/CodeGen/X86/shuffle-strided-with-offset-128.ll
The file was modifiedllvm/test/CodeGen/X86/vec_int_to_fp.ll
The file was modifiedllvm/test/CodeGen/X86/psubus.ll
The file was modifiedllvm/test/CodeGen/X86/masked_load.ll
The file was modifiedllvm/test/CodeGen/X86/shuffle-vs-trunc-512.ll
The file was modifiedllvm/test/CodeGen/X86/vector-idiv-udiv-256.ll
The file was modifiedllvm/test/CodeGen/X86/oddsubvector.ll
The file was modifiedllvm/test/CodeGen/X86/urem-seteq-vec-nonsplat.ll
The file was modifiedllvm/test/CodeGen/X86/vector-reduce-and-bool.ll
The file was modifiedllvm/test/CodeGen/X86/bitcast-setcc-128.ll
The file was modifiedllvm/test/CodeGen/X86/vselect.ll
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
The file was modifiedllvm/test/CodeGen/X86/known-signbits-vector.ll
The file was modifiedllvm/test/CodeGen/X86/masked_store_trunc.ll
The file was modifiedllvm/test/CodeGen/X86/vector-reduce-xor-bool.ll
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-128-v16.ll
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-variable-128.ll
The file was modifiedllvm/test/CodeGen/X86/vec_set-6.ll
The file was modifiedllvm/test/CodeGen/X86/insert-into-constant-vector.ll
The file was modifiedllvm/test/CodeGen/X86/vector-reduce-mul.ll
The file was modifiedllvm/test/CodeGen/X86/shrink_vmul.ll
The file was modifiedllvm/test/CodeGen/X86/vec_insert-2.ll
The file was modifiedllvm/test/CodeGen/X86/udiv_fix_sat.ll
The file was modifiedllvm/test/CodeGen/X86/buildvec-insertvec.ll
The file was modifiedllvm/test/CodeGen/X86/shuffle-vs-trunc-256.ll
The file was modifiedllvm/test/CodeGen/X86/vec_insert-3.ll
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-combining.ll
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-128-v8.ll
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-128-v4.ll
The file was modifiedllvm/test/CodeGen/X86/bitcast-and-setcc-128.ll
The file was modifiedllvm/test/CodeGen/X86/combine-shl.ll
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-256-v16.ll
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-256-v8.ll
The file was modifiedllvm/test/CodeGen/X86/oddshuffles.ll
The file was modifiedllvm/test/CodeGen/X86/insertelement-shuffle.ll
The file was modifiedllvm/test/CodeGen/X86/srem-seteq-vec-nonsplat.ll
The file was modifiedllvm/test/CodeGen/X86/vector-reduce-or-bool.ll
The file was modifiedllvm/test/CodeGen/X86/udiv_fix.ll
Commit 2bf491c7294c020d1754cddbf3a55e8e21c14bdc by benny.kra
[mlir][VectorOps] Fail fast when a strided memref is passed to vector_transfer

Otherwise we'll silently miscompile things.

Differential Revision:
The file was modifiedmlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
Commit 4820af2bfc71f2a5bb7a335a7fdcc4ec395877e4 by martin
[X86] Remove superfluous trailing semicolons, fixing warnings. NFC.
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
Commit 71eed4808fbc5e5baa016210f727683610139014 by ikudrin
[DebugInfo] Remove Dwarf5AccelTableWriter::Header::UnitLength. NFC.

The member is not in use; the unit length for the table is emitted as
a difference between two labels. Moreover, the type of the member might
be misleading, because for DWARF64 the field should be 64 bit long.

Differential Revision:
The file was modifiedllvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
Commit 3445ec9ba718035b27c0140dc1e892be843236f5 by ikudrin
[DebugInfo] Emit a 1-byte value as a terminator of entries list in the name index.

As stated in section, DWARFv5, p. 142,
| The last entry for each name is followed by a zero byte that
| terminates the list. There may be gaps between the lists.

The patch changes emitting a 4-byte zero value to a 1-byte one, which
effectively removes the gap between entry lists, and thus saves
approximately 3 bytes per name; the calculation is not exact because
the total size of the table is aligned to 4.

Differential Revision:
The file was modifiedllvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
The file was addedllvm/test/DebugInfo/X86/debug-names-end-of-list.ll
Commit f13beac51be02cae21bce465206a920ecdca7566 by sander.desmalen
[AArch64][SVE] Preserve full vector regs over EH edge.

Unwinders may only preserve the lower 64bits of Neon and SVE registers,
as only the registers in the base ABI are guaranteed to be preserved
over the exception edge. The caller will need to preserve additional
registers for when the call throws an exception and the unwinder has
tried to recover state.

For  e.g.

    svint32_t bar(svint32_t);
    svint32_t foo(svint32_t x, bool *err) {
      try { bar(x); } catch (...) { *err = true; }
      return x;

`z0` needs to be spilled before the call to `bar(x)` and reloaded before
returning from foo, as the exception handler may have clobbered z0.

Reviewed By: efriedma

Differential Revision:
The file was modifiedllvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
The file was addedllvm/test/CodeGen/AArch64/unwind-preserved-from-mir.mir
The file was modifiedllvm/lib/CodeGen/MIRParser/MIRParser.cpp
The file was modifiedllvm/include/llvm/CodeGen/TargetRegisterInfo.h
The file was modifiedllvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
The file was modifiedllvm/lib/CodeGen/LiveIntervals.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64RegisterInfo.h
The file was addedllvm/test/CodeGen/AArch64/unwind-preserved.ll
Commit 4bdab2e86aba371dbad100dd3515ab9f05833719 by jay.foad
[AMDGPU] Fix offset for REL32_HI relocs

The addend in a REL32 reloc needs to be adjusted to account for the
offset from the PC value returned by the s_getpc instruction to the
point where the reloc is applied. This was being done correctly for
(GOTPC)REL32_LO but not for (GOTPC)REL32_HI. This will only make a
difference if the target symbol happens to get loaded almost exactly
a multiple of 4G away from the relocated instructions.

Differential Revision:
The file was modifiedllvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/global-constant.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/cc-update.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/call-argument-types.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/propagate-attributes-bitcast-function.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/indirect-call.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/vgpr-tuple-allocation.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/reassoc-scalar.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/propagate-attributes-single-set.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/rel32.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/localizer.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/call-constexpr.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/propagate-attributes-clone.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/divergent-control-flow.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/call-preserved-registers.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIISelLowering.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/callee-special-input-sgprs.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/cross-block-use-is-not-abi-copy.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/mem-builtins.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/captured-frame-index.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/sibling-call.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/function-call-relocs.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/global-variable-relocs.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/call-waitcnt.ll
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/global-value.ll
The file was modifiedllvm/test/CodeGen/MIR/AMDGPU/target-flags.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/asm-printer-check-vcc.mir
Commit f72121254da48bf668c35918b53c96cf8c568342 by paul.walker
[SVE] Don't reorder subvector/binop sequences when the resulting binop is not legal.

When lowering fixed length vector operations for SVE the subvector
operations are used extensively to marshall data between scalable
and fixed-length vectors. This means that sequences like:

  extract_subvec(binop(insert_subvec(a), insert_subvec(b)))

are very common. DAGCombine only checks if the resulting binop is
legal or can be custom lowered when undoing such sequences. When
it's custom lowering that is introducing them the result is an
infinite legalise->combine->legalise loop.

This patch extends the isOperationLegalOr... functions to include
a "LegalOnly" parameter to restrict the check to legal operations
only. Although isOperationLegal could be used it's common for
the affected code paths to be visited pre and post legalisation,
so the extra parameter keeps the code tidy.

Differential Revision:
The file was modifiedllvm/include/llvm/CodeGen/TargetLowering.h
The file was modifiedllvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
The file was modifiedllvm/test/CodeGen/AArch64/sve-fixed-length-subvector.ll


  1. Add flang to clang-cmake-aarch64-full (details)
Commit 4720197cd41704fb7294de7452656c77e4183c8d by diana.picus
Add flang to clang-cmake-aarch64-full

This requires some minor changes in, to make it possible
to enable flang along all the other projects. Enabling flang will also
pull in MLIR.

Differential Revision:
The file was modifiedzorg/buildbot/builders/
The file was modifiedbuildbot/osuosl/master/config/