SuccessChanges

Summary

  1. [llvm/Object] - Make dyn_cast<XCOFFObjectFile> work as it should. (details)
  2. [mlir][PDL] Add a PDL Interpreter Dialect (details)
  3. [Scheduling] Implement a new way to cluster loads/stores (details)
  4. [DWARFYAML] Make the unit_length and header_length fields optional. (details)
  5. [AMDGPU][GlobalISel] Eliminate barrier if workgroup size is not greater than wavefront size (details)
  6. GlobalISel: Combine G_ADD of G_PTRTOINT to G_PTR_ADD (details)
  7. AMDGPU/GlobalISel: Tolerate negated control flow intrinsic outputs (details)
  8. Add clang-cl "vctoolsdir" option to specify the location of the msvc toolchain (details)
  9. AMDGPU: Use Subtarget reference in SIInstrInfo (details)
  10. [Support] Allow printing the stack trace only for a given depth (details)
  11. [LegalizeTypes] Add ROTL/ROTR to ScalarizeVectorResult. (details)
  12. [libc] Add implementations for sqrt, sqrtf, and sqrtl. (details)
  13. [OpenMP] Fix build on macOS sdk 10.12 and newer (details)
  14. [AMDGPU] Make more use of Subtarget reference in SIInstrInfo (details)
  15. [lldb][NFC] Simplify string literal in GDBRemoteCommunicationClient (details)
  16. Fix failing tests after VCTOOLSDIR change (details)
  17. Bump -len_control value in fuzzer-custommutator.test (PR47286) (details)
  18. [clangd] Enable recovery-ast-type by default. (details)
  19. [libc++] Always run Ninja through xcrun in the macOS CI scripts (details)
Commit 92c527e5a2b49fb1213ceda97738d4caf414666a by grimar
[llvm/Object] - Make dyn_cast<XCOFFObjectFile> work as it should.

Currently, `dyn_cast<XCOFFObjectFile>` always does cast and returns a pointer,
even when we pass `ELF`/`Wasm`/`Mach-O` or `COFF` instead of `XCOFF`.

It happens because `XCOFFObjectFile` class does not implement `classof`.
I've fixed it and added a unit test.

Differential revision: https://reviews.llvm.org/D86542
The file was modifiedllvm/unittests/Object/XCOFFObjectFileTest.cpp
The file was modifiedllvm/include/llvm/Object/XCOFFObjectFile.h
Commit d289a97f91443177b605926668512479c2cee37b by riddleriver
[mlir][PDL] Add a PDL Interpreter Dialect

The PDL Interpreter dialect provides a lower level abstraction compared to the PDL dialect, and is targeted towards low level optimization and interpreter code generation. The dialect operations encapsulates low-level pattern match and rewrite "primitives", such as navigating the IR (Operation::getOperand), creating new operations (OpBuilder::create), etc. Many of the operations within this dialect also fuse branching control flow with some form of a predicate comparison operation. This type of fusion reduces the amount of work that an interpreter must do when executing.

An example of this representation is shown below:

```mlir
// The following high level PDL pattern:
pdl.pattern : benefit(1) {
  %resultType = pdl.type
  %inputOperand = pdl.input
  %root, %results = pdl.operation "foo.op"(%inputOperand) -> %resultType
  pdl.rewrite %root {
    pdl.replace %root with (%inputOperand)
  }
}

// May be represented in the interpreter dialect as follows:
module {
  func @matcher(%arg0: !pdl.operation) {
    pdl_interp.check_operation_name of %arg0 is "foo.op" -> ^bb2, ^bb1
  ^bb1:
    pdl_interp.return
  ^bb2:
    pdl_interp.check_operand_count of %arg0 is 1 -> ^bb3, ^bb1
  ^bb3:
    pdl_interp.check_result_count of %arg0 is 1 -> ^bb4, ^bb1
  ^bb4:
    %0 = pdl_interp.get_operand 0 of %arg0
    pdl_interp.is_not_null %0 : !pdl.value -> ^bb5, ^bb1
  ^bb5:
    %1 = pdl_interp.get_result 0 of %arg0
    pdl_interp.is_not_null %1 : !pdl.value -> ^bb6, ^bb1
  ^bb6:
    pdl_interp.record_match @rewriters::@rewriter(%0, %arg0 : !pdl.value, !pdl.operation) : benefit(1), loc([%arg0]), root("foo.op") -> ^bb1
  }
  module @rewriters {
    func @rewriter(%arg0: !pdl.value, %arg1: !pdl.operation) {
      pdl_interp.replace %arg1 with(%arg0)
      pdl_interp.return
    }
  }
}
```

Differential Revision: https://reviews.llvm.org/D84579
The file was modifiedmlir/lib/Parser/Parser.h
The file was addedmlir/include/mlir/Dialect/PDLInterp/IR/PDLInterp.h
The file was addedmlir/lib/Dialect/PDLInterp/CMakeLists.txt
The file was modifiedmlir/include/mlir/IR/Attributes.h
The file was modifiedmlir/test/Dialect/PDL/ops.mlir
The file was modifiedmlir/lib/Dialect/PDL/IR/PDL.cpp
The file was addedmlir/test/Dialect/PDLInterp/ops.mlir
The file was modifiedmlir/lib/Parser/Parser.cpp
The file was addedmlir/include/mlir/Dialect/PDLInterp/IR/PDLInterpOps.td
The file was modifiedmlir/tools/mlir-tblgen/OpFormatGen.cpp
The file was modifiedmlir/lib/Dialect/CMakeLists.txt
The file was modifiedmlir/include/mlir/IR/Builders.h
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
The file was modifiedmlir/test/Dialect/PDL/invalid.mlir
The file was modifiedmlir/include/mlir/Dialect/PDL/IR/PDLBase.td
The file was modifiedmlir/lib/Parser/AttributeParser.cpp
The file was modifiedmlir/include/mlir/Dialect/CMakeLists.txt
The file was addedmlir/include/mlir/Dialect/PDLInterp/IR/CMakeLists.txt
The file was modifiedmlir/include/mlir/IR/OpImplementation.h
The file was addedmlir/lib/Dialect/PDLInterp/IR/CMakeLists.txt
The file was modifiedmlir/lib/IR/Builders.cpp
The file was modifiedmlir/include/mlir/InitAllDialects.h
The file was addedmlir/lib/Dialect/PDLInterp/IR/PDLInterp.cpp
The file was modifiedmlir/include/mlir/Dialect/PDL/IR/PDLOps.td
The file was addedmlir/include/mlir/Dialect/PDLInterp/CMakeLists.txt
Commit ebf3b188c6edcce7e90ddcacbe7c51c90d95b0ac by qshanz
[Scheduling] Implement a new way to cluster loads/stores

Before calling target hook to determine if two loads/stores are clusterable,
we put them into different groups to avoid fake cluster due to dependency.
For now, we are putting the loads/stores into the same group if they have
the same predecessor. We assume that, if two loads/stores have the same
predecessor, it is likely that, they didn't have dependency for each other.

However, one SUnit might have several predecessors and for now, we just
pick up the first predecessor that has non-data/non-artificial dependency,
which is too arbitrary. And we are struggling to fix it.

So, I am proposing some better implementation.
1. Collect all the loads/stores that has memory info first to reduce the complexity.
2. Sort these loads/stores so that we can stop the seeking as early as possible.
3. For each load/store, seeking for the first non-dependency instruction with the
   sorted order, and check if they can cluster or not.

Reviewed By: Jay Foad

Differential Revision: https://reviews.llvm.org/D85517
The file was modifiedllvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll
The file was modifiedllvm/include/llvm/CodeGen/ScheduleDAGInstrs.h
The file was modifiedllvm/test/CodeGen/AMDGPU/stack-realign.ll
The file was modifiedllvm/lib/CodeGen/MachineScheduler.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/max.i16.ll
The file was modifiedllvm/test/CodeGen/AArch64/aarch64-stp-cluster.ll
Commit 8daa3264a3329ad34a0b210afdd8699f27d66db2 by Xing
[DWARFYAML] Make the unit_length and header_length fields optional.

This patch makes the unit_length and header_length fields of line tables
optional. yaml2obj is able to infer them for us.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D86590
The file was modifiedllvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
The file was modifiedllvm/include/llvm/ObjectYAML/DWARFYAML.h
The file was modifiedllvm/test/tools/yaml2obj/ELF/DWARF/debug-line.yaml
The file was modifiedllvm/lib/ObjectYAML/DWARFYAML.cpp
The file was modifiedllvm/tools/obj2yaml/dwarf2yaml.cpp
The file was modifiedllvm/lib/ObjectYAML/DWARFEmitter.cpp
Commit 831457c6d59edb0e381917b35ca6099f9b86c6e8 by jay.foad
[AMDGPU][GlobalISel] Eliminate barrier if workgroup size is not greater than wavefront size

If a workgroup size is known to be not greater than wavefront size
the s_barrier instruction is not needed since all threads are guaranteed
to come to the same point at the same time.

This is the same optimization that was implemented for SelectionDAG in
D31731.

Differential Revision: https://reviews.llvm.org/D86609
The file was modifiedllvm/test/CodeGen/AMDGPU/barrier-elimination.ll
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
Commit eb074088c983ce6f255e0e83babdd32a4f2dd457 by Matthew.Arsenault
GlobalISel: Combine G_ADD of G_PTRTOINT to G_PTR_ADD

This produces less work for addressing mode matching. I think this is
safe since I don't think machine IR is supposed to give the same
aliasing properties as getelementptr in the IR.
The file was modifiedllvm/include/llvm/Target/GlobalISel/Combine.td
The file was modifiedllvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
The file was modifiedllvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/combine-add-to-ptradd.mir
Commit 21ccedc24fc49b43e84095b4773f8aa86c366dac by Matthew.Arsenault
AMDGPU/GlobalISel: Tolerate negated control flow intrinsic outputs

If the condition output is negated, swap the branch targets. This is
similar to what SelectionDAG does for when SelectionDAGBuilder
decides to invert the condition and swap the branches.

This is leaving behind a dead constant def for some reason.
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-amdgcn.if-invalid.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-brcond.mir
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
Commit 08704714421086e775a297339fc60963cc07eec9 by hans
Add clang-cl "vctoolsdir" option to specify the location of the msvc toolchain

Add an option to directly specify where the msvc toolchain lives for
clang-cl and avoid unwanted file and registry probes.

Differential revision: https://reviews.llvm.org/D85998
The file was modifiedclang/lib/Driver/ToolChains/MSVC.cpp
The file was modifiedclang/include/clang/Driver/Options.td
The file was modifiedclang/test/Driver/cl-options.c
Commit ff34116cf022ca010d60b972dae55016cd5f7478 by Matthew.Arsenault
AMDGPU: Use Subtarget reference in SIInstrInfo
The file was modifiedllvm/lib/Target/AMDGPU/SIInstrInfo.cpp
Commit a7da7e421c54e0053628f18f3750d4a8588cd627 by alexandre.ganea
[Support] Allow printing the stack trace only for a given depth

Differential Revision: https://reviews.llvm.org/D85458
The file was modifiedllvm/include/llvm/Support/Signals.h
The file was modifiedllvm/unittests/Support/CrashRecoveryTest.cpp
The file was modifiedllvm/lib/Support/Unix/Signals.inc
The file was modifiedllvm/lib/Support/Windows/Signals.inc
Commit 75d159f924868ec93e3008b04b637412b64de29e by jay.foad
[LegalizeTypes] Add ROTL/ROTR to ScalarizeVectorResult.

We can scalarize these just like any other binary operation.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47303 caused by D77152.

Differential Revision: https://reviews.llvm.org/D86601
The file was modifiedllvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
Commit 5078825aa982905088502f14b5387fc5c96017fe by lntue
[libc] Add implementations for sqrt, sqrtf, and sqrtl.

Differential Revision: https://reviews.llvm.org/D84726
The file was addedlibc/test/src/math/sqrtf_test.cpp
The file was addedlibc/utils/FPUtil/SqrtLongDoubleX86.h
The file was modifiedlibc/config/linux/aarch64/entrypoints.txt
The file was modifiedlibc/config/linux/x86_64/entrypoints.txt
The file was modifiedlibc/test/src/math/CMakeLists.txt
The file was addedlibc/src/math/sqrt.cpp
The file was addedlibc/src/math/sqrtl.h
The file was addedlibc/test/src/math/sqrt_test.cpp
The file was addedlibc/test/src/math/sqrtl_test.cpp
The file was modifiedlibc/src/math/CMakeLists.txt
The file was modifiedlibc/spec/stdc.td
The file was addedlibc/src/math/sqrt.h
The file was addedlibc/utils/FPUtil/Sqrt.h
The file was addedlibc/src/math/sqrtf.cpp
The file was modifiedlibc/config/linux/api.td
The file was addedlibc/src/math/sqrtl.cpp
The file was addedlibc/src/math/sqrtf.h
Commit 09af378f49dca98bc931ba0ff2c1cde307fe7c2c by Andrey.Churbanov
[OpenMP] Fix build on macOS sdk 10.12 and newer

Patch by nihui (Ni Hui)

Differential Revision: https://reviews.llvm.org/D76755
The file was modifiedopenmp/runtime/src/kmp_wrapper_getpid.h
Commit a75e67b3b4885efdb6a0b0b2939cccb5a9e67b72 by jay.foad
[AMDGPU] Make more use of Subtarget reference in SIInstrInfo
The file was modifiedllvm/lib/Target/AMDGPU/SIInstrInfo.cpp
Commit 4a15f51a4f7726e12c327fa30e76d90a2b90430b by Raphael Isemann
[lldb][NFC] Simplify string literal in GDBRemoteCommunicationClient
The file was modifiedlldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp
Commit 7a34dca0f3918ab1c0397e56dd64a3c04164c8b6 by hans
Fix failing tests after VCTOOLSDIR change

Switch from hardcoded x64 arch to a regex in the target triple

Differential revision: https://reviews.llvm.org/D86622
The file was modifiedclang/test/Driver/cl-options.c
Commit 8421503300c6145480710761983f089ccbe0bb56 by hans
Bump -len_control value in fuzzer-custommutator.test (PR47286)

to make the test more stable, as suggested by mmoroz.
The file was modifiedcompiler-rt/test/fuzzer/fuzzer-custommutator.test
Commit 667867e0df26e45ed2c86e192fee69dd484167c7 by hokein.wu
[clangd] Enable recovery-ast-type by default.

Differential Revision: https://reviews.llvm.org/D86602
The file was modifiedclang-tools-extra/clangd/tool/ClangdMain.cpp
The file was modifiedclang-tools-extra/clangd/ClangdServer.h
Commit 3d120b6f7be816d188bd05271fff17f0030db9b2 by Louis Dionne
[libc++] Always run Ninja through xcrun in the macOS CI scripts

Ninja isn't installed by default on OSX, so run it through xcrun to find
the one in the developer tools if needed.
The file was modifiedlibcxx/utils/ci/macos-trunk.sh