SuccessChanges

Summary

  1. [lldb] More memory allocation test fixes (details)
  2. [MBP] Add whole chain to BlockFilterSet instead of individual BB (details)
  3. [llvm] Set the default for -bbsections-cold-text-prefix to .text.split. (details)
  4. [clang] Add -fc++-abi= flag for specifying which C++ ABI to use (details)
  5. [mlir] Linalg refactor for using "bufferize" terminology. (details)
  6. [mlir] Rename ShapeTypeConversion to ShapeBufferize (details)
  7. [mlir] Refactor code out of BufferPlacement.cpp (details)
  8. [mlir][bufferize] Rename BufferAssignment* to Bufferize* (details)
  9. [mlir] Fix some style comments from D89268 (details)
  10. [llvm] Update default cutoff threshold for machine function splitter. (details)
  11. clang/Basic: Replace ContentCache::getBuffer with Optional semantics (details)
  12. Make header self-contained. NFC. (details)
  13. [mlir][Linalg] Rethink fusion of linalg ops with reshape ops. (details)
Commit a1ab2b773b6d78ec71edfebd2474c403cbe77977 by pavel
[lldb] More memory allocation test fixes

XFAIL nodefaultlib.cpp on darwin - the test does not pass there

XFAIL TestGdbRemoteMemoryAllocation on windows - memory is allocated
with incorrect permissions
The file was modifiedlldb/test/Shell/Expr/nodefaultlib.cpp
The file was modifiedlldb/test/API/tools/lldb-server/memory-allocation/TestGdbRemoteMemoryAllocation.py
Commit adfb5415010fbbc009a4a6298cfda7a6ed4fa6d4 by carrot
[MBP] Add whole chain to BlockFilterSet instead of individual BB

Currently we add individual BB to BlockFilterSet if its frequency satisfies

LoopFreq / Freq <= LoopToColdBlockRatio

LoopFreq is edge frequency from outside to loop header.
LoopToColdBlockRatio is a command line parameter.

It doesn't make sense since we always layout whole chain, not individual BBs.

It may also cause a tricky problem. Sometimes it is possible that the LoopFreq
of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in
BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop,
like .cold in the test case. So it is added to the chain of inner loop. When
work on the outer loop, .cold is not added to BlockFilterSet, so the edge to
successor .problem is not counted in UnscheduledPredecessors of .problem chain.
But other blocks in the inner loop are added BlockFilterSet, so the whole inner
loop chain can be layout, and markChainSuccessors is called to decrease
UnscheduledPredecessors of following chains. markChainSuccessors calls
markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold,
so .problem chain's UnscheduledPredecessors is decreased, but this edge was not
counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors
becomes 0 when it still has an unscheduled predecessor .pred! And it causes
problems in following various successor BB selection algorithms.

Differential Revision: https://reviews.llvm.org/D89088
The file was modifiedllvm/lib/CodeGen/MachineBlockPlacement.cpp
The file was addedllvm/test/CodeGen/X86/block_set.ll
Commit 77638a5343d5b4c1a87ec2b7fb3671ccb108a059 by snehasishk
[llvm] Set the default for -bbsections-cold-text-prefix to .text.split.

After using this for a while, we find that it is generally useful to
have it set to .text.split. by default, removing the need for an
additional -mllvm option.

Differential Revision: https://reviews.llvm.org/D88997
The file was modifiedllvm/test/CodeGen/X86/basic-block-sections-cold.ll
The file was modifiedllvm/lib/CodeGen/BasicBlockSections.cpp
The file was modifiedllvm/test/CodeGen/X86/basic-block-sections-clusters.ll
The file was modifiedllvm/test/CodeGen/X86/basic-block-sections-clusters-branches.ll
The file was modifiedllvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
The file was modifiedllvm/test/CodeGen/X86/machine-function-splitter.ll
Commit 683b308c07bf827255fe1403056413f790e03729 by leonardchan
[clang] Add -fc++-abi= flag for specifying which C++ ABI to use

This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.

The goal is to add a way to override the default target C++ ABI through
a compiler flag. This makes it easier to test and transition between different
C++ ABIs through compile flags rather than build flags.

In this patch:
- Store `-fc++-abi=` in a LangOpt. This isn't stored in a
  CodeGenOpt because there are instances outside of codegen where Clang
  needs to know what the ABI is (particularly through
  ASTContext::createCXXABI), and we should be able to override the
  target default if the flag is provided at that point.
- Expose the existing ABIs in TargetCXXABI as values that can be passed
  through this flag.
  - Create a .def file for these ABIs to make it easier to check flag
    values.
  - Add an error for diagnosing bad ABI flag values.

Differential Revision: https://reviews.llvm.org/D85802
The file was modifiedclang/include/clang/Driver/Options.td
The file was modifiedclang/lib/CodeGen/CodeGenModule.cpp
The file was modifiedclang/lib/Frontend/CompilerInvocation.cpp
The file was modifiedclang/include/clang/Basic/LangOptions.h
The file was addedclang/include/clang/Basic/TargetCXXABI.def
The file was addedclang/test/Frontend/invalid-cxx-abi.cpp
The file was modifiedclang/include/clang/Basic/TargetCXXABI.h
The file was modifiedclang/lib/AST/ASTContext.cpp
The file was modifiedclang/lib/CodeGen/ItaniumCXXABI.cpp
The file was modifiedclang/include/clang/Basic/DiagnosticDriverKinds.td
The file was modifiedclang/lib/Driver/ToolChains/Clang.cpp
The file was modifiedclang/include/clang/AST/ASTContext.h
Commit 9ca97cde8508b92856d22e2164c8b6fb6756696e by silvasean
[mlir] Linalg refactor for using "bufferize" terminology.

Part of the refactor discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17

Differential Revision: https://reviews.llvm.org/D89261
The file was removedmlir/lib/Dialect/Linalg/Transforms/TensorsToBuffers.cpp
The file was modifiedmlir/include/mlir/Dialect/Linalg/Passes.td
The file was modifiedmlir/integration_test/Dialect/Linalg/CPU/test-tensor-matmul.mlir
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/CMakeLists.txt
The file was modifiedmlir/include/mlir/Dialect/Linalg/Passes.h
The file was modifiedmlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
The file was addedmlir/test/Dialect/Linalg/bufferize.mlir
The file was removedmlir/test/Dialect/Linalg/tensors-to-buffers.mlir
The file was modifiedmlir/integration_test/Dialect/Linalg/CPU/test-tensor-e2e.mlir
The file was addedmlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp
Commit 6b30fb7653948fec80ca0cea19d8691495c96c28 by silvasean
[mlir] Rename ShapeTypeConversion to ShapeBufferize

Once we have tensor_to_memref ops suitable for type materializations,
this pass can be split into a generic type conversion pattern.

Part of the refactor discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17

Differential Revision: https://reviews.llvm.org/D89258
The file was modifiedmlir/include/mlir/Dialect/Shape/Transforms/Passes.td
The file was modifiedmlir/include/mlir/Dialect/Shape/Transforms/Passes.h
The file was addedmlir/test/Dialect/Shape/bufferize.mlir
The file was addedmlir/lib/Dialect/Shape/Transforms/Bufferize.cpp
The file was removedmlir/lib/Dialect/Shape/Transforms/ShapeTypeConversion.cpp
The file was removedmlir/test/Dialect/Shape/shape-type-conversion.mlir
The file was modifiedmlir/lib/Dialect/Shape/Transforms/CMakeLists.txt
Commit 1cca0f323efab386300f19902faa6337dccae1c1 by silvasean
[mlir] Refactor code out of BufferPlacement.cpp

Now BufferPlacement.cpp doesn't depend on Bufferize.h.

Part of the refactor discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17

Differential Revision: https://reviews.llvm.org/D89268
The file was modifiedmlir/lib/Transforms/CMakeLists.txt
The file was modifiedmlir/lib/Transforms/BufferPlacement.cpp
The file was addedmlir/lib/Transforms/Bufferize.cpp
The file was modifiedmlir/include/mlir/Transforms/Bufferize.h
Commit 9a14cb53cb4cd92b2c261a040a8750973b991b9f by silvasean
[mlir][bufferize] Rename BufferAssignment* to Bufferize*

Part of the refactor discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17

Differential Revision: https://reviews.llvm.org/D89271
The file was modifiedmlir/test/lib/Transforms/TestBufferPlacement.cpp
The file was modifiedmlir/lib/Dialect/Shape/Transforms/Bufferize.cpp
The file was modifiedmlir/include/mlir/Dialect/Shape/Transforms/Passes.h
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp
The file was modifiedmlir/test/Transforms/buffer-placement-preparation.mlir
The file was modifiedmlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
The file was modifiedmlir/lib/Transforms/Bufferize.cpp
The file was modifiedmlir/include/mlir/Transforms/Bufferize.h
Commit dd378739d731c81ded5209d70e2313c24811434d by silvasean
[mlir] Fix some style comments from D89268

That change was a pure move, so split out the stylistic changes into
this patch.

Differential Revision: https://reviews.llvm.org/D89272
The file was modifiedmlir/lib/Transforms/Bufferize.cpp
Commit 24bf6ff4e08f88df0b6c01ef87aa384276636901 by snehasishk
[llvm] Update default cutoff threshold for machine function splitter.

Based on internal testing at Google we found that setting the profile
summary cutoff threshold to 999950 yields the best results in terms of
itlb and icache metrics (as observed on Intel CPUs).

*default* = Split out code if no profile count available for block
*size-%*  = The fraction of bytes split out of .text and .text.hot
*itlb*    = Misses per kilo instructions (MPKI) for itlb
*icache*  = Misses per kilo instructions (MPKI) for L1 icache

Search1

| cutoff  | size-%  | itlb      | icache  |
|---------|---------|-----------|---------|
| default | 42.5861 | 0.0822151 | 2.46363 |
|  999999 | 44.9350 | 0.0767194 | 2.44416 |
|  999950 | 50.0660 |  0.075744 |  2.4091 |
|  999500 | 56.9158 |  0.082564 |  2.4188 |
|  995000 | 63.8625 | 0.0814927 | 2.42832 |
|  990000 | 71.7314 |  0.106906 | 2.57785 |

Search2

| cutoff  | size-% | itlb     | icache  |
|---------|--------|----------|---------|
| default | 2.8845 | 0.626712 | 4.73245 |
|  999999 | 3.3291 | 0.602309 | 4.70045 |
|  999950 | 3.8577 | 0.587842 | 4.71632 |
|  999500 | 4.4170 |  0.63577 | 4.68351 |
|  995000 | 5.1020 | 0.657969 | 4.82272 |
|  990000 | 5.7153 | 0.719122 | 5.39496 |

Differential Revision: https://reviews.llvm.org/D89085
The file was modifiedllvm/lib/CodeGen/MachineFunctionSplitter.cpp
The file was modifiedllvm/test/CodeGen/X86/machine-function-splitter.ll
Commit d758f79e5d381bd4f5122193a9538d89c907c812 by Duncan P. N. Exon Smith
clang/Basic: Replace ContentCache::getBuffer with Optional semantics

Remove `ContentCache::getBuffer`, which always returned a
dereferenceable `MemoryBuffer*` and had a `bool*Invalid` out parameter,
and replace it with:

- `ContentCache::getBufferOrNone`, which returns
  `Optional<MemoryBufferRef>`. This is the new API that consumers should
  use. Later it could be renamed to `getBuffer`, but intentionally using
  a different name to root out any unexpected callers.
- `ContentCache::getBufferPointer`, which returns `MemoryBuffer*` with
  "optional" semantics. This is `private` to avoid growing callers and
  `SourceManager` has temporarily been made a `friend` to access it.
  Later paches will update the transitive callers to not need a raw
  pointer, and eventually this will be deleted.

No functionality change intended here.

Differential Revision: https://reviews.llvm.org/D89348
The file was modifiedclang/lib/AST/ASTImporter.cpp
The file was modifiedclang/lib/Basic/SourceManager.cpp
The file was modifiedclang/lib/Serialization/ASTWriter.cpp
The file was modifiedclang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp
The file was modifiedclang/include/clang/Basic/SourceManager.h
Commit 633f9fcb820bf01d59cdcdd8038889eec61cf2f2 by benny.kra
Make header self-contained. NFC.
The file was modifiedclang/include/clang/Basic/TargetCXXABI.h
Commit de2568aab819f4ae97a9d92ea68ef1a8ab56ae8c by ravishankarm
[mlir][Linalg] Rethink fusion of linalg ops with reshape ops.

The current fusion on tensors fuses reshape ops with generic ops by
linearizing the indexing maps of the fused tensor in the generic
op. This has some limitations
- It only works for static shapes
- The resulting indexing map has a linearization that would be
  potentially prevent fusion later on (for ex. tile + fuse).

Instead, try to fuse the reshape consumer (producer) with generic op
producer (consumer) by expanding the dimensionality of the generic op
when the reshape is expanding (folding).  This approach conflicts with
the linearization approach. The expansion method is used instead of
the linearization method.

Further refactoring that changes the fusion on tensors to be a
collection of patterns.

Differential Revision: https://reviews.llvm.org/D89002
The file was modifiedmlir/include/mlir/Dialect/Linalg/Passes.h
The file was addedmlir/test/Dialect/Linalg/reshape_linearization_fusion.mlir
The file was modifiedmlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
The file was addedmlir/test/Dialect/Linalg/reshape_fusion.mlir
The file was modifiedmlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp
The file was modifiedmlir/include/mlir/Dialect/Linalg/Passes.td
The file was modifiedmlir/include/mlir/Dialect/Linalg/Utils/Utils.h
The file was modifiedmlir/test/Dialect/Linalg/fusion-tensor.mlir