Commit
a1ab2b773b6d78ec71edfebd2474c403cbe77977
by pavel[lldb] More memory allocation test fixes
XFAIL nodefaultlib.cpp on darwin - the test does not pass there
XFAIL TestGdbRemoteMemoryAllocation on windows - memory is allocated with incorrect permissions
|
 | lldb/test/API/tools/lldb-server/memory-allocation/TestGdbRemoteMemoryAllocation.py |
 | lldb/test/Shell/Expr/nodefaultlib.cpp |
Commit
adfb5415010fbbc009a4a6298cfda7a6ed4fa6d4
by carrot[MBP] Add whole chain to BlockFilterSet instead of individual BB
Currently we add individual BB to BlockFilterSet if its frequency satisfies
LoopFreq / Freq <= LoopToColdBlockRatio
LoopFreq is edge frequency from outside to loop header. LoopToColdBlockRatio is a command line parameter.
It doesn't make sense since we always layout whole chain, not individual BBs.
It may also cause a tricky problem. Sometimes it is possible that the LoopFreq of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop, like .cold in the test case. So it is added to the chain of inner loop. When work on the outer loop, .cold is not added to BlockFilterSet, so the edge to successor .problem is not counted in UnscheduledPredecessors of .problem chain. But other blocks in the inner loop are added BlockFilterSet, so the whole inner loop chain can be layout, and markChainSuccessors is called to decrease UnscheduledPredecessors of following chains. markChainSuccessors calls markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold, so .problem chain's UnscheduledPredecessors is decreased, but this edge was not counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors becomes 0 when it still has an unscheduled predecessor .pred! And it causes problems in following various successor BB selection algorithms.
Differential Revision: https://reviews.llvm.org/D89088
|
 | llvm/test/CodeGen/X86/block_set.ll |
 | llvm/lib/CodeGen/MachineBlockPlacement.cpp |
Commit
77638a5343d5b4c1a87ec2b7fb3671ccb108a059
by snehasishk[llvm] Set the default for -bbsections-cold-text-prefix to .text.split.
After using this for a while, we find that it is generally useful to have it set to .text.split. by default, removing the need for an additional -mllvm option.
Differential Revision: https://reviews.llvm.org/D88997
|
 | llvm/test/CodeGen/X86/basic-block-sections-cold.ll |
 | llvm/test/CodeGen/X86/basic-block-sections-clusters-branches.ll |
 | llvm/lib/CodeGen/BasicBlockSections.cpp |
 | llvm/test/CodeGen/X86/basic-block-sections-clusters.ll |
 | llvm/test/CodeGen/X86/machine-function-splitter.ll |
 | llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp |
Commit
683b308c07bf827255fe1403056413f790e03729
by leonardchan[clang] Add -fc++-abi= flag for specifying which C++ ABI to use
This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to override the default target C++ ABI through a compiler flag. This makes it easier to test and transition between different C++ ABIs through compile flags rather than build flags.
In this patch: - Store `-fc++-abi=` in a LangOpt. This isn't stored in a CodeGenOpt because there are instances outside of codegen where Clang needs to know what the ABI is (particularly through ASTContext::createCXXABI), and we should be able to override the target default if the flag is provided at that point. - Expose the existing ABIs in TargetCXXABI as values that can be passed through this flag. - Create a .def file for these ABIs to make it easier to check flag values. - Add an error for diagnosing bad ABI flag values.
Differential Revision: https://reviews.llvm.org/D85802
|
 | clang/include/clang/Basic/TargetCXXABI.h |
 | clang/include/clang/Driver/Options.td |
 | clang/lib/AST/ASTContext.cpp |
 | clang/test/Frontend/invalid-cxx-abi.cpp |
 | clang/include/clang/AST/ASTContext.h |
 | clang/include/clang/Basic/DiagnosticDriverKinds.td |
 | clang/include/clang/Basic/TargetCXXABI.def |
 | clang/include/clang/Basic/LangOptions.h |
 | clang/lib/CodeGen/CodeGenModule.cpp |
 | clang/lib/CodeGen/ItaniumCXXABI.cpp |
 | clang/lib/Driver/ToolChains/Clang.cpp |
 | clang/lib/Frontend/CompilerInvocation.cpp |
Commit
9ca97cde8508b92856d22e2164c8b6fb6756696e
by silvasean[mlir] Linalg refactor for using "bufferize" terminology.
Part of the refactor discussed in: https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17
Differential Revision: https://reviews.llvm.org/D89261
|
 | mlir/integration_test/Dialect/Linalg/CPU/test-tensor-matmul.mlir |
 | mlir/lib/Dialect/Linalg/Transforms/CMakeLists.txt |
 | mlir/include/mlir/Dialect/Linalg/Passes.td |
 | mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h |
 | mlir/test/Dialect/Linalg/bufferize.mlir |
 | mlir/include/mlir/Dialect/Linalg/Passes.h |
 | mlir/test/Dialect/Linalg/tensors-to-buffers.mlir |
 | mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp |
 | mlir/integration_test/Dialect/Linalg/CPU/test-tensor-e2e.mlir |
 | mlir/lib/Dialect/Linalg/Transforms/TensorsToBuffers.cpp |
Commit
6b30fb7653948fec80ca0cea19d8691495c96c28
by silvasean[mlir] Rename ShapeTypeConversion to ShapeBufferize
Once we have tensor_to_memref ops suitable for type materializations, this pass can be split into a generic type conversion pattern.
Part of the refactor discussed in: https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17
Differential Revision: https://reviews.llvm.org/D89258
|
 | mlir/lib/Dialect/Shape/Transforms/CMakeLists.txt |
 | mlir/include/mlir/Dialect/Shape/Transforms/Passes.h |
 | mlir/test/Dialect/Shape/shape-type-conversion.mlir |
 | mlir/include/mlir/Dialect/Shape/Transforms/Passes.td |
 | mlir/lib/Dialect/Shape/Transforms/ShapeTypeConversion.cpp |
 | mlir/lib/Dialect/Shape/Transforms/Bufferize.cpp |
 | mlir/test/Dialect/Shape/bufferize.mlir |
Commit
1cca0f323efab386300f19902faa6337dccae1c1
by silvasean[mlir] Refactor code out of BufferPlacement.cpp
Now BufferPlacement.cpp doesn't depend on Bufferize.h.
Part of the refactor discussed in: https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17
Differential Revision: https://reviews.llvm.org/D89268
|
 | mlir/include/mlir/Transforms/Bufferize.h |
 | mlir/lib/Transforms/Bufferize.cpp |
 | mlir/lib/Transforms/BufferPlacement.cpp |
 | mlir/lib/Transforms/CMakeLists.txt |
Commit
9a14cb53cb4cd92b2c261a040a8750973b991b9f
by silvasean[mlir][bufferize] Rename BufferAssignment* to Bufferize*
Part of the refactor discussed in: https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17
Differential Revision: https://reviews.llvm.org/D89271
|
 | mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp |
 | mlir/lib/Dialect/Shape/Transforms/Bufferize.cpp |
 | mlir/test/lib/Transforms/TestBufferPlacement.cpp |
 | mlir/include/mlir/Transforms/Bufferize.h |
 | mlir/test/Transforms/buffer-placement-preparation.mlir |
 | mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h |
 | mlir/include/mlir/Dialect/Shape/Transforms/Passes.h |
 | mlir/lib/Transforms/Bufferize.cpp |
Commit
dd378739d731c81ded5209d70e2313c24811434d
by silvasean[mlir] Fix some style comments from D89268
That change was a pure move, so split out the stylistic changes into this patch.
Differential Revision: https://reviews.llvm.org/D89272
|
 | mlir/lib/Transforms/Bufferize.cpp |
Commit
24bf6ff4e08f88df0b6c01ef87aa384276636901
by snehasishk[llvm] Update default cutoff threshold for machine function splitter.
Based on internal testing at Google we found that setting the profile summary cutoff threshold to 999950 yields the best results in terms of itlb and icache metrics (as observed on Intel CPUs).
*default* = Split out code if no profile count available for block *size-%* = The fraction of bytes split out of .text and .text.hot *itlb* = Misses per kilo instructions (MPKI) for itlb *icache* = Misses per kilo instructions (MPKI) for L1 icache
Search1
| cutoff | size-% | itlb | icache | |---------|---------|-----------|---------| | default | 42.5861 | 0.0822151 | 2.46363 | | 999999 | 44.9350 | 0.0767194 | 2.44416 | | 999950 | 50.0660 | 0.075744 | 2.4091 | | 999500 | 56.9158 | 0.082564 | 2.4188 | | 995000 | 63.8625 | 0.0814927 | 2.42832 | | 990000 | 71.7314 | 0.106906 | 2.57785 |
Search2
| cutoff | size-% | itlb | icache | |---------|--------|----------|---------| | default | 2.8845 | 0.626712 | 4.73245 | | 999999 | 3.3291 | 0.602309 | 4.70045 | | 999950 | 3.8577 | 0.587842 | 4.71632 | | 999500 | 4.4170 | 0.63577 | 4.68351 | | 995000 | 5.1020 | 0.657969 | 4.82272 | | 990000 | 5.7153 | 0.719122 | 5.39496 |
Differential Revision: https://reviews.llvm.org/D89085
|
 | llvm/lib/CodeGen/MachineFunctionSplitter.cpp |
 | llvm/test/CodeGen/X86/machine-function-splitter.ll |
Commit
d758f79e5d381bd4f5122193a9538d89c907c812
by Duncan P. N. Exon Smithclang/Basic: Replace ContentCache::getBuffer with Optional semantics
Remove `ContentCache::getBuffer`, which always returned a dereferenceable `MemoryBuffer*` and had a `bool*Invalid` out parameter, and replace it with:
- `ContentCache::getBufferOrNone`, which returns `Optional<MemoryBufferRef>`. This is the new API that consumers should use. Later it could be renamed to `getBuffer`, but intentionally using a different name to root out any unexpected callers. - `ContentCache::getBufferPointer`, which returns `MemoryBuffer*` with "optional" semantics. This is `private` to avoid growing callers and `SourceManager` has temporarily been made a `friend` to access it. Later paches will update the transitive callers to not need a raw pointer, and eventually this will be deleted.
No functionality change intended here.
Differential Revision: https://reviews.llvm.org/D89348
|
 | clang/include/clang/Basic/SourceManager.h |
 | clang/lib/Serialization/ASTWriter.cpp |
 | clang/lib/Basic/SourceManager.cpp |
 | clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp |
 | clang/lib/AST/ASTImporter.cpp |
Commit
633f9fcb820bf01d59cdcdd8038889eec61cf2f2
by benny.kraMake header self-contained. NFC.
|
 | clang/include/clang/Basic/TargetCXXABI.h |
Commit
de2568aab819f4ae97a9d92ea68ef1a8ab56ae8c
by ravishankarm[mlir][Linalg] Rethink fusion of linalg ops with reshape ops.
The current fusion on tensors fuses reshape ops with generic ops by linearizing the indexing maps of the fused tensor in the generic op. This has some limitations - It only works for static shapes - The resulting indexing map has a linearization that would be potentially prevent fusion later on (for ex. tile + fuse).
Instead, try to fuse the reshape consumer (producer) with generic op producer (consumer) by expanding the dimensionality of the generic op when the reshape is expanding (folding). This approach conflicts with the linearization approach. The expansion method is used instead of the linearization method.
Further refactoring that changes the fusion on tensors to be a collection of patterns.
Differential Revision: https://reviews.llvm.org/D89002
|
 | mlir/include/mlir/Dialect/Linalg/Passes.td |
 | mlir/test/Dialect/Linalg/reshape_linearization_fusion.mlir |
 | mlir/test/Dialect/Linalg/reshape_fusion.mlir |
 | mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp |
 | mlir/include/mlir/Dialect/Linalg/Passes.h |
 | mlir/include/mlir/Dialect/Linalg/Utils/Utils.h |
 | mlir/test/Dialect/Linalg/fusion-tensor.mlir |
 | mlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td |
 | mlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp |