1. [compiler-rt][profile] Make corrupted-profile.c more robust (details)
  2. [ThinLTO] Don't emit original GUID for locals to distributed indexes (details)
  3. [ThinLTO] Fix bot failures (details)
  4. [mlir][linalg] Merge all tiling passes into a single one. (details)
  5. [mlir][linalg] Support tile+peel with TiledLoopOp (details)
  6. Add missing storageType to AttrDef to ODS (details)
  7. [Polly] Implement user-directed loop distribution/fission. (details)
  8. [Driver] Default Generic_GCC x86 to -fasynchronous-unwind-tables (details)
  9. [AMDGPU] Legalize initialized LDS variables (details)
  10. DebugInfo: Implement the -gsimple-template-names functionality (details)
  11. [ORC] Introduce EPCGenericDylibManager / SimpleExecutorDylibManager. (details)
  12. [gn build] Port a2c1cf09dfaa (details)
  13. [TableGen] Allow targets to entirely ignore Psets for registers (details)
Commit c579c658cd42034449d4fa19f28b43f2082c0991 by leonardchan
[compiler-rt][profile] Make corrupted-profile.c more robust

This test specifically checks that profiles are not mergeable if there's a
change in the CounterPtr in the profile header. The test manually changes
CounterPtr by explicitly calling memset on some offset into the profile file.
This test would fail if binary IDs were emitted because the offset calculation
does not take into account the binary ID sizes.

This patch updates the test to use types provided in profile/
to make it more resistant to profile layout changes.

Differential Revision:
The file was modifiedcompiler-rt/test/profile/Linux/corrupted-profile.c
Commit 2c1defeee40cf643ea6f0fa5e01164c9a4c48c30 by tejohnson
[ThinLTO] Don't emit original GUID for locals to distributed indexes

In ThinLTO for locals we normally compute the GUID from the name after
prepending the source path to get a unique global id. SamplePGO indirect
call profiles contain the target GUID without this uniquification,
however (unless compiling with -funique-internal-linkage-names).
Therefore, the index contains the original GUID of the local symbols
(without module path prepended to uniquify), in order to correctly
handle the call edges added for these indirect call profile targets
with SamplePGO.

We were emitting these to the combined index when writing it out as
bitcode, which is unnecessary and causes overhead when writing out the
indexes for distributed backends. The only use of the original GUID name
is in the thin link. Suppress it in that case. This reduced the thin
link time for a large distributed build by about 7%, and the aggregate
size of the serialized indexes by over 2%.

Continue to print it when writing out the full index, since that is just
used for debugging and testing.

Update a distributed thinlto index test to contain a local and ensure
that we don't get a COMBINED_ORIGINAL_NAME record.

Differential Revision:
The file was modifiedllvm/test/ThinLTO/X86/distributed_indexes.ll
The file was modifiedllvm/lib/Bitcode/Writer/BitcodeWriter.cpp
Commit 7da4ee2df088d39c7ca6531d80172af7d973bb67 by tejohnson
[ThinLTO] Fix bot failures

Fix bot failures after 2c1defeee40cf643ea6f0fa5e01164c9a4c48c30. The new
GUID I added isn't matching because it is a local with the source path
prepended. There isn't much use in matching the GUID's exactly anyway,
so remove those from the patterns.
The file was modifiedllvm/test/ThinLTO/X86/distributed_indexes.ll
Commit 8dc16ba8d2b429261dd95e88496b2a866dc18ae5 by springerm
[mlir][linalg] Merge all tiling passes into a single one.

Passes such as `linalg-tile-to-tiled-loop` are merged into `linalg-tile`.

Differential Revision:
The file was modifiedmlir/test/Integration/Dialect/Linalg/CPU/test-conv-3d-call.mlir
The file was modifiedmlir/test/Dialect/Linalg/tile.mlir
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Tiling.cpp
The file was modifiedmlir/test/Dialect/Linalg/tile-simple-conv.mlir
The file was modifiedmlir/test/Dialect/Linalg/tile-parallel.mlir
The file was modifiedmlir/test/Integration/Dialect/Linalg/CPU/test-tensor-matmul.mlir
The file was modifiedmlir/include/mlir/Dialect/Linalg/
The file was modifiedmlir/test/Dialect/Linalg/tile-indexed.mlir
The file was modifiedmlir/include/mlir/Dialect/Linalg/Passes.h
The file was modifiedmlir/test/Dialect/Linalg/tile-conv-padding.mlir
The file was modifiedmlir/test/Integration/Dialect/Linalg/CPU/test-conv-1d-call.mlir
The file was modifiedmlir/test/Dialect/Linalg/tile-pad-tensor-op.mlir
The file was modifiedmlir/test/Dialect/Linalg/tile-parallel-reduce.mlir
The file was modifiedmlir/test/Integration/Dialect/Linalg/CPU/test-conv-2d-nhwc-hwcf-call.mlir
The file was modifiedmlir/test/Integration/Dialect/Linalg/CPU/test-conv-2d-call.mlir
The file was modifiedmlir/test/Dialect/Linalg/tile-tensors.mlir
The file was modifiedmlir/test/Integration/Dialect/Linalg/CPU/test-conv-1d-nwc-wcf-call.mlir
The file was modifiedmlir/test/Integration/Dialect/Linalg/CPU/test-conv-3d-ndhwc-dhwcf-call.mlir
The file was modifiedmlir/include/mlir/Dialect/Linalg/Utils/Utils.h
The file was modifiedmlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
The file was modifiedmlir/test/Dialect/Linalg/tile-conv.mlir
Commit 2190f8a8b1e01b7bc7429eb490f3001a23f27df1 by springerm
[mlir][linalg] Support tile+peel with TiledLoopOp

Only scf.for was supported until now.

Differential Revision:
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Transforms.cpp
The file was modifiedmlir/test/Dialect/Linalg/tile-and-peel-tensors.mlir
The file was modifiedmlir/test/lib/Dialect/Linalg/TestLinalgTransforms.cpp
Commit 83f3c615dde3fce5c0560c19316b08c1e6aa8c27 by joker.eph
Add missing storageType to AttrDef to ODS

This is only noticeable when using an attribute across dialects I think.
Previously the namespace would be ommited, but it wouldn't matter as
long as the generated code stays within a single namespace.

Differential Revision:
The file was modifiedmlir/include/mlir/IR/
The file was modifiedmlir/test/mlir-tblgen/
Commit e470f9268a448fedea25289ec343f82ff52ccc36 by llvm-project
[Polly] Implement user-directed loop distribution/fission.

This is a simple version without the possibility to define distribute
points or followup-transformations. However, it is the first
transformation that has to check whether the transformation is correct.

It interprets the same metadata as the LoopDistribute pass.

Re-apply after revert in c7bcd72a38bcf99e03e4651ed5204d1a1f2bf695 with
fix: Take isBand out of #ifndef NDEBUG since it now is used
The file was addedpolly/test/ScheduleOptimizer/ManualOptimization/distribute_illegal_pragmaloc.ll
The file was modifiedpolly/lib/Transform/ManualOptimizer.cpp
The file was addedpolly/test/ScheduleOptimizer/ManualOptimization/distribute_illegal_looploc.ll
The file was modifiedpolly/lib/Transform/ScheduleOptimizer.cpp
The file was modifiedpolly/include/polly/ScheduleTreeTransform.h
The file was modifiedpolly/lib/Analysis/DependenceInfo.cpp
The file was addedpolly/test/ScheduleOptimizer/ManualOptimization/distribute_heuristic.ll
The file was modifiedpolly/lib/Transform/ScheduleTreeTransform.cpp
The file was modifiedpolly/include/polly/DependenceInfo.h
The file was modifiedpolly/include/polly/ManualOptimizer.h
Commit afab3c488f0c86af87e262cc7454e04de18e3e6a by i
[Driver] Default Generic_GCC x86 to -fasynchronous-unwind-tables

to match GCC and Clang's own x86-64.
The file was modifiedclang/test/Driver/clang-translation.c
The file was modifiedclang/lib/Driver/ToolChains/Gnu.cpp
Commit 7a62a5b56d670c4e152159740cd7fc4030a9470f by Christudasan.Devadasan
[AMDGPU] Legalize initialized LDS variables

We don't allow an initializer for LDS variables
and there is an early abort during instruction
selection. This patch legalizes them by ignoring
the init values. During assembly emission, proper
error reporting already exists for such instances.

Reviewed By: arsenm

Differential Revision:
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/lds-zero-initializer.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll
Commit 25ac0d3c73d68c017546eb622ba7632c6b581bfb by dblaikie
DebugInfo: Implement the -gsimple-template-names functionality

This excludes certain names that can't be rebuilt from the available

* Atomic types - no DWARF differentiating int from atomic int.
* Vector types - enough DWARF (an attribute on the array type) to do
  this, but I haven't written the extra code to add the attributes
  required for this
* Lambdas - ambiguous with any other unnamed class
* Unnamed classes/enums - would need column info for the type in
  addition to file/line number
* noexcept function types - not encoded in DWARF
The file was addedclang/test/CodeGenCXX/debug-info-simple-template-names.cpp
The file was modifiedclang/lib/CodeGen/CGDebugInfo.cpp
Commit a2c1cf09dfaaa6d2161fee00f8317005bf955d64 by Lang Hames
[ORC] Introduce EPCGenericDylibManager / SimpleExecutorDylibManager.

EPCGenericDylibManager provides an interface for loading dylibs and looking up
symbols in the executor, implemented using EPC-calls to functions in the

SimpleExecutorDylibManager is an executor-side service that provides the
functions used by EPCGenericDylibManager.

SimpleRemoteEPC is updated to use an EPCGenericDylibManager instance to
implement the ExecutorProcessControl loadDylib and lookup methods. In a future
commit these methods will be removed, and clients updated to use
EPCGenericDylibManagers directly.
The file was addedllvm/include/llvm/ExecutionEngine/Orc/EPCGenericDylibManager.h
The file was modifiedllvm/lib/ExecutionEngine/Orc/TargetProcess/CMakeLists.txt
The file was modifiedllvm/lib/ExecutionEngine/Orc/TargetProcess/SimpleRemoteEPCServer.cpp
The file was addedllvm/lib/ExecutionEngine/Orc/TargetProcess/SimpleExecutorDylibManager.cpp
The file was modifiedllvm/lib/ExecutionEngine/Orc/SimpleRemoteEPC.cpp
The file was modifiedllvm/lib/ExecutionEngine/Orc/Shared/OrcRTBridge.cpp
The file was addedllvm/include/llvm/ExecutionEngine/Orc/TargetProcess/SimpleExecutorDylibManager.h
The file was addedllvm/lib/ExecutionEngine/Orc/EPCGenericDylibManager.cpp
The file was modifiedllvm/include/llvm/ExecutionEngine/Orc/TargetProcess/SimpleRemoteEPCServer.h
The file was modifiedllvm/lib/ExecutionEngine/Orc/CMakeLists.txt
The file was modifiedllvm/include/llvm/ExecutionEngine/Orc/SimpleRemoteEPC.h
The file was modifiedllvm/include/llvm/ExecutionEngine/Orc/Shared/OrcRTBridge.h
Commit 58d9ed2c935d6665da388cd72273360349792281 by llvmgnsyncbot
[gn build] Port a2c1cf09dfaa
The file was modifiedllvm/utils/gn/secondary/llvm/lib/ExecutionEngine/Orc/TargetProcess/
The file was modifiedllvm/utils/gn/secondary/llvm/lib/ExecutionEngine/Orc/
Commit 40ddde5d1fa7e5eadb76f6c3cc37dae2f80a8ca2 by Christudasan.Devadasan
[TableGen] Allow targets to entirely ignore Psets for registers

Tablegen currently expects targets to have at least one
pressure set for every broader register category. AMDGPU's
VGPR or AGPR, for instance, seemed to work correctly without
any pset, though we have forced one for each type to avoid
the assertion in computeRegUnitSets. However, psets can not
be entirely empty. At least one set is mandatory for every
target. This patch bypasses the assertion for the classes
when GeneratePressureSet is zero while ensuring the
RegUnitSets are not empty.

Reviewed By: arsenm, rampitec

Differential Revision:
The file was addedllvm/test/TableGen/
The file was modifiedllvm/utils/TableGen/CodeGenRegisters.cpp
The file was addedllvm/test/TableGen/