SuccessChanges

Summary

  1. [CodeGen] Format code comment to 80 columns. NFC. (details)
  2. [MLIR][affine-loop-fusion] Handle defining ops between the source and dest loops (details)
  3. [mlir] Check 'iter_args' in 'isLoopParallel' utility (details)
  4. [SampleFDO][NFC] Refactor: make SampleProfileLoaderBaseImpl a template class (details)
  5. [AMDGPU] require s-memtime-inst for __builtin_amdgcn_s_memtime (details)
Commit b368fc735d5a485ebf8ed455e078dafbccf27659 by fraser
[CodeGen] Format code comment to 80 columns. NFC.
The file was modifiedllvm/include/llvm/CodeGen/ISDOpcodes.h (diff)
Commit 203d5eeec55b1f0e0dd2aa28f5c5ebe292802e62 by diego.caballero
[MLIR][affine-loop-fusion] Handle defining ops between the source and dest loops

This patch handles defining ops between the source and dest loop nests, and prevents loop nests with `iter_args` from being fused.

If there is any SSA value in the dest loop nest whose defining op has dependence from the source loop nest, we cannot fuse the loop nests.

If there is a `affine.for` with `iter_args`, prevent it from being fused.

Reviewed By: dcaballe, bondhugula

Differential Revision: https://reviews.llvm.org/D97030
The file was modifiedmlir/test/Transforms/loop-fusion.mlir (diff)
The file was modifiedmlir/lib/Transforms/LoopFusion.cpp (diff)
Commit ebca222b65cb847f7bf4ee3da1dd7e2df35d0338 by diego.caballero
[mlir] Check 'iter_args' in 'isLoopParallel' utility

Fix 'isLoopParallel' utility so that 'iter_args' is taken into account
and loops with loop-carried dependences are not classified as parallel.

Reviewed By: tungld, vinayaka-polymage

Differential Revision: https://reviews.llvm.org/D97347
The file was modifiedmlir/test/Dialect/Affine/parallelize.mlir (diff)
The file was modifiedmlir/lib/Analysis/Utils.cpp (diff)
Commit 6103b6ad69fed0fe300f518b5115837cf6b74148 by xur
[SampleFDO][NFC] Refactor: make SampleProfileLoaderBaseImpl a template class

This patch makes SampleProfileLoaderBaseImpl a template class so it
can be used in CodeGen transformation.

Noticeable changes:
* use one template parameter and use IRTraits to get other used
   types an type specific functions.
* remove the temporary "inline" keywords in previous refactor
   patch.
* change the template function findEquivalencesFor to a regular
   function. This function has a single caller with type of
   PostDominatorTree. It's simpler to use the type directly
   because MachinePostDominatorTree is not a derived type of
   template DominatorTreeBase.

Differential Revision: https://reviews.llvm.org/D96981
The file was modifiedllvm/include/llvm/Transforms/Utils/SampleProfileLoaderBaseImpl.h (diff)
The file was modifiedllvm/lib/Transforms/IPO/SampleProfile.cpp (diff)
Commit 502b3bfc6a713e5b6640faf48e72de08d7cb0aba by Stanislav.Mekhanoshin
[AMDGPU] require s-memtime-inst for __builtin_amdgcn_s_memtime

Differential Revision: https://reviews.llvm.org/D97420
The file was modifiedclang/test/CodeGenOpenCL/builtins-amdgcn.cl (diff)
The file was modifiedclang/include/clang/Basic/BuiltinsAMDGPU.def (diff)
The file was modifiedclang/test/CodeGenOpenCL/builtins-amdgcn-ci.cl (diff)
The file was modifiedclang/test/CodeGenOpenCL/builtins-amdgcn-gfx9.cl (diff)
The file was modifiedclang/test/CodeGenOpenCL/builtins-amdgcn-vi.cl (diff)
The file was addedclang/test/SemaOpenCL/builtins-amdgcn-error-gfx1030.cl
The file was modifiedclang/test/CodeGenOpenCL/builtins-amdgcn-gfx10.cl (diff)