Changes

Summary

  1. [libomptarget][amdgpu] Destruct HSA queues (details)
  2. [DSE] Make DSEState non-copyable (NFC) (details)
  3. [DSE] Don't check getUnderlyingObject() return value (NFC) (details)
  4. [X86][Costmodel] Load/store i16 VF=2 interleaving costs (details)
  5. [RISCV] Remove redundant declaration RISCVMnemonicSpellCheck (NFC) (details)
Commit 8cf93a35d4b873b5e50c152d00adfc3701c679ea by jonathanchesterfield
[libomptarget][amdgpu] Destruct HSA queues

Store queues in unique_ptr so they are destroyed when the global DeviceInfo is. Currently they leak which raises an assert in debug builds of hsa.

Reviewed By: pdhaliwal

Differential Revision: https://reviews.llvm.org/D109511
The file was modifiedopenmp/libomptarget/plugins/amdgpu/dynamic_hsa/hsa.h
The file was modifiedopenmp/libomptarget/plugins/amdgpu/src/rtl.cpp
The file was modifiedopenmp/libomptarget/plugins/amdgpu/dynamic_hsa/hsa.cpp
Commit f3c74b72f45ec3e6ca2402468cb070d7e485e3d4 by nikita.ppv
[DSE] Make DSEState non-copyable (NFC)

As it contains a self-reference, the default copy/move ctors
would not be safe.

Move the DSEState::get() method into the ctor to make sure no move
occurs here even without NRVO.

This is a speculative fix for test failures on
llvm-clang-x86_64-expensive-checks-win.
The file was modifiedllvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
Commit 14a49f5840a15791e7452200832a51bd11620df6 by nikita.ppv
[DSE] Don't check getUnderlyingObject() return value (NFC)

getUnderlyingObject() never returns null. It will simply return
something that is not the "root" underlying object.

Also drop a stale comment.
The file was modifiedllvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
Commit d9413f46b308df5afd7fc106df2af809757bb0c9 by lebedev.ri
[X86][Costmodel] Load/store i16 VF=2 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/M8vEKs5jY - for intels `Block RThroughput: =2.0`;
                                  for ryzens, `Block RThroughput: <=1.0`
So pick cost of `2`.

For store we have:
https://godbolt.org/z/Kx1nKz7je - for intels `Block RThroughput: =1.0`;
                                  for ryzens, `Block RThroughput: <=0.5`
So pick cost of `1`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D103144
The file was modifiedllvm/lib/Target/X86/X86TargetTransformInfo.cpp
The file was modifiedllvm/test/Analysis/CostModel/X86/interleaved-load-i16-stride-2.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/interleaved-store-i16-stride-2.ll
Commit c4ae4a745dbdb2ac3d8a5c77d2cd12b5e5349154 by kazu
[RISCV] Remove redundant declaration RISCVMnemonicSpellCheck (NFC)

Note that RISCVMnemonicSpellCheck is defined in
RISCVGenAsmMatcher.inc, which RISCVAsmParser.cpp includes.

Identified with readability-redundant-declaration.
The file was modifiedllvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp