FailedChanges

Summary

  1. Reapply [InstCombine] Fold multiuse shr eq zero (details)
  2. [mlir][linalg][nfc] Fix signed/unsigned comparison warning in header (details)
  3. [HIP] support ThinLTO (details)
  4. [JITLink] Move some Block bitfields into Addressable to improve packing. (details)
  5. [ORC] Add more synchronization to TestLookupWithUnthreadedMaterialization. (details)
  6. [CostModel][X86] Pull out X86/X64 scalar int arithmetric costs from SSE tables. NFCI. (details)
  7. [IR] Optimize no-op removal from AttributeSet (NFC) (details)
  8. [IR] Optimize no-op removal from AttributeList (NFC) (details)
Commit 9a9421a461166482465e786a46f8cced63cd2e9f by nikita.ppv
Reapply [InstCombine] Fold multiuse shr eq zero

This was reverted due to performance regressions in ARM benchmarks,
which have since been addressed by D101196 (SCEV analysis improvement)
and D101778 (CGP reverse transform).

-----

The single-use case is handled implicity by converting the icmp
into a mask check first. When comparing with zero in particular,
we don't need the one-use restriction, as we only produce a single
icmp.

https://alive2.llvm.org/ce/z/MSixcm
https://alive2.llvm.org/ce/z/GwpG0M
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
The file was modifiedllvm/test/Transforms/PhaseOrdering/X86/ctlz-loop.ll
The file was modifiedllvm/test/Transforms/InstCombine/icmp_sdiv_with_and_without_range.ll
The file was modifiedllvm/test/Transforms/InstCombine/icmp-shr.ll
Commit 0dd36f81b9f894497773caed509603eb0f090cae by ivan.butygin
[mlir][linalg][nfc] Fix signed/unsigned comparison warning in header

Differential Revision: https://reviews.llvm.org/D102968
The file was modifiedmlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td
Commit bf6124580dfba86b73d828851f03fb9eea1269bd by Yaxun.Liu
[HIP] support ThinLTO

Add options -[no-]offload-lto and -foffload-lto=[thin,full] for controlling
LTO for offload compilation. Allow LTO for AMDGPU target.

AMDGPU target does not support codegen of object files containing
call of external functions, therefore the LLVM module passed to
AMDGPU backend needs to contain definitions of all the callees.
An LLVM option is added to allow function importer to import
functions with noinline attribute.

HIP toolchain passes proper LLVM options to lld to make sure
function importer imports definitions of all the callees.

Reviewed by: Teresa Johnson, Artem Belevich

Differential Revision: https://reviews.llvm.org/D99683
The file was modifiedclang/lib/Driver/ToolChains/Clang.cpp
The file was modifiedclang/include/clang/Driver/Driver.h
The file was modifiedllvm/test/Transforms/FunctionImport/Inputs/funcimport.ll
The file was modifiedllvm/test/Transforms/FunctionImport/adjustable_threshold.ll
The file was addedllvm/test/Transforms/FunctionImport/noinline.ll
The file was modifiedllvm/test/Transforms/FunctionImport/funcimport.ll
The file was modifiedllvm/lib/Transforms/IPO/FunctionImport.cpp
The file was modifiedclang/lib/Driver/ToolChains/HIP.cpp
The file was addedllvm/test/Transforms/FunctionImport/Inputs/noinline.ll
The file was modifiedclang/test/Driver/hip-options.hip
The file was modifiedclang/include/clang/Driver/Options.td
The file was modifiedclang/lib/Driver/Driver.cpp
Commit 2b45895df46e3e87b9588bd207f417d2d2fe7482 by Lang Hames
[JITLink] Move some Block bitfields into Addressable to improve packing.

Keeping these bitfields from Block to Addressable allows them to be packed with
the bitfields at the end of Addressable, reducing the size of Block by eight
bytes.
The file was modifiedllvm/include/llvm/ExecutionEngine/JITLink/JITLink.h
Commit 1a1d6e6f98738be249b20994bcfed48dccac59e3 by Lang Hames
[ORC] Add more synchronization to TestLookupWithUnthreadedMaterialization.

Don't run tasks until their corresponding thread has been added to the running
threads vector. This is an extention to fda4300da82, which doesn't seem to have
been enough to fix the synchronization issues on its own.
The file was modifiedllvm/unittests/ExecutionEngine/Orc/CoreAPIsTest.cpp
Commit 6f9ac11e3960bf5953b3af4b0c4e2682ea802081 by llvm-dev
[CostModel][X86] Pull out X86/X64 scalar int arithmetric costs from SSE tables. NFCI.

These aren't dependent on any SSE level (and don't tend to get quicker either).
The file was modifiedllvm/lib/Target/X86/X86TargetTransformInfo.cpp
Commit fd46ed3f397d6cf41bc6c5a04ab2089f585afe44 by nikita.ppv
[IR] Optimize no-op removal from AttributeSet (NFC)

When removing an AttrBuilder from an AttributeSet, first check
whether there is any overlap. If nothing is being removed, we can
directly return the original set.
The file was modifiedllvm/lib/IR/Attributes.cpp
Commit 05738ffcb87b76c6f166f965ba9b2db3257a4338 by nikita.ppv
[IR] Optimize no-op removal from AttributeList (NFC)

When removing an AttrBuilder from an index of an AttributeList,
directly return the original list if no attributes were actually
removed.
The file was modifiedllvm/lib/IR/Attributes.cpp