Changes

Summary

  1. [ELF] Move GOT/PLT relocation code closer. NFC (details)
  2. [clang-tidy] Warn on functional C-style casts (details)
  3. [ARM] create new pseudo t2LDRLIT_ga_pcrel for stack guards (details)
  4. [X86][LoopVectorize] "Fix" `X86TTIImpl::getAddressComputationCost()` (details)
  5. [llvm-profgen] Compute and show profile density (details)
  6. [PR52549][clang-cl] Predefine _MSVC_EXECUTION_CHARACTER_SET (details)
  7. [RISCV] Decode vtype with reserved fields to raw immediate (details)
  8. [ELF] Move ObjFile<ELFT>::{getLocalSymbols,getGlobalSymbols} to non-template ELFFileBase. NFC (details)
  9. [mlir][OpDSL] Fix OpDSL tests after https://reviews.llvm.org/D114680. (details)
  10. [mlir] Move bufferization-related passes to `bufferization` dialect. (details)
  11. [clangd] Make std symbol generation script python3 friendly (details)
  12. [mlir] Decompose Bufferization Clone operation into Memref Alloc and Copy. (details)
  13. [clang][ARM] PACBTI-M assembly support (details)
Commit 5047e3a3ba92402b60c200201484b422cad8bea6 by i
[ELF] Move GOT/PLT relocation code closer. NFC
The file was modifiedlld/ELF/Relocations.cpp
Commit 5bbe50148f3b515c170be22209395b72890f5b8c by carlosgalvezp
[clang-tidy] Warn on functional C-style casts

The google-readability-casting check is meant to be on par
with cpplint's readability/casting check, according to the
documentation. However it currently does not diagnose
functional casts, like:

float x = 1.5F;
int y = int(x);

This is detected by cpplint, however, and the guidelines
are clear that such a cast is only allowed when the type
is a class type (constructor call):

> You may use cast formats like `T(x)` only when `T` is a class type.

Therefore, update the clang-tidy check to check this
case.

Differential Revision: https://reviews.llvm.org/D114427
The file was modifiedclang-tools-extra/clang-tidy/google/AvoidCStyleCastsCheck.cpp
The file was modifiedclang-tools-extra/test/clang-tidy/checkers/google-readability-casting.cpp
The file was modifiedclang-tools-extra/docs/ReleaseNotes.rst
Commit 89453ed6f2059b5cec576fc41914def713fe38f7 by ardb
[ARM] create new pseudo t2LDRLIT_ga_pcrel for stack guards

We can't use the existing pseudo ARM::tLDRLIT_ga_pcrel for loading the
stack guard for PIC code that references the GOT, since arm-pseudo may
expand this to the narrow tLDRpci rather than the wider t2LDRpci.

Create a new pseudo, t2LDRLIT_ga_pcrel, and expand it to t2LDRpci.

Fixes: https://bugs.chromium.org/p/chromium/issues/detail?id=1270361

Reviewed By: ardb

Differential Revision: https://reviews.llvm.org/D114762
The file was addedllvm/test/CodeGen/ARM/expand-pseudos.ll
The file was modifiedllvm/lib/Target/ARM/Thumb2InstrInfo.cpp
The file was modifiedllvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
The file was modifiedllvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
The file was modifiedllvm/lib/Target/ARM/ARMInstrThumb2.td
Commit 8cd782487fe68082e57d24a576b77f529d77f96c by lebedev.ri
[X86][LoopVectorize] "Fix" `X86TTIImpl::getAddressComputationCost()`

We ask `TTI.getAddressComputationCost()` about the cost of computing vector address,
and then multiply it by the vector width. This doesn't make any sense,
it implies that we'd do a vector GEP and then scalarize the vector of pointers,
but there is no such thing in the vectorized IR, we perform scalar GEP's.

This is *especially* bad on X86, and was effectively prohibiting any scalarized
vectorization of gathers/scatters, because `X86TTIImpl::getAddressComputationCost()`
says that cost of vector address computation is `10` as compared to `1` for scalar.

The computed costs are similar to the ones with D111222+D111220,
but we end up without masked memory intrinsics that we'd then have to
expand later on, without much luck. (D111363)

Differential Revision: https://reviews.llvm.org/D111460
The file was modifiedllvm/test/Analysis/CostModel/X86/gather-i32-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/gather-i8-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/gather-i16-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/masked-interleaved-store-i16.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/scatter-i32-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/gather-i64-with-i8-index.ll
The file was modifiedllvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp
The file was modifiedllvm/test/Analysis/CostModel/X86/scatter-i64-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/masked-scatter-i64-with-i8-index.ll
The file was modifiedllvm/lib/Target/X86/X86TargetTransformInfo.cpp
The file was modifiedllvm/test/Analysis/CostModel/X86/interleaved-load-i16-stride-5.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/scatter-i16-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/masked-scatter-i32-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/scatter-i8-with-i8-index.ll
The file was modifiedllvm/test/Transforms/LoopVectorize/X86/gather_scatter.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/masked-interleaved-load-i16.ll
Commit c2e08aba1afd5a69dbe74b03ce6f463d45102222 by wlei
[llvm-profgen] Compute and show profile density

AutoFDO performance is sensitive to profile density, i.e., the amount of samples in the profile relative to the program size, because profiles with insufficient samples could be inaccurate due to statistical noise and thus hurt AutoFDO performance. A previous investigation showed that AutoFDO performed better on MySQL with increased amount of samples. Therefore, we implement a profile-density computation feature to give hints about profile density to users and the compiler.

We define the density of a profile Prof as follows:

- For each function A in the profile, density(A) = total_samples(A) / sizeof(A).
- density(Prof) = min(density(A)) for all functions A that are warm (defined below).

A function is considered warm if its total-samples is within top N percent of the profile. For implementation, we reuse the `ProfileSummaryBuilder::getHotCountThreshold(..)` as threshold which can be set by percent(`--profile-summary-cutoff-hot`) or by value(`--profile-summary-hot-count`).

We also introduce `--hot-function-density-threshold` to set hot function density threshold and will give suggestion if profile density is below it which implies we should increase samples.

This also applies for CS profile with all profiles merged into base.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D113781
The file was addedllvm/test/tools/llvm-profgen/profile-density.test
The file was addedllvm/test/tools/llvm-profgen/Inputs/profile-density.raw.prof
The file was modifiedllvm/tools/llvm-profgen/ProfileGenerator.h
The file was addedllvm/test/tools/llvm-profgen/Inputs/profile-density-cs.raw.prof
The file was modifiedllvm/tools/llvm-profgen/ProfileGenerator.cpp
The file was modifiedllvm/tools/llvm-profgen/ProfiledBinary.h
Commit 7ba70d32736aef0c640b9d0e7b9081fc208c81c2 by markus.boeck02
[PR52549][clang-cl] Predefine _MSVC_EXECUTION_CHARACTER_SET

Since VS 2022 17.1 MSVC predefines _MSVC_EXECUTION_CHARACTER_SET to inform the users of the execution character set defined at compile time. The value the macro expands to is a Windows Code Page Identifier which are documented here: https://docs.microsoft.com/en-us/windows/win32/intl/code-page-identifiers

As clang currently only supports UTF-8 it is defined as 65001. If clang-cl were to support a different execution character set in the future we'd have to change the value.

Fixes https://bugs.llvm.org/show_bug.cgi?id=52549

Differential Revision: https://reviews.llvm.org/D114576
The file was modifiedclang/lib/Basic/Targets/OSTargets.cpp
The file was modifiedclang/test/Preprocessor/init.c
Commit 29d4230d6b528ebf14dcd5dc610ee0d937a23d51 by powerman1st
[RISCV] Decode vtype with reserved fields to raw immediate

This patch fixes a crash when doing "llvm-objdump -D --mattr=+experimental-v"
against an object file which happens to keep a word that can be decoded to
VSETVLI & VSETIVLI with reserved vlmul[2:0]=4. All vtype values with
reserved fields (vlmul[2:0]=4, vsew[2:0]=0b1xx, non-zero bits 8/9/10) are
printed to raw immediate.

Reviewed By: jhenderson, jrtc27, craig.topper

Differential Revision: https://reviews.llvm.org/D114581
The file was modifiedllvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp
The file was addedllvm/test/MC/RISCV/rvv/vsetvl-invalid.s
Commit 5188f55d32a9cd95c3cb668ab2d762ca4e0c8d6b by i
[ELF] Move ObjFile<ELFT>::{getLocalSymbols,getGlobalSymbols} to non-template ELFFileBase. NFC
The file was modifiedlld/ELF/InputFiles.h
The file was modifiedlld/ELF/InputFiles.cpp
Commit 0d0371f58ff0e4289bdff9ef70f7f6fb0277c3d0 by gysit
[mlir][OpDSL] Fix OpDSL tests after https://reviews.llvm.org/D114680.

Update the shapes of the convolution / pooling tests that where detected after enabling verification during printing (https://reviews.llvm.org/D114680). Also split the emit_structured_generic.py file that previously contained all tests into multiple separate files to simplify debugging.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D114731
The file was addedmlir/test/python/dialects/linalg/opdsl/emit_pooling.py
The file was addedmlir/test/python/dialects/linalg/opdsl/emit_misc.py
The file was removedmlir/test/python/dialects/linalg/opdsl/emit_structured_generic.py
The file was addedmlir/test/python/dialects/linalg/opdsl/emit_convolution.py
The file was addedmlir/test/python/dialects/linalg/opdsl/emit_matmul.py
Commit f89bb3c012b46a00eb31bb7a705a85993eb763e3 by pifon
[mlir] Move bufferization-related passes to `bufferization` dialect.

[RFC](https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712)

Differential Revision: https://reviews.llvm.org/D114698
The file was modifiedmlir/include/mlir/InitAllPasses.h
The file was modifiedmlir/lib/Dialect/StandardOps/Transforms/Bufferize.cpp
The file was addedmlir/lib/Dialect/Bufferization/Transforms/BufferDeallocation.cpp
The file was modifiedmlir/lib/Transforms/PassDetail.h
The file was addedmlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
The file was modifiedmlir/include/mlir/Dialect/Bufferization/CMakeLists.txt
The file was removedmlir/include/mlir/Transforms/Bufferize.h
The file was removedmlir/lib/Transforms/BufferDeallocation.cpp
The file was modifiedmlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
The file was modifiedmlir/include/mlir/Dialect/Tensor/Transforms/Passes.h
The file was removedmlir/lib/Transforms/Bufferize.cpp
The file was modifiedmlir/lib/Dialect/Tensor/Transforms/Bufferize.cpp
The file was addedmlir/include/mlir/Dialect/Bufferization/Transforms/CMakeLists.txt
The file was modifiedmlir/include/mlir/Dialect/StandardOps/Transforms/Passes.h
The file was modifiedmlir/lib/Dialect/Arithmetic/Transforms/Bufferize.cpp
The file was modifiedmlir/include/mlir/Dialect/Arithmetic/Transforms/Passes.h
The file was modifiedmlir/lib/Dialect/StandardOps/Transforms/CMakeLists.txt
The file was modifiedutils/bazel/llvm-project-overlay/mlir/BUILD.bazel
The file was modifiedmlir/lib/Dialect/Bufferization/CMakeLists.txt
The file was addedmlir/lib/Dialect/Bufferization/Transforms/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/SCF/Transforms/Bufferize.cpp
The file was modifiedmlir/lib/Dialect/Shape/Transforms/Bufferize.cpp
The file was addedmlir/test/Dialect/Bufferization/Transforms/finalizing-bufferize.mlir
The file was addedmlir/lib/Dialect/Bufferization/Transforms/PassDetail.h
The file was addedmlir/test/Dialect/Bufferization/Transforms/buffer-deallocation.mlir
The file was modifiedmlir/lib/Dialect/Math/Transforms/PolynomialApproximation.cpp
The file was addedmlir/include/mlir/Dialect/Bufferization/Transforms/Passes.h
The file was modifiedmlir/include/mlir/Transforms/Passes.td
The file was removedmlir/test/Transforms/buffer-deallocation.mlir
The file was addedmlir/include/mlir/Dialect/Bufferization/Transforms/Bufferize.h
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp
The file was modifiedmlir/include/mlir/Transforms/Passes.h
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Promotion.cpp
The file was modifiedmlir/lib/Dialect/Tensor/Transforms/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Shape/Transforms/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/StandardOps/Transforms/TensorConstantBufferize.cpp
The file was modifiedmlir/lib/Dialect/Arithmetic/Transforms/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/StandardOps/Transforms/ExpandOps.cpp
The file was removedmlir/test/Transforms/finalizing-bufferize.mlir
The file was modifiedmlir/lib/Dialect/StandardOps/Transforms/FuncBufferize.cpp
The file was addedmlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td
The file was modifiedmlir/lib/Dialect/Arithmetic/Transforms/ExpandOps.cpp
The file was modifiedmlir/lib/Transforms/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/SCF/Transforms/CMakeLists.txt
Commit 3356d8837e46a92446e4b9b0cbd6967e5f4e44ba by kadircet
[clangd] Make std symbol generation script python3 friendly

Differential Revision: https://reviews.llvm.org/D114723
The file was modifiedclang-tools-extra/clangd/include-mapping/gen_std.py
Commit ae1ea0bead75f4c7a4c965dfa40b5f3b78b60364 by julian.gross
[mlir] Decompose Bufferization Clone operation into Memref Alloc and Copy.

This patch introduces a new conversion to convert bufferization.clone operations
into a memref.alloc and a memref.copy operation. This transformation is needed to
transform all remaining clones which "survive" all previous transformations, before
a given program is lowered further (to LLVM e.g.). Otherwise, these operations
cannot be handled anymore and lead to compile errors.
See: https://llvm.discourse.group/t/bufferization-error-related-to-memref-clone/4665

Differential Revision: https://reviews.llvm.org/D114233
The file was addedmlir/lib/Conversion/BufferizationToMemRef/CMakeLists.txt
The file was modifiedmlir/lib/Conversion/CMakeLists.txt
The file was modifiedmlir/include/mlir/Conversion/Passes.h
The file was modifiedmlir/include/mlir/Conversion/Passes.td
The file was addedmlir/test/Conversion/BufferizationToMemRef/bufferization-to-memref.mlir
The file was addedmlir/lib/Conversion/BufferizationToMemRef/BufferizationToMemRef.cpp
The file was addedmlir/include/mlir/Conversion/BufferizationToMemRef/BufferizationToMemRef.h
Commit 5cff77c23f43130887b566dd0fe237e1c482e23b by zeno
[clang][ARM] PACBTI-M assembly support

Introduce assembly support for Armv8.1-M PACBTI extension. This is an optional
extension in v8.1-M.

There are 10 new system registers and 5 new instructions, all predicated on the
feature.

The attribute for llvm-mc is called "pacbti". For armclang, an architecture
extension also called "pacbti" was created.

This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension

The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:

https://developer.arm.com/documentation/ddi0553/latest

The following people contributed to this patch:

- Victor Campos
- Ties Stuij

Reviewed By: labrinea

Differential Revision: https://reviews.llvm.org/D112420
The file was modifiedllvm/lib/Target/ARM/ARMSystemRegister.td
The file was modifiedllvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
The file was modifiedllvm/lib/Target/ARM/ARMSubtarget.h
The file was modifiedllvm/include/llvm/Support/ARMTargetParser.def
The file was addedllvm/test/MC/Disassembler/ARM/armv8.1m-pacbti.txt
The file was addedllvm/test/MC/ARM/armv8.1m-pacbti.s
The file was modifiedllvm/lib/Target/ARM/ARMRegisterInfo.td
The file was modifiedllvm/include/llvm/Support/ARMTargetParser.h
The file was modifiedllvm/lib/Target/ARM/ARMPredicates.td
The file was modifiedllvm/lib/Target/ARM/ARMInstrThumb2.td
The file was modifiedllvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp
The file was addedllvm/test/MC/ARM/armv8.1m-pacbti-error.s
The file was addedllvm/test/MC/ARM/implicit-it-generation-v8.s
The file was modifiedllvm/lib/Target/ARM/ARM.td
The file was modifiedclang/test/Driver/armv8.1m.main.c
The file was modifiedllvm/test/CodeGen/Thumb/high-reg-clobber.mir