Changes from Git (git http://labmaster3.local/git/llvm-project.git)


  1. [clang-tidy] Warn on functional C-style casts (details)
  2. [ARM] create new pseudo t2LDRLIT_ga_pcrel for stack guards (details)
  3. [X86][LoopVectorize] "Fix" `X86TTIImpl::getAddressComputationCost()` (details)
  4. [llvm-profgen] Compute and show profile density (details)
  5. [PR52549][clang-cl] Predefine _MSVC_EXECUTION_CHARACTER_SET (details)
Commit 5bbe50148f3b515c170be22209395b72890f5b8c by carlosgalvezp
[clang-tidy] Warn on functional C-style casts

The google-readability-casting check is meant to be on par
with cpplint's readability/casting check, according to the
documentation. However it currently does not diagnose
functional casts, like:

float x = 1.5F;
int y = int(x);

This is detected by cpplint, however, and the guidelines
are clear that such a cast is only allowed when the type
is a class type (constructor call):

> You may use cast formats like `T(x)` only when `T` is a class type.

Therefore, update the clang-tidy check to check this

Differential Revision:
The file was modifiedclang-tools-extra/clang-tidy/google/AvoidCStyleCastsCheck.cpp
The file was modifiedclang-tools-extra/test/clang-tidy/checkers/google-readability-casting.cpp
The file was modifiedclang-tools-extra/docs/ReleaseNotes.rst
Commit 89453ed6f2059b5cec576fc41914def713fe38f7 by ardb
[ARM] create new pseudo t2LDRLIT_ga_pcrel for stack guards

We can't use the existing pseudo ARM::tLDRLIT_ga_pcrel for loading the
stack guard for PIC code that references the GOT, since arm-pseudo may
expand this to the narrow tLDRpci rather than the wider t2LDRpci.

Create a new pseudo, t2LDRLIT_ga_pcrel, and expand it to t2LDRpci.


Reviewed By: ardb

Differential Revision:
The file was modifiedllvm/lib/Target/ARM/Thumb2InstrInfo.cpp
The file was modifiedllvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
The file was addedllvm/test/CodeGen/ARM/expand-pseudos.ll
The file was modifiedllvm/lib/Target/ARM/
The file was modifiedllvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
Commit 8cd782487fe68082e57d24a576b77f529d77f96c by lebedev.ri
[X86][LoopVectorize] "Fix" `X86TTIImpl::getAddressComputationCost()`

We ask `TTI.getAddressComputationCost()` about the cost of computing vector address,
and then multiply it by the vector width. This doesn't make any sense,
it implies that we'd do a vector GEP and then scalarize the vector of pointers,
but there is no such thing in the vectorized IR, we perform scalar GEP's.

This is *especially* bad on X86, and was effectively prohibiting any scalarized
vectorization of gathers/scatters, because `X86TTIImpl::getAddressComputationCost()`
says that cost of vector address computation is `10` as compared to `1` for scalar.

The computed costs are similar to the ones with D111222+D111220,
but we end up without masked memory intrinsics that we'd then have to
expand later on, without much luck. (D111363)

Differential Revision:
The file was modifiedllvm/test/Analysis/CostModel/X86/scatter-i64-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/gather-i8-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/masked-scatter-i32-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/masked-interleaved-store-i16.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/gather-i64-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/interleaved-load-i16-stride-5.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/masked-interleaved-load-i16.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/gather-i32-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/scatter-i32-with-i8-index.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/masked-scatter-i64-with-i8-index.ll
The file was modifiedllvm/test/Transforms/LoopVectorize/X86/gather_scatter.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/gather-i16-with-i8-index.ll
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp
The file was modifiedllvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll
The file was modifiedllvm/test/Analysis/CostModel/X86/scatter-i8-with-i8-index.ll
The file was modifiedllvm/lib/Target/X86/X86TargetTransformInfo.cpp
The file was modifiedllvm/test/Analysis/CostModel/X86/scatter-i16-with-i8-index.ll
Commit c2e08aba1afd5a69dbe74b03ce6f463d45102222 by wlei
[llvm-profgen] Compute and show profile density

AutoFDO performance is sensitive to profile density, i.e., the amount of samples in the profile relative to the program size, because profiles with insufficient samples could be inaccurate due to statistical noise and thus hurt AutoFDO performance. A previous investigation showed that AutoFDO performed better on MySQL with increased amount of samples. Therefore, we implement a profile-density computation feature to give hints about profile density to users and the compiler.

We define the density of a profile Prof as follows:

- For each function A in the profile, density(A) = total_samples(A) / sizeof(A).
- density(Prof) = min(density(A)) for all functions A that are warm (defined below).

A function is considered warm if its total-samples is within top N percent of the profile. For implementation, we reuse the `ProfileSummaryBuilder::getHotCountThreshold(..)` as threshold which can be set by percent(`--profile-summary-cutoff-hot`) or by value(`--profile-summary-hot-count`).

We also introduce `--hot-function-density-threshold` to set hot function density threshold and will give suggestion if profile density is below it which implies we should increase samples.

This also applies for CS profile with all profiles merged into base.

Reviewed By: hoy, wenlei

Differential Revision:
The file was addedllvm/test/tools/llvm-profgen/Inputs/
The file was addedllvm/test/tools/llvm-profgen/Inputs/
The file was modifiedllvm/tools/llvm-profgen/ProfileGenerator.cpp
The file was modifiedllvm/tools/llvm-profgen/ProfileGenerator.h
The file was modifiedllvm/tools/llvm-profgen/ProfiledBinary.h
The file was addedllvm/test/tools/llvm-profgen/profile-density.test
Commit 7ba70d32736aef0c640b9d0e7b9081fc208c81c2 by markus.boeck02
[PR52549][clang-cl] Predefine _MSVC_EXECUTION_CHARACTER_SET

Since VS 2022 17.1 MSVC predefines _MSVC_EXECUTION_CHARACTER_SET to inform the users of the execution character set defined at compile time. The value the macro expands to is a Windows Code Page Identifier which are documented here:

As clang currently only supports UTF-8 it is defined as 65001. If clang-cl were to support a different execution character set in the future we'd have to change the value.


Differential Revision:
The file was modifiedclang/test/Preprocessor/init.c
The file was modifiedclang/lib/Basic/Targets/OSTargets.cpp