Changes from Git (git http://labmaster3.local/git/llvm-project.git)


  1. [SCEV] Simplify backedge count clearing (NFC) (details)
  2. [Verifier] Slightly refactor code to reduce duplication, NFC. (details)
  3. Fix type printing of array template args (details)
  4. [X86] AMD Zen 3 Scheduler Model (details)
  5. Microoptimize dominance a bit - NFC. (details)
  6. [RISCV] Add missing frontend tests for vcompress intrinsics. (details)
  7. [lldb] [Process/FreeBSD] Fix arm64 build after RegisterInfoPOSIX_arm64 changes (details)
  8. [CVP] Add tests for mask not equal zero guard (NFC) (details)
  9. [LVI] Handle mask not equal zero conditions (details)
  10. [X32][CET] Fix size and alignment of section (details)
  11. [Cuda] Internalize a struct and a global variable (details)
  12. [HIP] Fix device lib selection (details)
  13. [InstCombine] Precommit tests for D101423 (NFC) (details)
  14. [InstCombine] Fold overflow bit of [u|s]mul.with.overflow in a poison-safe way (details)
  15. [gn build] Port 1977c53b2ae4 (details)
Commit cc58e8918b70d5698ec06c0b6e4c6e4c27971870 by nikita.ppv
[SCEV] Simplify backedge count clearing (NFC)

This seems to be a leftover from when the BackedgeTakenInfo
stored multiple exit counts with manual memory management. At
some point this was switchted to a simple vector, and there should
be no need to micro-manage the clearing anymore. We can simply
drop the loop from the map and the the destructor do its job.
The file was modifiedllvm/include/llvm/Analysis/ScalarEvolution.h
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
Commit be8ad4e98e1fd26057471a98d6404bfdb00235cd by clattner
[Verifier] Slightly refactor code to reduce duplication, NFC.
The file was modifiedmlir/lib/IR/Verifier.cpp
Commit 8518742104ab075296722ef6151f65aee7a0646d by v.g.vassilev
Fix type printing of array template args

The code example:
constexpr const char kEta[] = "Eta";
template <const char*, typename T> class Column {};
using quick = Column<kEta,double>;

void lookup() {
  quick c1;;

emits error: no member named 'ls' in 'Column<&kEta, double>'. The patch fixes
the printed type name by not printing the ampersand for array types.

Differential Revision:
The file was modifiedclang/lib/AST/TemplateBase.cpp
The file was modifiedclang/test/SemaTemplate/temp_arg_nontype_cxx11.cpp
The file was modifiedclang/test/CodeGenCXX/debug-info-codeview-display-name.cpp
Commit 2b93c9c16c586c26d20a5166c6ffbd71bc85b2e6 by lebedev.ri
[X86] AMD Zen 3 Scheduler Model

Introduce basic schedule model for AMD Zen 3 CPU's, a.k.a `znver3`.

This is fully built from scratch, from llvm-mca measurements
and documented reference materials.
Nothing was copied from `znver2`/`znver1`.

I believe this is in a reasonable state of completion for inclusion,
probably better than D52779 `bdver2` was :)

* uops are pretty spot-on (at least what llvm-mca can measure)
* latency is also pretty spot-on (at least what llvm-mca can measure)
* throughput is within reason

I haven't run much benchmarks with this,
however RawSpeed benchmarks says this is beneficial:

I'll call out the obvious problems there:
* i didn't really bother with X87 instructions
* i didn't really bother with obviously-microcoded/system instructions
* There are large discrepancy in throughput for `mr` and `rm` instructions.
  I'm not really sure if it's a modelling defect that needs to be fixed,
  or it's a defect of measurments.
* Pipe distributions are probably bad :)
  I can't do much here until AMD allows that to be fixed
  by documenting the appropriate counters and updating libpfm

That being said, as @RKSimon notes:
>>! In D94395#2647381, @RKSimon wrote:
> I'll mention again that all the znver* models appear to be very inaccurate wrt SIMD/FPU instructions <...>
so how much worse this could possibly be?!

Things that aren't there:
* Various tunings: zero idioms, etc. That is follow-ups.

Differential Revision:
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/partial-reg-update-7.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-clzero.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-sse42.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/partial-reg-update-6.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/partial-reg-update-3.s
The file was modifiedllvm/test/tools/llvm-mca/X86/register-file-statistics.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-x86_32.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-fsgsbase.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-fma.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-sse1.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-avx1.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/partial-reg-update.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-sha.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-cmpxchg.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-mmx.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-popcnt.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-sse4a.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-f16c.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-lea.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-ssse3.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-rdrand.s
The file was modifiedllvm/lib/Target/X86/
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-sse2.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-sse41.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-avx2.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-cmov.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-rdseed.s
The file was modifiedllvm/test/CodeGen/X86/slow-unaligned-mem.ll
The file was modifiedllvm/test/CodeGen/X86/x86-64-double-shifts-var.ll
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/partial-reg-update-2.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-prefetchw.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-aes.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-clflushopt.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-lzcnt.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-bmi1.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-pclmul.s
The file was modifiedllvm/test/tools/llvm-mca/X86/scheduler-queue-usage.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-x87.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-sse3.s
The file was addedllvm/lib/Target/X86/
The file was modifiedllvm/test/tools/llvm-mca/X86/in-order-cpu.s
The file was modifiedllvm/test/tools/llvm-mca/X86/read-after-ld-1.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-adx.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/partial-reg-update-4.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-movbe.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-x86_64.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/partial-reg-update-5.s
The file was modifiedllvm/test/tools/llvm-mca/X86/cpus.s
The file was modifiedllvm/lib/Target/X86/
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-mwaitx.s
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/resources-bmi2.s
Commit a4c8952e6d4c67cca8387e6951e41c1bd4d5960e by clattner
Microoptimize dominance a bit - NFC.

Don't get RegionKindInterface if we won't use it. Noticed by inspection.
The file was modifiedmlir/lib/IR/Dominance.cpp
Commit f36e6e16a86eceaab39b0ac38517feb04775e0d4 by craig.topper
[RISCV] Add missing frontend tests for vcompress intrinsics.
The file was addedclang/test/CodeGen/RISCV/rvv-intrinsics/vcompress.c
The file was addedclang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vcompress.c
Commit db457e64794ccbc248236c10333db9e13c082a78 by mgorny
[lldb] [Process/FreeBSD] Fix arm64 build after RegisterInfoPOSIX_arm64 changes

Commit 88a5b35d63f927db69ec953ff487a7ba2504a610 changed the API
of RegisterInfoPOSIX_arm64 and effectively broke the FreeBSD plugin.
Update it to work with the new API.

Differential Revision:
The file was modifiedlldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm64.cpp
Commit 7aafd104bfb8b2f676d9cb56dc176118260e5114 by nikita.ppv
[CVP] Add tests for mask not equal zero guard (NFC)
The file was modifiedllvm/test/Transforms/CorrelatedValuePropagation/icmp.ll
Commit db9d00c5e7b02b5fe77cb8f3c7c40405c5f1222a by nikita.ppv
[LVI] Handle mask not equal zero conditions

If V & Mask != 0, we know that at least one of the bits in Mask
must be set, so the value must be >= the lowest bit in Mask.
The file was modifiedllvm/lib/Analysis/LazyValueInfo.cpp
The file was modifiedllvm/test/Transforms/CorrelatedValuePropagation/icmp.ll
Commit f30500632b299a8f8f8a53f06efb1038eb7fa48d by harald
[X32][CET] Fix size and alignment of section

X32 uses 32-bit ELF object files with 32-bit alignment, so the section needs to be emitted as it is for X86.

Reviewed By: MaskRay

Differential Revision:
The file was modifiedllvm/test/CodeGen/X86/note-cet-property.ll
The file was modifiedllvm/lib/Target/X86/X86AsmPrinter.cpp
Commit 1fcf9247de05bfd960a35c691b9cc47b6a94cd2b by i
[Cuda] Internalize a struct and a global variable
The file was modifiedclang/lib/Basic/Cuda.cpp
Commit c58a6a6fb4110ee1ffd0e45ad98872e55855b310 by Yaxun.Liu
[HIP] Fix device lib selection

Choose optimized device lib bitcode by fp options
for performance.

Reviewed by: Artem Belevich, Fangrui Song

Differential Revision:
The file was modifiedclang/include/clang/Driver/
The file was modifiedclang/lib/CodeGen/CGExprScalar.cpp
The file was addedclang/test/CodeGenCUDA/
The file was modifiedclang/include/clang/Basic/CodeGenOptions.def
The file was modifiedclang/test/Driver/hip-device-libs.hip
The file was modifiedclang/lib/Driver/ToolChains/HIP.cpp
Commit 603ae6082bcb9e7b8ae9f288c007488f144d9d22 by aqjune
[InstCombine] Precommit tests for D101423 (NFC)
The file was addedllvm/test/Transforms/InstCombine/div-by-0-guard-before-smul_ov-not.ll
The file was addedllvm/test/Transforms/InstCombine/div-by-0-guard-before-smul_ov.ll
The file was addedllvm/test/Transforms/InstCombine/div-by-0-guard-before-umul_ov.ll
The file was addedllvm/test/Transforms/InstCombine/div-by-0-guard-before-umul_ov-not.ll
Commit 1977c53b2ae425541a0ef329ca10cc8d5cacd0cd by aqjune
[InstCombine] Fold overflow bit of [u|s]mul.with.overflow in a poison-safe way

As discussed in D101191, this patch adds a poison-safe folding of overflow bit check:
  %Op0 = icmp ne i4 %X, 0
  %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y)
  %Op1 = extractvalue { i4, i1 } %Agg, 1
  %ret = select i1 %Op0, i1 %Op1, i1 false
=> = freeze %Y
  %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4
  %Op1 = extractvalue { i4, i1 } %Agg, 1
  %ret = %Op1

Note that there are cases where inserting freeze is not necessary: e.g. %Y is `noundef`.
In this case, LLVM is already good because `%ret` is already successfully folded into `and`,
triggering the pre-existing optimization in InstSimplify:

Differential Revision:
The file was addedllvm/include/llvm/Analysis/OverflowInstAnalysis.h
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
The file was addedllvm/lib/Analysis/OverflowInstAnalysis.cpp
The file was modifiedllvm/test/Transforms/InstCombine/div-by-0-guard-before-smul_ov.ll
The file was modifiedllvm/test/Transforms/InstCombine/div-by-0-guard-before-smul_ov-not.ll
The file was modifiedllvm/lib/Analysis/InstructionSimplify.cpp
The file was modifiedllvm/test/Transforms/InstCombine/div-by-0-guard-before-umul_ov-not.ll
The file was modifiedllvm/lib/Analysis/CMakeLists.txt
The file was modifiedllvm/test/Transforms/InstCombine/div-by-0-guard-before-umul_ov.ll
Commit ff7f27fe67dbfb7b2c0788c0603fde2b245d1b40 by llvmgnsyncbot
[gn build] Port 1977c53b2ae4
The file was modifiedllvm/utils/gn/secondary/llvm/lib/Analysis/