SuccessChanges

Summary

  1. Revert "[mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies." (details)
  2. [SCEV] If Start>=RHS, simplify (Start smin RHS) = RHS for trip counts. (details)
  3. [MSAN] Instrument freeze instruction by clearing shadow (details)
  4. [Utils] Add noundef attribute to vim/emacs/vscode syntax scripts (details)
  5. [llvm] Add a parser from JSON to TensorSpec (details)
  6. [mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies. (details)
  7. [mlir][DialectConversion] Add support for mergeBlocks in ConversionPatternRewriter. (details)
  8. [mlir][DialectConversion] Remove usage of std::distance to track position. (details)
  9. [X86] Use h-register for final XOR of __builtin_parity on 64-bit targets. (details)
  10. [PGO] Change a `NumVSites == 0` workaround to assert (details)
  11. [FPEnv] IRBuilder fails to add strictfp attribute (details)
  12. [NewPM][LoopVersioning] Port LoopVersioning to NPM (details)
  13. [X86][SSE] Shuffle combine blends to OR(X,Y) if the relevant elements are known zero. (details)
  14. [X86] Make ENDBR instruction a scheduling boundary (details)
  15. [compiler-rt][profile] Fix various InstrProf tests on Solaris (details)
  16. [PGO] Extend the value profile buckets for mem op sizes. (details)
  17. [gn build] Port f78f509c758 (details)
  18. [ArgPromotion] Replace all md uses of promoted values with undef. (details)
  19. [X86] support .nops directive (details)
  20. Fix layering violation Transforms/Utils -> Scalar (details)
  21. [InstSimplify] add tests for min-of-max variants; NFC (details)
  22. [InstSimplify] fold variations of max-of-min with common operand (details)
  23. [flang] Fix bug detecting intrinsic function (details)
  24. [PGO] Enable the extended value profile buckets for mem op sizes. (details)
  25. [llvm-jitlink] Add support for static archives and MachO universal archives. (details)
  26. [AArch64] Add missing isel patterns for fcvtzs/u intrinsic on v1f64. (details)
  27. Fix typo: s/epomymous/eponymous/ NFC (details)
  28. Allow .dSYM's to be directly placed in an alternate directory (details)
  29. [CodeGen][ObjC] Mark calls to objc_unsafeClaimAutoreleasedReturnValue as (details)
  30. [MC] Set sh_link to 0 if the associated symbol is undefined (details)
  31. [ARM] Test for converting VPSEL to VMOVT. NFC (details)
  32. Revert "[X86][SSE] Shuffle combine blends to OR(X,Y) if the relevant elements are known zero." (details)
  33. [WebAssembly] Implement prototype v128.load{32,64}_zero instructions (details)
  34. [ARM] Convert VPSEL to VMOV in tail predicated loops (details)
  35. [HWASan] [GlobalISel] Add +tagged-globals backend feature for GlobalISel (details)
  36. [mlir][OpFormatGen] Add support for eliding UnitAttr when used to anchor an optional group (details)
  37. [MemorySSA] Restrict optimizations after a PhiTranslation. (details)
Commit 7ba82a7320df82d07d3d5679bce89b14526b536c by joker.eph
Revert "[mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies."

This reverts commit 35b65be041127db9fe23d3128a004c888893cbae.

Build is broken with -DBUILD_SHARED_LIBS=ON with some undefined
references like:

VectorTransforms.cpp:(.text._ZN4llvm12function_refIFvllEE11callback_fnIZL24createScopedInBoundsCondN4mlir25VectorTransferOpInterfaceEE3$_8EEvlll+0xa5): undefined reference to `mlir::edsc::op::operator+(mlir::Value, mlir::Value)'
The file was modifiedmlir/include/mlir/Dialect/Vector/VectorTransforms.h (diff)
The file was modifiedmlir/test/lib/Transforms/TestVectorTransforms.cpp (diff)
The file was removedmlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir
The file was modifiedmlir/include/mlir/Interfaces/VectorInterfaces.td (diff)
The file was modifiedmlir/lib/Dialect/Vector/VectorTransforms.cpp (diff)
Commit ee1c12708a4519361729205168dedb2b61bc2638 by flo
[SCEV] If Start>=RHS, simplify (Start smin RHS) = RHS for trip counts.

In some cases, it seems like we can get rid of unnecessary s/umins by
using information from the loop guards (unless I am missing something).

One place where this seems to be helpful in practice is when computing
loop trip counts. This patch just changes howManyGreaterThans for now.
Note that this requires a loop for which we can check 'is guarded'.

On SPEC2000/SPEC2006/MultiSource, there are some notable changes for
some programs in the number of loops unrolled and trip counts computed.

```
Same hash: 179 (filtered out)
Remaining: 58
Metric: scalar-evolution.NumTripCountsComputed

Program                                        base    patch   diff
test-suite...langs-C/compiler/compiler.test    25.00   31.00  24.0%
test-suite.../Applications/SPASS/SPASS.test   2020.00 2323.00 15.0%
test-suite...langs-C/allroots/allroots.test    29.00   32.00  10.3%
test-suite.../Prolangs-C/loader/loader.test    17.00   18.00   5.9%
test-suite...fice-ispell/office-ispell.test   253.00  265.00   4.7%
test-suite...006/450.soplex/450.soplex.test   3552.00 3692.00  3.9%
test-suite...chmarks/MallocBench/gs/gs.test   453.00  470.00   3.8%
test-suite...ngs-C/assembler/assembler.test    29.00   30.00   3.4%
test-suite.../Benchmarks/Ptrdist/bc/bc.test   263.00  270.00   2.7%
test-suite...rks/FreeBench/pifft/pifft.test   722.00  741.00   2.6%
test-suite...count/automotive-bitcount.test    41.00   42.00   2.4%
test-suite...0/253.perlbmk/253.perlbmk.test   1417.00 1451.00  2.4%
test-suite...000/197.parser/197.parser.test   387.00  396.00   2.3%
test-suite...lications/sqlite3/sqlite3.test   1168.00 1189.00  1.8%
test-suite...000/255.vortex/255.vortex.test   173.00  176.00   1.7%

Metric: loop-unroll.NumUnrolled

Program                                        base   patch  diff
test-suite...langs-C/compiler/compiler.test     1.00   3.00 200.0%
test-suite.../Applications/SPASS/SPASS.test   134.00 234.00 74.6%
test-suite...count/automotive-bitcount.test     3.00   4.00 33.3%
test-suite.../Prolangs-C/loader/loader.test     3.00   4.00 33.3%
test-suite...langs-C/allroots/allroots.test     3.00   4.00 33.3%
test-suite...Source/Benchmarks/sim/sim.test    10.00  12.00 20.0%
test-suite...fice-ispell/office-ispell.test    21.00  25.00 19.0%
test-suite.../Benchmarks/Ptrdist/bc/bc.test    32.00  38.00 18.8%
test-suite...006/450.soplex/450.soplex.test   300.00 352.00 17.3%
test-suite...rks/FreeBench/pifft/pifft.test    60.00  69.00 15.0%
test-suite...chmarks/MallocBench/gs/gs.test    57.00  63.00 10.5%
test-suite...ngs-C/assembler/assembler.test    10.00  11.00 10.0%
test-suite...0/253.perlbmk/253.perlbmk.test   145.00 157.00  8.3%
test-suite...000/197.parser/197.parser.test    43.00  46.00  7.0%
test-suite...TimberWolfMC/timberwolfmc.test   205.00 214.00  4.4%
Geomean difference                                           7.6%
```

Fixes https://bugs.llvm.org/show_bug.cgi?id=46939
Fixes https://bugs.llvm.org/show_bug.cgi?id=46924 on X86.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D85046
The file was modifiedllvm/test/Transforms/HardwareLoops/scalar-while.ll (diff)
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp (diff)
The file was modifiedllvm/test/Analysis/ScalarEvolution/pr46939-trip-count-count-down.ll (diff)
Commit 3ebd1ba64f3d6f1e75f43213c50f0d1bd3902228 by guiand
[MSAN] Instrument freeze instruction by clearing shadow

Freeze always returns a defined value. This also prevents msan from
checking the input shadow, which happened because freeze wasn't
explicitly visited.

Differential Revision: https://reviews.llvm.org/D85040
The file was modifiedllvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (diff)
The file was addedllvm/test/Instrumentation/MemorySanitizer/freeze.ll
Commit caf002c7be44cb6c54de5a1b19aa177f18b6b0c1 by guiand
[Utils] Add noundef attribute to vim/emacs/vscode syntax scripts

Differential Revision: https://reviews.llvm.org/D84553
The file was modifiedllvm/utils/emacs/llvm-mode.el (diff)
The file was modifiedllvm/utils/vim/syntax/llvm.vim (diff)
The file was modifiedllvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml (diff)
Commit 4b1b109c5126efc963cc19949df5201e40f1bcc1 by mtrofin
[llvm] Add a parser from JSON to TensorSpec

A JSON->TensorSpec utility we will use subsequently to specify
additional outputs needed for certain training scenarios.

Differential Revision: https://reviews.llvm.org/D84976
The file was modifiedllvm/lib/Analysis/TFUtils.cpp (diff)
The file was modifiedllvm/include/llvm/Analysis/Utils/TFUtils.h (diff)
The file was modifiedllvm/unittests/Analysis/TFUtilsTest.cpp (diff)
Commit d313e9c12ed3541f63a36e3b0d59e9e1185603d2 by ntv
[mlir][Vector] Add transformation + pattern to split vector.transfer_read into full and partial copies.

This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling:

```
   %1:3 = scf.if (%inBounds) {
      scf.yield %view : memref<A...>, index, index
    } else {
      %2 = vector.transfer_read %view[...], %pad : memref<A...>, vector<...>
      %3 = vector.type_cast %extra_alloc : memref<...> to
      memref<vector<...>> store %2, %3[] : memref<vector<...>> %4 =
      memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 :
      memref<A...>, index, index
   }
   %res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]}
```
where `extra_alloc` is a top of the function alloca'ed buffer of one vector.

This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer.
The extra work only occurs on the boundary tiles.

Differential Revision: https://reviews.llvm.org/D84631
The file was modifiedmlir/include/mlir/Interfaces/VectorInterfaces.td (diff)
The file was modifiedmlir/include/mlir/Dialect/Vector/VectorTransforms.h (diff)
The file was modifiedmlir/lib/Dialect/Vector/VectorTransforms.cpp (diff)
The file was addedmlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir
The file was modifiedmlir/lib/Dialect/Vector/CMakeLists.txt (diff)
The file was modifiedmlir/test/lib/Transforms/TestVectorTransforms.cpp (diff)
Commit e888886cc3daf2c2d6c20cad51cd5ec2ffc24789 by ravishankarm
[mlir][DialectConversion] Add support for mergeBlocks in ConversionPatternRewriter.

Differential Revision: https://reviews.llvm.org/D84795
The file was modifiedmlir/test/lib/Dialect/Test/TestOps.td (diff)
The file was modifiedmlir/test/lib/Dialect/Test/TestPatterns.cpp (diff)
The file was modifiedmlir/lib/Transforms/DialectConversion.cpp (diff)
The file was addedmlir/test/Transforms/test-merge-blocks.mlir
Commit 32f3a9a9d68eea7d40a19767b591622b4b737990 by ravishankarm
[mlir][DialectConversion] Remove usage of std::distance to track position.

Remove use of iterator::difference_type to know where to insert a
moved or erased block during undo actions.

Differential Revision: https://reviews.llvm.org/D85066
The file was modifiedmlir/lib/Transforms/DialectConversion.cpp (diff)
Commit ac82b918c74f3fab8d4a7c1905277bda6b9bccb4 by craig.topper
[X86] Use h-register for final XOR of __builtin_parity on 64-bit targets.

This adds an isel pattern and special XOR8rr_NOREX instruction
to enable the use of h-registers for __builtin_parity. This avoids
a copy and a shift instruction. The NOREX instruction is in case
register allocation doesn't use the matching l-register for some
reason. If a R8-R15 register gets picked instead, we won't be
able to encode the instruction since an h-register can't be used
with a REX prefix.

Fixes PR46954
The file was modifiedllvm/lib/Target/X86/X86InstrCompiler.td (diff)
The file was modifiedllvm/lib/Target/X86/X86InstrArithmetic.td (diff)
The file was modifiedllvm/test/CodeGen/X86/parity.ll (diff)
The file was modifiedllvm/test/CodeGen/X86/vector-reduce-xor-bool.ll (diff)
Commit 317e00dc54c74a2e0fd0c62bdc6a6d68b0d2ca7e by i
[PGO] Change a `NumVSites == 0` workaround to assert

The root cause was fixed by 3d6f53018f845e893ad34f64ff2851a2e5c3ba1d.
The workaround added in 99ad956fdaee5398fdcf46fa49cb433cf52dc461 can be changed
to an assert now. (In case the fix regresses, there will be a heap-use-after-free.)
The file was modifiedcompiler-rt/lib/profile/InstrProfilingValue.c (diff)
Commit d535a91d13b88b547ba24ec50337aa0715d74d4d by kevin.neal
[FPEnv] IRBuilder fails to add strictfp attribute

The strictfp attribute is required on all function calls in a function
that is itself marked with the strictfp attribute. The IRBuilder knows
this and has a method for adding the attribute to function call instructions.

If a function being called has the strictfp attribute itself then the
IRBuilder will refuse to add the attribute to the calling instruction
despite being asked to add it. Eliminate this error.

Differential Revision: https://reviews.llvm.org/D84878
The file was modifiedllvm/include/llvm/IR/IRBuilder.h (diff)
The file was modifiedllvm/unittests/IR/IRBuilderTest.cpp (diff)
Commit 7c19c89dd5c532fef533e008fb5911d20992d2ac by aeubanks
[NewPM][LoopVersioning] Port LoopVersioning to NPM

Reviewed By: ychen, fhahn

Differential Revision: https://reviews.llvm.org/D85063
The file was modifiedllvm/include/llvm/InitializePasses.h (diff)
The file was modifiedllvm/lib/Transforms/Scalar/Scalar.cpp (diff)
The file was modifiedllvm/lib/Passes/PassRegistry.def (diff)
The file was modifiedllvm/lib/Passes/PassBuilder.cpp (diff)
The file was modifiedllvm/test/Transforms/LoopVersioning/basic.ll (diff)
The file was modifiedllvm/include/llvm/Transforms/Utils/LoopVersioning.h (diff)
The file was modifiedllvm/lib/Transforms/Utils/LoopVersioning.cpp (diff)
Commit 219f32f4b68679563443cdaae7b8174c9976409a by llvm-dev
[X86][SSE] Shuffle combine blends to OR(X,Y) if the relevant elements are known zero.

This allows us to remove the (depth violating) code in getFauxShuffleMask where we were combining the OR(SHUFFLE,SHUFFLE) shuffle inputs as well, and not just the OR().

This is a minor step toward being able to shuffle combine from/to SELECT/BLENDV as a faux shuffle.
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp (diff)
The file was modifiedllvm/test/CodeGen/X86/shuffle-vs-trunc-256.ll (diff)
The file was modifiedllvm/test/CodeGen/X86/insertelement-ones.ll (diff)
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-128-v8.ll (diff)
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-256-v32.ll (diff)
Commit f208c659fb76b1ad8ae83dd10c4f0c30986d48ee by craig.topper
[X86] Make ENDBR instruction a scheduling boundary

Instructions should not be scheduled across ENDBR instructions, as this would result in the ENDBR being displaced, breaking the parity needed for the Indirect Branch Tracking feature of CET.

Currently, the X86IndirectBranchTracking pass is later than the instruction scheduling in the pipeline, what causes the bug to be unnoticeable and very hard (if not unfeasible) to be triggered while compiling C files with the standard LLVM setup. Yet, for correctness and to prevent issues in future changes, the compiler should prevent the such scheduling.

Differential Revision: https://reviews.llvm.org/D84862
The file was modifiedllvm/lib/Target/X86/X86InstrInfo.cpp (diff)
The file was modifiedllvm/lib/Target/X86/X86InstrInfo.h (diff)
Commit 39494d9c21bab3281e4af30578af10f37ea09470 by ro
[compiler-rt][profile] Fix various InstrProf tests on Solaris

Currently, several InstrProf tests `FAIL` on Solaris (both sparc and x86):

  Profile-i386 :: Posix/instrprof-visibility.cpp
  Profile-i386 :: instrprof-merging.cpp
  Profile-i386 :: instrprof-set-file-object-merging.c
  Profile-i386 :: instrprof-set-file-object.c

On sparc there's also

  Profile-sparc :: coverage_comments.cpp

The failure mode is always the same:

  error: /var/llvm/local-amd64/projects/compiler-rt/test/profile/Profile-i386/Posix/Output/instrprof-visibility.cpp.tmp: Failed to load coverage: Malformed coverage data

The error is from `llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp`
(`loadBinaryFormat`), l.926:

  InstrProfSymtab ProfileNames;
  std::vector<SectionRef> NamesSectionRefs = *NamesSection;
  if (NamesSectionRefs.size() != 1)
    return make_error<CoverageMapError>(coveragemap_error::malformed);

where .size() is 2 instead.

Looking at the executable, I find (with `elfdump -c -N __llvm_prf_names`):

  Section Header[15]:  sh_name: __llvm_prf_names
      sh_addr:      0x8053ca5       sh_flags:   [ SHF_ALLOC ]
      sh_size:      0x86            sh_type:    [ SHT_PROGBITS ]
      sh_offset:    0x3ca5          sh_entsize: 0
      sh_link:      0               sh_info:    0
      sh_addralign: 0x1

  Section Header[31]:  sh_name: __llvm_prf_names
      sh_addr:      0x8069998       sh_flags:   [ SHF_WRITE SHF_ALLOC ]
      sh_size:      0               sh_type:    [ SHT_PROGBITS ]
      sh_offset:    0x9998          sh_entsize: 0
      sh_link:      0               sh_info:    0
      sh_addralign: 0x1

Unlike GNU `ld` (which primarily operates on section names) the Solaris
linker, following the ELF spirit, only merges input sections into an output
section if both section name and section flags match, so two separate
sections are maintained.

The read-write one comes from `lib/clang/12.0.0/lib/sunos/libclang_rt.profile-i386.a(InstrProfilingPlatformLinux.c.o)`
while the read-only one is generated by
`llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp` (`InstrProfiling::emitNameData`)
at l.1004 where `isConstant = true`.

The easiest way to avoid the mismatch is to change the definition in
`compiler-rt/lib/profile/InstrProfilingPlatformLinux.c` to `const`.

This fixes all failures observed.

Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D85116
The file was modifiedcompiler-rt/lib/profile/InstrProfilingPlatformLinux.c (diff)
Commit f78f509c75861dc4e26f9a22ad12996bf8005a2e by yamauchi
[PGO] Extend the value profile buckets for mem op sizes.

Extend the memop value profile buckets to be more flexible (could accommodate a
mix of individual values and ranges) and to cover more value ranges (from 11 to
22 buckets).

Disabled behind a flag (to be enabled separately) and the existing code to be
removed later.

Differential Revision: https://reviews.llvm.org/D81682
The file was modifiedllvm/include/llvm/Transforms/Instrumentation/InstrProfiling.h (diff)
The file was modifiedllvm/lib/ProfileData/InstrProf.cpp (diff)
The file was modifiedcompiler-rt/lib/profile/InstrProfilingValue.c (diff)
The file was modifiedllvm/include/llvm/ProfileData/InstrProf.h (diff)
The file was modifiedllvm/test/Transforms/PGOProfile/memcpy.ll (diff)
The file was modifiedcompiler-rt/include/profile/InstrProfData.inc (diff)
The file was modifiedllvm/test/Transforms/PGOProfile/memop_profile_funclet.ll (diff)
The file was modifiedllvm/include/llvm/ProfileData/InstrProfData.inc (diff)
The file was modifiedllvm/lib/Transforms/Instrumentation/InstrProfiling.cpp (diff)
The file was modifiedllvm/lib/Transforms/Instrumentation/PGOMemOPSizeOpt.cpp (diff)
The file was modifiedllvm/unittests/ProfileData/CMakeLists.txt (diff)
The file was addedllvm/unittests/ProfileData/InstrProfDataTest.cpp
Commit c12bd8dac91adac81cd9721fe34daf473ebd5e10 by llvmgnsyncbot
[gn build] Port f78f509c758
The file was modifiedllvm/utils/gn/secondary/llvm/unittests/ProfileData/BUILD.gn (diff)
Commit 1e392fc44584a4909b4dced02b8386b48963002b by flo
[ArgPromotion] Replace all md uses of promoted values with undef.

Currently, ArgPromotion may leave metadata uses of promoted values,
which will end up in the wrong function, creating invalid IR.

PR33641 fixed this for dead arguments, but it can be also be triggered
arguments with users that are promoted (see the updated test case).

We also have to drop uses to them after promoting them. We need to do
this after dealing with the non-metadata uses, so I also moved the empty
use case to the loop that deals with updating the arguments of the new
function.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D85127
The file was modifiedllvm/lib/Transforms/IPO/ArgumentPromotion.cpp (diff)
The file was modifiedllvm/test/Transforms/ArgumentPromotion/pr33641_remove_arg_dbgvalue.ll (diff)
Commit c6334db577e7049fe4868b1647c9f937f68ff1f5 by caij2003
[X86] support .nops directive

Add support of .nops on X86. This addresses llvm.org/PR45788.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D82826
The file was modifiedllvm/lib/MC/MCFragment.cpp (diff)
The file was addedllvm/test/MC/X86/x86_64-directive-nops.s
The file was modifiedllvm/lib/MC/MCAssembler.cpp (diff)
The file was modifiedllvm/lib/Target/X86/AsmParser/X86AsmParser.cpp (diff)
The file was modifiedllvm/include/llvm/MC/MCFragment.h (diff)
The file was modifiedllvm/lib/MC/MCObjectStreamer.cpp (diff)
The file was modifiedllvm/include/llvm/MC/MCStreamer.h (diff)
The file was modifiedllvm/include/llvm/MC/MCObjectStreamer.h (diff)
The file was addedllvm/test/MC/X86/x86-directive-nops-errors.s
The file was modifiedllvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp (diff)
The file was modifiedllvm/include/llvm/MC/MCAsmBackend.h (diff)
The file was modifiedllvm/lib/MC/MCStreamer.cpp (diff)
The file was addedllvm/test/MC/X86/x86-directive-nops.s
Commit 456f38a97199770d4ea563dec8c50eaaf20f0309 by aeubanks
Fix layering violation Transforms/Utils -> Scalar

Introduced in D85063.
The file was modifiedllvm/lib/Transforms/Utils/LoopVersioning.cpp (diff)
Commit 7efd9ceb588b5e76e4ce9ae0b8ed45bfc90645cd by spatel
[InstSimplify] add tests for min-of-max variants; NFC
The file was modifiedllvm/test/Transforms/InstSimplify/maxmin_intrinsics.ll (diff)
Commit 9e5cf6bde5963f14a38117061c7a4df064453088 by spatel
[InstSimplify] fold variations of max-of-min with common operand

https://alive2.llvm.org/ce/z/ZtxpZ3
The file was modifiedllvm/lib/Analysis/InstructionSimplify.cpp (diff)
The file was modifiedllvm/test/Transforms/InstSimplify/maxmin_intrinsics.ll (diff)
Commit 0d454e8e087049ae86283e73a25cf8eaad488651 by tkeith
[flang] Fix bug detecting intrinsic function

Don't set the INTRINSIC attribute on a dummy procedure.

Differential Revision: https://reviews.llvm.org/D85136
The file was modifiedflang/lib/Semantics/resolve-names.cpp (diff)
The file was modifiedflang/test/Semantics/symbol18.f90 (diff)
Commit 3e89cbf38e76d0d0ac75fe77d318a5cfeac512f5 by yamauchi
[PGO] Enable the extended value profile buckets for mem op sizes.

Following up D81682 and enable the new, extended value profile buckets for mem
op sizes.

Differential Revision: https://reviews.llvm.org/D83903
The file was modifiedllvm/lib/Transforms/Instrumentation/InstrProfiling.cpp (diff)
Commit 777824b49d5d9e1fbc93108107fa6d12a936a2e4 by Lang Hames
[llvm-jitlink] Add support for static archives and MachO universal archives.

Archives can now be specified as input files the same way that object
files are. Archives will always be linked after all objects (regardless
of the relative order of the inputs) but before any dynamic libraries or
process symbols.

This patch also relaxes matching for slice triples in
StaticLibraryDefinitionGenerator in order to support this feature:
Vendors need not match if the source vendor is unknown.
The file was modifiedllvm/lib/ExecutionEngine/Orc/ExecutionUtils.cpp (diff)
The file was modifiedllvm/tools/llvm-jitlink/llvm-jitlink.cpp (diff)
Commit dca23ed8952383701a62b778104f4db6f5d4b799 by efriedma
[AArch64] Add missing isel patterns for fcvtzs/u intrinsic on v1f64.

Fixes test-suite compile failure caused by 8dfb5d7.

While I'm in the area, add some more test coverage to related
operations, to make sure we aren't missing any other patterns.
The file was modifiedllvm/lib/Target/AArch64/AArch64InstrInfo.td (diff)
The file was modifiedllvm/test/CodeGen/AArch64/arm64-vcvt.ll (diff)
The file was modifiedllvm/test/CodeGen/AArch64/fp16_intrinsic_scalar_1op.ll (diff)
Commit 7f1556f292ccfd80c4ffa986d5b849f915e5cd82 by jonathan_roelofs
Fix typo: s/epomymous/eponymous/ NFC
The file was modifiedllvm/lib/CodeGen/MachineScheduler.cpp (diff)
Commit 7209f83112db4dbe15d8328705f9d2aff0624fbd by daniel_l_sanders
Allow .dSYM's to be directly placed in an alternate directory

Once available in the relevant toolchains this will allow us to implement
LLVM_EXTERNALIZE_DEBUGINFO_OUTPUT_DIR after D84127 by directly placing the dSYM
in the desired location instead of emitting next to the output file and moving
it.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D84572
The file was modifiedclang/lib/Driver/Driver.cpp (diff)
The file was modifiedclang/test/Driver/darwin-dsymutil.c (diff)
The file was modifiedclang/include/clang/Driver/Options.td (diff)
Commit 41b1e97b12c1407e40d8e5081bf1f9cf183934b0 by Akira
[CodeGen][ObjC] Mark calls to objc_unsafeClaimAutoreleasedReturnValue as
notail on x86-64

This is needed because the epilogue code inserted before tail calls on
x86-64 breaks the handshake between the caller and callee.

Calls to objc_retainAutoreleasedReturnValue used to have the same
problem, which was fixed in https://reviews.llvm.org/D59656.

rdar://problem/66029552

Differential Revision: https://reviews.llvm.org/D84540
The file was modifiedclang/lib/CodeGen/TargetInfo.cpp (diff)
The file was modifiedclang/lib/CodeGen/TargetInfo.h (diff)
The file was modifiedclang/lib/CodeGen/CGObjC.cpp (diff)
The file was modifiedclang/test/CodeGenObjC/arc-unsafeclaim.m (diff)
Commit 11bb7c220ccdff1ffec4780ff92fb5acec8f6f0b by i
[MC] Set sh_link to 0 if the associated symbol is undefined

Part of https://bugs.llvm.org/show_bug.cgi?id=41734

LTO can drop externally available definitions. Such AssociatedSymbol is
not associated with a symbol. ELFWriter::writeSection() will assert.

Allow a SHF_LINK_ORDER section to have sh_link=0.

We need to give sh_link a syntax, a literal zero in the linked-to symbol
position, e.g. `.section name,"ao",@progbits,0`

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D72899
The file was addedllvm/test/MC/ELF/section-linkorder.s
The file was modifiedllvm/test/CodeGen/X86/elf-associated.ll (diff)
The file was modifiedllvm/lib/MC/MCSectionELF.cpp (diff)
The file was modifiedllvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp (diff)
The file was modifiedllvm/lib/MC/MCParser/ELFAsmParser.cpp (diff)
The file was modifiedllvm/lib/MC/ELFObjectWriter.cpp (diff)
The file was addedllvm/test/CodeGen/X86/elf-associated-discarded.ll
Commit 21de4e74acf603f02f886a9e6030945f077bca3f by david.green
[ARM] Test for converting VPSEL to VMOVT. NFC
The file was addedllvm/test/CodeGen/Thumb2/mve-pred-vctpvpsel.ll
Commit 66e7dce714fabd3ddb1aed635e4b826476d4f1a2 by 31459023+hctim
Revert "[X86][SSE] Shuffle combine blends to OR(X,Y) if the relevant elements are known zero."

This reverts commit 219f32f4b68679563443cdaae7b8174c9976409a.

Commit contains unsigned compasions that break bots that build with
-Wsign-compare.
The file was modifiedllvm/test/CodeGen/X86/shuffle-vs-trunc-256.ll (diff)
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-256-v32.ll (diff)
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp (diff)
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-128-v8.ll (diff)
The file was modifiedllvm/test/CodeGen/X86/insertelement-ones.ll (diff)
Commit cb327922101b28ea70ec68d7f026da0e5e388eed by tlively
[WebAssembly] Implement prototype v128.load{32,64}_zero instructions

Specified in https://github.com/WebAssembly/simd/pull/237, these
instructions load the first vector lane from memory and zero the other
lanes. Since these instructions are not officially part of the SIMD
proposal, they are only available on an opt-in basis via LLVM
intrinsics and clang builtin functions. If these instructions are
merged to the proposal, this implementation will change so that the
instructions will be generated from normal IR. At that point the
intrinsics and builtin functions would be removed.

This PR also changes the opcodes for the experimental f32x4.qfm{a,s}
instructions because their opcodes conflicted with those of the
v128.load{32,64}_zero instructions. The new opcodes were chosen to
match those used in V8.

Differential Revision: https://reviews.llvm.org/D84820
The file was modifiedclang/include/clang/Basic/BuiltinsWebAssembly.def (diff)
The file was modifiedllvm/lib/Target/WebAssembly/WebAssemblyInstrMemory.td (diff)
The file was modifiedllvm/include/llvm/IR/IntrinsicsWebAssembly.td (diff)
The file was modifiedllvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (diff)
The file was addedllvm/test/CodeGen/WebAssembly/simd-load-zero-offset.ll
The file was modifiedllvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td (diff)
The file was modifiedllvm/test/MC/WebAssembly/simd-encodings.s (diff)
The file was modifiedclang/lib/CodeGen/CGBuiltin.cpp (diff)
The file was modifiedclang/test/CodeGen/builtins-wasm.c (diff)
The file was modifiedllvm/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCTargetDesc.h (diff)
Commit 22916481c11e1d46132752086290a668e62fc9ce by david.green
[ARM] Convert VPSEL to VMOV in tail predicated loops

VPSEL has slightly different semantics under tail predication (it can
end up selecting from Qn, Qm and Qd). We do not model that at the moment
so they block tail predicated loops from being formed.

This just converts them into a predicated VMOV instead (via a VORR),
allowing tail predication to happen whilst still modelling the original
behaviour of the input.

Differential Revision: https://reviews.llvm.org/D85110
The file was modifiedllvm/test/CodeGen/Thumb2/LowOverheadLoops/cond-vector-reduce-mve-codegen.ll (diff)
The file was modifiedllvm/lib/Target/ARM/MVEVPTOptimisationsPass.cpp (diff)
The file was modifiedllvm/test/CodeGen/Thumb2/mve-pred-selectop3.ll (diff)
The file was modifiedllvm/test/CodeGen/Thumb2/mve-pred-vctpvpsel.ll (diff)
The file was modifiedllvm/test/CodeGen/Thumb2/mve-vctp.ll (diff)
Commit 9a05fa10bd05525adedb6117351333699a3d4ae2 by 31459023+hctim
[HWASan] [GlobalISel] Add +tagged-globals backend feature for GlobalISel

GlobalISel is the default ISel for aarch64 at -O0. Prior to D78465, GlobalISel
didn't have support for dealing with address-of-global lowerings, so it fell
back to SelectionDAGISel.

HWASan Globals require special handling, as they contain the pointer tag in the
top 16-bits, and are thus outside the code model. We need to generate a `movk`
in the instruction sequence with a G3 relocation to ensure the bits are
relocated properly. This is implemented in SelectionDAGISel, this patch does
the same for GlobalISel.

GlobalISel and SelectionDAGISel differ in their lowering sequence, so there are
differences in the final instruction sequence, explained in
`tagged-globals.ll`. Both of these implementations are correct, but GlobalISel
is slightly larger code size / slightly slower (by a couple of arithmetic
instructions). I don't see this as a problem for now as GlobalISel is only on
by default at `-O0`.

Reviewed By: aemerson, arsenm

Differential Revision: https://reviews.llvm.org/D82615
The file was modifiedllvm/test/CodeGen/AArch64/tagged-globals.ll (diff)
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (diff)
The file was addedcompiler-rt/test/hwasan/TestCases/exported-tagged-global.c
Commit 8c39e70679e93da3af9f881d314940c570d5d822 by riddleriver
[mlir][OpFormatGen] Add support for eliding UnitAttr when used to anchor an optional group

Unit attributes are given meaning by their existence, and thus have no meaningful value beyond "is it present". As such, in the format of an operation unit attributes are generally used to guard the printing of other elements and aren't generally printed themselves; as the presence of the group when parsing means that the unit attribute should be added. This revision adds support to the declarative format for eliding unit attributes in situations where they anchor an optional group, but aren't the first element.

For example,
```
let assemblyFormat = "(`is_optional` $unit_attr^)? attr-dict";
```

would print `foo.op is_optional` when $unit_attr is present, instead of the current `foo.op is_optional unit`.

Differential Revision: https://reviews.llvm.org/D84577
The file was modifiedmlir/docs/OpDefinitions.md (diff)
The file was modifiedmlir/tools/mlir-tblgen/OpFormatGen.cpp (diff)
The file was modifiedmlir/test/mlir-tblgen/op-format.mlir (diff)
The file was modifiedmlir/test/lib/Dialect/Test/TestOps.td (diff)
Commit 1ce82015f6d06f8026357e4faa925f900136b575 by asbirlea
[MemorySSA] Restrict optimizations after a PhiTranslation.

Merging alias results from different paths, when a path did phi
translation is not necesarily correct. Conservatively terminate such paths.
Aimed to fix PR46156.

Differential Revision: https://reviews.llvm.org/D84905
The file was modifiedllvm/include/llvm/Analysis/MemorySSA.h (diff)
The file was modifiedllvm/lib/Analysis/MemorySSA.cpp (diff)
The file was modifiedllvm/test/Analysis/MemorySSA/phi-translation.ll (diff)