SuccessChanges

Summary

  1. [AArch64][GlobalISel] Make <8 x s8> integer arithmetic ops legal. (details)
  2. [AArch64][GlobalISel] Alias rules for G_FCMP to G_ICMP. (details)
  3. [AArch64][GlobalISel] Use emitTestBit in selection for G_BRCOND (details)
  4. [GlobalISel][AArch64] Don't emit cset for G_FCMPs feeding into G_BRCONDs (details)
  5. [flang] Readability improvement in binary->decimal conversion (details)
  6. [AMDGPU] Allow SOP asm mnemonic to differ (details)
  7. Fix a bug in memset formation with vectors of non-integral pointers (details)
  8. [AArch64][SVE] Add lowering for llvm fabs (details)
  9. [memcpyopt] Conservatively handle non-integral pointers (details)
  10. [flang][msvc] Rework a MSVC work-around to avoid clang warning (details)
  11. [flang] Fix buffering read->write transition (details)
  12. [XCOFF] Enable -fdata-sections on AIX (details)
  13. [flang] Fix actions at end of output record (details)
  14. [flang] Extend runtime API for PAUSE to allow a stop code (details)
  15. [flang][openacc] Update loop construct lowering (details)
  16. [OpenMP] Add Missing Runtime Call for Globalization Remarks (details)
  17. [PowerPC] Put the CR field in low bits of GRC during copying CRRC to GRC. (details)
  18. CodeGen: Fix livein calculation in MachineBasicBlock splitAt (details)
  19. Have kernel binary scanner load dSYMs as binary+dSYM if best thing found (details)
  20. [AMDGPU] SIInsertSkips: Tidy block splitting to use splitAt (details)
  21. [gvn] Handle a corner case w/vectors of non-integral pointers (details)
Commit e28c5899a24117cdb0b081a54508af486a2634a0 by Amara Emerson
[AArch64][GlobalISel] Make <8 x s8> integer arithmetic ops legal.
The file was modifiedllvm/test/CodeGen/AArch64/arm64-vabs.ll
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/legalize-add.mir
Commit 017b871502b0c6fe72f52c5b47780f77e38d9035 by Amara Emerson
[AArch64][GlobalISel] Alias rules for G_FCMP to G_ICMP.

No need to be different here for the vast majority of rules.
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir
The file was removedllvm/test/CodeGen/AArch64/GlobalISel/legalize-vector-icmp.mir
The file was addedllvm/test/CodeGen/AArch64/GlobalISel/legalize-vector-cmp.mir
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
Commit 8e8664e55e8986e061283cb20c30f21fb2d2b641 by Jessica Paquette
[AArch64][GlobalISel] Use emitTestBit in selection for G_BRCOND

Partially refactoring, partially fixing a bug.

- We shouldn't use TB(N)ZX unless the bit number is >= 32
- We can fold more than xor using emitTestBit

Also remove a check which isn't relevant anymore + update tests.

Rename select-brcond-of-not.mir to select-brcond-of-binop.mir, since it now
tests more than just G_XOR.

Differential Revision: https://reviews.llvm.org/D88702
The file was addedllvm/test/CodeGen/AArch64/GlobalISel/select-brcond-of-binop.mir
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/widen-narrow-tbz-tbnz.mir
The file was removedllvm/test/CodeGen/AArch64/GlobalISel/select-brcond-of-not.mir
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/select.mir
Commit 5402d11b1d8853ff10417b0f8d32edde3f4a51c0 by Jessica Paquette
[GlobalISel][AArch64] Don't emit cset for G_FCMPs feeding into G_BRCONDs

Similar to the FP case in `AArch64TargetLowering::LowerBR_CC`.

Instead of emitting the csets + a tbnz, just emit a compare + bcc
(or two bccs, depending on the condition code)

This improves cases like this: https://godbolt.org/z/v8hebx

This is a 0.1% geomean code size improvement for CTMark at -O3.

Differential Revision: https://reviews.llvm.org/D88624
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
The file was addedllvm/test/CodeGen/AArch64/GlobalISel/fold-brcond-fcmp.mir
Commit e99d184d54937b56d5f4f1ba06fb984019beaee1 by pklausler
[flang] Readability improvement in binary->decimal conversion

Tweak binary->decimal conversions to avoid an integer multiplication
in a hot loop to improve readability and get a minor (~5%) speed-up.
Use native integer division by constants for more readability, too,
since current build compilers seem to optimize it correctly now.
Delete the now needless temporary work-around facility in
Common/unsigned-const-division.h.

Differential revision: https://reviews.llvm.org/D88604
The file was modifiedflang/lib/Decimal/big-radix-floating-point.h
The file was modifiedflang/runtime/edit-output.cpp
The file was modifiedflang/lib/Decimal/binary-to-decimal.cpp
The file was removedflang/include/flang/Common/unsigned-const-division.h
Commit caeb13aba853b949ca45627f023dbeac77c13b2f by Stanislav.Mekhanoshin
[AMDGPU] Allow SOP asm mnemonic to differ

Allows the creation of real SOP1 instructions with
assembler mnemonics that differ from their
pseudo-instruction mnemonics. The default behavior
keeps the mnemonics matching.

Corrects a subtarget label typo in a comment.

Authored By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D88708
The file was modifiedllvm/lib/Target/AMDGPU/SOPInstructions.td
Commit de3cb9548d77726186db2d384193e0565cb0afc5 by listmail
Fix a bug in memset formation with vectors of non-integral pointers

We were converting the non-integral store into a integer store which is not legal.
The file was modifiedllvm/test/Transforms/LoopIdiom/non-integral-pointers.ll
The file was modifiedllvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
Commit aab6f7db471d577d313f334cba37667c35158420 by muhammad.asif.manzoor
[AArch64][SVE] Add lowering for llvm fabs

Add the functionality to lower fabs for passthru variant

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D88679
The file was modifiedllvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.h
The file was modifiedllvm/lib/Target/AArch64/SVEInstrFormats.td
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.cpp
The file was modifiedllvm/test/CodeGen/AArch64/sve-fp.ll
Commit bb0344644a656734d707ab9c0baf6eb0533ac905 by listmail
[memcpyopt] Conservatively handle non-integral pointers

If we allow the non-integral pointers to become memset and memcpy, we loose the ability to reason about pointer propagation.  This patch is modeled on changes we've carried downstream for a long time, figured it was worth being equally conservative for other users.  There is room to refine the semantics and handling here if anyone is motivated.
The file was addedllvm/test/Transforms/MemCpyOpt/non-integral.ll
The file was modifiedllvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
Commit 75a5ec1bad18ae1d741830cc46946da00fed6ed9 by pklausler
[flang][msvc] Rework a MSVC work-around to avoid clang warning

A recent MSVC work-around patch is eliciting unused variable
warnings from clang; package the lambda reference arguments
into a struct to avoid the warning.

Differential revision: https://reviews.llvm.org/D88695
The file was modifiedflang/lib/Evaluate/fold-implementation.h
Commit 61687f3a48c254436cbdd55e10bfb23b727f3eb5 by pklausler
[flang] Fix buffering read->write transition

The buffer needs to be Reset() after a Flush(), since the
Flush() can be a no-op after a read->write transition.
And record numbers are 1-based, not 0-based.
This fixes a bug with rewrites of records that have been
recently read.

Differential revision: https://reviews.llvm.org/D88612
The file was modifiedflang/runtime/buffer.h
The file was modifiedflang/runtime/io-api.cpp
Commit 78a9e62aa6f8f39fe8141e5486fca6db29947ecf by jasonliu
[XCOFF] Enable -fdata-sections on AIX

Summary:
Some design decision worth noting about:

I've noticed a recent mailing discussing about why string literal is
not affected by -fdata-sections for ELF target:
http://lists.llvm.org/pipermail/llvm-dev/2020-September/145121.html

But on AIX, our linker could not split the mergeable string like other target.
So I think it would make more sense for us to emit separate csect for
every mergeable string in -fdata-sections mode,
as there might not be other ways for linker to do garbage collection
on unused mergeable string.

Reviewed By: daltenty, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D88339
The file was modifiedllvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
The file was modifiedllvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
The file was addedllvm/test/CodeGen/PowerPC/aix-xcoff-data-sections.ll
Commit a94d943f1a3f42efede7e908bb250c84f9f442b1 by pklausler
[flang] Fix actions at end of output record

It turns out that unformatted fixed-size output records
do need to be padded out if short, in order to avoid a
spurious EOF crash on a short record at the end of the file.
While here in AdvanceRecord(), move the unformatted
variable-length record header/footer writing code to here
from EndIoStatement().

Differential revision: https://reviews.llvm.org/D88685
The file was modifiedflang/runtime/unit.cpp
The file was modifiedflang/runtime/io-stmt.cpp
The file was modifiedflang/runtime/io-stmt.h
Commit 3261aefc72b3769e8b3eccbb67e1145e195ffa8d by pklausler
[flang] Extend runtime API for PAUSE to allow a stop code

Support integer and default character stop codes on PAUSE
statements.  Add length argument to STOP statement with a
character stop code.

Differential revision: https://reviews.llvm.org/D88692
The file was modifiedflang/runtime/stop.h
The file was modifiedflang/runtime/stop.cpp
Commit c1dcb573a861dc45be6e4cfc598b340c9079fc1f by clementval
[flang][openacc] Update loop construct lowering

Update the loop construct lowering to support multiple occurences of the same clauses
such as private. Add some utility functions used by other constructs.

Upstreaming part of https://github.com/flang-compiler/f18-llvm-project/pull/438/

Reviewed By: schweitz

Differential Revision: https://reviews.llvm.org/D88253
The file was modifiedflang/lib/Lower/OpenACC.cpp
Commit 82453e759c77941cf2281ade79fb9b945b7e9458 by jhuber6
[OpenMP] Add Missing Runtime Call for Globalization Remarks

Summary:
Add a missing runtime call to perform data globalization checks.

Reviewers: jdoerfert

Subscribers: guansong hiraditya llvm-commits sstefan1 yaxunl

Tags: #LLVM #OpenMP

Differential Revision: https://reviews.llvm.org/D88621
The file was modifiedllvm/test/Transforms/OpenMP/globalization_remarks.ll
The file was modifiedllvm/lib/Transforms/IPO/OpenMPOpt.cpp
Commit c4690b007743d2f564bc1156fdbdbcaad2adddcc by esme.yi
[PowerPC] Put the CR field in low bits of GRC during copying CRRC to GRC.

Summary: How we copying the CRRC to GRC is using a single MFOCRF to copy the contents of CR field n (CR bits 4×n+32:4×n+35) into bits 4×n+32:4×n+35 of register GRC. That’s not correct because we expect the value of destination register equals to source so we have to put the the contents of CR field in the lowest 4 bits. This patch adds a RLWINM after MFOCRF to achieve that.
The problem came up when adding builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp, as posted in D88278. We need to move the outputs (in CR register) to GRC. However outputs of these instructions may not in a fixed CR# register, so we can’t directly add a rotation instruction in the .td patterns, but need to wait until the CR register is determined. Then we confirmed this should be a bug in POST-RA PSEUDO PASS.

Reviewed By: nemanjai, shchenz

Differential Revision: https://reviews.llvm.org/D88274
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrHTM.td
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrInfo.cpp
The file was modifiedllvm/test/CodeGen/PowerPC/htm-ttest.ll
Commit 5136f4748a2b3302da581f6140ca453bb37f11e9 by carl.ritson
CodeGen: Fix livein calculation in MachineBasicBlock splitAt

Fix and simplify computation of liveins for new block.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D88535
The file was modifiedllvm/lib/CodeGen/MachineBasicBlock.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/si-lower-control-flow.mir
Commit a1e97923a025d09934b557ca4343d8e4b5a9973d by Jason Molenda
Have kernel binary scanner load dSYMs as binary+dSYM if best thing found

lldb's PlatforDarwinKernel scans the local filesystem (well known
locations, plus user-specified directories) for kernels and kexts
when doing kernel debugging, and loads them automatically.  Sometimes
kernel developers want to debug with *only* a dSYM, in which case they
give lldb the DWARF binary + the dSYM as a binary and symbol file.
This patch adds code to lldb to do this automatically if that's the
best thing lldb can find.

A few other bits of cleanup in PlatformDarwinKernel that I undertook
at the same time:

1. Remove the 'platform.plugin.darwin-kernel.search-locally-for-kexts'
setting.  When I added the local filesystem index at start of kernel
debugging, I thought people might object to the cost of the search
and want a way to disable it.  No one has.

2. Change the behavior of
'plugin.dynamic-loader.darwin-kernel.load-kexts' setting so it does
not disable the local filesystem scan, or use of the local filesystem
binaries.

3. PlatformDarwinKernel::GetSharedModule into GetSharedModuleKext and
GetSharedModuleKernel for easier readability & maintenance.

4. Added accounting of .dSYM.yaa files (an archive format akin to tar)
that I come across during the scan.  I'm not using these for now; it
would be very expensive to expand the archives & see if the UUID matches
what I'm searching for.

<rdar://problem/69774993>
Differential Revision: https://reviews.llvm.org/D88632
The file was modifiedlldb/source/Plugins/Platform/MacOSX/PlatformDarwinKernel.cpp
The file was modifiedlldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
The file was modifiedlldb/source/Plugins/Platform/MacOSX/PlatformDarwinKernel.h
The file was modifiedlldb/source/Plugins/Platform/MacOSX/PlatformMacOSXProperties.td
Commit 2ef9d21e1a3cf8a58049921c785de1487fbcd7e1 by carl.ritson
[AMDGPU] SIInsertSkips: Tidy block splitting to use splitAt

Convert to use new MachineBasicBlock splitAt function.
Place code in splitBlock function for reuse in future changes.
Should yield no functional change.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D88537
The file was modifiedllvm/lib/Target/AMDGPU/SIInsertSkips.cpp
Commit f29645e7afdbb8d1fc2dd603c0b128bac055625c by listmail
[gvn] Handle a corner case w/vectors of non-integral pointers

If we try to coerce a vector of non-integral pointers to a narrower type (either narrower vector or single pointer), we use inttoptr and violate the semantics of non-integral pointers.  In theory, we can handle many of these cases, we just need to use a different code idiom to convert without going through inttoptr and back.

This shows up as wrong code bugs, and in some cases, crashes due to failed asserts.  Modeled after a change which has lived downstream for a couple years, though completely rewritten to be more idiomatic.
The file was modifiedllvm/lib/Transforms/Utils/VNCoercion.cpp
The file was modifiedllvm/test/Transforms/GVN/non-integral-pointers.ll