Changes

Summary

  1. [ELF] Add -Bsymbolic-non-weak-functions (details)
  2. [mlir][linalg] Fix pad tensor cast folding with changed type (details)
  3. [GWP-ASan] Add version header. (details)
  4. [ARM] Define a couple more ssub indexes. NFC (details)
  5. Fix unit test checks for the scalar cases of all/any intrinsics. I (details)
  6. Simplify testcase to use v instead of p (NFC) (details)
  7. [MLIR][Python] Use DEST_PREFIX when installing. (details)
  8. [mlir][tosa] Fix tosa.reshape failures due to implicit broadcasting (details)
  9. security: highlight phab accounts; recommend phab for nominations (details)
  10. [mlir] Set insertion point of vector constant to the top of the vectorized loop body (details)
  11. GlobalISel/AArch64: don't optimize away redundant branches at -O0 (details)
  12. [InstCombine] add tests for vector cmp-bitcast; NFC (details)
  13. Fix typo (details)
  14. Make testcase more robust against codegen changes (details)
  15. [OpenMP] Adding flags for disabling the following optimizations: Deglobalization SPMDization State machine rewrites Folding (details)
  16. [ARC] Add additional mov immediate instruction formats with a fix for u6 decoding (details)
  17. [compiler-rt] Fix COMPILER_RT_OS_DIR for Android (details)
  18. [GlobalISel] Refactor the unmerge artifact value finder code. (details)
  19. [AVR][clang] Pass '--start-group' and '--end-group' options to avr-ld (details)
Commit b06426da764a8d0254521b33d667db8f26ae5e2f by i
[ELF] Add -Bsymbolic-non-weak-functions

This option is a subset of -Bsymbolic-functions. It applies to STB_GLOBAL
STT_FUNC definitions.

The address of a vague linkage function (STB_WEAK STT_FUNC, e.g. an inline
function, a template instantiation) seen by a -Bsymbolic-functions linked
shared object may be different from the address seen from outside the shared
object. Such cases are uncommon. (ELF/Mach-O programs may use
`-fvisibility-inlines-hidden` to break such pointer equality.  On Windows,
correct dllexport and dllimport are needed to make pointer equality work.
Windows link.exe enables /OPT:ICF by default so different inline functions may
have the same address.)

```
// a.cc -> a.o -> a.so (-Bsymbolic-functions)
inline void f() {}
void *g() { return (void *)&f; }

// b.cc -> b.o -> exe
// The address is different!
inline void f() {}
```

-Bsymbolic-non-weak-functions is a safer (C++ conforming) subset of
-Bsymbolic-functions, which can make such programs work.

Implementations usually emit a vague linkage definition in a COMDAT group.  We
could detect the group (with more code) but I feel that we should just check
STB_WEAK for simplicity. A weak definition will thus serve as an escape hatch
for rare cases when users want interposition on definitions.

GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27871

Longer write-up: https://maskray.me/blog/2021-05-16-elf-interposition-and-bsymbolic

If Linux distributions migrate to protected non-vague-linkage external linkage
functions by default, the linker option can still be handy because it allows
rapid experiment without recompilation. Protected function addresses currently
have deep issues in GNU ld.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D102570
The file was modifiedlld/ELF/Driver.cpp
The file was modifiedlld/docs/ReleaseNotes.rst
The file was modifiedlld/docs/ld.lld.1
The file was modifiedlld/ELF/Symbols.cpp
The file was modifiedlld/ELF/Options.td
The file was modifiedlld/test/ELF/bsymbolic.s
The file was modifiedlld/ELF/SyntheticSections.cpp
The file was modifiedlld/ELF/Config.h
Commit 9a824823131600bca71406f533c2ba051c23c7d7 by cathyzhyi
[mlir][linalg] Fix pad tensor cast folding with changed type

`PadTensorOp` has verification logic to make sure
result dim must be static if all the padding values are static.
Cast folding might add more static information for the src operand
of `PadTensorOp` which might change a valid operation to be invalid.
Change the canonicalizing pattern to fix this.
The file was modifiedmlir/test/Dialect/Linalg/canonicalize.mlir
The file was modifiedmlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
Commit 8e167f66b27fe9d2573eb149f736700302675297 by 31459023+hctim
[GWP-ASan] Add version header.

Adds magic version header to AllocatorState. This can be used by
out-of-process crash handlers, like Crashpad on Fuchsia, to do offline
reconstruction of GWP-ASan crash metadata.

Crashpad on Fuchsia is intending on dumping the AllocationMetadata pool
and the AllocatorState directly into the minidump. Then, using the
version number, they can unpack the data on serverside using a versioned
unpack tool.

Also add some asserts to make sure the version number gets bumped if the
internal structs get changed.

Reviewed By: eugenis, mcgrathr

Differential Revision: https://reviews.llvm.org/D106690
The file was modifiedcompiler-rt/lib/gwp_asan/common.h
Commit d4a2daa919272081582c6c14141dd57a6eab9832 by david.green
[ARM] Define a couple more ssub indexes. NFC

Same as 91bd3ad128f7b3b28bd98242e9a5df214eb04eea, this doesn't really
change anything but gives the registers better names than the ones
tablegen would define. And fills in the missing gaps.
The file was modifiedllvm/lib/Target/ARM/ARMRegisterInfo.td
Commit 2ca8295c860f1e8e300c2fde5c4e84b72d8248aa by leairmark
Fix unit test checks for the scalar cases of all/any intrinsics. I
accidentally used int64 when they should have been int32. This lead to
a Windows build unit test error (Linux did not catch the problem).

Differential Revision: https://reviews.llvm.org/D107107
The file was modifiedflang/unittests/RuntimeGTest/Reduction.cpp
Commit 26ba774f6865f0c6bb15dbbe80dc23e1db33d54b by Adrian Prantl
Simplify testcase to use v instead of p (NFC)
The file was modifiedlldb/test/API/commands/process/attach/TestProcessAttach.py
Commit cf36ab1d6c39e80a70b5a1dc80120cccccb5301c by stellaraccident
[MLIR][Python] Use DEST_PREFIX when installing.

Differential Revision: https://reviews.llvm.org/D107100
The file was modifiedmlir/cmake/modules/AddMLIRPython.cmake
Commit 2d0ba5e1446f0025603bbe064090737c5510bcf4 by rob.suderman
[mlir][tosa] Fix tosa.reshape failures due to implicit broadcasting

Make broadcastable needs the output shape to determine whether the operation
includes additional broadcasting. Include some canonicalizations for TOSA
to remove unneeded reshape.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D106846
The file was modifiedmlir/lib/Dialect/Tosa/Transforms/TosaMakeBroadcastable.cpp
The file was modifiedmlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
The file was modifiedmlir/test/Dialect/Tosa/broadcast.mlir
The file was modifiedmlir/include/mlir/Dialect/Tosa/IR/TosaOps.td
The file was modifiedmlir/lib/Dialect/Tosa/IR/TosaOps.cpp
Commit 4c98e9455aadee2629dd4f117b6c4e7e9e87cc01 by George Burgess IV
security: highlight phab accounts; recommend phab for nominations

This commit contains two mildly separate concepts.

First, sending out reviews for things like this is a bit of a
complicated endeavor, since the reviewer list is relatively long, and I
generally rely on prior CLs in this area to find an authoritative list.
Life's quite a bit easier if phab usernames are readily available on the
doc. So part 1 is making those available.

Second, it seems to me that, at the moment, Phabricator makes the most
sense for membership changes (incl. security group nominations). My
reasoning for this is detailed in the diff, and to some extent in
comment #1 of this bug
<https://bugs.chromium.org/p/llvm/issues/detail?id=12#c1>. This change
adds prose to recommend the use of Phabricator for nominations as a
result.

Differential Revision: https://reviews.llvm.org/D106917
The file was modifiedllvm/docs/Security.rst
Commit a8b7e56f65c78a49ba0297c4ecabbd643fa40c25 by amy.zhuang
[mlir] Set insertion point of vector constant to the top of the vectorized loop body

When we vectorize a scalar constant, the vector constant is inserted before its
first user if the scalar constant is defined outside the loops to be vectorized.
It is possible that the vector constant does not dominate all its users. To fix
the problem, we find the innermost vectorized loop that encloses that first user
and insert the vector constant at the top of the loop body.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D106609
The file was modifiedmlir/test/Dialect/Affine/SuperVectorize/vectorize_2d.mlir
The file was modifiedmlir/test/Dialect/Affine/SuperVectorize/vectorize_1d.mlir
The file was modifiedmlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
Commit c5d84d2eb35c1f0d3b08e1d1e95c1a22a28904d1 by Adrian Prantl
GlobalISel/AArch64: don't optimize away redundant branches at -O0

This patch prevents GlobalISel from optimizing out redundant branch
instructions when compiling without optimizations.

The motivating example is code like the following common pattern in
Swift, where users expect to be able to set a breakpoint on the early
exit:

public func f(b: Bool) {
  guard b else {
    return // I would like to set a breakpoint here.
  }
  ...
}

The patch modifies two places in GlobalISEL: The first one is in
IRTranslator.cpp where the removal of redundant branches is made
conditional on the optimization level. The second one is in
AArch64InstructionSelector.cpp where an -O0 *only* optimization is
being removed.

Disabling these optimizations increases code size at -O0 by
~8%. However, doing so improves debuggability, and debug builds are
the primary reason why developers compile without optimizations. We
thus concluded that this is the right trade-off.

rdar://79515454

This tenatively reapplies the patch without modifications, the LLDB
test that has blocked this from landing previously has since been
modified to hopefully no longer be sensitive to this change.

Differential Revision: https://reviews.llvm.org/D105238
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
The file was modifiedllvm/test/CodeGen/Mips/GlobalISel/llvm-ir/long_ambiguous_chain_s32.ll
The file was modifiedllvm/test/CodeGen/Mips/GlobalISel/llvm-ir/phi.ll
The file was addedllvm/test/DebugInfo/AArch64/fallthrough-branch.ll
The file was modifiedllvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
The file was modifiedllvm/test/CodeGen/Mips/GlobalISel/llvm-ir/jump_table_and_brjt.ll
The file was modifiedllvm/test/CodeGen/Mips/GlobalISel/llvm-ir/long_ambiguous_chain_s64.ll
The file was modifiedllvm/test/CodeGen/AArch64/unwind-preserved.ll
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
Commit f3c39ee84ad6950035a6083c076fdebddebefb04 by spatel
[InstCombine] add tests for vector cmp-bitcast; NFC
The file was modifiedllvm/test/Transforms/InstCombine/icmp-vec.ll
Commit 0fd813cf19c754a6cb3f2483332038a49eac33bd by Adrian Prantl
Fix typo
The file was modifiedlldb/test/API/commands/process/attach/TestProcessAttach.py
Commit 648844fd69fa304a04aab2e39dcb114c89d078bf by Adrian Prantl
Make testcase more robust against codegen changes
The file was modifiedlldb/test/API/commands/process/attach/main.cpp
Commit cd0dd8ece8e67a3bea96056b2ef15afb481155ad by huberjn
[OpenMP] Adding flags for disabling the following optimizations: Deglobalization SPMDization State machine rewrites Folding

This work provides four flags to disable four different sets of OpenMP optimizations. These flags take effect in llvm/lib/Transforms/IPO/OpenMPOpt.cpp and include the following:
- openmp-opt-disable-deglobalization: Defaults to false, adding this flag sets the variable DisableOpenMPOptDeglobalization to true. This prevents AA registration for HeapToStack and HeapToShared.
- openmp-opt-disable-spmdization: Defaults to false, adding this flag sets the variable DisableOpenMPOptSPMDization to true. This indicates a pessimistic fixpoint in changeToSPMDMode.
- openmp-opt-disable-folding: Defaults to false, adding this flag sets the variable DisableOpenMPOptFolding to true. This indicates a pessimistic fixpoint in the attributor init for AAFoldRuntimeCall.
- openmp-opt-disable-state-machine-rewrite: Defaults to false, adding this flag sets the variable DisableOpenMPOptStateMachineRewrite to true. This first prevents changes to the state machine in rewriteDeviceCodeStateMachine by returning before changes are made, and if a custom state machine is built in buildCustomStateMachine, stops by returning a pessimistic fixpoint.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D106802
The file was modifiedllvm/test/Transforms/OpenMP/fold_generic_main_thread.ll
The file was modifiedllvm/test/Transforms/OpenMP/remove_globalization.ll
The file was modifiedllvm/test/Transforms/OpenMP/custom_state_machines.ll
The file was modifiedllvm/test/Transforms/OpenMP/spmdization.ll
The file was modifiedllvm/lib/Transforms/IPO/OpenMPOpt.cpp
Commit cc238a6e038832b804c8010e02b87ff9e84e0bfe by marksl
[ARC] Add additional mov immediate instruction formats with a fix for u6 decoding

Differential Revision: https://reviews.llvm.org/D107088
The file was modifiedllvm/lib/Target/ARC/Disassembler/ARCDisassembler.cpp
The file was modifiedllvm/test/MC/Disassembler/ARC/misc.txt
The file was modifiedllvm/lib/Target/ARC/ARCInstrInfo.td
The file was modifiedllvm/test/MC/Disassembler/ARC/alu.txt
The file was modifiedllvm/lib/Target/ARC/ARCInstrFormats.td
Commit a68ccba77a48494a5200245ddcd085e49a77a2d1 by smeenai
[compiler-rt] Fix COMPILER_RT_OS_DIR for Android

Android has its own CMAKE_SYSTEM_NAME, but the OS is Linux (Android
target triples look like aarch64-none-linux-android21). The driver will
therefore search for compiler-rt libraries in the "linux" directory and
not the "android" directory, so the default placement of Android
compiler-rt libraries was incorrect. You could fix it by specifying
COMPILER_RT_OS_DIR manually, but it also makes sense to fix the default,
to save others from having to discover and fix the issue for themselves.
The file was modifiedcompiler-rt/cmake/base-config-ix.cmake
Commit f984b0e177f8342a66bde66f65c71ac68bc9acf0 by Amara Emerson
[GlobalISel] Refactor the unmerge artifact value finder code.

I moved the code that tries to combine away each unmerge def into a method in
ArtifactValueFinder class itself. This removes a logically messy lambda and
makes it easier to use the value-finder in more places in future.
The file was modifiedllvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
Commit 1e6a93f15c7ea890c876a3a10c7e3970a1710dff by powerman1st
[AVR][clang] Pass '--start-group' and '--end-group' options to avr-ld

Reviewed By: Ben Shi

Differential Revision: https://reviews.llvm.org/D106854
The file was modifiedclang/test/Driver/avr-ld.c
The file was modifiedclang/lib/Driver/ToolChains/AVR.cpp