Changes

Summary

  1. Fix a crash on targets where __bf16 isn't supported (details)
  2. [compiler-rt][AVR] Use correct return value for __ledf2 etc (details)
  3. [clang-tidy] Escape diagnostic messages before passing to `diag` in Transformer. (details)
  4. [BOLT][NFC] Minor cleanup in ICP getCallTargets and canPromoteCallsite (details)
  5. [BOLT][NFC] Move getInliningInfo out of Inliner class (details)
  6. [ARM] Delay creation of MVE Imm shifts to legalization (details)
  7. [LoopCacheAnalysis][NFC] Add a test case for improved loop cache analysis cost calculation (details)
  8. [RISCV] Add a special case to treat riscv-v-vector-bits-min=-1 as meaning use Zvl*b value. (details)
  9. [amdgpu] Elide module lds allocation in kernels with no callees (details)
  10. [AMDGPU] Handle LDS DMA and LDS_DIRECT hazards (details)
  11. [RISCV] Add a version of insertVSETVLI which uses an iterator [NFC] (details)
  12. Revert "Revert "[clang][extract-api] Use relative includes"" (details)
  13. Update the CFA to use $sp when $fp is restored on arm64 (details)
  14. [lld-macho][nfc] Set test min version to 11.0 (details)
  15. [DebugInfo] Give warning instead of error for premature terminator in .debug_aranges section. (details)
  16. Remove expected fail for TestStepNoDebug on AArch64 (details)
  17. [sanitizer] Use newfstatat for x32 (details)
  18. [PowerPC] Re-run update_mir_test_checks.py on nofpexcept.ll. NFC (details)
  19. [llvm-otool] Make `llvm-otool -l` output compatible with otool for LC_BUILD_VERSION (details)
  20. [lld/mac] Support writing zippered dylibs and bundles (details)
Commit b1a55d0895249a493da5a442e44ee0a846410e88 by aaron
Fix a crash on targets where __bf16 isn't supported

We'd nondeterministically assert (and later crash) when calculating the size or
alignment of a __bf16 type when the type isn't supported on a target because of
reading uninitialized values. Now we check whether the type is supported first.

Fixes #50171
The file was modifiedclang/lib/AST/ASTContext.cpp
The file was modifiedclang/lib/Sema/SemaType.cpp
The file was addedclang/test/Sema/vector-decl-crash.c
The file was modifiedclang/docs/ReleaseNotes.rst
Commit c1d6dca694d001efe3d332db539348a9829d3869 by aykevanlaethem
[compiler-rt][AVR] Use correct return value for __ledf2 etc

Previously the default was long, which is 32-bit on AVR. But avr-gcc
expects a smaller value: it reads the return value from r24.

This is actually a regression from https://reviews.llvm.org/D98205.
Before D98205, the return value was an enum (which was 2 bytes in size)
which was compatible with the 1-byte return value that avr-gcc was
expecting. But long is 4 bytes and thus places the significant return
value in a different register.

Differential Revision: https://reviews.llvm.org/D124939
The file was modifiedcompiler-rt/lib/builtins/fp_compare_impl.inc
Commit 9a8d33dbd8a851ccb9821d5d1346aa225398cadc by yitzhakm
[clang-tidy] Escape diagnostic messages before passing to `diag` in Transformer.

Messages generated by Transformer rules may have `%` in them, which
needs to be escaped before being passed to `diag`, which interprets them
specially (and crashes if they are misused).

Differential Revision: https://reviews.llvm.org/D124952
The file was modifiedclang-tools-extra/clang-tidy/utils/TransformerClangTidyCheck.cpp
The file was modifiedclang-tools-extra/unittests/clang-tidy/TransformerClangTidyCheckTest.cpp
Commit 2ad1c7540eb0e07047911a39d12a12d062d4bbf4 by aaupov
[BOLT][NFC] Minor cleanup in ICP getCallTargets and canPromoteCallsite

Minor refactoring. NFC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124898
The file was modifiedbolt/lib/Passes/IndirectCallPromotion.cpp
Commit f8d2d8b587db6255bbb8ca7b87091dabb9dbecf0 by aaupov
[BOLT][NFC] Move getInliningInfo out of Inliner class

`getInliningInfo` is useful in other passes that need to check inlining
eligibility for some function. Move the declaration and InliningInfo definition
out of Inliner class. Prepare for subsequent use in ICP.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124899
The file was modifiedbolt/lib/Passes/Inliner.cpp
The file was modifiedbolt/include/bolt/Passes/Inliner.h
Commit f848798b7d3f59cf1f4de4187618eaad10b0ae86 by david.green
[ARM] Delay creation of MVE Imm shifts to legalization

The reasoning for creating VSHLIMM/VSHRsIMM/VSHRuIMM nodes in a combine
- because matching i64 constants is difficult -  does not apply for MVE,
as there are not v2i64 shifts. Delaying the creation of the nodes can
allow extra transforms on target independant shl/shr.
The file was modifiedllvm/test/CodeGen/Thumb2/mve-laneinterleaving.ll
The file was modifiedllvm/lib/Target/ARM/ARMISelLowering.cpp
Commit 5e004fb787698440a387750db7f8028e7cb14cfc by congzhecao
[LoopCacheAnalysis][NFC] Add a test case for improved loop cache analysis cost calculation

Added a motivating test case for D123400 where the loopnest has a
suboptimal loop order j-i-k. After D123400 we ensure that the order
of loop cache analysis output is loop i-j-k, despite the suboptimal
order in the original loopnest.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D122776
The file was modifiedllvm/test/Analysis/LoopCacheAnalysis/PowerPC/single-store.ll
Commit 411bb42eed723ba8e8ae29a59cbc7aacc6bab774 by craig.topper
[RISCV] Add a special case to treat riscv-v-vector-bits-min=-1 as meaning use Zvl*b value.

riscv-v-vector-bits-min is primarily used to opt-in to the
autovectorizer. The vector width can be determined from Zvl*b.

This patch adds support treating -1 as meaning use Zvl*b so we can
still opt-in to autovectorization without needing to repeat a
vector width already given by Zvl*b or -mcpu.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D124960
The file was modifiedllvm/lib/Target/RISCV/RISCVSubtarget.cpp
The file was modifiedllvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll
Commit bc78c099524283b5de44517ee5fbb805d09a7cdc by jonathanchesterfield
[amdgpu] Elide module lds allocation in kernels with no callees

Introduces a string attribute, amdgpu-requires-module-lds, to allow
eliding the module.lds block from kernels. Will allocate the block as before
if the attribute is missing or has its default value of true.

Patch uses the new attribute to detect the simplest possible instance of this,
where a kernel makes no calls and thus cannot call any functions that use LDS.

Tests updated to match, coverage was already good. Interesting cases is in
lower-module-lds-offsets where annotating the kernel allows the backend to pick
a different (in this case better) variable ordering than previously. A later
patch will avoid moving kernel variables into module.lds when the kernel can
have this attribute, allowing optimal ordering and locally unused variable
elimination.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122091
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/lower-module-lds.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIISelLowering.cpp
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUMachineFunction.h
The file was modifiedllvm/test/CodeGen/AMDGPU/lower-kernel-and-module-lds.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/lower-module-lds-offsets.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/lower-module-lds-constantexpr.ll
Commit 63f21f4cc7bb12614e7049c464f115bdcb8b7fe5 by Stanislav.Mekhanoshin
[AMDGPU] Handle LDS DMA and LDS_DIRECT hazards

There shall be 1 wait state between M0 write and LDS DMA/LDS_DIRECT use.

Differential Revision: https://reviews.llvm.org/D124550
The file was modifiedllvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
The file was modifiedllvm/lib/Target/AMDGPU/GCNSubtarget.h
The file was addedllvm/test/CodeGen/AMDGPU/lds-dma-hazards.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/hazard.mir
Commit 18ed2ee80c540d8b9c389d3722c70f89593c2b44 by preames
[RISCV] Add a version of insertVSETVLI which uses an iterator [NFC]

This is to simplify the final version of D124869.
The file was modifiedllvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
Commit cb5bb28511f2c7530806af7ef53696deed453ca1 by Zixu Wang
Revert "Revert "[clang][extract-api] Use relative includes""

Reapply the change after fixing sanitizer errors.
The original problem was that `StringRef`s in `Matches` are pointing to
temporary local `std::string`s created by `path::convert_to_slash` in
the regex match call. This patch does the conversion up front in
container `FilePath`.

This reverts commit 2966f0fa505266735dbc8324b8821b7f0aa901ff.

Differential Revision: https://reviews.llvm.org/D124964
The file was removedclang/test/ExtractAPI/known_files_only_hmap.c
The file was addedclang/test/ExtractAPI/relative_include.m
The file was modifiedclang/lib/ExtractAPI/ExtractAPIConsumer.cpp
The file was modifiedclang/include/clang/ExtractAPI/FrontendActions.h
Commit df552edb08c4a1f7600b97444f6315a63c998c59 by Jason Molenda
Update the CFA to use $sp when $fp is restored on arm64

In UnwindAssemblyInstEmulation we correctly recognize when a LDP
restores the fp & lr in an epilogue, and mark them as having the
caller's contents now, but we don't update the CFA register rule
at that point to indicate that the CFA is now calculated in terms
of $sp.  This doesn't impact the backtrace because the register
contents are all <same> now, but it can confuse the stepper when
the StackID changes mid-epilogue.

Differential Revision: https://reviews.llvm.org/D124492
rdar://92064415
The file was modifiedlldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
The file was modifiedlldb/unittests/UnwindAssembly/ARM64/TestArm64InstEmulation.cpp
Commit 19bb38b9c93c7b94dfbe89a67f8a97d0cbf8e23a by jezng
[lld-macho][nfc] Set test min version to 11.0

The arm64-apple-macos triple is only valid for versions >= 11.0. (If
one passes arm64-apple-macos10.15 to llvm-mc, the output's min version is still
11.0). In order to write tests easily for both target archs, let's up the
default min version in our tests.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D124562
The file was modifiedlld/test/MachO/lit.local.cfg
Commit a0fb387941cd26c9fbaa695dd7ddbfdc9a48eb46 by hoy
[DebugInfo] Give warning instead of error for premature terminator in .debug_aranges section.

llvm-profgen gives error message when the input binary contains premature terminator in .debug_aranges section. These zero length items point to some rodata with zero size type in embed Rust Library. Considering Zero-Sized Types are a valid feature in Rust. They are not real error. This change makes the "error:" message into a warning to avoid misleading.

Why do we still want a warning on such case? because it doesn't follow dwarf standard.  https://bugs.llvm.org/show_bug.cgi?id=46805 contains early discussion.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D124121
The file was modifiedllvm/include/llvm/DebugInfo/DWARF/DWARFDebugAranges.h
The file was addedllvm/test/tools/llvm-symbolizer/debug-aranges-premature-end.yaml
The file was modifiedllvm/lib/DebugInfo/DWARF/DWARFDebugAranges.cpp
Commit a6553d97df39ee810aedb6ba27522910986b4bf5 by Jason Molenda
Remove expected fail for TestStepNoDebug on AArch64

My fix in https://reviews.llvm.org/D124492 should fix
this - I got an "unexpected pass" failure from an
Aarch64 Ubuntu bot when I landed my fix.
The file was modifiedlldb/test/API/functionalities/step-avoids-no-debug/TestStepNoDebug.py
Commit f52e365092aa08f35b9e2b0d9b8410780c7be39a by hjl.tools
[sanitizer] Use newfstatat for x32

Since newfstatat is supported on x32, use it for x32.

Differential Revision: https://reviews.llvm.org/D124968
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_linux.cpp
Commit ef849f50481667a9e1d8261811345cefa53315eb by craig.topper
[PowerPC] Re-run update_mir_test_checks.py on nofpexcept.ll. NFC

This test was previously generated by the script, but the script
now uses CHECK-NEXT instead of CHECK.

This is preparation for a strictfp related patch I'm working on.
The file was modifiedllvm/test/CodeGen/PowerPC/nofpexcept.ll
Commit ddef1ed4e7932e09c9316681779509f980ba8d13 by thakis
[llvm-otool] Make `llvm-otool -l` output compatible with otool for LC_BUILD_VERSION

Namely, only "symbolize" platform and tool names if `-v` is passed.

(`llvm-otool -lv` output still isn't quite the same as `otool -lv` output, but
`-v` output is arguably for consumption by humans, so I'm not changing that
at this point. Someone else could change it if it was important to them.)

Differential Revision: https://reviews.llvm.org/D124920
The file was modifiedllvm/test/tools/llvm-objdump/MachO/build-version.yaml
The file was modifiedllvm/tools/llvm-objdump/MachODump.cpp
Commit 895a72111b0f1b6b29683d254dae4ebda9081eb4 by thakis
[lld/mac] Support writing zippered dylibs and bundles

With -platform_version flags for two distinct platforms,
this writes a LC_BUILD_VERSION header for each.

The motivation is that this is needed for self-hosting with lld as linker
after D124059.

To create a zippered output at the clang driver level, pass

    -target arm64-apple-macos -darwin-target-variant arm64-apple-ios-macabi

to create a zippered dylib.

(In Xcode's clang, `-darwin-target-variant` is spelled just `-target-variant`.)

(If you pass `-target arm64-apple-ios-macabi -target-variant arm64-apple-macos`
instead, ld64 crashes!)

This results in two -platform_version flags being passed to the linker.

ld64 also verifies that the iOS SDK version is at least 13.1. We don't do that
yet. But ld64 also does that for other platforms and we don't. So we need to
do that at some point, but not in this patch.

Only dylib and bundle outputs can be zippered.

I verified that a Catalyst app linked against a dylib created with

    clang -shared foo.cc -o libfoo.dylib \
          -target arm64-apple-macos \
          -target-variant arm64-apple-ios-macabi \
          -Wl,-install_name,@rpath/libfoo.dylib \
          -fuse-ld=$PWD/out/gn/bin/ld64.lld

runs successfully. (The app calls a function `f()` in libfoo.dylib
that returns a const char* "foo", and NSLog(@"%s")s it.)

ld64 is a bit more permissive when writing zippered outputs,
see references to "unzippered twins". That's not implemented yet.
(If anybody wants to implement that, D124275 is a good start.)

Differential Revision: https://reviews.llvm.org/D124887
The file was modifiedlld/test/MachO/zippered.yaml
The file was modifiedlld/MachO/Config.h
The file was modifiedlld/MachO/Writer.cpp
The file was modifiedlld/test/MachO/platform-version.s
The file was modifiedlld/MachO/Driver.cpp