SuccessChanges

Changes from Git (git http://labmaster3.local/git/llvm-project.git)

Summary

  1. scudo: Make it thread-safe to set some runtime configuration flags. (details)
  2. [test][SampleProfile][NewPM] Fix some tests under NPM (details)
  3. [asan][test] Several Posix/unpoison-alternate-stack.cpp fixes (details)
  4. [AArch64] Avoid pairing loads when the base reg is modified (details)
  5. [CodeGen] add test for NAN creation; NFC (details)
  6. [Sema] Support Comma operator for fp16 vectors. (details)
  7. Fix interaction of `constinit` and `weak`. (details)
  8. [OpenMP] Add Error Handling for Conflicting Pointer Sizes for Target Offload (details)
  9. [OpenMP] Replace OpenMP RTL Functions With OMPIRBuilder and OMPKinds.def (details)
  10. [AIX][Clang][Driver] Link libm in c++ mode (details)
  11. Exception support for basic block sections (details)
  12. [lldb/ipv6] Support running lldb tests in an ipv6-only environment. (details)
  13. [MLIR] Add async.value type to Async dialect (details)
  14. [lldb-vscode] Allow an empty 'breakpoints' field to clear breakpoints. (details)
  15. Fix crash in SBStructuredData::GetDescription() when there's no StructuredDataPlugin. (details)
  16. [test][NewPM][SampleProfile] Fix more tests under NPM (details)
  17. [libc++] Make sure we don't attempt to run check-cxx-abilist when libc++ doesn't define new/delete (details)
  18. Revert "[OpenMP] Add Error Handling for Conflicting Pointer Sizes for Target Offload" (details)
  19. [CodeGen] improve coverage for float (32-bit) type of NAN; NFC (details)
  20. Revert "[OpenMP] Replace OpenMP RTL Functions With OMPIRBuilder and OMPKinds.def" (details)
  21. Add GDB prettyprinters for a few more MLIR types. (details)
  22. [mlir][vector] First step of vector distribution transformation (details)
  23. [NPM] Add target specific hook to add passes for New Pass Manager (details)
  24. [X86] Canonicalize (x > 1) ? x : 1 -> (x >= 1) ? x : 1 for sign and unsigned to enable the use of test instructions for the compare. (details)
  25. [asan][test] XFAIL Posix/no_asan_gen_globals.c on Solaris (details)
  26. [NFC] Fix spacing in clang/test/Driver/aix-ld.c (details)
  27. [flang] Fix descriptor-based array data item I/O for list-directed CHARACTER & LOGICAL (details)
  28. [clangd] Remove dead variable. NFC (details)
  29. [PDB] Merge types in parallel when using ghashing (details)
  30. Revert "[PDB] Merge types in parallel when using ghashing" (details)
  31. [mlir][Linalg] Add pattern to tile and fuse Linalg operations on buffers. (details)
  32. [Msan] Add ptsname, ptsname_r interceptors (details)
  33. [AMDGPU] Reorganize VOP3P encoding (details)
  34. Re-land "[PDB] Merge types in parallel when using ghashing" (details)
  35. [flang] Semantic analysis for FINAL subroutines (details)
  36. [OpenMP][libomptarget] make omp_get_initial_device 5.1 compliant (details)
  37. [OpenMP][OMPT] Update OMPT tests for newly added GOMP interface patches (details)
  38. Handle unknown OSes in DarwinTargetInfo::getExnObjectAlignment (details)
  39. [PowerPC] Add outer product instructions for MMA (details)
  40. Patch IEEEFloat::isSignificandAllZeros and IEEEFloat::isSignificandAllOnes (bug 34579) (details)
  41. [OpenMP][libarcher] Allow all possible argument separators in TSAN_OPTIONS (details)
  42. [ARM] Add missing target for Arm neon test case. (details)
  43. [AArch64][GlobalISel] NFC: Refactor G_FCMP selection code (details)
  44. [lldb] Make TestGuiBasicDebug more lenient (details)
  45. [flang] Allow record advancement in external formatted sequential READ (details)
  46. [AArch64][GlobalISel] Add some more legal types for G_PHI, G_IMPLICIT_DEF, G_FREEZE. (details)
  47. [WholeProgramDevirt][NewPM] Add NPM testing path to match legacy pass (details)
  48. Try to fix build. May have used a C++ feature too new/not supported on all platforms. (details)
  49. [lld][WebAssembly] Allow exporting of mutable globals (details)
  50. Remove `Ops` suffix from dialect library names (details)
  51. [flang] Fix Gw.d format output (details)
  52. [mlir] Split Dialect::addOperations into two functions (details)
  53. [AArch64][GlobalISel] Clamp oversize FP arithmetic vectors. (details)
  54. [flang][msvc] Avoid ReferenceVariantBase ctor ambiguity. NFC. (details)
  55. [WebAssembly] New-style command support (details)
  56. [flang][msvc] Workaround 'forgotten' symbols in FoldOperation. NFC. (details)
  57. [APFloat] Improve asserts in isSignificandAllOnes and isSignificandAllZeros so they protect shift operations from undefined behavior. (details)
  58. [ELF] --wrap: don't unnecessarily expose __real_ (details)
  59. Revert "[llvm-exegesis] Add option to check the hardware support for a given feature before benchmarking." (details)
Commit 719ab7309eb7b7b5d802273b0f1871d6cdb965b1 by peter
scudo: Make it thread-safe to set some runtime configuration flags.

Move some of the flags previously in Options, as well as the
UseMemoryTagging flag previously in the primary allocator, into an
atomic variable so that it can be updated while other threads are
running. Relaxed accesses are used because we only have the requirement
that the other threads see the new value eventually.

The code is set up so that the variable is generally loaded once per
allocation function call with the exception of some rarely used code
such as error handlers. The flag bits can generally stay in a register
during the execution of the allocation function which means that they
can be branched on with minimal overhead (e.g. TBZ on aarch64).

Differential Revision: https://reviews.llvm.org/D88523
The file was modifiedcompiler-rt/lib/scudo/standalone/atomic_helpers.h
The file was modifiedcompiler-rt/lib/scudo/standalone/primary64.h
The file was modifiedcompiler-rt/lib/scudo/standalone/combined.h
The file was addedcompiler-rt/lib/scudo/standalone/options.h
The file was modifiedcompiler-rt/lib/scudo/standalone/wrappers_c.inc
The file was modifiedcompiler-rt/lib/scudo/standalone/primary32.h
Commit 2ab87702231e193ca170aa8ad4caa9f98bc7ced1 by aeubanks
[test][SampleProfile][NewPM] Fix some tests under NPM
The file was modifiedllvm/test/Transforms/SampleProfile/discriminator.ll
The file was modifiedllvm/test/Transforms/SampleProfile/calls.ll
The file was modifiedllvm/test/Transforms/SampleProfile/propagate.ll
The file was modifiedllvm/test/Transforms/SampleProfile/remap.ll
The file was modifiedllvm/test/Transforms/SampleProfile/branch.ll
The file was modifiedllvm/test/Transforms/SampleProfile/offset.ll
The file was modifiedllvm/test/Transforms/SampleProfile/fnptr.ll
Commit 73fb9698c0573778787e77a8ffa57e7fa3caebd4 by ro
[asan][test] Several Posix/unpoison-alternate-stack.cpp fixes

`Posix/unpoison-alternate-stack.cpp` currently `FAIL`s on Solaris/i386.
Some of the problems are generic:

- `clang` warns compiling the testcase:

  compiler-rt/test/asan/TestCases/Posix/unpoison-alternate-stack.cpp:83:7: warning: nested designators are a C99 extension [-Wc99-designator]
        .sa_sigaction = signalHandler,
        ^~~~~~~~~~~~~
  compiler-rt/test/asan/TestCases/Posix/unpoison-alternate-stack.cpp:84:7: warning: ISO C++ requires field designators to be specified in declaration order; field '_funcptr' will be initialized after field 'sa_flags' [-Wreorder-init-list]
        .sa_flags = SA_SIGINFO | SA_NODEFER | SA_ONSTACK,
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  and some more instances.  This can all easily be avoided by initializing
  each field separately.

- The test `SEGV`s in `__asan_memcpy`.  The default Solaris/i386 stack size
  is only 4 kB, while `__asan_memcpy` tries to allocate either 5436
  (32-bit) or 10688 bytes (64-bit) on the stack.  This patch avoids this by
  requiring at least 16 kB stack size.

- Even without `-fsanitize=address` I get an assertion failure:

  Assertion failed: !isOnSignalStack(), file compiler-rt/test/asan/TestCases/Posix/unpoison-alternate-stack.cpp, line 117

  The fundamental problem with this testcase is that `longjmp` from a
  signal handler is highly unportable; XPG7 strongly warns against it and
  it is thus unspecified which stack is used when `longjmp`ing from a
  signal handler running on an alternative stack.

  So I'm `XFAIL`ing this testcase on Solaris.

Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D88501
The file was modifiedcompiler-rt/test/asan/TestCases/Posix/unpoison-alternate-stack.cpp
Commit 8d8cb1ad80b7074ac60d070fae89261894d34a0d by dancgr
[AArch64] Avoid pairing loads when the base reg is modified

When pairing loads, we should check if in between the two loads the
base register has been modified. If that is the case then avoid pairing
them because the second load actually loads from a different address.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D86956
The file was modifiedllvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
The file was addedllvm/test/CodeGen/AArch64/aarch64-ldst-modified-baseReg.mir
Commit 187686bea3878c0bf2b150d784e7eab223434e25 by spatel
[CodeGen] add test for NAN creation; NFC

This goes with the APFloat change proposed in
D88238.
This is copied from the MIPS-specific test in
builtin-nan-legacy.c to verify that the normal
behavior is correct on other targets without the
complication of an inverted quiet bit.
The file was addedclang/test/CodeGen/builtin-nan-exception.c
Commit 700e63293eea4a23440f300b1e9125ca2e80c6e9 by flo
[Sema] Support Comma operator for fp16 vectors.

The current half vector was enforcing an assert expecting
"(LHS is half vector) == (RHS is half vector)"
for comma.

Reviewed By: ahatanak, fhahn

Differential Revision: https://reviews.llvm.org/D88265
The file was modifiedclang/lib/Sema/SemaExpr.cpp
The file was modifiedclang/test/Sema/fp16vec-sema.c
Commit 892df30a7f344b6cb9995710efbc94bb25cfb95b by richard
Fix interaction of `constinit` and `weak`.

We previously took a shortcut and said that weak variables never have
constant initializers (because those initializers are never correct to
use outside the variable). We now say that weak variables can have
constant initializers, but are never usable in constant expressions.
The file was modifiedclang/lib/AST/ExprConstant.cpp
The file was addedclang/test/SemaCXX/cxx20-constinit.cpp
The file was modifiedclang/lib/AST/Decl.cpp
The file was modifiedclang/lib/Sema/SemaDeclCXX.cpp
Commit 9d2378b59150f6f1cb5c9cf42ea06b0bb57029a1 by huberjn
[OpenMP] Add Error Handling for Conflicting Pointer Sizes for Target Offload

Summary:
This patch adds an error to Clang that detects if OpenMP offloading is used
between two architectures with incompatible pointer sizes. This ensures that
the data mapping can be done correctly and solves an issue in code generation
generating the wrong size pointer.

Reviewer: jdoerfert

Subscribers:

Tags: #OpenMP #Clang

Differential Revision:
The file was modifiedclang/lib/Frontend/CompilerInvocation.cpp
The file was addedclang/test/OpenMP/target_incompatible_architecture_messages.cpp
The file was modifiedclang/include/clang/Basic/DiagnosticDriverKinds.td
The file was modifiedclang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
Commit 90eaedda9b8ef46e2c0c1b8bce33e98a3adbb68c by jhuber6
[OpenMP] Replace OpenMP RTL Functions With OMPIRBuilder and OMPKinds.def

Summary:
Replace the OpenMP Runtime Library functions used in CGOpenMPRuntimeGPU
for OpenMP device code generation with ones in OMPKinds.def and use
OMPIRBuilder for generating runtime calls. This allows us to consolidate
more OpenMP code generation into the OMPIRBuilder. This patch also
invalidates specifying target architectures with conflicting pointer
sizes.

Reviewers: jdoerfert

Subscribers: aaron.ballman cfe-commits guansong llvm-commits sstefan1 yaxunl

Tags: #OpenMP #Clang #LLVM

Differential Revision: https://reviews.llvm.org/D88430
The file was modifiedclang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
The file was modifiedclang/lib/CodeGen/CodeGenModule.h
The file was modifiedllvm/test/Transforms/OpenMP/add_attributes.ll
The file was modifiedclang/lib/CodeGen/CGOpenMPRuntime.h
The file was modifiedclang/test/OpenMP/nvptx_parallel_codegen.cpp
The file was modifiedllvm/include/llvm/Frontend/OpenMP/OMPKinds.def
Commit afc277b0ed0dcd9fbbde6015bbdf289349fb2104 by daltenty
[AIX][Clang][Driver] Link libm in c++ mode

since that is the normal behaviour of other compilers on the platform.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D88500
The file was modifiedclang/test/Driver/aix-ld.c
The file was modifiedclang/lib/Driver/ToolChains/AIX.cpp
Commit 8955950c121c97a686310991203c89ba14c90b82 by rahmanl
Exception support for basic block sections

This is part of the Propeller framework to do post link code layout optimizations. Please see the RFC here: https://groups.google.com/forum/#!msg/llvm-dev/ef3mKzAdJ7U/1shV64BYBAAJ and the detailed RFC doc here: https://github.com/google/llvm-propeller/blob/plo-dev/Propeller_RFC.pdf

This patch provides exception support for basic block sections by splitting the call-site table into call-site ranges corresponding to different basic block sections. Still all landing pads must reside in the same basic block section (which is guaranteed by the the core basic block section patch D73674 (ExceptionSection) ). Each call-site table will refer to the landing pad fragment by explicitly specifying @LPstart (which is omitted in the normal non-basic-block section case). All these call-site tables will share their action and type tables.

The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. In the case of basic block section where one section contains all the landing pads, the landing pad offset relative to LPStart could actually be zero. Thus, we avoid zero-offset landing pads by inserting a **nop** operation as the first non-CFI instruction in the exception section.

**Background on Exception Handling in C++ ABI**
https://github.com/itanium-cxx-abi/cxx-abi/blob/master/exceptions.pdf

Compiler emits an exception table for every function. When an exception is thrown, the stack unwinding library queries the unwind table (which includes the start and end of each function) to locate the exception table for that function.

The exception table includes a call site table for the function, which is used to guide the exception handling runtime to take the appropriate action upon an exception. Each call site record in this table is structured as follows:

| CallSite                       |  -->  Position of the call site (relative to the function entry)
| CallSite length           |  -->  Length of the call site.
| Landing Pad               |  -->  Position of the landing pad (relative to the landing pad fragment’s begin label)
| Action record offset  |  -->  Position of the first action record

The call site records partition a function into different pieces and describe what action must be taken for each callsite. The callsite fields are relative to the start of the function (as captured in the unwind table).

The landing pad entry is a reference into the function and corresponds roughly to the catch block of a try/catch statement. When execution resumes at a landing pad, it receives an exception structure and a selector value corresponding to the type of the exception thrown, and executes similar to a switch-case statement. The landing pad field is relative to the beginning of the procedure fragment which includes all the landing pads (@LPStart). The C++ ABI requires all landing pads to be in the same fragment. Nonetheless, without basic block sections, @LPStart is the same as the function @Start (found in the unwind table) and can be omitted.

The action record offset is an index into the action table which includes information about which exception types are caught.

**C++ Exceptions with Basic Block Sections**
Basic block sections break the contiguity of a function fragment. Therefore, call sites must be specified relative to the beginning of the basic block section. Furthermore, the unwinding library should be able to find the corresponding callsites for each section. To do so, the .cfi_lsda directive for a section must point to the range of call-sites for that section.
This patch introduces a new **CallSiteRange** structure which specifies the range of call-sites which correspond to every section:

  `struct CallSiteRange {
    // Symbol marking the beginning of the precedure fragment.
    MCSymbol *FragmentBeginLabel = nullptr;
    // Symbol marking the end of the procedure fragment.
    MCSymbol *FragmentEndLabel = nullptr;
    // LSDA symbol for this call-site range.
    MCSymbol *ExceptionLabel = nullptr;
    // Index of the first call-site entry in the call-site table which
    // belongs to this range.
    size_t CallSiteBeginIdx = 0;
    // Index just after the last call-site entry in the call-site table which
    // belongs to this range.
    size_t CallSiteEndIdx = 0;
    // Whether this is the call-site range containing all the landing pads.
    bool IsLPRange = false;
  };`

With N basic-block-sections, the call-site table is partitioned into N call-site ranges.

Conceptually, we emit the call-site ranges for sections sequentially in the exception table as if each section has its own exception table. In the example below, two sections result in the two call site ranges (denoted by LSDA1 and LSDA2) placed next to each other. However, their call-sites will refer to records in the shared Action Table. We also emit the header fields (@LPStart and CallSite Table Length) for each call site range in order to place the call site ranges in separate LSDAs. We note that with -basic-block-sections, The CallSiteTableLength will not actually represent the length of the call site table, but rather the reference to the action table. Since the only purpose of this field is to locate the action table, correctness is guaranteed.

Finally, every call site range has one @LPStart pointer so the landing pads of each section must all reside in one section (not necessarily the same section). To make this easier, we decide to place all landing pads of the function in one section (hence the `IsLPRange` field in CallSiteRange).

|  @LPStart                   |  --->  Landing pad fragment     ( LSDA1 points here)
| CallSite Table Length | ---> Used to find the action table.
| CallSites                     |
| …                                 |
| …                                 |
| @LPStart                    |  --->  Landing pad fragment ( LSDA2 points here)
| CallSite Table Length |
| CallSites                     |
| …                                 |
| …                                 |


|      Action Table          |
|      Types Table           |

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D73739
The file was modifiedllvm/include/llvm/CodeGen/AsmPrinterHandler.h
The file was addedllvm/test/CodeGen/X86/gcc_except_table_bb_sections_ehpad_groups_with_cold.ll
The file was modifiedllvm/lib/CodeGen/AsmPrinter/EHStreamer.h
The file was modifiedllvm/lib/CodeGen/AsmPrinter/WasmException.cpp
The file was modifiedllvm/lib/CodeGen/AsmPrinter/WasmException.h
The file was modifiedllvm/lib/CodeGen/AsmPrinter/EHStreamer.cpp
The file was modifiedllvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
The file was modifiedllvm/lib/Target/ARM/ARMAsmPrinter.cpp
The file was addedllvm/test/CodeGen/X86/gcc_except_table_bb_sections.ll
The file was modifiedllvm/lib/CodeGen/AsmPrinter/DwarfCFIException.cpp
The file was modifiedllvm/include/llvm/CodeGen/AsmPrinter.h
The file was modifiedllvm/lib/CodeGen/BasicBlockSections.cpp
Commit c3193e464cbd5e8b7cade103032c222bf8bc0e27 by rupprecht
[lldb/ipv6] Support running lldb tests in an ipv6-only environment.

When running in an ipv6-only environment where `AF_INET` sockets are not available, many lldb tests (mostly gdb remote tests) fail because things like `127.0.0.1` don't work there.

Use `localhost` instead of `127.0.0.1` whenever possible, or include a fallback of creating `AF_INET6` sockets when `AF_INET` fails.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D87333
The file was modifiedlldb/unittests/Host/SocketTest.cpp
The file was modifiedlldb/test/API/functionalities/gdb_remote_client/gdbclientutils.py
The file was modifiedlldb/packages/Python/lldbsuite/test/tools/lldb-server/gdbremote_testcase.py
The file was modifiedlldb/test/API/tools/lldb-server/commandline/TestStubReverseConnect.py
The file was modifiedlldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunication.cpp
The file was modifiedlldb/tools/lldb-server/lldb-gdbserver.cpp
The file was modifiedlldb/unittests/Host/SocketTestUtilities.cpp
Commit 655af658c93bf7f133341e7eb5a2dfa176282781 by ezhulenev
[MLIR] Add async.value type to Async dialect

Return values from async regions as !async.value<...>.

Reviewed By: mehdi_amini, csigg

Differential Revision: https://reviews.llvm.org/D88510
The file was modifiedmlir/include/mlir/Dialect/Async/IR/Async.h
The file was modifiedmlir/test/Dialect/Async/ops.mlir
The file was modifiedmlir/lib/Dialect/Async/IR/Async.cpp
The file was modifiedmlir/include/mlir/Dialect/Async/IR/AsyncBase.td
The file was modifiedmlir/include/mlir/Dialect/Async/IR/AsyncOps.td
Commit ad865d9d10b8cf93738470175aae1be7a4a3eb6b by rupprecht
[lldb-vscode] Allow an empty 'breakpoints' field to clear breakpoints.

Per the DAP spec for SetBreakpoints [1], the way to clear breakpoints is: `To clear all breakpoint for a source, specify an empty array.`

However, leaving the breakpoints field unset is also a well formed request (note the `breakpoints?:` in the `SetBreakpointsArguments` definition). If it's unset, we have a couple choices:

1. Crash (current behavior)
2. Clear breakpoints
3. Return an error response that the breakpoints field is missing.

I propose we do (2) instead of (1), and treat an unset breakpoints field the same as an empty breakpoints field.

[1] https://microsoft.github.io/debug-adapter-protocol/specification#Requests_SetBreakpoints

Reviewed By: wallace, labath

Differential Revision: https://reviews.llvm.org/D88513
The file was modifiedlldb/packages/Python/lldbsuite/test/tools/lldb-vscode/vscode.py
The file was modifiedlldb/test/API/tools/lldb-vscode/breakpoint/TestVSCode_setBreakpoints.py
The file was modifiedlldb/tools/lldb-vscode/lldb-vscode.cpp
Commit afaeb6af79a4278249ef9114755e5685d0b35984 by jingham
Fix crash in SBStructuredData::GetDescription() when there's no StructuredDataPlugin.

Also, use the StructuredData::Dump method to print the StructuredData if there
is no plugin, rather than just returning an error.

Differential Revision: https://reviews.llvm.org/D88266
The file was modifiedlldb/include/lldb/Core/StructuredDataImpl.h
The file was modifiedlldb/test/API/python_api/sbstructureddata/TestStructuredDataAPI.py
Commit 2d761a368c3637cb6a6b05eb10ac8d839efe77cc by aeubanks
[test][NewPM][SampleProfile] Fix more tests under NPM

These all have separate legacy and new PM RUN lines.
The file was modifiedllvm/test/Transforms/SampleProfile/profile-sample-accurate.ll
The file was modifiedllvm/test/Transforms/SampleProfile/flattened.ll
The file was modifiedllvm/test/Transforms/SampleProfile/inline-mergeprof.ll
Commit 490b556a0f3c9daddd05651d945662b93b3b13b9 by Louis Dionne
[libc++] Make sure we don't attempt to run check-cxx-abilist when libc++ doesn't define new/delete

That would make the test fail spuriously because we don't generate
an ABI list for that configuration.
The file was modifiedlibcxx/lib/abi/CMakeLists.txt
Commit bdc85292fb0f2a3965c8c65f9461d285b04841ed by huberjn
Revert "[OpenMP] Add Error Handling for Conflicting Pointer Sizes for Target Offload"

Failing tests on Arm due to the tests automatically populating
incomatible pointer width architectures. Reverting until the tests are
updated. Failing tests:

OpenMP/distribute_parallel_for_num_threads_codegen.cpp
OpenMP/distribute_parallel_for_if_codegen.cpp
OpenMP/distribute_parallel_for_simd_if_codegen.cpp
OpenMP/distribute_parallel_for_simd_num_threads_codegen.cpp
OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
OpenMP/teams_distribute_parallel_for_if_codegen.cpp
OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp

This reverts commit 9d2378b59150f6f1cb5c9cf42ea06b0bb57029a1.
The file was modifiedclang/include/clang/Basic/DiagnosticDriverKinds.td
The file was modifiedclang/lib/Frontend/CompilerInvocation.cpp
The file was removedclang/test/OpenMP/target_incompatible_architecture_messages.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
Commit 81921ebc430536ae5718da70a54328c790c8ae19 by spatel
[CodeGen] improve coverage for float (32-bit) type of NAN; NFC

Goes with D88238
The file was modifiedclang/test/CodeGen/builtin-nan-exception.c
Commit 1b60f63e4fd041550019b692dc7bf490dce2c75c by jhuber6
Revert "[OpenMP] Replace OpenMP RTL Functions With OMPIRBuilder and OMPKinds.def"

Failing tests on Arm due to the tests automatically populating
incomatible pointer width architectures. Reverting until the tests are
updated. Failing tests:

OpenMP/distribute_parallel_for_num_threads_codegen.cpp
OpenMP/distribute_parallel_for_if_codegen.cpp
OpenMP/distribute_parallel_for_simd_if_codegen.cpp
OpenMP/distribute_parallel_for_simd_num_threads_codegen.cpp
OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp
OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp
OpenMP/teams_distribute_parallel_for_if_codegen.cpp
OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp

This reverts commit 90eaedda9b8ef46e2c0c1b8bce33e98a3adbb68c.
The file was modifiedclang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
The file was modifiedllvm/test/Transforms/OpenMP/add_attributes.ll
The file was modifiedclang/lib/CodeGen/CGOpenMPRuntime.h
The file was modifiedclang/test/OpenMP/nvptx_parallel_codegen.cpp
The file was modifiedllvm/include/llvm/Frontend/OpenMP/OMPKinds.def
The file was modifiedclang/lib/CodeGen/CodeGenModule.h
Commit e9b38841619f20a6f4c8657880fd487083ba499a by csigg
Add GDB prettyprinters for a few more MLIR types.

Reviewed By: dblaikie, jpienaar

Differential Revision: https://reviews.llvm.org/D87159
The file was addeddebuginfo-tests/llvm-prettyprinters/gdb/mlir-support.gdb
The file was modifieddebuginfo-tests/CMakeLists.txt
The file was addeddebuginfo-tests/llvm-prettyprinters/gdb/mlir-support.cpp
The file was addedmlir/utils/gdb-scripts/prettyprinters.py
The file was modifieddebuginfo-tests/llvm-prettyprinters/gdb/llvm-support.cpp
The file was modifieddebuginfo-tests/llvm-prettyprinters/gdb/lit.local.cfg
The file was modifieddebuginfo-tests/lit.site.cfg.py.in
The file was modifieddebuginfo-tests/lit.cfg.py
Commit dd14e5825209386129770296f9bc3a14ab0b4592 by thomasraoux
[mlir][vector] First step of vector distribution transformation

This is the first of several steps to support distributing large vectors. This
adds instructions extract_map and insert_map that allow us to do incremental
lowering. Right now the transformation only apply to simple pointwise operation
with a vector size matching the multiplicity of the IDs used to distribute the
vector.
This can be used to distribute large vectors to loops or SPMD.

Differential Revision: https://reviews.llvm.org/D88341
The file was modifiedmlir/include/mlir/Dialect/Vector/VectorOps.td
The file was modifiedmlir/lib/Dialect/Vector/VectorOps.cpp
The file was modifiedmlir/test/Dialect/Vector/ops.mlir
The file was modifiedmlir/test/lib/Transforms/TestVectorTransforms.cpp
The file was modifiedmlir/test/Dialect/Vector/invalid.mlir
The file was modifiedmlir/include/mlir/Dialect/Vector/VectorTransforms.h
The file was modifiedmlir/lib/Dialect/Vector/VectorTransforms.cpp
The file was addedmlir/test/Dialect/Vector/vector-distribution.mlir
Commit ce5379f0f0675592fd10a522009fd5b1561ca72b by aeubanks
[NPM] Add target specific hook to add passes for New Pass Manager

The patch adds a new TargetMachine member "registerPassBuilderCallbacks" for targets to add passes to the pass pipeline using the New Pass Manager (similar to adjustPassManager for the Legacy Pass Manager).

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D88138
The file was addedllvm/test/CodeGen/Hexagon/registerpassbuildercallbacks.ll
The file was modifiedllvm/include/llvm/Target/TargetMachine.h
The file was modifiedllvm/tools/opt/NewPMDriver.cpp
The file was modifiedclang/lib/CodeGen/BackendUtil.cpp
The file was modifiedllvm/lib/Target/Hexagon/HexagonTargetMachine.cpp
The file was modifiedllvm/lib/Target/Hexagon/HexagonTargetMachine.h
Commit d1d7fc98325d948bede85e6304c5ca93f79e050e by craig.topper
[X86] Canonicalize (x > 1) ? x : 1 -> (x >= 1) ? x : 1 for sign and unsigned to enable the use of test instructions for the compare.

This will be further canonicalized to a compare involving 0
which will enable the use of test instructions. Either using
cmovg for signed for cmovne for unsigned.

Fixes more case for PR47049
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
The file was modifiedllvm/test/CodeGen/X86/cmov.ll
Commit 8a1084a9486313e9f46e61ab69f80309c7050e1f by ro
[asan][test] XFAIL Posix/no_asan_gen_globals.c on Solaris

`Posix/no_asan_gen_globals.c` currently `FAIL`s on Solaris:

  $ nm no_asan_gen_globals.c.tmp.exe | grep ___asan_gen_
  0809696a r .L___asan_gen_.1
  0809a4cd r .L___asan_gen_.2
  080908e2 r .L___asan_gen_.4
  0809a4cd r .L___asan_gen_.5
  0809a529 r .L___asan_gen_.7
  0809a4cd r .L___asan_gen_.8

As detailed in Bug 47607, there are two factors here:

- `clang` plays games by emitting some local labels into the symbol
  table.  When instead one uses `-fno-integrated-as` to have `gas` create
  the object files, they don't land in the objects in the first place.
- Unlike GNU `ld`, the Solaris `ld` doesn't support support
  `-X`/`--discard-locals` but instead relies on the assembler to follow its
  specification and not emit local labels.

Therefore this patch `XFAIL`s the test on Solaris.

Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D88218
The file was modifiedcompiler-rt/test/asan/TestCases/Posix/no_asan_gen_globals.c
Commit ae4c400e02fc3f7cff11cc332e6b107353b3e6a2 by hubert.reinterpretcast
[NFC] Fix spacing in clang/test/Driver/aix-ld.c

Fix one line with mismatch in indentation after afc277b0ed0d.
The file was modifiedclang/test/Driver/aix-ld.c
Commit 0c3c8f4ae69a619efd8dc088e2572db172d40547 by pklausler
[flang] Fix descriptor-based array data item I/O for list-directed CHARACTER & LOGICAL

These types have to distinguish list-directed I/O from formatted I/O,
and the subscript incrementation call was in the formatted branch
of the if() rather than after the if().

Differential revision: https://reviews.llvm.org/D88606
The file was modifiedflang/runtime/descriptor-io.h
The file was modifiedflang/unittests/Runtime/hello.cpp
Commit 85fc5bf341395171e67490061f6fbc76b297b78d by sam.mccall
[clangd] Remove dead variable. NFC
The file was modifiedclang-tools-extra/clangd/URI.cpp
Commit 49b3459930655d879b2dc190ff8fe11c38a8be5f by rnk
[PDB] Merge types in parallel when using ghashing

This makes type merging much faster (-24% on chrome.dll) when multiple
threads are available, but it slightly increases the time to link (+10%)
when /threads:1 is passed. With only one more thread, the new type
merging is faster (-11%). The output PDB should be identical to what it
was before this change.

To give an idea, here is the /time output placed side by side:
                              BEFORE    | AFTER
  Input File Reading:           956 ms  |  968 ms
  Code Layout:                  258 ms  |  190 ms
  Commit Output File:             6 ms  |    7 ms
  PDB Emission (Cumulative):   6691 ms  | 4253 ms
    Add Objects:               4341 ms  | 2927 ms
      Type Merging:            2814 ms  | 1269 ms  -55%!
      Symbol Merging:          1509 ms  | 1645 ms
    Publics Stream Layout:      111 ms  |  112 ms
    TPI Stream Layout:          764 ms  |   26 ms  trivial
    Commit to Disk:            1322 ms  | 1036 ms  -300ms
----------------------------------------- --------
Total Link Time:               8416 ms    5882 ms  -30% overall

The main source of the additional overhead in the single-threaded case
is the need to iterate all .debug$T sections up front to check which
type records should go in the IPI stream. See fillIsItemIndexFromDebugT.
With changes to the .debug$H section, we could pre-calculate this info
and eliminate the need to do this walk up front. That should restore
single-threaded performance back to what it was before this change.

This change will cause LLD to be much more parallel than it used to, and
for users who do multiple links in parallel, it could regress
performance. However, when the user is only doing one link, it's a huge
improvement. In the future, we can use NT worker threads to avoid
oversaturating the machine with work, but for now, this is such an
improvement for the single-link use case that I think we should land
this as is.

Algorithm
----------

Before this change, we essentially used a
DenseMap<GloballyHashedType, TypeIndex> to check if a type has already
been seen, and if it hasn't been seen, insert it now and use the next
available type index for it in the destination type stream. DenseMap
does not support concurrent insertion, and even if it did, the linker
must be deterministic: it cannot produce different PDBs by using
different numbers of threads. The output type stream must be in the same
order regardless of the order of hash table insertions.

In order to create a hash table that supports concurrent insertion, the
table cells must be small enough that they can be updated atomically.
The algorithm I used for updating the table using linear probing is
described in this paper, "Concurrent Hash Tables: Fast and General(?)!":
https://dl.acm.org/doi/10.1145/3309206

The GHashCell in this change is essentially a pair of 32-bit integer
indices: <sourceIndex, typeIndex>. The sourceIndex is the index of the
TpiSource object, and it represents an input type stream. The typeIndex
is the index of the type in the stream. Together, we have something like
a ragged 2D array of ghashes, which can be looked up as:
  tpiSources[tpiSrcIndex]->ghashes[typeIndex]

By using these side tables, we can omit the key data from the hash
table, and keep the table cell small. There is a cost to this: resolving
hash table collisions requires many more loads than simply looking at
the key in the same cache line as the insertion position. However, most
supported platforms should have a 64-bit CAS operation to update the
cell atomically.

To make the result of concurrent insertion deterministic, the cell
payloads must have a priority function. Defining one is pretty
straightforward: compare the two 32-bit numbers as a combined 64-bit
number. This means that types coming from inputs earlier on the command
line have a higher priority and are more likely to appear earlier in the
final PDB type stream than types from an input appearing later on the
link line.

After table insertion, the non-empty cells in the table can be copied
out of the main table and sorted by priority to determine the ordering
of the final type index stream. At this point, item and type records
must be separated, either by sorting or by splitting into two arrays,
and I chose sorting. This is why the GHashCell must contain the isItem
bit.

Once the final PDB TPI stream ordering is known, we need to compute a
mapping from source type index to PDB type index. To avoid starting over
from scratch and looking up every type again by its ghash, we save the
insertion position of every hash table insertion during the first
insertion phase. Because the table does not support rehashing, the
insertion position is stable. Using the array of insertion positions
indexed by source type index, we can replace the source type indices in
the ghash table cells with the PDB type indices.

Once the table cells have been updated to contain PDB type indices, the
mapping for each type source can be computed in parallel. Simply iterate
the list of cell positions and replace them with the PDB type index,
since the insertion positions are no longer needed.

Once we have a source to destination type index mapping for every type
source, there are no more data dependencies. We know which type records
are "unique" (not duplicates), and what their final type indices will
be. We can do the remapping in parallel, and accumulate type sizes and
type hashes in parallel by type source.

Lastly, TPI stream layout must be done serially. Accumulate all the type
records, sizes, and hashes, and add them to the PDB.

Differential Revision: https://reviews.llvm.org/D87805
The file was modifiedlld/COFF/DebugTypes.cpp
The file was modifiedlld/test/COFF/pdb-global-hashes.test
The file was modifiedlld/COFF/PDB.cpp
The file was modifiedlld/COFF/TypeMerger.h
The file was modifiedlld/COFF/PDB.h
The file was modifiedlld/include/lld/Common/ErrorHandler.h
The file was modifiedlld/test/COFF/precomp-link.test
The file was modifiedlld/test/COFF/pdb-procid-remapping.test
The file was modifiedllvm/include/llvm/DebugInfo/CodeView/TypeIndex.h
The file was modifiedlld/test/COFF/pdb-type-server-simple.test
The file was modifiedllvm/lib/DebugInfo/PDB/Native/TpiStreamBuilder.cpp
The file was modifiedllvm/include/llvm/DebugInfo/PDB/Native/TpiStreamBuilder.h
The file was modifiedlld/COFF/Driver.cpp
The file was modifiedlld/test/COFF/s_udt.s
The file was modifiedllvm/include/llvm/DebugInfo/CodeView/TypeHashing.h
The file was modifiedlld/test/COFF/pdb-type-server-missing.yaml
The file was modifiedllvm/lib/DebugInfo/CodeView/RecordName.cpp
The file was modifiedlld/COFF/DebugTypes.h
Commit 8d250ac3cd48d0f17f9314685a85e77895c05351 by rnk
Revert "[PDB] Merge types in parallel when using ghashing"

This reverts commit 49b3459930655d879b2dc190ff8fe11c38a8be5f.
The file was modifiedlld/COFF/PDB.cpp
The file was modifiedlld/include/lld/Common/ErrorHandler.h
The file was modifiedlld/test/COFF/precomp-link.test
The file was modifiedlld/COFF/TypeMerger.h
The file was modifiedlld/test/COFF/pdb-global-hashes.test
The file was modifiedlld/test/COFF/pdb-procid-remapping.test
The file was modifiedllvm/include/llvm/DebugInfo/PDB/Native/TpiStreamBuilder.h
The file was modifiedlld/test/COFF/s_udt.s
The file was modifiedllvm/include/llvm/DebugInfo/CodeView/TypeHashing.h
The file was modifiedllvm/lib/DebugInfo/PDB/Native/TpiStreamBuilder.cpp
The file was modifiedlld/COFF/DebugTypes.cpp
The file was modifiedlld/test/COFF/pdb-type-server-missing.yaml
The file was modifiedlld/COFF/DebugTypes.h
The file was modifiedllvm/lib/DebugInfo/CodeView/RecordName.cpp
The file was modifiedlld/COFF/PDB.h
The file was modifiedlld/test/COFF/pdb-type-server-simple.test
The file was modifiedlld/COFF/Driver.cpp
The file was modifiedllvm/include/llvm/DebugInfo/CodeView/TypeIndex.h
Commit c694588fc52a8845174fee06ad0bcfa338e87816 by ravishankarm
[mlir][Linalg] Add pattern to tile and fuse Linalg operations on buffers.

The pattern is structured similar to other patterns like
LinalgTilingPattern. The fusion patterns takes options that allows you
to fuse with producers of multiple operands at once.
- The pattern fuses only at the level that is known to be legal, i.e
  if a reduction loop in the consumer is tiled, then fusion should
  happen "before" this loop. Some refactoring of the fusion code is
  needed to fuse only where it is legal.
- Since the fusion on buffers uses the LinalgDependenceGraph that is
  not mutable in place the fusion pattern keeps the original
  operations in the IR, but are tagged with a marker that can be later
  used to find the original operations.

This change also fixes an issue with tiling and
distribution/interchange where if the tile size of a loop were 0 it
wasnt account for in these.

Differential Revision: https://reviews.llvm.org/D88435
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Fusion.cpp
The file was addedmlir/test/lib/Transforms/TestLinalgFusionTransforms.cpp
The file was modifiedmlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOpsInterface.td
The file was modifiedmlir/include/mlir/Dialect/Linalg/Utils/Utils.h
The file was modifiedmlir/test/lib/Transforms/CMakeLists.txt
The file was modifiedmlir/tools/mlir-opt/mlir-opt.cpp
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Tiling.cpp
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Transforms.cpp
The file was modifiedmlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
The file was addedmlir/test/Dialect/Linalg/fusion-pattern.mlir
Commit 7475bd5411a3f62a7860db09a5bcf1fc147c43d6 by Vitaly Buka
[Msan] Add ptsname, ptsname_r interceptors

Reviewed By: eugenis, MaskRay

Differential Revision: https://reviews.llvm.org/D88547
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_platform_interceptors.h
The file was addedcompiler-rt/test/sanitizer_common/TestCases/Linux/ptsname.c
Commit 722d792499a4b60dd582f870cbdfb572897906b4 by Stanislav.Mekhanoshin
[AMDGPU] Reorganize VOP3P encoding

This changes width of encoding and opcode fields to match the
documentation.

Differential Revision: https://reviews.llvm.org/D88619
The file was modifiedllvm/lib/Target/AMDGPU/VOPInstructions.td
The file was modifiedllvm/lib/Target/AMDGPU/VOP3PInstructions.td
Commit 5519e4da83d1abc66620334692394749eceb0e50 by rnk
Re-land "[PDB] Merge types in parallel when using ghashing"

Stored Error objects have to be checked, even if they are success
values.

This reverts commit 8d250ac3cd48d0f17f9314685a85e77895c05351.
Relands commit 49b3459930655d879b2dc190ff8fe11c38a8be5f..

Original commit message:
-----------------------------------------

This makes type merging much faster (-24% on chrome.dll) when multiple
threads are available, but it slightly increases the time to link (+10%)
when /threads:1 is passed. With only one more thread, the new type
merging is faster (-11%). The output PDB should be identical to what it
was before this change.

To give an idea, here is the /time output placed side by side:
                              BEFORE    | AFTER
  Input File Reading:           956 ms  |  968 ms
  Code Layout:                  258 ms  |  190 ms
  Commit Output File:             6 ms  |    7 ms
  PDB Emission (Cumulative):   6691 ms  | 4253 ms
    Add Objects:               4341 ms  | 2927 ms
      Type Merging:            2814 ms  | 1269 ms  -55%!
      Symbol Merging:          1509 ms  | 1645 ms
    Publics Stream Layout:      111 ms  |  112 ms
    TPI Stream Layout:          764 ms  |   26 ms  trivial
    Commit to Disk:            1322 ms  | 1036 ms  -300ms
----------------------------------------- --------
Total Link Time:               8416 ms    5882 ms  -30% overall

The main source of the additional overhead in the single-threaded case
is the need to iterate all .debug$T sections up front to check which
type records should go in the IPI stream. See fillIsItemIndexFromDebugT.
With changes to the .debug$H section, we could pre-calculate this info
and eliminate the need to do this walk up front. That should restore
single-threaded performance back to what it was before this change.

This change will cause LLD to be much more parallel than it used to, and
for users who do multiple links in parallel, it could regress
performance. However, when the user is only doing one link, it's a huge
improvement. In the future, we can use NT worker threads to avoid
oversaturating the machine with work, but for now, this is such an
improvement for the single-link use case that I think we should land
this as is.

Algorithm
----------

Before this change, we essentially used a
DenseMap<GloballyHashedType, TypeIndex> to check if a type has already
been seen, and if it hasn't been seen, insert it now and use the next
available type index for it in the destination type stream. DenseMap
does not support concurrent insertion, and even if it did, the linker
must be deterministic: it cannot produce different PDBs by using
different numbers of threads. The output type stream must be in the same
order regardless of the order of hash table insertions.

In order to create a hash table that supports concurrent insertion, the
table cells must be small enough that they can be updated atomically.
The algorithm I used for updating the table using linear probing is
described in this paper, "Concurrent Hash Tables: Fast and General(?)!":
https://dl.acm.org/doi/10.1145/3309206

The GHashCell in this change is essentially a pair of 32-bit integer
indices: <sourceIndex, typeIndex>. The sourceIndex is the index of the
TpiSource object, and it represents an input type stream. The typeIndex
is the index of the type in the stream. Together, we have something like
a ragged 2D array of ghashes, which can be looked up as:
  tpiSources[tpiSrcIndex]->ghashes[typeIndex]

By using these side tables, we can omit the key data from the hash
table, and keep the table cell small. There is a cost to this: resolving
hash table collisions requires many more loads than simply looking at
the key in the same cache line as the insertion position. However, most
supported platforms should have a 64-bit CAS operation to update the
cell atomically.

To make the result of concurrent insertion deterministic, the cell
payloads must have a priority function. Defining one is pretty
straightforward: compare the two 32-bit numbers as a combined 64-bit
number. This means that types coming from inputs earlier on the command
line have a higher priority and are more likely to appear earlier in the
final PDB type stream than types from an input appearing later on the
link line.

After table insertion, the non-empty cells in the table can be copied
out of the main table and sorted by priority to determine the ordering
of the final type index stream. At this point, item and type records
must be separated, either by sorting or by splitting into two arrays,
and I chose sorting. This is why the GHashCell must contain the isItem
bit.

Once the final PDB TPI stream ordering is known, we need to compute a
mapping from source type index to PDB type index. To avoid starting over
from scratch and looking up every type again by its ghash, we save the
insertion position of every hash table insertion during the first
insertion phase. Because the table does not support rehashing, the
insertion position is stable. Using the array of insertion positions
indexed by source type index, we can replace the source type indices in
the ghash table cells with the PDB type indices.

Once the table cells have been updated to contain PDB type indices, the
mapping for each type source can be computed in parallel. Simply iterate
the list of cell positions and replace them with the PDB type index,
since the insertion positions are no longer needed.

Once we have a source to destination type index mapping for every type
source, there are no more data dependencies. We know which type records
are "unique" (not duplicates), and what their final type indices will
be. We can do the remapping in parallel, and accumulate type sizes and
type hashes in parallel by type source.

Lastly, TPI stream layout must be done serially. Accumulate all the type
records, sizes, and hashes, and add them to the PDB.

Differential Revision: https://reviews.llvm.org/D87805
The file was modifiedlld/COFF/Driver.cpp
The file was modifiedlld/COFF/PDB.h
The file was modifiedlld/test/COFF/pdb-type-server-missing.yaml
The file was modifiedlld/test/COFF/pdb-type-server-simple.test
The file was modifiedllvm/lib/DebugInfo/CodeView/RecordName.cpp
The file was modifiedllvm/lib/DebugInfo/PDB/Native/TpiStreamBuilder.cpp
The file was modifiedlld/COFF/DebugTypes.cpp
The file was modifiedlld/test/COFF/s_udt.s
The file was modifiedllvm/include/llvm/DebugInfo/PDB/Native/TpiStreamBuilder.h
The file was modifiedlld/test/COFF/pdb-global-hashes.test
The file was modifiedlld/COFF/TypeMerger.h
The file was modifiedllvm/include/llvm/DebugInfo/CodeView/TypeIndex.h
The file was modifiedlld/include/lld/Common/ErrorHandler.h
The file was modifiedllvm/include/llvm/DebugInfo/CodeView/TypeHashing.h
The file was modifiedlld/COFF/DebugTypes.h
The file was modifiedlld/test/COFF/pdb-procid-remapping.test
The file was modifiedlld/test/COFF/precomp-link.test
The file was modifiedlld/COFF/PDB.cpp
Commit 37b2e2b04cf434b368b1edf29609be21952316f9 by pklausler
[flang] Semantic analysis for FINAL subroutines

Represent FINAL subroutines in the symbol table entries of
derived types.  Enforce constraints.  Update tests that have
inadvertent violations or modified messages.  Added a test.

The specific procedure distinguishability checking code for generics
was used to enforce distinguishability of FINAL procedures.
(Also cleaned up some confusion and redundancy noticed in the
type compatibility infrastructure while digging into that area.)

Differential revision: https://reviews.llvm.org/D88613
The file was modifiedflang/lib/Semantics/symbol.cpp
The file was modifiedflang/include/flang/Semantics/tools.h
The file was modifiedflang/test/Semantics/resolve32.f90
The file was addedflang/test/Semantics/final01.f90
The file was modifiedflang/lib/Semantics/mod-file.h
The file was modifiedflang/lib/Semantics/tools.cpp
The file was modifiedflang/lib/Semantics/pointer-assignment.cpp
The file was modifiedflang/include/flang/Evaluate/characteristics.h
The file was modifiedflang/lib/Evaluate/type.cpp
The file was modifiedflang/test/Semantics/modfile10.f90
The file was modifiedflang/include/flang/Semantics/symbol.h
The file was modifiedflang/lib/Semantics/mod-file.cpp
The file was modifiedflang/lib/Semantics/check-call.cpp
The file was modifiedflang/test/Semantics/call03.f90
The file was modifiedflang/test/Semantics/resolve55.f90
The file was modifiedflang/lib/Semantics/check-declarations.cpp
The file was modifiedflang/lib/Evaluate/characteristics.cpp
The file was modifiedflang/lib/Evaluate/tools.cpp
The file was modifiedflang/test/Semantics/call05.f90
The file was modifiedflang/include/flang/Evaluate/type.h
The file was modifiedflang/lib/Semantics/resolve-names.cpp
Commit 55cff5b288650f0ce814c3c85041852bbed554b8 by protze
[OpenMP][libomptarget] make omp_get_initial_device 5.1 compliant

OpenMP 5.1 defines omp_get_initial_device to return the same value as omp_get_num_devices.
Since this change is also 5.0 compliant, no versioning is needed.

Differential Revision: https://reviews.llvm.org/D88149
The file was modifiedopenmp/runtime/src/kmp.h
The file was modifiedopenmp/libomptarget/src/api.cpp
The file was modifiedopenmp/runtime/src/kmp_ftn_entry.h
The file was modifiedopenmp/libomptarget/include/omptarget.h
Commit 6104b30446aa976006fd322af4a57a8f0124f94f by protze
[OpenMP][OMPT] Update OMPT tests for newly added GOMP interface patches

This patch updates the expected results for the GOMP interface patches: D87267, D87269, and D87271.
The taskwait-depend test is changed to really use taskwait-depend and copied to an task_if0-depend test.

To pass the tests, the handling of the return address was fixed.

Differential Revision: https://reviews.llvm.org/D87680
The file was modifiedopenmp/runtime/test/ompt/tasks/dependences_mutexinoutset.c
The file was modifiedopenmp/runtime/src/kmp_taskdeps.cpp
The file was modifiedopenmp/runtime/test/ompt/tasks/taskwait-depend.c
The file was modifiedopenmp/runtime/src/kmp_gsupport.cpp
The file was modifiedopenmp/runtime/src/ompt-specific.h
The file was addedopenmp/runtime/test/ompt/tasks/task_if0-depend.c
Commit 21cf2e6c263d7a50654653bce4e83ab463fae580 by Akira
Handle unknown OSes in DarwinTargetInfo::getExnObjectAlignment

rdar://problem/69727650
The file was modifiedclang/lib/Basic/Targets/OSTargets.h
The file was modifiedclang/test/SemaCXX/warn-overaligned-type-thrown.cpp
Commit 66d2e3f495948412602db4507359b4612639e523 by saghir
[PowerPC] Add outer product instructions for MMA

This patch adds outer product instructions for MMA, including related infrastructure, and their tests.

Depends on D84968.

Reviewed By: #powerpc, bsaleil, amyk

Differential Revision: https://reviews.llvm.org/D88043
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrPrefix.td
The file was modifiedllvm/lib/Target/PowerPC/MCTargetDesc/PPCMCCodeEmitter.h
The file was modifiedllvm/lib/Target/PowerPC/MCTargetDesc/PPCMCCodeEmitter.cpp
The file was modifiedllvm/test/MC/Disassembler/PowerPC/ppc64-encoding-ISA31.txt
The file was modifiedllvm/lib/Target/PowerPC/AsmParser/PPCAsmParser.cpp
The file was modifiedllvm/lib/Target/PowerPC/Disassembler/PPCDisassembler.cpp
The file was modifiedllvm/test/MC/PowerPC/ppc64-encoding-ISA31.s
Commit b23916504a1a9f29c7519ed83813774eecce1789 by craig.topper
Patch IEEEFloat::isSignificandAllZeros and IEEEFloat::isSignificandAllOnes (bug 34579)

Patch IEEEFloat::isSignificandAllZeros and IEEEFloat::isSignificandAllOnes to behave correctly in the case that the size of the significand is a multiple of the width of the integerParts making up the significand.

The patch to IEEEFloat::isSignificandAllOnes fixes bug 34579, and the patch to IEEE:Float:isSignificandAllZeros fixes the unit test "APFloatTest.x87Next" I added here. I have included both in this diff since the changes are very similar.

Patch by Andrew Briand
The file was modifiedllvm/lib/Support/APFloat.cpp
The file was modifiedllvm/unittests/ADT/APFloatTest.cpp
Commit 23419bfd1c8f26617bda47e6d4732dcbfe0c09a3 by protze
[OpenMP][libarcher] Allow all possible argument separators in TSAN_OPTIONS

Currently, the parser used to tokenize the TSAN_OPTIONS in libomp uses
only spaces as separators, even though TSAN in compiler-rt supports
other separators like ':' or ','.
CTest uses ':' to separate sanitizer options by default.
The documentation for other sanitizers mentions ':' as separator,
but TSAN only lists spaces, which is probably where this mismatch originated.

Patch provided by  upsj

Differential Revision: https://reviews.llvm.org/D87144
The file was modifiedopenmp/tools/archer/ompt-tsan.cpp
The file was addedopenmp/tools/archer/tests/parallel/parallel-nosuppression.c
The file was modifiedopenmp/tools/archer/tests/lit.cfg
The file was modifiedopenmp/tools/archer/tests/parallel/parallel-simple.c
Commit e4f50e587f077c246b7f29db0b7daddf583e2b64 by ranjeet.singh
[ARM] Add missing target for Arm neon test case.

This is a follow-up from https://reviews.llvm.org/D61717. Where Richard
described the issue with compiling arm_neon.h under
-flax-vector-conversions=none. It looks like the example reproducer does
actually work but what was missing was a test entry for that target.

Differential Revision: https://reviews.llvm.org/D88546
The file was modifiedclang/test/Headers/arm-neon-header.c
Commit bc43ddf42fff5a43f23354e25a32aca19541fec5 by Jessica Paquette
[AArch64][GlobalISel] NFC: Refactor G_FCMP selection code

Refactor this so it's similar to the existing integer comparison code.

Also add some missing 64-bit testcases to select-fcmp.mir.

Refactoring to prep for improving selection for G_FCMP-related conditional
branches etc.

Differential Revision: https://reviews.llvm.org/D88614
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/select-fcmp.mir
Commit d689570d7dcb16ee241676e22324dc456837eb23 by Jonas Devlieghere
[lldb] Make TestGuiBasicDebug more lenient

Matt's change to the register allocator in 89baeaef2fa9 changed where we
end up after the `finish`. Before we'd end up on line 4.

* thread #1, queue = 'com.apple.main-thread', stop reason = step out
Return value: (int) $0 = 1
    frame #0: 0x0000000100003f7d a.out`main(argc=1, argv=0x00007ffeefbff630) at main.c:4:3
   1    extern int func();
   2
   3    int main(int argc, char **argv) {
-> 4      func(); // Break here
   5      func(); // Second
   6      return 0;
   7    }

Now, we end up on line 5.

* thread #1, queue = 'com.apple.main-thread', stop reason = step out
Return value: (int) $0 = 1

    frame #0: 0x0000000100003f8d a.out`main(argc=1, argv=0x00007ffeefbff630) at main.c:5:3
   2
   3    int main(int argc, char **argv) {
   4      func(); // Break here
-> 5      func(); // Second
   6      return 0;
   7    }

Given that this is not expected stable to be stable I've made the test a
bit more lenient to accept both scenarios.
The file was modifiedlldb/test/API/commands/gui/basicdebug/TestGuiBasicDebug.py
Commit e24f0ac7a389fcb5c2f5295e717d9f7d3fcd4cea by pklausler
[flang] Allow record advancement in external formatted sequential READ

The '/' control edit descriptor causes a runtime crash for an
external formatted sequential READ because the AdvanceRecord()
member function for external units implemented only the tasks
to finish reading the current record.  Split those out into
a new FinishReadingRecord() member function, call that instead
from EndIoStatement(), and change AdvanceRecord() to both
finish reading the current record and to begin reading the next
one.

Differential revision: https://reviews.llvm.org/D88607
The file was modifiedflang/runtime/io-stmt.h
The file was modifiedflang/runtime/unit.cpp
The file was modifiedflang/runtime/io-stmt.cpp
The file was modifiedflang/runtime/unit.h
Commit 4ab45cc2260d87f18e1b05517d5d366b2e754b72 by Amara Emerson
[AArch64][GlobalISel] Add some more legal types for G_PHI, G_IMPLICIT_DEF, G_FREEZE.

Also use this opportunity start to clean up the mess of vector type lists we
have in the LegalizerInfo. Unfortunately since the legalizer rule builders require
std::initializer_list objects as parameters we can't programmatically generate the
type lists.
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/legalize-freeze.mir
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/legalize-phi.mir
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
Commit 460dda071e091df3b5584f21954c9209e7334c50 by aeubanks
[WholeProgramDevirt][NewPM] Add NPM testing path to match legacy pass

The legacy pass's default constructor sets UseCommandLine = true and
goes down a separate testing route. Match that in the NPM pass.

This fixes all tests in llvm/test/Transforms/WholeProgramDevirt under NPM.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D88588
The file was modifiedllvm/test/Transforms/WholeProgramDevirt/import.ll
The file was modifiedllvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
The file was modifiedllvm/lib/Passes/PassRegistry.def
The file was modifiedllvm/include/llvm/Transforms/IPO/WholeProgramDevirt.h
Commit 93a1fc2e18b452216be70f534da42f7702adbe1d by Amara Emerson
Try to fix build. May have used a C++ feature too new/not supported on all platforms.
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
Commit 3c45a06f26edfb7e94003adf58cb8951ea9c2ce6 by sbc
[lld][WebAssembly] Allow exporting of mutable globals

In particular allow explict exporting of `__stack_pointer` but
exclud this from `--export-all` to avoid requiring the mutable
globals feature whenenve `--export-all` is used.

This uncovered a bug in populateTargetFeatures regarding checking
if the mutable-globals feature is allowed.

See: https://github.com/WebAssembly/binaryen/issues/2934

Differential Revision: https://reviews.llvm.org/D88506
The file was modifiedlld/docs/WebAssembly.rst
The file was modifiedlld/wasm/Writer.cpp
The file was addedlld/test/wasm/mutable-global-exports.s
The file was modifiedlld/test/wasm/mutable-globals.s
Commit d4e889f1f5723105dbab12b749503d2462eb1755 by stellaraccident
Remove `Ops` suffix from dialect library names

Dialects include more than just ops, so this suffix is outdated. Follows
discussion in
https://llvm.discourse.group/t/rfc-canonical-file-paths-to-dialects/621

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D88530
The file was modifiedmlir/lib/Dialect/Linalg/Utils/CMakeLists.txt
The file was modifiedmlir/lib/Conversion/LinalgToLLVM/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/SCF/CMakeLists.txt
The file was modifiedmlir/lib/Conversion/LinalgToStandard/CMakeLists.txt
The file was modifiedmlir/lib/Conversion/LinalgToSPIRV/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Linalg/EDSC/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/StandardOps/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/SCF/Transforms/CMakeLists.txt
The file was modifiedmlir/test/lib/Dialect/Test/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Affine/Transforms/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Vector/CMakeLists.txt
The file was modifiedmlir/lib/Transforms/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/StandardOps/Transforms/CMakeLists.txt
The file was modifiedmlir/lib/Conversion/SCFToSPIRV/CMakeLists.txt
The file was modifiedmlir/lib/Conversion/StandardToSPIRV/CMakeLists.txt
The file was modifiedmlir/lib/Analysis/CMakeLists.txt
The file was modifiedflang/lib/Lower/CMakeLists.txt
The file was modifiedmlir/lib/Conversion/GPUToVulkan/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Affine/EDSC/CMakeLists.txt
The file was modifiedmlir/lib/Conversion/AffineToStandard/CMakeLists.txt
The file was modifiedmlir/lib/Conversion/SCFToGPU/CMakeLists.txt
The file was modifiedmlir/lib/ExecutionEngine/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Linalg/IR/CMakeLists.txt
The file was modifiedmlir/test/EDSC/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Affine/IR/CMakeLists.txt
The file was modifiedmlir/docs/Tutorials/CreatingADialect.md
The file was modifiedmlir/lib/Transforms/Utils/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Shape/IR/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Linalg/Analysis/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Quant/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/CMakeLists.txt
The file was modifiedmlir/test/lib/Transforms/CMakeLists.txt
The file was modifiedmlir/lib/CAPI/Standard/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/Affine/Utils/CMakeLists.txt
The file was modifiedmlir/lib/Conversion/GPUToSPIRV/CMakeLists.txt
The file was modifiedmlir/lib/Dialect/GPU/CMakeLists.txt
Commit 4fb679d3b159f0a5e4ff87f4e7ecf44fbbf331b9 by pklausler
[flang] Fix Gw.d format output

The estimation of the decimal exponent needs to allow for all
'd' of the requested significant digits.

Also accept a plus sign on a "+kP" scaling factor in a format.

Differential revision: https://reviews.llvm.org/D88618
The file was modifiedflang/runtime/edit-output.cpp
The file was modifiedflang/runtime/format-implementation.h
Commit f0505534900bb1fcdee368136cd733aefd20ce39 by riddleriver
[mlir] Split Dialect::addOperations into two functions

The current implementation uses a fold expression to add all of the operations at once. This is really nice, but apparently the lifetime of each of the AbstractOperation instances is for the entire expression which may lead to a stack overflow for large numbers of operations. This splits the method in two to allow for the lifetime of the AbstractOperation to be properly scoped.
The file was modifiedmlir/include/mlir/IR/Dialect.h
Commit 196c097bba8b0b3932f3fcdcd5310f78ebaa43a3 by Amara Emerson
[AArch64][GlobalISel] Clamp oversize FP arithmetic vectors.
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/legalize-fp-arith.mir
Commit b656189e6a602aaf86714ccbf89d94f2ef05b644 by llvm-project
[flang][msvc] Avoid ReferenceVariantBase ctor ambiguity. NFC.

Msvc reports the following error when a ReferenceVariantBase is constructed using an r-value reference or instantiated as std::vector template parameter.  The error message is:
```
PFTBuilder.h(59,1): error C2665: 'std::variant<...>::variant': none of the 2 overloads could convert all the argument types
variant(1248,1): message : could be 'std::variant<...>::variant(std::variant<...> &&) noexcept(false)'
variant(1248,1): message : or       'std::variant<...>::variant(const std::variant<...> &) noexcept(false)'
PFTBuilder.h(59,1): message : while trying to match the argument list '(common::Reference<lower::pft::ReferenceVariantBase<false,...>>)'
```

Work around the ambiguity by only taking `common::Reference` arguments in the constructor. That is, conversion to common::Reference has to be done be the caller instead of being done inside the ctor. Unfortunately, with this change clang/gcc (but not msvc) insist on that the ReferenceVariantBase is stored in a `std::initializer_list`-initialized variable before being used, like being passed to a function or returned.

This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>.

Reviewed By: DavidTruby

Differential Revision: https://reviews.llvm.org/D88109
The file was modifiedflang/lib/Lower/PFTBuilder.cpp
The file was modifiedflang/include/flang/Lower/PFTBuilder.h
Commit 6cd8511e5932e4a53b2bb7780f69489355fc7783 by Dev
[WebAssembly] New-style command support

This adds support for new-style command support. In this mode, all exports
are considered command entrypoints, and the linker inserts calls to
`__wasm_call_ctors` and `__wasm_call_dtors` for all such entrypoints.

This enables support for:

- Command entrypoints taking arguments other than strings and return values
   other than `int`.
- Multicall executables without requiring on the use of string-based
   command-line arguments.

This new behavior is disabled when the input has an explicit call to
`__wasm_call_ctors`, indicating code not expecting new-style command
support.

This change does mean that wasm-ld no longer supports DCE-ing the
`__wasm_call_ctors` function when there are no calls to it. If there are no
calls to it, and there are ctors present, we assume it's wasm-ld's job to
insert the calls. This seems ok though, because if there are ctors present,
the program is expecting them to be called. This change affects the
init-fini-gc.ll test.
The file was modifiedlld/wasm/Driver.cpp
The file was modifiedlld/wasm/Symbols.h
The file was removedlld/test/wasm/init-fini-gc.ll
The file was modifiedlld/wasm/InputChunks.h
The file was addedlld/test/wasm/command-exports.s
The file was modifiedlld/wasm/MarkLive.cpp
The file was modifiedlld/wasm/Writer.cpp
The file was modifiedlld/wasm/Symbols.cpp
The file was addedlld/test/wasm/init-fini-no-gc.ll
The file was addedlld/test/wasm/command-exports-no-tors.s
Commit d4a1db4f3fd7ce701454127465dd0ddbdb7face2 by llvm-project
[flang][msvc] Workaround 'forgotten' symbols in FoldOperation. NFC.

This resolves an issue where the Microsoft compiler 'forgets' symbols when using constexpr in a lambda in a templated function. The symbols are:

1. The implicit lambda captures `context` and `convert`. Fix by making them explicit captures. The error message was:
```
fold-implementation.h(1220): error C2065: 'convert': undeclared identifier
```

2. The function template argument FROMCAT. Fix by storing it in a temporary constexpr variable inside the function. The error message was:
```
fold-implementation.h(1216): error C2065: 'FROMCAT': undeclared identifier
```

This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D88504
The file was modifiedflang/lib/Evaluate/fold-implementation.h
Commit 12bdd427b33a75bd7abb5d4cb095d0b983328034 by craig.topper
[APFloat] Improve asserts in isSignificandAllOnes and isSignificandAllZeros so they protect shift operations from undefined behavior.

For example, the assert in isSignificandAllZeros allowed NumHighBits
to be integerPartWidth. But since it is used directly as a shift amount
it must be less than integerPartWidth.
The file was modifiedllvm/lib/Support/APFloat.cpp
Commit 4e9277eda1874ead60f2c9d7cdb558fd19b32076 by i
[ELF] --wrap: don't unnecessarily expose __real_

The routing rules are:

sym -> __wrap_sym
__real_sym -> sym

__wrap_sym and sym are routing targets, so they need to be exposed to the symbol
table. __real_sym is not and can be eliminated if not used by regular object.
The file was modifiedlld/test/ELF/lto/wrap-1.ll
The file was modifiedlld/ELF/Driver.cpp
Commit 2c9dc7bbbf514b1ed7bdefacb3213beae5916b3d by michael.hliao
Revert "[llvm-exegesis] Add option to check the hardware support for a given feature before benchmarking."

This reverts commit 4fcd1a8e6528ca42fe656f2745e15d2b7f5de495 as
`llvm/test/tools/llvm-exegesis/X86/lbr/mov-add.s` failed on hosts
without LBR supported if the build has LIBPFM enabled. On that host,
`perf_event_open` fails with `EOPNOTSUPP` on LBR config. That change's
basic assumption

> If this is run on a non-supported hardware, it will produce all zeroes for latency.

could not stand as `perf_event_open` system call will fail if the
underlying hardware really don't have LBR supported.
The file was modifiedllvm/tools/llvm-exegesis/lib/X86/X86Counter.h
The file was modifiedllvm/tools/llvm-exegesis/llvm-exegesis.cpp
The file was modifiedllvm/tools/llvm-exegesis/lib/Target.h
The file was modifiedllvm/test/tools/llvm-exegesis/X86/lbr/lit.local.cfg
The file was modifiedllvm/tools/llvm-exegesis/lib/X86/X86Counter.cpp
The file was modifiedllvm/tools/llvm-exegesis/lib/X86/Target.cpp