SuccessChanges

Summary

  1. [OpenCL] Add clang extension for non-portable kernel parameters. (details)
  2. [AArch64] Fix for the pre-indexed paired load/store optimization. (details)
  3. [AsmParser][SystemZ][z/OS] Reject character and string literals for HLASM (details)
  4. [AMDGPU][OpenMP] Fix clang driver crash when provided -c (details)
  5. [mlir][linalg] Fix bug in the fusion on tensors index op handling. (details)
  6. [AMDGPU] Fix llc pipeline lit test for bots enabling expensive checks (details)
  7. [MIPS][MSA] Regenerate bitwise tests. NFCI. (details)
  8. [MIPS][MSA] Regenerate i5-b tests. NFCI. (details)
  9. [MIPS][MSA] Regenerate immediates tests. NFCI. (details)
  10. [InstCombine] improve readability; NFC (details)
  11. [GlobalISel] Fix buildZExtInReg creating new register. (details)
  12. [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics (details)
  13. [RISCV][NFC] Fix up pseudoinstruction name in comment (details)
  14. [libc] Normalize LIBC_TARGET_MACHINE (details)
  15. Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics" (details)
  16. [docs] Update the llvm/example section (details)
  17. Added a faster method to clone llvm project [DOCS] (details)
  18. [clang][Driver] Add -fintegrate-as to debug-pass-structure test (details)
  19. [mlir][Affine][Vector] Support vectorizing reduction loops (details)
  20. [AMDGPU] Pre-commit 2 new saddr load tests. NFC. (details)
  21. [clang] remove an incremental build workaround (details)
  22. [mlir][ArmSVE] Add masked arithmetic operations (details)
  23. [LV] Workaround PR49900 (a crash due to analyzing partially mutated IR) (details)
  24. [MC] Untangle MCContext and MCObjectFileInfo (details)
Commit e994e74bca49831eb649e7c67955e9de7a1784b6 by anastasia.stulova
[OpenCL] Add clang extension for non-portable kernel parameters.

Added __cl_clang_non_portable_kernel_param_types extension that
allows using non-portable types as kernel parameters. This allows
bypassing the portability guarantees from the restrictions specified
in C++ for OpenCL v1.0 s2.4.

Currently this only disables the restrictions related to the data
layout. The programmer should ensure the compiler generates the same
layout for host and device or otherwise the argument should only be
accessed on the device side. This extension could be extended to other
case (e.g. permitting size_t) if desired in the future.

Patch by olestrohm (Ole Strohm)!

https://reviews.llvm.org/D101168
The file was modifiedclang/lib/Sema/SemaDecl.cpp
The file was modifiedclang/lib/Basic/Targets/AMDGPU.h
The file was modifiedclang/docs/LanguageExtensions.rst
The file was modifiedclang/test/SemaOpenCLCXX/invalid-kernel.clcpp
The file was modifiedclang/test/Misc/nvptx.languageOptsOpenCL.cl
The file was modifiedclang/test/Misc/amdgcn.languageOptsOpenCL.cl
The file was modifiedclang/test/Misc/r600.languageOptsOpenCL.cl
The file was modifiedclang/lib/Basic/Targets/NVPTX.h
The file was modifiedclang/include/clang/Basic/OpenCLExtensions.def
Commit 3f4bad5eadacfc5322817eaa062dd272b52cfc54 by stelios.ioannou
[AArch64] Fix for the pre-indexed paired load/store optimization.

This patch fixes an issue where a pre-indexed store e.g.,
STR x1, [x0, #24]! with a store like STR x0, [x0, #8] are
merged into a single store: STP x1, x0, [x0, #24]!
. They shouldn’t be merged because the second store uses
x0 as both the stored value and the address and so it needs to be using the updated x0.
Therefore, it should not be folded into a STP <>pre.

Additionally a new test case is added to verify this fix.

Differential Revision: https://reviews.llvm.org/D101888

Change-Id: I26f1985ac84e970961e2cdca23c590fa6773851a
The file was modifiedllvm/test/CodeGen/AArch64/strpre-str-merge.mir
The file was modifiedllvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
Commit ae2aef13618beb8cb86e8b137a8ddbc846461169 by anirudh_prasad
[AsmParser][SystemZ][z/OS] Reject character and string literals for HLASM

- As per the HLASM support we are providing, i.e. support only for the first parameter of the inline asm block, only pertaining to Z machine instructions defined in LLVM, character literals and string literals are not supported (see Figure 4 - https://www-01.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R3sc264940/$file/asmr1023.pdf for more information)
- This patch explicitly rejects the usage of char literals and string literals (for example "abc 'a'") when the relevant field is set
- This is achieved by introducing a field called `LexHLASMStrings` in MCAsmLexer similar to `LexMasmStrings`

Reviewed By: abhina.sreeskantharajan, Kai

Differential Revision: https://reviews.llvm.org/D101660
The file was modifiedllvm/lib/MC/MCParser/AsmLexer.cpp
The file was modifiedllvm/include/llvm/MC/MCParser/MCAsmLexer.h
The file was modifiedllvm/unittests/MC/SystemZ/SystemZAsmLexerTest.cpp
Commit 1f5cacfcb845fd4163dec5a8c7991934c53d6cb3 by Pushpinder.Singh
[AMDGPU][OpenMP] Fix clang driver crash when provided -c

The offload action is used in four different ways as explained
in Driver.cpp:4495. When -c is present, the final phase will be
assemble (linker when -c is not present). However, this phase
is skipped according to D96769 for amdgcn. So, offload action
arrives into following situation,

compile (device) ---> offload ---> offload

without -c the chain looks like,
compile (device) ---> offload ---> linker (device)
---> offload

The former situation creates an unhandled case which causes
problem. The solution presented in this patch delays the D96769
logic until job creation time. This keeps the offload action
in the 1 of the 4 specified situations.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D101901
The file was modifiedclang/test/Driver/amdgpu-openmp-toolchain.c
The file was modifiedclang/lib/Driver/Driver.cpp
Commit 4a6ee23d832f823d71faf7d0dca1b6eec71df253 by gysit
[mlir][linalg] Fix bug in the fusion on tensors index op handling.

The old index op handling let the new index operations point back to the
producer block. As a result, after fusion some index operations in the
fused block had back references to the old producer block resulting in
illegal IR. The patch now relies on a block and value mapping to avoid
such back references.

Differential Revision: https://reviews.llvm.org/D101887
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp
The file was modifiedmlir/test/Dialect/Linalg/fusion-tensor.mlir
Commit 83646f60a8a499473aad0fa591195065fca9d7b2 by baptiste.saleil
[AMDGPU] Fix llc pipeline lit test for bots enabling expensive checks
The file was modifiedllvm/test/CodeGen/AMDGPU/llc-pipeline.ll
Commit c673a95cb46aacc9631dbc7d1a07851d951f2e64 by llvm-dev
[MIPS][MSA] Regenerate bitwise tests. NFCI.

Simplifies an upcoming patch diff
The file was modifiedllvm/test/CodeGen/Mips/msa/bitwise.ll
Commit 679e30dc3f50b5fc6adc3a67dc2a4d1b23e8656e by llvm-dev
[MIPS][MSA] Regenerate i5-b tests. NFCI.

Simplifies an upcoming patch diff
The file was modifiedllvm/test/CodeGen/Mips/msa/i5-b.ll
Commit 0f97afe32044ab5b7e4d20090952143d8a5547e5 by llvm-dev
[MIPS][MSA] Regenerate immediates tests. NFCI.

Simplifies an upcoming patch diff
The file was modifiedllvm/test/CodeGen/Mips/msa/immediates.ll
Commit 00341978745d93e22560ae394a57e80a3fd29bf7 by spatel
[InstCombine] improve readability; NFC
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
Commit a3d273c9ff4c789aec0dc743fa2dc846b5987312 by Vang.Thao
[GlobalISel] Fix buildZExtInReg creating new register.

Fix a bug where buildZExtInReg will create and use a new register instead of using the register from parameter DstOp Res.

Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D101871
The file was modifiedllvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
Commit 6e876f9dedf00b24a96b8781e3b39d5282c43e91 by jrtc27
[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics

Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
The file was modifiedllvm/test/CodeGen/RISCV/atomic-signext.ll
The file was modifiedllvm/test/CodeGen/PowerPC/atomics-i8-ldst.ll
The file was modifiedllvm/test/CodeGen/PowerPC/atomics-i64-ldst.ll
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
The file was modifiedllvm/test/CodeGen/Mips/atomic.ll
The file was modifiedllvm/test/CodeGen/PowerPC/atomics-i32-ldst.ll
The file was modifiedllvm/test/CodeGen/PowerPC/atomics-i16-ldst.ll
The file was modifiedllvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
Commit efc31be7f8e8487c774dd9052980b67f0d5e70e2 by fraser
[RISCV][NFC] Fix up pseudoinstruction name in comment
The file was modifiedllvm/lib/Target/RISCV/RISCVInstrInfoVSDPatterns.td
The file was modifiedllvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
Commit 7c2ece523d7ff74f3eeabce1b9685f3eaae8cff4 by gchatelet
[libc] Normalize LIBC_TARGET_MACHINE

Current implementation defines LIBC_TARGET_MACHINE with the use of CMAKE_SYSTEM_PROCESSOR.
Unfortunately CMAKE_SYSTEM_PROCESSOR is OS dependent and can produce different results.
An evidence of this is the various matchers used to detect whether the architecture is x86.

This patch normalizes LIBC_TARGET_MACHINE and renames it LIBC_TARGET_ARCHITECTURE.
I've added many architectures but we may want to limit ourselves to x86 and ARM.

Differential Revision: https://reviews.llvm.org/D101524
The file was modifiedlibc/src/string/aarch64/CMakeLists.txt
The file was addedlibc/src/string/x86_64/memcpy.cpp
The file was removedlibc/src/string/x86/memcpy.cpp
The file was addedlibc/cmake/modules/LLVMLibCArchitectures.cmake
The file was modifiedlibc/config/linux/CMakeLists.txt
The file was modifiedlibc/src/string/CMakeLists.txt
The file was modifiedlibc/utils/FPUtil/CMakeLists.txt
The file was addedlibc/src/string/x86_64/CMakeLists.txt
The file was modifiedlibc/loader/linux/CMakeLists.txt
The file was modifiedlibc/src/math/CMakeLists.txt
The file was modifiedlibc/test/src/math/CMakeLists.txt
The file was removedlibc/src/string/x86/CMakeLists.txt
The file was modifiedlibc/test/loader/linux/CMakeLists.txt
The file was modifiedlibc/test/utils/FPUtil/CMakeLists.txt
The file was modifiedlibc/src/threads/linux/CMakeLists.txt
The file was modifiedlibc/test/config/linux/CMakeLists.txt
The file was modifiedlibc/cmake/modules/LLVMLibCCheckCpuFeatures.cmake
The file was modifiedlibc/CMakeLists.txt
Commit 897d7bceb90f1ef4807c0f698eaff3c10b471cb9 by jrtc27
Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"

This seems to have broken sanitizers, giving lots of

  Assertion `NumBits <= MAX_INT_BITS && "bitwidth too large"' failed.

failures across multiple targets (currently X86 and PowerPC). Reverting
until I have a chance to reproduce and debug.

This reverts commit 6e876f9dedf00b24a96b8781e3b39d5282c43e91.
The file was modifiedllvm/test/CodeGen/PowerPC/atomics-i32-ldst.ll
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
The file was modifiedllvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td
The file was modifiedllvm/test/CodeGen/PowerPC/atomics-i16-ldst.ll
The file was modifiedllvm/test/CodeGen/RISCV/atomic-signext.ll
The file was modifiedllvm/test/CodeGen/PowerPC/atomics-i8-ldst.ll
The file was modifiedllvm/test/CodeGen/Mips/atomic.ll
The file was modifiedllvm/test/CodeGen/PowerPC/atomics-i64-ldst.ll
Commit 0b9447157b01ac18bb9f4d865920027cbd7df840 by shivam98.tkg
[docs] Update the llvm/example section

Added details about the llvm/example section.

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D101284
The file was modifiedllvm/docs/GettingStarted.rst
Commit 67ee2f870d3b06a5684251272eae36d6e0f519b0 by shivam98.tkg
Added a faster method to clone llvm project [DOCS]

Reviewed By: xgupta, amccarth

Differential Revision: https://reviews.llvm.org/D101433
The file was modifiedclang/www/get_started.html
Commit 20d0aca43073f18f70b1c5a665631dee1be1598d by Jinsong Ji
[clang][Driver] Add -fintegrate-as to debug-pass-structure test

CGProfilePass is not always on, it will be disabled when using
non-intergrated assemblers.

  // Only enable CGProfilePass when using integrated assembler, since
  // non-integrated assemblers don't recognize .cgprofile section.
  PMBuilder.CallGraphProfile = !CodeGenOpts.DisableIntegratedAS;

Add -fintegrate-as to make sure the output don't rely on the platform default.

Reviewed By: evgeny777

Differential Revision: https://reviews.llvm.org/D101918
The file was modifiedclang/test/Driver/debug-pass-structure.c
Commit d80b04ab0015b218b613f8fe59506d45739817b8 by sergei.grechanik
[mlir][Affine][Vector] Support vectorizing reduction loops

This patch adds support for vectorizing loops with 'iter_args'
implementing known reductions along the vector dimension. Comparing to
the non-vector-dimension case, two additional things are done during
vectorization of such loops:
- The resulting vector returned from the loop is reduced to a scalar
  using `vector.reduce`.
- In some cases a mask is applied to the vector yielded at the end of
  the loop to prevent garbage values from being written to the
  accumulator.

Vectorization of reduction loops is disabled by default. To enable it, a
map from loops to array of reduction descriptors should be explicitly passed to
`vectorizeAffineLoops`, or `vectorize-reductions=true` should be passed
to the SuperVectorize pass.

Current limitations:
- Loops with a non-unit step size are not supported.
- n-D vectorization with n > 1 is not supported.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D100694
The file was modifiedmlir/include/mlir/Dialect/Affine/Passes.td
The file was addedmlir/test/Dialect/Affine/SuperVectorize/vectorize_reduction.mlir
The file was modifiedmlir/lib/Analysis/AffineAnalysis.cpp
The file was modifiedmlir/include/mlir/Dialect/StandardOps/IR/Ops.h
The file was modifiedmlir/lib/Dialect/Vector/VectorOps.cpp
The file was modifiedmlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
The file was addedmlir/test/Dialect/Affine/SuperVectorize/vectorize_reduction_2d.mlir
The file was modifiedmlir/include/mlir/Analysis/AffineAnalysis.h
The file was modifiedmlir/include/mlir/Dialect/Vector/VectorOps.h
The file was modifiedmlir/include/mlir/Dialect/Affine/Utils.h
The file was modifiedmlir/lib/Conversion/AffineToStandard/AffineToStandard.cpp
The file was modifiedmlir/test/Dialect/Affine/SuperVectorize/vectorize_1d.mlir
The file was modifiedmlir/lib/Dialect/StandardOps/IR/Ops.cpp
Commit 4c178d809b1df3216de251d5345b8ecc9cc3990e by Stanislav.Mekhanoshin
[AMDGPU] Pre-commit 2 new saddr load tests. NFC.
The file was modifiedllvm/test/CodeGen/AMDGPU/global-saddr-load.ll
Commit f16afcd9b5ce3054aac2b08b3a20472c07b6773a by thakis
[clang] remove an incremental build workaround

This cleaned up an oversight over a year ago. Should no longer be needed.
The file was modifiedclang/test/CoverageMapping/coroutine.cpp
Commit 95861216ac6558dc0dbcf638902feb9072c84661 by javier.setoain
[mlir][ArmSVE] Add masked arithmetic operations

These instructions map to SVE-specific instrinsics that accept a
predicate operand to support control flow in vector code.

Differential Revision: https://reviews.llvm.org/D100982
The file was modifiedmlir/lib/Dialect/ArmSVE/IR/ArmSVEDialect.cpp
The file was modifiedmlir/test/Dialect/ArmSVE/legalize-for-llvm.mlir
The file was modifiedmlir/lib/Dialect/ArmSVE/Transforms/LegalizeForLLVMExport.cpp
The file was modifiedmlir/test/Target/LLVMIR/arm-sve.mlir
The file was modifiedmlir/include/mlir/Dialect/ArmSVE/ArmSVE.td
The file was modifiedmlir/test/Dialect/ArmSVE/roundtrip.mlir
Commit 80e8025083982f4eca8ca8200eafecf4a5c3ae6e by listmail
[LV] Workaround PR49900 (a crash due to analyzing partially mutated IR)

LoopVectorize has a fairly deeply baked in design problem where it will try to query analysis (primarily SCEV, but also ValueTracking) in the midst of mutating IR. In particular, the intermediate IR state does not represent the semantics of the original (or final) program.

Fixing this for real is hard, but all of the cases seen so far share a common symptom. In cases seen to date, the analysis being queried is the computation of the original loop's trip count. We can fix this particular instance of the issue by simply computing the trip count early, and caching it.

I want to be really clear that this is nothing but a workaround. It does nothing to fix the root issue, and at best, delays the time until we have to fix this for real. Florian and I have discussed an eventual solution in the review comments for https://reviews.llvm.org/D100663, but it's a lot of work.

Test taken from https://reviews.llvm.org/D100663.

Differential Revision: https://reviews.llvm.org/D101487
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp
The file was addedllvm/test/Transforms/LoopVectorize/scev-during-mutation.ll
Commit 632ebc4ab4374e53fce1ec870465c587e0a33668 by i
[MC] Untangle MCContext and MCObjectFileInfo

This untangles the MCContext and the MCObjectFileInfo. There is a circular
dependency between MCContext and MCObjectFileInfo. Currently this dependency
also exists during construction: You can't contruct a MOFI without a MCContext
without constructing the MCContext with a dummy version of that MOFI first.
This removes this dependency during construction. In a perfect world,
MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the
MCContext, like other MC information. This is future work.

This also shifts/adds more information to the MCContext making it more
available to the different targets. Namely:

- TargetTriple
- ObjectFileType
- SubtargetInfo

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D101462
The file was modifiedllvm/tools/sancov/sancov.cpp
The file was modifiedllvm/tools/llvm-exegesis/lib/Analysis.cpp
The file was modifiedllvm/tools/llvm-profgen/ProfiledBinary.cpp
The file was modifiedllvm/lib/MC/MCWinCOFFStreamer.cpp
The file was modifiedllvm/lib/MC/MCParser/DarwinAsmParser.cpp
The file was modifiedlldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
The file was modifiedllvm/lib/Target/TargetLoweringObjectFile.cpp
The file was modifiedllvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
The file was modifiedllvm/lib/MC/MCParser/COFFAsmParser.cpp
The file was modifiedllvm/lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.cpp
The file was modifiedllvm/tools/llvm-cfi-verify/lib/FileAnalysis.cpp
The file was modifiedllvm/unittests/CodeGen/MachineOperandTest.cpp
The file was modifiedllvm/lib/Object/ModuleSymbolTable.cpp
The file was modifiedllvm/tools/llvm-rtdyld/llvm-rtdyld.cpp
The file was modifiedllvm/unittests/MC/DwarfLineTables.cpp
The file was modifiedllvm/lib/MC/MCMachOStreamer.cpp
The file was modifiedllvm/unittests/CodeGen/MachineInstrTest.cpp
The file was modifiedllvm/lib/DWARFLinker/DWARFStreamer.cpp
The file was modifiedllvm/tools/llvm-ml/llvm-ml.cpp
The file was modifiedllvm/tools/llvm-objdump/llvm-objdump.cpp
The file was modifiedclang/lib/Parse/ParseStmtAsm.cpp
The file was modifiedlldb/source/Plugins/Instruction/MIPS/EmulateInstructionMIPS.cpp
The file was modifiedllvm/lib/MC/MCContext.cpp
The file was modifiedllvm/unittests/MC/SystemZ/SystemZAsmLexerTest.cpp
The file was modifiedllvm/lib/MC/MCAsmStreamer.cpp
The file was modifiedllvm/tools/llvm-mc-assemble-fuzzer/llvm-mc-assemble-fuzzer.cpp
The file was modifiedllvm/lib/CodeGen/MachineModuleInfo.cpp
The file was modifiedllvm/include/llvm/MC/MCObjectFileInfo.h
The file was modifiedllvm/lib/MC/MCParser/AsmParser.cpp
The file was modifiedllvm/tools/llvm-dwp/llvm-dwp.cpp
The file was modifiedllvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
The file was modifiedlldb/source/Plugins/Instruction/MIPS64/EmulateInstructionMIPS64.cpp
The file was modifiedllvm/tools/llvm-exegesis/lib/SnippetFile.cpp
The file was modifiedllvm/lib/MC/MCParser/MasmParser.cpp
The file was modifiedllvm/tools/llvm-jitlink/llvm-jitlink.cpp
The file was modifiedllvm/tools/llvm-mc/llvm-mc.cpp
The file was modifiedllvm/tools/llvm-ml/Disassembler.cpp
The file was modifiedllvm/lib/MC/MCStreamer.cpp
The file was modifiedllvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
The file was modifiedllvm/unittests/CodeGen/TestAsmPrinter.cpp
The file was modifiedllvm/unittests/DebugInfo/DWARF/DwarfGenerator.cpp
The file was modifiedmlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp
The file was modifiedllvm/lib/MC/MCObjectFileInfo.cpp
The file was modifiedllvm/tools/llvm-objdump/MachODump.cpp
The file was modifiedllvm/include/llvm/MC/MCContext.h
The file was modifiedllvm/lib/MC/MCDisassembler/Disassembler.cpp
The file was modifiedllvm/tools/llvm-exegesis/lib/LlvmState.cpp
The file was modifiedclang/tools/driver/cc1as_main.cpp
The file was modifiedllvm/tools/llvm-mca/llvm-mca.cpp