Commit
ec725b307f3fdc5656459047bab6e69669d9534f
by marek.kurdej+llvm.org[clang-format] Fix C# nullable-related errors
This fixes two errors:
Previously, clang-format was splitting up type identifiers from the nullable ?. This changes this behavior so that the type name sticks with the operator.
Additionally, nullable operators attached to return types in interface functions were not parsed correctly. Digging deeper, it looks like interface bodies were being parsed differently than classes and structs, causing MustBeDeclaration to be incorrect for interface members. They now share the same logic.
One other change is reintroducing the CSharpNullable type independent of JsTypeOptionalQuestion. Despite having a similar semantic purpose, their actual syntax differs quite a bit.
Reviewed By: MyDeveloperDay, curdeius
Differential Revision: https://reviews.llvm.org/D101860
|
 | clang/lib/Format/UnwrappedLineParser.cpp |
 | clang/lib/Format/FormatToken.h |
 | clang/unittests/Format/FormatTestCSharp.cpp |
 | clang/lib/Format/TokenAnnotator.cpp |
 | clang/lib/Format/UnwrappedLineParser.h |
Commit
cdf33962d9768fbd8d6b193aff463a21eaa984f3
by marek.kurdej+llvm.org[clang-format] Rename common types between C#/JS
Reviewed By: curdeius
Differential Revision: https://reviews.llvm.org/D101862
|
 | clang/lib/Format/FormatToken.h |
 | clang/lib/Format/TokenAnnotator.cpp |
 | clang/lib/Format/UnwrappedLineParser.cpp |
 | clang/lib/Format/FormatTokenLexer.cpp |
Commit
8c9742bd239af602ee2743baa3c4281f24d45df1
by kerry.mclaughlin[SVE][LoopVectorize] Add support for scalable vectorization of first-order recurrences
Adds support for scalable vectorization of loops containing first-order recurrences, e.g: ``` for(int i = 0; i < n; i++) b[i] = a[i] + a[i - 1] ``` This patch changes fixFirstOrderRecurrence for scalable vectors to take vscale into account when inserting into and extracting from the last lane of a vector. CreateVectorSplice has been added to construct a vector for the recurrence, which returns a splice intrinsic for scalable types. For fixed-width the behaviour remains unchanged as CreateVectorSplice will return a shufflevector instead.
The tests included here are the same as test/Transform/LoopVectorize/first-order-recurrence.ll
Reviewed By: david-arm, fhahn
Differential Revision: https://reviews.llvm.org/D101076
|
 | llvm/lib/Transforms/Vectorize/LoopVectorize.cpp |
 | llvm/include/llvm/IR/IRBuilder.h |
 | llvm/lib/IR/IRBuilder.cpp |
 | llvm/test/Transforms/LoopVectorize/AArch64/first-order-recurrence.ll |
 | llvm/test/Transforms/LoopVectorize/scalable-first-order-recurrence.ll |
Commit
a0da66bc1330f9808ed9814aaa9c3c3d3244852d
by paulsson[SystemZ] Support builtin_frame_address with packed stack without backchain.
In order to use __builtin_frame_address(0) with packed stack and no backchain, the address of where the backchain would have been written is returned (like GCC).
This address may either contain a saved register or be unused.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D101897
|
 | llvm/test/CodeGen/SystemZ/frameaddr-02.ll |
 | llvm/lib/Target/SystemZ/SystemZISelLowering.cpp |
Commit
20e976e2487f5b52541772e6e92954ebf2dcf13e
by llvm-dev[AMDGPU] Regenerate shift tests. NFCI.
|
 | llvm/test/CodeGen/AMDGPU/sra.ll |
 | llvm/test/CodeGen/AMDGPU/srl.ll |
 | llvm/test/CodeGen/AMDGPU/shl.ll |
Commit
0fdce16efb281ab52e1aa5a7a760aebcb7a59163
by llvm-dev[AMDGPU] Regenerate fp2int tests. NFCI.
|
 | llvm/test/CodeGen/AMDGPU/fp_to_sint.ll |
 | llvm/test/CodeGen/AMDGPU/fp_to_uint.ll |
Commit
a0d019fc89c57736e54a476aa4db63027a2dace2
by csigg[mlir] Add support for ops with regions in 'gpu-async-region' rewriter.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D101757
|
 | mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp |
Commit
5dd9f44c17ec0d8b6b88bb015560b3c566622fdc
by Ben.Dunbobbin[LLD] Improve --strip-all help text
This is a slight improvement to the help text, as I was slightly surprised when strip-all did more than remove the symbol table.
Currently, we match gold's help text for strip-all and strip-debug. I think that the GNU documentation for these options is not particularly clear. However, I have opted to make only a minor change here and keep the help text similar to gold's as these are mature options that are well understood.
ld.bfd (https://sourceware.org/binutils/docs/ld/Options.html) has a similar implication although it defines strip-debug as a subset of strip-all. However, felt that noting that strip-all implies strip-debug is better; because, with the ld.bfd approach you have to read both the --strip-debug and the --strip-all help text to understand the behaviour of --strip-all (and the --strip-all help text doesn't indicate that he --strip-debug help text is related).
Differential Revision: https://reviews.llvm.org/D101890
|
 | lld/ELF/Options.td |
 | lld/docs/ld.lld.1 |
Commit
4979c90458628c9463815d81c637f8787f72fff0
by david.green[LV] Account for tripcount when calculation vectorization profitability
The loop vectorizer will currently assume a large trip count when calculating which of several vectorization factors are more profitable. That is often not a terrible assumption to make as small trip count loops will usually have been fully unrolled. There are cases however where we will try to vectorize them, and especially when folding the tail by masking can incorrectly choose to vectorize loops that are not beneficial, due to the folded tail rounding the iteration count up for the vectorized loop.
The motivating example here has a trip count of 5, so either performs 5 scalar iterations or 2 vector iterations (with VF=4). At a high enough trip count the vectorization becomes profitable, but the rounding up to 2 vector iterations vs only 5 scalar makes it unprofitable.
This adds an alternative cost calculation when we know the max trip count and are folding tail by masking, rounding the iteration count up to the correct number for the vector width. We still do not account for anything like setup cost or the mixture of vector and scalar loops, but this is at least an improvement in a few cases that we have had reported.
Differential Revision: https://reviews.llvm.org/D101726
|
 | llvm/test/Transforms/LoopVectorize/ARM/mve-known-trip-count.ll |
 | llvm/lib/Transforms/Vectorize/LoopVectorize.cpp |
Commit
3d746962ed1831987c6a1ab54fe8f6cbb6477e0e
by benny.kra[ORC] Silence unused variable warnings in Release builds. NFC.
|
 | llvm/unittests/ExecutionEngine/Orc/OrcCAPITest.cpp |
Commit
fc690777fce0bf50a8f424b05993b1e218713ae5
by malhar.jajooRevert "[ARM] Transforming memcpy to Tail predicated Loop"
Reverting commit since it causes failure (10462). This reverts commit b856f4a232cbd43476e9b9f75c80aacfc6f5c152.
|
 | llvm/lib/Target/ARM/ARMSubtarget.h |
 | llvm/test/CodeGen/Thumb2/LowOverheadLoops/memcall.ll |
 | llvm/lib/Target/ARM/ARMSelectionDAGInfo.cpp |
 | llvm/lib/Target/ARM/ARMISelLowering.h |
 | llvm/lib/Target/ARM/ARMTargetTransformInfo.h |
 | llvm/lib/Target/ARM/ARMInstrMVE.td |
 | llvm/test/CodeGen/Thumb2/mve-tp-loop.ll |
 | llvm/test/CodeGen/Thumb2/mve-tp-loop.mir |
 | llvm/lib/Target/ARM/ARMISelLowering.cpp |
Commit
67cfefebbbbb3a5923c47c31293a8f76596de8be
by carl.ritson[AMDGPU] Fix WQM failure with single block inactive demote
Instruction test for inactive kill/demote needs to be based on actual opcode not whether instruction would be lowered to demote.
Reviewed By: piotr
Differential Revision: https://reviews.llvm.org/D101966
|
 | llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wqm.demote.ll |
Commit
b24e9f82b71f325214c41fdc3f106207cc2244a6
by jonathanchesterfield[amdgpu-arch] Fix rpath to run from build dir
[amdgpu-arch] Fix rpath to run from build dir
Prior to this, amdgpu-arch has RUNPATH set to $ORIGIN/../lib which works for some installs, but not from the build directory where clang executes the tool from when running tests.
This cmake option adds the location of the rocr runtime to the RUNPATH (note, it amends RUNPATH here, despite the cmake option referring to RPATH) to create a binary that runs from build or install location.
Before: RUNPATH [$ORIGIN/../lib] After: RUNPATH [$ORIGIN/../lib:$HOME/llvm-install/lib]
Credit to Greg for knowing this trick and pointing to examples of it in use for the aomp build scripts.
Reviewed By: pdhaliwal
Differential Revision: https://reviews.llvm.org/D101926
|
 | clang/tools/amdgpu-arch/CMakeLists.txt |
Commit
c28a602329a78db5c02cc85679b5035aaf6753b4
by anastasia.stulova[OpenCL] Remove subgroups pragma in enqueue kernel and pipe builtins.
This patch simplifies the parser and makes the language semantics consistent. There is no extension pragma requirement in the spec for the subgroup functions in enqueue kernel or pipes and all other builtin functions are available without the pragama.
Differential Revision: https://reviews.llvm.org/D100984
|
 | clang/test/SemaOpenCL/cl20-device-side-enqueue.cl |
 | clang/lib/Sema/SemaChecking.cpp |