Commit
57b8b5c114b6e595f8d9118807d741ef30518777
by Justas.Janickas[OpenCL] Test case for C++ for OpenCL 2021 in OpenCL C header test
RUN line representing C++ for OpenCL 2021 added to the test. This should have been done as part of earlier commit fb321c2ea274 but was missed during rebasing.
Differential Revision: https://reviews.llvm.org/D109492
|
 | clang/test/Headers/opencl-c-header.cl |
Commit
7b4cc09b1424c7f53051f971347c00d5f27fbb4e
by david.stenberg[LowerConstantIntrinsics] Fix heap-use-after-free bug in worklist
This fixes PR51730, a heap-use-after-free bug in replaceConditionalBranchesOnConstant().
With the attached reproducer we were left with a function looking something like this after replaceAndRecursivelySimplify():
[...]
cont2.i: br i1 %.not1.i, label %handler.type_mismatch3.i, label %cont4.i
handler.type_mismatch3.i: %3 = phi i1 [ %2, %cont2.thread.i ], [ false, %cont2.i ] unreachable
cont4.i: unreachable
[...]
with both the branch instruction and PHI node being in the worklist. As a result of replacing the branch instruction with an unconditional branch, the PHI node in %handler.type_mismatch3.i would be removed. This then resulted in a heap-use-after-free bug due to accessing that removed PHI node in the next worklist iteration.
This is solved by using a value handle worklist. I am a unsure if this is the most idiomatic solution. Another solution could have been to produce a worklist just containing the interesting branch instructions, but I thought that it perhaps was a bit cleaner to keep all worklist filtering in the loop that does the rewrites.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D109221
|
 | llvm/test/Transforms/LowerConstantIntrinsics/stale-worklist-phi.ll |
 | llvm/lib/Transforms/Scalar/LowerConstantIntrinsics.cpp |
Commit
4d5d72542839b11455a0c261b66a0426c1530d52
by mkazantsev[SCEV] Add some asserts on availability of arguments of isLoopEntryGuardedByCond
The logic in howManyLessThans is fishy. It first checks invariance of RHS, and then uses OrigRHS as argument for isLoopEntryGuardedByCond, which is, strictly saying, a different thing. We are seeing a very rare intermittent failure of availability checks, and it looks like this precondition is sometimes broken. Before we can figure out what's going on, adding asserts that all involved values that may possibly to to isLoopEntryGuardedByCond are available at loop entry.
If either of these asserts fails (OrigRHS is the most likely suspect), it means that the logic here is flawed.
|
 | llvm/lib/Analysis/ScalarEvolution.cpp |
Commit
8bc71856681c235a3192813947308a19577c9236
by petar.avramovicGlobalISel/Utils: Refactor constant splat match functions
Add generic helper function that matches constant splat. It has option to match constant splat with undef (some elements can be undef but not all). Add util function and matcher for G_FCONSTANT splat.
Differential Revision: https://reviews.llvm.org/D104410
|
 | llvm/include/llvm/CodeGen/GlobalISel/MIPatternMatch.h |
 | llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h |
 | llvm/include/llvm/CodeGen/GlobalISel/Utils.h |
 | llvm/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp |
 | llvm/lib/CodeGen/GlobalISel/Utils.cpp |
Commit
cd166fb2ef9c8fde374cb5de9c57802536d9b79e
by mkazantsev[SCEV] Use isAvailableAtLoopEntry in the asserts
This is what is supposed to be there.
|
 | llvm/lib/Analysis/ScalarEvolution.cpp |
Commit
e83629280f32102cd93a216490188922843af06c
by david.green[AArch64] Regenerate test lines in sve-implicit-zero-filling.ll
|
 | llvm/test/CodeGen/AArch64/sve-implicit-zero-filling.ll |
Commit
86dcb592069f2d18a183fa1daa611029ae80ef4c
by jay.foad[AMDGPU] Prefer v_fmac over v_fma only when no source modifiers are used
v_fmac with source modifiers forces VOP3 encoding, but it is strictly better to use the VOP3-only v_fma instead, because $dst and $src2 are not tied so it gives the register allocator more freedom and avoids a copy in some cases.
This is the same strategy we already use for v_mad vs v_mac and v_fma_legacy vs v_fmac_legacy.
Differential Revision: https://reviews.llvm.org/D110070
|
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fma.s32.mir |
 | llvm/test/CodeGen/AMDGPU/frem.ll |
 | llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll |
 | llvm/test/CodeGen/AMDGPU/strict_fma.f32.ll |
 | llvm/lib/Target/AMDGPU/SIInstructions.td |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fma.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f32.ll |
 | llvm/test/CodeGen/AMDGPU/fmad-formation-fmul-distribute-denormal-mode.ll |
 | llvm/test/CodeGen/AMDGPU/udiv.ll |
 | llvm/test/CodeGen/AMDGPU/fdiv.ll |
 | llvm/test/CodeGen/AMDGPU/fmuladd.f16.ll |
 | llvm/test/CodeGen/AMDGPU/fma.f64.ll |
 | llvm/test/CodeGen/AMDGPU/dagcombine-fma-fmad.ll |
 | llvm/test/CodeGen/AMDGPU/mad-mix.ll |
Commit
598bebeaa645049d13f1d3d1c8b8b821bb97283f
by jay.foad[AMDGPU] Prefer fmac over fma when selecting FMA_W_CHAIN
FMA_W_CHAIN is used when lowering fdiv f32. Prefer to select it to fmac if there are no source modifiers, just like we do for other mad/mac and fma/fmac cases.
Differential Revision: https://reviews.llvm.org/D110074
|
 | llvm/test/CodeGen/AMDGPU/fdiv.ll |
 | llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp |
 | llvm/test/CodeGen/AMDGPU/frem.ll |
Commit
6fe35ef419391215e21e51b859f74e6b7b8819d4
by dvyukovtsan: fix debug format strings
Some of the DPrintf's currently produce -Wformat warnings if enabled. Fix these format strings.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D110131
|
 | compiler-rt/lib/tsan/rtl/tsan_mman.cpp |
 | compiler-rt/lib/tsan/rtl/tsan_interface_java.cpp |
 | compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp |
 | compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp |
 | compiler-rt/lib/tsan/rtl/tsan_rtl.cpp |
 | compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp |
Commit
908256b0ea3e3ba3b80dbe9c81fc68a1ee35ac33
by dvyukovtsan: rearrange thread state callbacks (NFC)
Thread state functions are split into 2 parts: tsan entry function (e.g. ThreadStart) and thread registry state change callback (e.g. OnStart). Currently these pairs of functions are located far from each other and in reverse order. This makes it hard to read and follow the logic. Reorder the code so that OnFoo directly follows ThreadFoo. No other code changes.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D110132
|
 | compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp |
Commit
9d7b7350c9e00e2a43585328e875bec11c8c8c17
by dvyukovtsan: simplify thread context setting
Currently we set thr->tctx after OnStarted callback taking thread registry mutex again and searching for the context. But OnStarted already runs under the thread registry mutex and has access to the context, so set it in the OnStarted. This makes code simpler and faster.
Depends on D110132.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D110133
|
 | compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp |
Commit
0f83456cf5bfaa731f9d5aa2500a5f10dd9de1b7
by llvm-dev[CodeGen] SDDbgValue::getSDNodes() - use const-ref to avoid unnecessary copies. NFCI.
Reported by MSVC static analyzer.
|
 | llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h |
Commit
f5d23d36de87f0cef3117df657d4f1d9133749c0
by llvm-devRewriteStatepointsForGC - Use const-ref iterator in for-range loops. NFCI.
Avoid unnecessary copies, reported by MSVC static analyzer.
|
 | llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp |
Commit
20b58855e0cfc263d609e8bb59e692024ecb42aa
by llvm-dev[CodeGen] SelectionDAGBuilder - Use const-ref iterator in for-range loops. NFCI.
Avoid unnecessary copies, reported by MSVC static analyzer.
|
 | llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp |
Commit
fc8f1e4419d338a347bade7cfc76f73052f00739
by llvm-dev[InstCombine] foldConstantInsEltIntoShuffle - bail if we fail to find constant element (PR51824)
If getAggregateElement() returns null for any element, early out as otherwise we will assert when creating a new constant vector
Fixes PR51824 + ; OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38057
|
 | llvm/test/Transforms/InstCombine/pr51824.ll |
 | llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp |
Commit
9e4d72675f476386cf6555cd5a1014cdd8d9facb
by nicholas.guy[AArch64] Improve schedule modelling on the Cortex-A55
Enables the FuseAddress feature in the Cortex-A55 scheduling model
Differential Revision: https://reviews.llvm.org/D109323
|
 | llvm/lib/Target/AArch64/AArch64.td |
 | llvm/test/CodeGen/AArch64/a55-fuse-address.mir |
Commit
ea27dd74972e95e513fefcf96067522364f4e3d7
by flo[VectorCombine] Add tests which require DT to use info from assumes.
|
 | llvm/test/Transforms/VectorCombine/AArch64/load-extract-insert-store-scalarization.ll |
 | llvm/test/Transforms/VectorCombine/AArch64/load-extractelement-scalarization.ll |
Commit
a48b43f9816aa3a3ccc9ca13e7767ccf70756729
by paulsson[SystemZ] Emit EXRL target instructions before text section is ended.
SystemZ adds the EXRL target instructions in the end of each file. This must be done before debug info emission since that may end the text section, and therefore this is now done in emitConstantPools() (instead of in emitEndOfAsmFile).
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D109513
|
 | llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCTargetDesc.cpp |
 | llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp |
 | llvm/lib/Target/SystemZ/SystemZTargetStreamer.h |
 | llvm/lib/Target/SystemZ/SystemZAsmPrinter.h |
 | llvm/test/CodeGen/SystemZ/memset-06.ll |