1. [AArch64] Fix over-eager fusing of NEON SIMD MUL/ADD (details)
  2. [lldb/Reproducer] Add version check (details)
  3. [OpenCL] Use generic addr space for lambda call operator (details)
  4. [EditLine] Fix RecallHistory to make it go in the right direction. (details)
  5. [SYCL] Add sycl_kernel attribute for accelerated code outlining (details)
  6. [SLP] Enhance SLPVectorizer to vectorize different combinations of (details)
  7. [scudo][standalone] Add chunk ownership function (details)
  8. Reland [clangd] Rethink how SelectionTree deals with macros and (details)
  9. llvm-config: do not link absolute paths with `-l` (details)
  10. [NFC][KnownBits] Add getMinValue() / getMaxValue() methods (details)
  11. [clang-format] Add new option to add spaces around conditions Summary: (details)
  12. Revert "Temporarily revert "build: avoid hardcoding the libxml2 library (details)
  13. Revert "[libomptarget] Build a minimal deviceRTL for amdgcn" (details)
  14. Rename `tsan/` to `test/tsan/race_range_pc.cpp`. (details)
  15. [LV] Scalar with predication must not be uniform (details)
Commit f2e7de81c625413a7f682c757ab64e7b63b48800 by Sanne.Wouda
[AArch64] Fix over-eager fusing of NEON SIMD MUL/ADD
Summary: The ISel pattern for SIMD MLA is a bit too eager: it replaces
the ADD with an MLA even when the MUL cannot be eliminated, e.g. when it
has another use.  An MLA is usually has a higher latency than an ADD
(and there are fewer pipes available that can execute it), so trading an
MLA for an ADD is not great.
ISel is not taking the number of uses of the MUL result into account,
nor any other factors such as the length of the critical path or other
resource pressure.
The MachineCombiner is able to make these judgments so this patch ports
the ISel pattern for MUL/ADD fusing to the MachineCombiner.
Similarly for MUL/SUB -> MLS, as well as the indexed variants.
The change has no impact on SPEC CPU© intrate nor fprate.
Reviewers: dmgreen, SjoerdMeijer, fhahn, Gerolf
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision:
The file was modifiedllvm/include/llvm/CodeGen/MachineCombinerPattern.h
The file was modifiedllvm/test/CodeGen/AArch64/overeager_mla_fusing.ll
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/select-with-no-legality-check.mir
The file was modifiedllvm/lib/Target/AArch64/AArch64InstrInfo.cpp
The file was modifiedllvm/lib/Target/AArch64/
Commit 62827737acd878af6cd8930758b0d6f297173f40 by Jonas Devlieghere
[lldb/Reproducer] Add version check
To ensure a reproducer works correctly, the version of LLDB used for
capture and replay must match. Right now the reproducer already contains
the LLDB version. However, this is purely informative. LLDB will happily
replay a reproducer generated with a different version of LLDB, which
can cause subtle differences.
This patch adds a version check which compares the current LLDB version
with the one in the reproducer. If the version doesn't match, LLDB will
refuse to replay. It also adds an escape hatch to make it possible to
still replay the reproducer without having to mess with the recorded
version. This might prove useful when you know two versions of LLDB
match, even though the version string doesn't. This behavior is
triggered by passing a new flag -reproducer-skip-version-check to the
lldb driver.
Differential revision:
The file was modifiedlldb/tools/driver/
The file was modifiedlldb/include/lldb/API/SBReproducer.h
The file was modifiedlldb/tools/driver/Driver.cpp
The file was addedlldb/test/Shell/Reproducer/TestVersionCheck.test
The file was modifiedlldb/source/API/SBReproducer.cpp
Commit 980133a2098cf6159785b8ac0cbe4d8fbf99bfea by anastasia.stulova
[OpenCL] Use generic addr space for lambda call operator
Since lambdas are represented by callable objects, we add generic addr
space for implicit object parameter in call operator.
Any lambda variable declared in __constant addr space
(which is not convertible to generic) fails to compile with a
diagnostic. To support constant addr space we need to add a way to
qualify the lambda call operators.
Tags: #clang
Differential Revision:
The file was addedclang/test/SemaOpenCLCXX/
The file was modifiedclang/lib/Sema/SemaType.cpp
The file was modifiedclang/lib/Sema/SemaLambda.cpp
The file was modifiedclang/lib/Sema/Sema.cpp
The file was modifiedclang/lib/Sema/SemaDeclCXX.cpp
The file was modifiedclang/include/clang/Sema/Sema.h
Commit 0e9b0b6d11e882efec8505d97c4b65e1562e6715 by Jonas Devlieghere
[EditLine] Fix RecallHistory to make it go in the right direction.
The naming used by editline for the history operations is counter
intuitive to how it's used in lldb for the REPL.
- The H_PREV operation returns the previous element in the history,
  which is newer than the current one.
- The H_NEXT operation returns the next element in the history, which
  is older than the current one.
This exposed itself as a bug in the REPL where the behavior of up- and
down-arrow was inverted. This wasn't immediately obvious because of how
we save the current "live" entry.
This patch fixes the bug and introduces and enum to wrap the editline
operations that match the semantics of lldb.
Differential revision:
The file was modifiedlldb/source/Host/common/Editline.cpp
The file was modifiedlldb/include/lldb/Host/Editline.h
Commit c094e7dc4b3f9d1c1e590b008bb1cc46e3496abd by alexey.bader
[SYCL] Add sycl_kernel attribute for accelerated code outlining
SYCL is single source offload programming model relying on compiler to
separate device code (i.e. offloaded to an accelerator) from the code
executed on the host.
Here is code example of the SYCL program to demonstrate compiler
outlining work:
``` int foo(int x) { return ++x; } int bar(int x) { throw
std::exception("CPU code only!"); }
... using namespace cl::sycl; queue Q; buffer<int, 1> a(range<1>{1024});
Q.submit([&](handler& cgh) {
auto A = a.get_access<access::mode::write>(cgh);
cgh.parallel_for<init_a>(range<1>{1024}, [=](id<1> index) {
   A[index] = index[0] + foo(42);
SYCL device compiler must compile lambda expression passed to
cl::sycl::handler::parallel_for method and function foo called from this
lambda expression for an "accelerator". SYCL device compiler also must
ignore bar function as it's not required for offloaded code execution.
This patch adds the sycl_kernel attribute, which is used to mark code
passed to cl::sycl::handler::parallel_for as "accelerated code".
Attribute must be applied to function templates which parameters include
at least "kernel name" and "kernel function object". These parameters
will be used to establish an ABI between the host application and
offloaded part.
Reviewers: jlebar, keryell, Naghasan, ABataev, Anastasia, bader,
aaron.ballman, rjmccall, rsmith
Reviewed By: keryell, bader
Subscribers: mgorny, OlegM, ArturGainullin, agozillon, aaron.ballman,
ebevhan, Anastasia, cfe-commits
Tags: #clang
Differential Revision:
Signed-off-by: Alexey Bader <>
The file was addedclang/test/SemaSYCL/kernel-attribute.cpp
The file was modifiedclang/include/clang/Basic/
The file was modifiedclang/include/clang/Basic/
The file was modifiedclang/lib/Sema/SemaDeclAttr.cpp
The file was addedclang/test/SemaSYCL/kernel-attribute-on-non-sycl.cpp
The file was modifiedclang/include/clang/Basic/
Commit a315519c17abaa621eddd30fd116ac2e030a36e9 by anton.a.afanasyev
[SLP] Enhance SLPVectorizer to vectorize different combinations of
Summary: Make SLPVectorize to recognize homogeneous aggregates like
`{<2 x float>, <2 x float>}`, `{{float, float}, {float, float}}`,
`[2 x {float, float}]` and so on. It's a follow-up of Merged `findBuildVector()` and
`findBuildAggregate()` to one `findBuildAggregate()` function making it
recursive to recognize multidimensional aggregates. Aggregates required
to be homogeneous.
Reviewers: RKSimon, ABataev, dtemirbulatov, spatel, vporpo
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision:
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/pr42022.ll
The file was modifiedllvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
Commit 5595249e48ef83bae5f2e61c0190332534902051 by kostyak
[scudo][standalone] Add chunk ownership function
Summary: In order to be compliant with tcmalloc's extension ownership
determination function, we have to expose a function that will say if a
chunk was allocated by us.
As to whether or not this has security consequences: someone able to
call this function repeatedly could use it to determine secrets (cookie)
or craft a valid header. So this should not be exposed directly to
untrusted user input.
Add related tests.
Additionally clang-format caught a few things to change.
Reviewers: hctim, pcc, cferris, eugenis, vitalybuka
Subscribers: JDevlieghere, jfb, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision:
The file was modifiedcompiler-rt/lib/scudo/standalone/tests/combined_test.cpp
The file was modifiedcompiler-rt/lib/scudo/standalone/combined.h
The file was modifiedcompiler-rt/lib/scudo/standalone/chunk.h
Commit c9c714c7054d555398c767cb39d7d97600b3d9d1 by sam.mccall
Reland [clangd] Rethink how SelectionTree deals with macros and
This reverts commit 905b002c139f039a32ab9bf1fad63d745d12423f.
Avoid tricky (and invalid) comparator for std::set.
The file was modifiedclang-tools-extra/clangd/Selection.h
The file was modifiedclang-tools-extra/clangd/Selection.cpp
The file was modifiedclang/lib/Tooling/Syntax/Tokens.cpp
The file was modifiedclang-tools-extra/clangd/unittests/TweakTests.cpp
The file was modifiedclang/unittests/Tooling/Syntax/TokensTest.cpp
The file was modifiedclang-tools-extra/clangd/unittests/SelectionTests.cpp
The file was modifiedclang/include/clang/Tooling/Syntax/Tokens.h
Commit 372ad32734ecb455f9fb4d0601229ca2dfc78b66 by Saleem Abdulrasool
llvm-config: do not link absolute paths with `-l`
When dealing with system libraries which are absolute paths, use the
absolute path rather than the `-l` option.  This ensures that the system
library can be properly linked against.  This is needed to enable using
proper link dependencies in CMake.
The file was modifiedllvm/tools/llvm-config/CMakeLists.txt
Commit 9a20c79ddc2523fb68be4d4246d7835c761c382f by lebedev.ri
[NFC][KnownBits] Add getMinValue() / getMaxValue() methods
As it can be seen from accompanying cleanup, it is not unheard of to
write `~Known.Zero` meaning "what maximal value can this KnownBits
produce". But i think `~Known.Zero` isn't *that* self-explanatory, as
compared to a method with a name.
Note that not all `~Known.Zero` places were cleaned up, only those where
this arguably improves things.
The file was modifiedllvm/lib/Analysis/ValueTracking.cpp
The file was modifiedllvm/lib/Support/KnownBits.cpp
The file was modifiedllvm/include/llvm/Support/KnownBits.h
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
The file was modifiedllvm/unittests/Support/KnownBitsTest.cpp
The file was modifiedllvm/lib/Target/SystemZ/SystemZISelLowering.cpp
The file was modifiedllvm/lib/IR/ConstantRange.cpp
Commit 26748a321e20a7aa952ce8daa4f030c384ae7032 by mitchell
[clang-format] Add new option to add spaces around conditions Summary:
This diff adds a new option SpacesAroundConditions that inserts spaces
inside the braces for conditional statements.
Reviewers: klimek, owenpan, mitchell-stellar, MyDeveloperDay
Patch by: timwoj
Subscribers: rsmmr, cfe-commits
Tags: clang, clang-format
Differential Revision:
The file was modifiedclang/lib/Format/TokenAnnotator.cpp
The file was modifiedclang/include/clang/Format/Format.h
The file was modifiedclang/docs/ClangFormatStyleOptions.rst
The file was modifiedclang/unittests/Format/FormatTest.cpp
The file was modifiedclang/lib/Format/Format.cpp
Commit abe8de29c4ae5eca86f3594d2edd43b2fcbda623 by Saleem Abdulrasool
Revert "Temporarily revert "build: avoid hardcoding the libxml2 library
This reverts commit 2e75681b55ab55301022533b203269f5f3d6f909.  Restore
the clean up change.  The underlying CMake issue was resolved in
The file was modifiedllvm/cmake/config-ix.cmake
The file was modifiedllvm/lib/WindowsManifest/CMakeLists.txt
Commit 02b9c5d963c6c87a5dba46642c63bdb6b35901f1 by a.bataev
Revert "[libomptarget] Build a minimal deviceRTL for amdgcn"
This reverts commit 877ffa716fba52251a7454ffd3727d025b617a1f because it
breaks the build.
The file was modifiedopenmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h
The file was removedopenmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt
The file was removedopenmp/libomptarget/deviceRTLs/common/support.h
The file was addedopenmp/libomptarget/deviceRTLs/nvptx/src/support.h
The file was modifiedopenmp/libomptarget/deviceRTLs/CMakeLists.txt
The file was removedopenmp/libomptarget/deviceRTLs/amdgcn/src/device_environment.h
Commit 96c8024e2eb05278206b1eb59208bad0f3c68f2e by Dan Liew
Rename `tsan/` to `test/tsan/race_range_pc.cpp`.
The old suffix was preventing it from being executed by default.
The file was addedcompiler-rt/test/tsan/race_range_pc.cpp
The file was removedcompiler-rt/test/tsan/
Commit 6ed9cef25f915d4533f261c401cee29d8d8012d5 by ayal.zaks
[LV] Scalar with predication must not be uniform
Fix PR40816: avoid considering scalar-with-predication instructions as
also uniform-after-vectorization.
Instructions identified as "scalar with predication" will be
"vectorized" using a replicating region. If such instructions are also
optimized as "uniform after vectorization", namely when only the first
of VF lanes is used, such a replicating region becomes erroneous - only
the first instance of the region can and should be formed. Fix such
cases by not considering such instructions as
"uniform after vectorization".
Differential Revision:
The file was modifiedllvm/test/Transforms/LoopVectorize/X86/consecutive-ptr-uniforms.ll
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp