Changes from Git (git http://labmaster3.local/git/llvm-project.git)


  1. [CMake] Add check-debuginfo-* targets (details)
  2. [AArch64] add vector test for merged condition branching; NFC (details)
  3. [DAGCombiner] rename variables for readability; NFC (details)
  4. [flang] Port remaining tests to FileCheck (details)
  5. [NFCI] Cleanup range checks in Register/MCRegister (details)
  6. AMDGPU: Add llvm.amdgcn.sqrt intrinsic (details)
  7. [libc++] Remove support for building through llvm-config (details)
  8. [CostModel] Avoid traditional ConstantExpr crashy pitfails (details)
  9. Correct documented spelling of ffinite-math to ffinite-math-only (details)
  10. [clang][SourceManager] cache Macro Expansions (details)
  11. [SVE] Code generation for fixed length vector adds. (details)
  12. [NFC] Builtins: list 'R' for restrict (details)
  13. [VPlan] Add & use VPValue for VPWidenGEPRecipe operands (NFC). (details)
  14. More corrections to documented spelling of ffinite-math to ffinite-math-only (details)
  15. Revert "[sve][acle] Add reinterpret intrinsics for brain float." (details)
  16. [InstCombine] Drop debug loc in TryToSinkInstruction (details)
  17. Extend or truncate __ptr32/__ptr64 pointers when dereferenced. (details)
Commit ac567eec119f7d288c6f47921b348aeea7d743cd by maskray
[CMake] Add check-debuginfo-* targets

* check-debuginfo-dexter runs lit tests under debuginfo-tests/dexter/
* check-debuginfo-llgdb-tests runs lit tests under debuginfo-tests/llgdb-tests/
* ...

Reviewed By: tbosch

Differential Revision:
The file was modifieddebuginfo-tests/CMakeLists.txt
Commit 67043ed8853569d25ad4f38c4626522b4958c914 by spatel
[AArch64] add vector test for merged condition branching; NFC
The file was addedllvm/test/CodeGen/AArch64/vec-extract-branch.ll
Commit e7f7715eb9ba223c8e754604c0fd9e3ab0c3a044 by spatel
[DAGCombiner] rename variables for readability; NFC

PR46406 shows a pattern where we can do better, so try to clean this up
before adding more code.
The file was modifiedllvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Commit b537c81b5fde154062ecf7521d41902b130dffe6 by richard.barton
[flang] Port remaining tests to FileCheck

Port the remaining tests which only require mechanical changes and delete

    * Delete old RUN lines
    * Replace:
        EXEC: ${F18} ... | ${FileCheck} ...
        RUN: %f18 .. | FileCheck ...
    * Prepend RUN line with not when it is expected to fail

Also reinstate a de-activated EXEC line and port it in the same way.

Differential Revision:
The file was modifiedflang/test/Semantics/label07.f90
The file was modifiedflang/test/Semantics/label03.f90
The file was modifiedflang/test/Semantics/label08.f90
The file was modifiedflang/test/Semantics/label05.f90
The file was modifiedflang/test/Semantics/label14.f90
The file was modifiedflang/test/Semantics/canondo05.f90
The file was modifiedflang/test/Semantics/label13.f90
The file was modifiedflang/test/Semantics/canondo19.f90
The file was modifiedflang/test/Semantics/label04.f90
The file was modifiedflang/test/Semantics/doconcurrent03.f90
The file was modifiedflang/test/Semantics/doconcurrent07.f90
The file was modifiedflang/test/Semantics/label10.f90
The file was modifiedflang/test/Semantics/critical04.f90
The file was modifiedflang/test/Semantics/canondo03.f90
The file was modifiedflang/test/Semantics/doconcurrent02.f90
The file was modifiedflang/test/Semantics/canondo04.f90
The file was modifiedflang/test/Semantics/label02.f90
The file was modifiedflang/test/Semantics/label09.f90
The file was modifiedflang/test/Semantics/canondo07.f90
The file was modifiedflang/test/Semantics/canondo06.f90
The file was removedflang/test/Semantics/
The file was modifiedflang/test/Semantics/label12.f90
The file was modifiedflang/test/Semantics/canondo01.f90
The file was modifiedflang/test/Semantics/canondo02.f90
The file was modifiedflang/test/Semantics/label06.f90
Commit 16dae81edc240b7c9f58e4b5ec0cf8d5ba0d847d by daltenty
[NFCI] Cleanup range checks in Register/MCRegister

by removing casts from unsigned to int that which may be implementation
defined according to C++14 (and thus trip up the XL compiler on AIX) by
just using unsigned comparisons/masks and refactor out the range
constants to cleanup things a bit while we are at it.

Reviewers: hubert.reinterpretcast, arsenm

Reviewed By: hubert.reinterpretcast

Subscribers: wdng, llvm-commits

Tags: #llvm

Differential Revision:
The file was modifiedllvm/include/llvm/MC/MCRegister.h
The file was modifiedllvm/include/llvm/CodeGen/Register.h
Commit 9e03bdebc17a223416d682f64ef2046b8bf0fc98 by arsenm2
AMDGPU: Add llvm.amdgcn.sqrt intrinsic

I spread the GlobalISel test into the regular one, which I've been
avoiding so far.
The file was modifiedclang/test/SemaOpenCL/
The file was modifiedllvm/include/llvm/IR/
The file was addedllvm/test/CodeGen/AMDGPU/llvm.amdgcn.sqrt.f16.ll
The file was modifiedclang/lib/CodeGen/CGBuiltin.cpp
The file was modifiedclang/test/CodeGenOpenCL/
The file was modifiedclang/include/clang/Basic/BuiltinsAMDGPU.def
The file was modifiedclang/test/CodeGenOpenCL/
The file was modifiedllvm/lib/Target/AMDGPU/
The file was addedllvm/test/CodeGen/AMDGPU/llvm.amdgcn.sqrt.ll
The file was modifiedllvm/lib/Target/AMDGPU/
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
Commit 8bc62db272448035d60ca9f31ff67040a288063c by Louis Dionne
[libc++] Remove support for building through llvm-config

We've decided to move away from that by requiring that libc++ is built
as part of the monorepo a while ago. This commit removes code pertaining
to that unsupported use case and produces a clear error when the user
violates that.

In fact, building outside of the monorepo will still work as long as
LLVM_PATH is pointing to the root of the LLVM project, although that
is not officially supported.
The file was modifiedlibcxx/include/CMakeLists.txt
The file was modifiedlibcxx/cmake/Modules/HandleOutOfTreeLLVM.cmake
The file was modifiedlibcxx/cmake/Modules/HandleLibCXXABI.cmake
Commit 64258773ad99b3b6a3eb2a456b79518c1444d9f3 by lebedev.ri
[CostModel] Avoid traditional ConstantExpr crashy pitfails

I'm not sure if this is a regression from D81448 + D81643,
which moved at least the code cast from elsewhere,
or somehow no one triggered that before.
But now we can reach it with a non-instruction..

It is not straight-forward to write cost-model tests for constantexprs,
`-cost-model -analyze -cost-kind=` does not appear to look at them,
or maybe i'm doing it wrong.

I've encountered that via a SimplifyCFG crash,
so reduced (currently-crashing) test is added.
There are likely other instances.

For now, simply restore previous status quo of
not crashing and returning TTI::TCC_Basic.
The file was modifiedllvm/include/llvm/Analysis/TargetTransformInfoImpl.h
The file was addedllvm/test/Transforms/SimplifyCFG/constantexprs.ll
Commit 7cc5307c73caa72f22edd4208b175e3c36eec46e by melanie.blower
Correct documented spelling of ffinite-math to ffinite-math-only

This is to correct
ffinite-math-only is a gcc option.  That is the correct spelling.
File modified is clang/docs/UsersManual.rst
The file was modifiedclang/docs/UsersManual.rst
Commit dffc1420451f674731cb36799c8ae084104ff0b5 by ndesaulniers
[clang][SourceManager] cache Macro Expansions

A seemingly innocuous Linux kernel change [0] seemingly blew up our
compile times by over 3x, as reported by @nathanchance in [1].

The code in question uses a doubly nested macro containing GNU C
statement expressions that are then passed to typeof(), which is then
used in a very important macro for atomic variable access throughout
most of the kernel. The inner most macro, is passed a GNU C statement
expression.  In this case, we have macro arguments that are GNU C
statement expressions, which can contain a significant number of tokens.
The upstream kernel patch caused significant build time regressions for
both Clang and GCC. Since then, some of the nesting has been removed via
@melver, which helps gain back most of the lost compilation time. [2]

Profiles collected [3] from compilations of the slowest TU for us in the
kernel show:
* 51.4% time spent in clang::TokenLexer::updateLocForMacroArgTokens
* 48.7% time spent in clang::SourceManager::getFileIDLocal
* 35.5% time spent in clang::SourceManager::isOffsetInFileID
(mostly calls from the former through to the latter).

So it seems we have a pathological case for which properly tracking the
SourceLocation of macro arguments is significantly harming build
performance. This stands out in referenced flame graph.

In fact, this case was identified previously as being problematic in
commit 3339c568c4 ("[Lex] Speed up updateConsecutiveMacroArgTokens (NFC)")

Looking at the above call chain, there's 3 things we can do to speed up
this case.

1. TokenLexer::updateConsecutiveMacroArgTokens() calls
   SourceManager::isWrittenInSameFile() which calls
   SourceManager::getFileID(), which is both very hot and very expensive
   to call. SourceManger has a one entry cache, member LastFileIDLookup.
   If that isn't the FileID for a give source location offset, we fall
   back to a linear probe, and then to a binary search for the FileID.
   These fallbacks update the one entry cache, but noticeably they do
   not for the case of macro expansions!

   For the slowest TU to compile in the Linux kernel, it seems that we
   miss about 78.67% of the 68 million queries we make to getFileIDLocal
   that we could have had cache hits for, had we saved the macro
   expansion source location's FileID in the one entry cache. [4]

   I tried adding a separate cache item for macro expansions, and to
   check that before the linear then binary search fallbacks, but did
   not find it faster than simply allowing macro expansions into the one
   item cache.  This alone nets us back a lot of the performance loss.

   That said, this is a modification of caching logic, which is playing
   with a double edged sword.  While it significantly improves the
   pathological case, its hard to say that there's not an equal but
   opposite pathological case that isn't regressed by this change.
   Though non-pathological cases of builds of the Linux kernel before
   [0] are only slightly improved (<1%) and builds of LLVM itself don't
   change due to this patch.

   Should future travelers find this change to significantly harm their
   build times, I encourage them to feel empowered to revert this

2. SourceManager::getFileIDLocal has a FIXME hinting that the call to
   SourceManager::isOffsetInFileID could be made much faster since
   isOffsetInFileID is generic in the sense that it tries to handle the
   more generic case of "local" (as opposed to "loaded") files, though
   the caller has already determined the file to be local. This patch
   implements a new method that specialized for use when the caller
   already knows the file is local, then use that in
   TokenLexer::updateLocForMacroArgTokens.  This should be less
   controversial than 1, and is likely an across the board win. It's
   much less significant for the pathological case, but still a
   measurable win once we have fallen to the final case of binary
   search.  D82497

3. A bunch of methods in SourceManager take a default argument.
   SourceManager::getLocalSLocEntry doesn't do anything with this
   argument, yet many callers of getLocalSLocEntry setup, pass, then
   check this argument. This is wasted work.  D82498

With this patch applied, the above profile [5] for the same pathological
input looks like:
* 25.1% time spent in clang::TokenLexer::updateLocForMacroArgTokens
* 17.2% time spent in clang::SourceManager::getFileIDLocal
and clang::SourceManager::isOffsetInFileID is no longer called, and thus
falls out of the profile.

There may be further improvements to the general problem of "what
interval contains one number out of millions" than the current use of a
one item cache, followed by linear probing, followed by binary
searching. We might even be able to do something smarter in


Reviewed By: kadircet

Differential Revision:
The file was modifiedclang/lib/Basic/SourceManager.cpp
Commit 3a98d5d7e7f5c651f1f22bf8dc552d5161cb999e by paul.walker
[SVE] Code generation for fixed length vector adds.

Teach LowerToPredicatedOp to lower fixed length vector operations.

Add AArch64ISD nodes and isel patterns for predicated integer
and floating point adds.

Together this enables SVE code generation for fixed length vector adds.

Reviewers: rengolin, efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision:
The file was modifiedllvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
The file was modifiedllvm/lib/Target/AArch64/
The file was addedllvm/test/CodeGen/AArch64/sve-fixed-length-fp-arith.ll
The file was modifiedllvm/lib/Target/AArch64/
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.h
The file was addedllvm/test/CodeGen/AArch64/sve-fixed-length-int-arith.ll
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.cpp
Commit 13fdcd37b325f62ff2513c59807de9ad0a9d2a51 by JF Bastien
[NFC] Builtins: list 'R' for restrict

It was added to the list of builtin modifiers in r148573 back in 2012-01-20, but the comment wasn't updated.
The file was modifiedclang/include/clang/Basic/Builtins.def
Commit c0cdba727ab29fb8ed2758a93a61d9658036ffe7 by flo
[VPlan] Add & use VPValue for VPWidenGEPRecipe operands (NFC).

This patch adds VPValue version of the GEP's operands to
VPWidenGEPRecipe and uses them during code-generation.

Reviewers: Ayal, gilr, rengolin

Reviewed By: gilr

Differential Revision:
The file was modifiedllvm/lib/Transforms/Vectorize/VPlan.h
The file was modifiedllvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Commit 467ba4c92f5bfafd88e154c1657d6ac11dfb34df by melanie.blower
More corrections to documented spelling of ffinite-math to ffinite-math-only
The file was modifiedclang/docs/UsersManual.rst
Commit ff5ccf258e297df29f32d6b5e4fa0a7b95c44f9c by francesco.petrogalli
Revert "[sve][acle] Add reinterpret intrinsics for brain float."

This reverts commit a15722c5ce4759c12960fe434ee6bd8aac70bb16.

The commmit has to be reverted because I accidentally submit without the C tests that were added in
an early version of the patch.
The file was modifiedllvm/lib/Target/AArch64/
The file was modifiedllvm/test/CodeGen/AArch64/sve-bitcast.ll
The file was modifiedclang/utils/TableGen/SveEmitter.cpp
Commit 903cf140d0118cf0d3f0f6f8967c6a20d9c5be6b by Vedant Kumar
[InstCombine] Drop debug loc in TryToSinkInstruction

The advice in HowToUpdateDebugInfo.rst is to "... preserve the debug
location of an instruction if the instruction either remains in its
basic block, or if its basic block is folded into a predecessor that
branches unconditionally".

TryToSinkInstruction doesn't seem to satisfy the criteria as it's
sinking an instruction to some successor block. Preserving the debug loc
can make single-stepping appear to go backwards, or make a breakpoint
hit on that location happen "too late" (since single-stepping from that
breakpoint can cause the function to return unexpectedly).

So, drop the debug location.

Reviewers: aprantl, davide

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision:
The file was addedllvm/test/Transforms/InstCombine/sink_to_unreachable_dbg.ll
The file was modifiedllvm/lib/Transforms/InstCombine/InstructionCombining.cpp
Commit 8b59c26bf347be5d96487c89849c0c1108bb3c42 by akhuang
Extend or truncate __ptr32/__ptr64 pointers when dereferenced.

A while ago I implemented the functionality to lower Microsoft __ptr32
and __ptr64 pointers, which are stored as 32-bit and 64-bit pointer
and are extended/truncated to the appropriate pointer size when
This patch adds an addrspacecast to cast from the __ptr32/__ptr64
pointer to a default address space when dereferencing.


Reviewers: hans, arsenm, RKSimon

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision:
The file was modifiedllvm/test/CodeGen/X86/mixed-ptr-sizes.ll
The file was modifiedllvm/test/CodeGen/X86/mixed-ptr-sizes-i686.ll
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp