Started 3 days 16 hr ago
Took 5 hr 10 min

Success Build clang-r364076-t57556-b57556.tar.gz (Jun 21, 2019 12:42:35 PM)

Issues

No known issues detected

Build Log

Revision: 362564
Changes
  1. Tabs to spaces. NFC. (detail)
    by gkistanova
  2. Allow merge build requests for lld-x86_64-win7. (detail)
    by gkistanova
  3. [GN] Report check-clang-tools error as warning (detail)
    by Vitaly Buka
  4. [lldb-cmake-standalone] Streamline labels of build stages (detail)
    by stefan.graenitz
  5. [lldb-cmake-standalone] Build and test CMake-generated Xcode project in Debug mode (detail)
    by stefan.graenitz
  6. [lldb-cmake-standalone] Build provided LLVM tree as RelWithDebInfo (detail)
    by stefan.graenitz
  7. [zorg] Add solaris11-amd64, solaris11-sparcv9 builders

    I'm working to provide two Solaris 11.4 build slaves with a clang builder
    each, one on amd64, the other on sparcv9.  I'm still working out the
    details like parallelism and max_builds, but the attached patch captures
    the basics, intended to be minimal.

    Differential Revision: https://reviews.llvm.org/D63495 (detail)
    by ro
  8. Separate options into separate array elements. (detail)
    by Adrian Prantl
  9. Run the debuginfo-tests as part of the lldb-cmake bot.

    rdar://problem/51799130 (detail)
    by Adrian Prantl
  10. Remove the debuginfo-tests from the default llvm configuration.

    I'm going to make them a separate bot where we have more control over what the host LLDB is.

    rdar://problem/51799130 (detail)
    by Adrian Prantl
  11. Add lit timeout for lldb arm/aarch64 ubuntu builders

    This patch adds a timeout interval of 200 seconds for any hanging tests. (detail)
    by omjavaid
  12. [zorg] Add lldb-arm-ubuntu builder

    This patch adds lldb arm linux builder. It ll run on staging master until tests become stable.

    Differential revision:https://reviews.llvm.org/D63441 (detail)
    by omjavaid
  13. Moved builder lld-x86_64-win7 to another machine.
    Removed slave/worker as-bldslv4. (detail)
    by gkistanova
  14. [lldb-cmake-standalone] CMake-generated Xcode project should build the LLDB.framework (detail)
    by stefan.graenitz
  15. [lldb-cmake-standalone] Invoke llvm-lit manually for CMake-generated Xcode project in order to pass --verbose flag (detail)
    by stefan.graenitz
Revision: 362564
Changes
  1. [DAGCombine] narrowExtractedVectorBinOp - pull out repeated getOpcode(). NFCI. (detail)
    by rksimon
  2. [AArch64][GlobalISel] Make s8 and s16 G_CONSTANTs legal.

    We sometimes get poor code size because constants of types < 32b are legalized
    as 32 bit G_CONSTANTs with a truncate to fit. This works but means that the
    localizer can no longer sink them (although it's possible to extend it to do so).

    On AArch64 however s8 and s16 constants can be selected in the same way as s32
    constants, with a mov pseudo into a W register. If we make s8 and s16 constants
    legal then we can avoid unnecessary truncates, they can be CSE'd, and the
    localizer can sink them as normal.

    There is a caveat: if the user of a smaller constant has to widen the sources,
    we end up with an anyext of the smaller typed G_CONSTANT. This can cause
    regressions because of the additional extend and missed pattern matching. To
    remedy this, there's a new artifact combiner to generate the wider G_CONSTANT
    if it's legal for the target.

    Differential Revision: https://reviews.llvm.org/D63587 (detail)
    by aemerson
  3. [AMDGPU] hazard recognizer for fp atomic to s_denorm_mode

    This requires 3 wait states unless there is a wait or VALU in
    between.

    Differential Revision: https://reviews.llvm.org/D63619 (detail)
    by rampitec
  4. [InstCombine] (1 << (C - x)) -> ((1 << C) >> x) if C is bitwidth - 1

    Summary:
    ```
    %a = sub i32 31, %x
    %r = shl i32 1, %a
      =>
    %d = shl i32 1, 31
    %r = lshr i32 %d, %x

    Done: 1
    Optimization is correct!
    ```

    https://rise4fun.com/Alive/btZm

    Reviewers: spatel, lebedev.ri, nikic

    Reviewed By: lebedev.ri

    Subscribers: llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63652 (detail)
    by xbolva00
  5. [X86] isBinOp - move commutative ops to isCommutativeBinOp. NFCI.

    TargetLoweringBase::isBinOp checks isCommutativeBinOp as a fallback, so don't duplicate. (detail)
    by rksimon
  6. [NFC] Added more tests for D63652 (detail)
    by xbolva00
  7. Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI. (detail)
    by rksimon
  8. [InstCombine] cttz(abs(x)) -> cttz(x)

    Summary: Signedness does not change number of trailing zeros.

    Reviewers: spatel, lebedev.ri, nikic

    Reviewed By: lebedev.ri

    Differential Revision: https://reviews.llvm.org/D63546 (detail)
    by xbolva00
  9. [GVNSink] prevent crashing on mismatched instructions (PR42346)

    Patch based on suggestion by James Molloy (@jmolloy) in:
    https://bugs.llvm.org/show_bug.cgi?id=42346 (detail)
    by spatel
  10. [NFC] Added tests for (1 << (C - x)) -> ((1 << C) >> x) (detail)
    by xbolva00
  11. [DAGCombine] narrowInsertExtractVectorBinOp - reuse "extract from insert" detection code.

    Move the "extract from insert detection code" into a lambda helper function. (detail)
    by rksimon
  12. [docs][llvm-objdump] Fix bad merge of docs (detail)
    by jhenderson
  13. [llvm-objcopy] - Get rid of dynrel.elf precompiled binary from inputs.

    We do not have to spread using the precompiled binaries in the tests,
    when we can use YAML. This patch removes the dynrel.elf binary and adds
    a few comments to the test cases.

    Differential revision: https://reviews.llvm.org/D63641 (detail)
    by grimar
  14. [Scalarizer] Propagate IR flags

    Summary:
    The motivation for this was to propagate fast-math flags like nnan and
    ninf on vector floating point operations to the corresponding scalar
    operations to take advantage of follow-on optimizations. But I think
    the same argument applies to all of our IR flags: if they apply to the
    vector operation then they also apply to all the individual scalar
    operations, and they might enable follow-on optimizations.

    Subscribers: hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63593 (detail)
    by foad
  15. [llvm-readobj] - Inline a few yaml inputs into test cases.

    There are some test that are splitted into main part + input yaml for no visible reason.
    This patch inines the yaml part for the 3 test cases I found.

    Differential revision: https://reviews.llvm.org/D63644 (detail)
    by grimar
  16. Set an explicit x86 triple for test bottleneck-analysis.s added by my r364045. NFC

    This should unbreak the ppc64 buildbots. (detail)
    by adibiagio
  17. [RISCV] Add RISCV-specific TargetTransformInfo

    Summary:
    LLVM Allows Targets to provide information that guides optimisations
    made to LLVM IR. This is done with callbacks on a TargetTransformInfo object.

    This patch adds a TargetTransformInfo class for RISC-V. This will allow us to
    implement RISC-V specific callbacks as they become necessary.

    This commit also adds the getIntImmCost callbacks, and tests them with a simple
    constant hoisting test. Our immediate costs are on the conservative side, for
    the moment, but we prevent hoisting in most circumstances anyway.

    Previous review was on D63007

    Reviewers: asb, luismarques

    Reviewed By: asb

    Subscribers: ributzka, MaskRay, llvm-commits, Jim, benna, psnobl, jocewei, PkmX, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, hiraditya, mgorny

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63433 (detail)
    by lenary
  18. [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of instructions based on the simulation.

    This patch teaches the bottleneck analysis how to identify and print the most
    expensive sequence of instructions according to the simulation. Fixes PR37494.

    The goal is to help users identify the sequence of instruction which is most
    critical for performance.

    A dependency graph is internally used by the bottleneck analysis to describe
    data dependencies and processor resource interferences between instructions.

    There is one node in the graph for every instruction in the input assembly
    sequence. The number of nodes in the graph is independent from the number of
    iterations simulated by the tool. It means that a single node of the graph
    represents all the possible instances of a same instruction contributed by the
    simulated iterations.

    Edges are dynamically "discovered" by the bottleneck analysis by observing
    instruction state transitions and "backend pressure increase" events generated
    by the Execute stage. Information from the events is used to identify critical
    dependencies, and materialize edges in the graph. A dependency edge is uniquely
    identified by a pair of node identifiers plus an instance of struct
    DependencyEdge::Dependency (which provides more details about the actual
    dependency kind).

    The bottleneck analysis internally ranks dependency edges based on their impact
    on the runtime (see field DependencyEdge::Dependency::Cost). To this end, each
    edge of the graph has an associated cost. By default, the cost of an edge is a
    function of its latency (in cycles). In practice, the cost of an edge is also a
    function of the number of cycles where the dependency has been seen as
    'contributing to backend pressure increases'. The idea is that the higher the
    cost of an edge, the higher is the impact of the dependency on performance. To
    put it in another way, the cost of an edge is a measure of criticality for
    performance.

    Note how a same edge may be found in multiple iteration of the simulated loop.
    The logic that adds new edges to the graph checks if an equivalent dependency
    already exists (duplicate edges are not allowed). If an equivalent dependency
    edge is found, field DependencyEdge::Frequency of that edge is incremented by
    one, and the new cost is cumulatively added to the existing edge cost.

    At the end of simulation, costs are propagated to nodes through the edges of the
    graph. The goal is to identify a critical sequence from a node of the root-set
    (composed by node of the graph with no predecessors) to a 'sink node' with no
    successors.  Note that the graph is intentionally kept acyclic to minimize the
    complexity of the critical sequence computation algorithm (complexity is
    currently linear in the number of nodes in the graph).

    The critical path is finally computed as a sequence of dependency edges. For
    edges describing processor resource interferences, the view also prints a
    so-called "interference probability" value (by dividing field
    DependencyEdge::Frequency by the total number of iterations).

    Examples of critical sequence computations can be found in tests added/modified
    by this patch.

    On output streams that support colored output, instructions from the critical
    sequence are rendered with a different color.

    Strictly speaking the analysis conducted by the bottleneck analysis view is not
    a critical path analysis. The cost of an edge doesn't only depend on the
    dependency latency. More importantly, the cost of a same edge may be computed
    differently by different iterations.

    The number of dependencies is discovered dynamically based on the events
    generated by the simulator. However, their number is not fixed. This is
    especially true for edges that model processor resource interferences; an
    interference may not occur in every iteration. For that reason, it makes sense
    to also print out a "probability of interference".

    By construction, the accuracy of this analysis (as always) is strongly dependent
    on the simulation (and therefore the quality of the information available in the
    scheduling model).

    That being said, the critical sequence effectively identifies a performance
    criticality. Instructions from that sequence are expected to have a very big
    impact on performance. So, users can take advantage of this information to focus
    their attention on specific interactions between instructions.
    In my experience, it works quite well in practice, and produces useful
    output (in a reasonable amount time).

    Differential Revision: https://reviews.llvm.org/D63543 (detail)
    by adibiagio
  19. [ARM] Add MVE 64-bit GPR <-> vector move instructions.

    These instructions let you load half a vector register at once from
    two general-purpose registers, or vice versa.

    The assembly syntax for these instructions mentions the vector
    register name twice. For the move _into_ a vector register, the MC
    operand list also has to mention the register name twice (once as the
    output, and once as an input to represent where the unchanged half of
    the output register comes from). So we can conveniently assign one of
    the two asm operands to be the output $Qd, and the other $QdSrc, which
    avoids confusing the auto-generated AsmMatcher too much. For the move
    _from_ a vector register, there's no way to get round the fact that
    both instances of that register name have to be inputs, so we need a
    custom AsmMatchConverter to avoid generating two separate output MC
    operands. (And even that wouldn't have worked if it hadn't been for
    D60695.)

    Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

    Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62679 (detail)
    by statham
  20. [ARM] Add MVE vector instructions that take a scalar input.

    This adds the `MVE_qDest_rSrc` superclass and all its instances, plus
    a few other instructions that also take a scalar input register or two.

    I've also belatedly added custom diagnostic messages to the operand
    classes for odd- and even-numbered GPRs, which required matching
    changes in two of the existing MVE assembly test files.

    Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

    Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62678 (detail)
    by statham
  21. Fix a crash with assembler source and -g.

    llvm-mc or clang with -g normally produces debug info describing the
    assembler source itself; however, if that source already contains some
    .file/.loc directives, we should instead emit the debug info described
    by those directives.  For certain assembler sources seen in the wild
    (particularly in the Chrome build) this was causing a crash due to
    incorrect assumptions about legal sequences of assembler source text.

    Fixes PR38994.

    Differential Revision: https://reviews.llvm.org/D63573 (detail)
    by probinson
  22. [X86] X86ISD::ANDNP is a (non-commutative) binop

    The sat add/sub tests still have unnecessary extract_subvector((vandnps ymm, ymm), 0) uses that should be split to (vandnps (extract_subvector(ymm, 0), extract_subvector(ymm, 0)), but its getting better. (detail)
    by rksimon
  23. [ARM] Add a batch of similarly encoded MVE instructions.

    Summary:
    This adds the `MVE_qDest_qSrc` superclass and all instructions that
    inherit from it. It's not the complete class of _everything_ with a
    q-register as both destination and source; it's a subset of them that
    all have similar encodings (but it would have been hopelessly unwieldy
    to call it anything like MVE_111x11100).

    This category includes add/sub with carry; long multiplies; halving
    multiplies; multiply and accumulate, and some more complex
    instructions.

    Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

    Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62677 (detail)
    by statham
  24. [binutils] Add response file option to help and docs

    Many LLVM-based tools already support response files (i.e. files
    containing a list of options, specified with '@'). This change simply
    updates the documentation and help text for some of these tools to
    include it. I haven't attempted to fix all tools, just a selection that
    I am interested in.

    I've taken the opportunity to add some tests for --help behaviour, where
    they were missing. We could expand these tests, but I don't think that's
    within scope of this patch.

    This fixes https://bugs.llvm.org/show_bug.cgi?id=42233 and
    https://bugs.llvm.org/show_bug.cgi?id=42236.

    Reviewed by: grimar, MaskRay, jkorous

    Differential Revision: https://reviews.llvm.org/D63597 (detail)
    by jhenderson
  25. [X86] createMMXBuildVector - call with BuildVectorSDNode directly. NFCI. (detail)
    by rksimon
  26. [llvm-dwarfdump] Remove unnecessary explicit -h behaviour

    --help and -h are automatically supported by the command-line parser,
    unless overridden by the tool. The behaviour of the PrintHelpMessage
    being used for -h prior to this patch is subtly different to that
    provided by --help automatically (it omits certain elements of help text
    and options, such as --help-list), so overriding the default is not
    desirable, without good reason. This patch removes the explicit
    specification of -h and its behaviour, so that the default behaviour is
    used.

    Reviewed by: hintonda

    Differential Revision: https://reviews.llvm.org/D63565 (detail)
    by jhenderson
  27. [ARM] Fix -Wimplicit-fallthrough after D62675 (detail)
    by maskray
  28. [ARM] Add MVE vector compare instructions.

    Summary:
    These take a pair of vector register to compare, and a comparison type
    (written in the form of an Arm condition suffix); they output a vector
    of booleans in the VPR register, where predication can conveniently
    use them.

    Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

    Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62676 (detail)
    by statham
  29. [X86] combineAndnp - use isNOT instead of manually checking for (XOR x, -1) (detail)
    by rksimon
  30. [Symbolize] Avoid lifetime extension and simplify std::map find/insert. NFC (detail)
    by maskray
  31. [X86] foldVectorXorShiftIntoCmp - use isConstOrConstSplat. NFCI.

    Use the isConstOrConstSplat helper instead of inspecting the build vector manually. (detail)
    by rksimon
  32. [X86][AVX] isNOT - handle concat_vectors(xor X, -1, xor Y, -1) pattern (detail)
    by rksimon
  33. [docs][llvm-objdump] Improve llvm-objdump documentation

    The llvm-objdump document was missing many options, and there were also
    some style issues with it. This patches fixes all but the first issue
    listed in https://bugs.llvm.org/show_bug.cgi?id=42249 by:

        1. Adding missing options and commands.
        2. Standardising on double dashes for long-options throughout.
        3. Moving Mach-O specific options to a separate section.
        4. Removing options that don't exist or aren't relevant to
           llvm-objdump.

    Reviewed by: MaskRay, mtrent, alexshap

    Differential Revision: https://reviews.llvm.org/D63606 (detail)
    by jhenderson
  34. [GN] Fix check-clang by disabling plugins

    We can't link Analysis/plugins without -fPIC (detail)
    by Vitaly Buka
  35. [GN] Put libcxx include into the same place as cmake to fix Driver/print-file-name.c test (detail)
    by Vitaly Buka
  36. [ARM] Add a batch of MVE floating-point instructions.

    Summary:
    This includes floating-point basic arithmetic (add/sub/multiply),
    complex add/multiply, unary negation and absolute value, rounding to
    integer value, and conversion to/from integer formats.

    Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

    Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62675 (detail)
    by statham
  37. Use std::iterator_traits to infer result type of llvm::enumerate iterator wrapper

    Update the llvm::enumerate helper class result_pair<R> to use the 'iterator_traits<R>::reference'
    type as the result of 'value()' instead 'ValueOfRange<R> &'. This enables support for iterators
    that return value types, i.e. non reference. This is a common pattern for some classes of
    iterators, e.g. mapped_iterator.

    Patch by: River Riddle <riverriddle@google.com>

    Differential Revision: https://reviews.llvm.org/D63632 (detail)
    by Mehdi Amini
  38. Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC (detail)
    by maskray
  39. [LICM & MSSA] Fixed test to run only with assertions enabled as it uses -debug-only (detail)
    by yrouban
  40. [GN] Fix build (detail)
    by Vitaly Buka
  41. [MIPS GlobalISel] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D63541 (detail)
    by maskray
  42. [GlobalISel][Localizer] Allow localization of G_INTTOPTR and chains of instructions.

    G_INTTOPTR can prevent the localizer from moving G_CONSTANTs, but since it's
    essentially a side effect free cast instruction we can remat both instructions.
    This patch changes the localizer to enable localization of the chains by
    iterating over the entry block instructions in reverse order. That way, uses will
    localized first, and then the defs are free to be localized as well.

    This also changes the previous SmallPtrSet of localized instructions to use a
    SetVector instead. We're dealing with pointers and need deterministic iteration
    order.

    Overall, this change improves ARM64 -O0 CTMark code size by around 0.7% geomean.

    Differential Revision: https://reviews.llvm.org/D63630 (detail)
    by aemerson
  43. [llvm-objcopy][MachO] Rebuild the symbol/string table in the writer

    Summary: Build the string table using StringTableBuilder, reassign symbol indices, and update symbol indices in relocations to allow adding/modifying/removing symbols from the object.

    Reviewers: alexshap, rupprecht, jhenderson

    Reviewed By: alexshap

    Subscribers: mgorny, jakehehrlich, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63309 (detail)
    by seiya
  44. [Reassociate] Remove bogus assert reported in PR42349.

    Also, add a FIXME for the unsafe transform on a unary FNeg. A unary FNeg can only be transformed to a FMul by -1.0 when the nnan flag is present. The unary FNeg project is a WIP, so the unsafe transformation is acceptable until that work is complete.

    The bogus assert with introduced in D63445. (detail)
    by mcinally
  45. [InstSimplify] simplify power-of-2 (single bit set) sequences

    As discussed in PR42314:
    https://bugs.llvm.org/show_bug.cgi?id=42314

    Improving the canonicalization for these patterns:
    rL363956
    ...means we should adjust/enhance the related simplification.

    https://rise4fun.com/Alive/w1cp

      Name: isPow2 or zero
      %x = and i32 %xx, 2048
      %a = add i32 %x, -1
      %r = and i32 %a, %x
      =>
      %r = i32 0 (detail)
    by spatel
  46. [CodeGen] Refactor check of suitability for a jump table (NFC) (detail)
    by evandro
  47. [ARM GlobalISel] Tests for s64 G_ADD and G_SUB.

    Forgot to commit these in r363989 (https://reviews.llvm.org/D63585) (detail)
    by efriedma
  48. AMDGPU: Always use s33 for global scratch wave offset

    Every called function could possibly need this to calculate the
    absolute address of stack objectst, and this avoids inserting a copy
    around every call site in the kernel. It's also somewhat cleaner to
    keep this in a callee saved SGPR. (detail)
    by arsenm
  49. [ARM GlobalISel] Add support for s64 G_ADD and G_SUB.

    Teach RegisterBankInfo to use the correct register class, and tell the
    legalizer it's legal.  Everything else just works.

    The one thing that's slightly weird about this compared to SelectionDAG
    isel is that legalization can't distinguish between i64 and <1 x i64>,
    so we might end up with more NEON instructions than the user expects.

    Differential Revision: https://reviews.llvm.org/D63585 (detail)
    by efriedma
  50. [PowerPC][NFC] Fix comments for AltVSXFMARel mapping. (detail)
    by jsji
  51. [profile] Solaris ld supports __start___llvm_prof_data etc. labels

    Currently, many profiling tests on Solaris FAIL like

      Command Output (stderr):
      --
      Undefined                       first referenced
       symbol                             in file
      __llvm_profile_register_names_function /tmp/lit_tmp_Nqu4eh/infinite_loop-9dc638.o
      __llvm_profile_register_function    /tmp/lit_tmp_Nqu4eh/infinite_loop-9dc638.o

    Solaris 11.4 ld supports the non-standard GNU ld extension of adding
    __start_SECNAME and __stop_SECNAME labels to sections whose names are valid
    as C identifiers.  Given that we already use Solaris 11.4-only features
    like ld -z gnu-version-script-compat and fully working .preinit_array
    support in compiler-rt, we don't need to worry about older versions of
    Solaris ld.

    The patch documents that support (although the comment in
    lib/Transforms/Instrumentation/InstrProfiling.cpp
    (needsRuntimeRegistrationOfSectionRange) is quite cryptic what it's
    actually about), and adapts the affected testcase not to expect the
    alternativeq __llvm_profile_register_functions and __llvm_profile_init.
    It fixes all affected tests.

    Tested on amd64-pc-solaris2.11.

    Differential Revision: https://reviews.llvm.org/D41111 (detail)
    by ro
  52. AMDGPU: Add intrinsics for DS GWS semaphore instructions (detail)
    by arsenm
  53. [LICM & MSSA] Limit unsafe sinking and hoisting.

    Summary:
    The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being
    clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch).
    If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink.
    Given:
    ```
    for i
      load a[i]
      store a[i]
    ```
    The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it.
    In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either
    Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs.

    Caught by PR42294.

    This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict
    hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop.

    Reviewers: george.burgess.iv

    Subscribers: jlebar, Prazek, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63582 (detail)
    by asbirlea
  54. [InstSimplify] add tests for known-not-a-power-of-2; NFC

    I added a canonicalization to create this general pattern in:
    rL363956

    But as noted in PR42314:
    https://bugs.llvm.org/show_bug.cgi?id=42314#c11

    ...we have a (potentially expensive) simplification for the version
    of the code that we just canonicalized away from, so we should
    add/adjust that code to match. (detail)
    by spatel
  55. AMDGPU: Insert mem_viol check loop around GWS pre-GFX9

    It is necessary to emit this loop around GWS operations in case the
    wave is preempted pre-GFX9. (detail)
    by arsenm
  56. [NFC][SLP] Pre-commit unary FNeg test to X86/propagate_ir_flags.ll (detail)
    by mcinally
  57. Update LLVM test to not check for the EliminateAvailableExternallyPass
    for lto-pre-link O2 pipeline runs. (detail)
    by leonardchan
  58. [InstCombine] fix typo in comment; NFC (detail)
    by spatel
  59. [clang][NewPM] Do not eliminate available_externally durng `-O2 -flto` runs

    This fixes CodeGen/available-externally-suppress.c when the new pass manager is
    turned on by default. available_externally was not emitted during -O2 -flto
    runs when it should still be retained for link time inlining purposes. This can
    be fixed by checking that we aren't LTOPrelinking when adding the
    EliminateAvailableExternallyPass.

    Differential Revision: https://reviews.llvm.org/D63580 (detail)
    by leonardchan
  60. [NFC] Add more tests for D46262 (detail)
    by xbolva00
  61. [NFC] Updated tests for D63546 (detail)
    by xbolva00
  62. [LFTR] Fix a (latent?) bug related to nested loops

    I can't actually come up with a test case this triggers on without an out of tree change, but in theory, it's a bug in the recently added multiple exit LFTR support.  The root issue is that an exiting block common to two loops can (in theory) have computable exit counts for both loops.  Rewriting the exit of an inner loop in terms of the outer loops IV would cause the inner loop to either a) run forever, or b) terminate on the first iteration.

    In practice, we appear to get lucky and not have the exit count computable for the outer loop, except when it's trivially zero.  Given we bail on zero exit counts, we don't appear to ever trigger this.  But I can't come up with a reason we *can't* compute an exit count for the outer loop on the common exiting block, so this may very well be triggering in some cases. (detail)
    by reames
  63. gn build: Merge r363948 (detail)
    by nico
  64. [X86] Add BLSI to isUseDefConvertible.

    Summary:
    BLSI sets the C flag is the input is not zero. So if its followed
    by a TEST of the input where only the Z flag is consumed, we can
    replace it with the opposite check of the C flag.

    We should be able to do the same for BLSMSK and BLSR, but the
    naive test case for those is being optimized to a subo by
    CodeGenPrepare.

    Reviewers: spatel, RKSimon

    Subscribers: hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63589 (detail)
    by ctopper
  65. [InstCombine] canonicalize check for power-of-2

    The form that compares against 0 is better because:
    1. It removes a use of the input value.
    2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2
    3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS).

    This is a root cause for PR42314, but probably doesn't completely answer the codegen request:
    https://bugs.llvm.org/show_bug.cgi?id=42314

    Alive proof:
    https://rise4fun.com/Alive/9kG

      Name: is power-of-2
      %neg = sub i32 0, %x
      %a = and i32 %neg, %x
      %r = icmp eq i32 %a, %x
      =>
      %dec = add i32 %x, -1
      %a2 = and i32 %dec, %x
      %r = icmp eq i32 %a2, 0

      Name: is not power-of-2
      %neg = sub i32 0, %x
      %a = and i32 %neg, %x
      %r = icmp ne i32 %a, %x
      =>
      %dec = add i32 %x, -1
      %a2 = and i32 %dec, %x
      %r = icmp ne i32 %a2, 0 (detail)
    by spatel
  66. [DAGCombiner] Use getAPIntValue() instead of getZExtValue() where possible.

    Better handling of out-of-i64-range values due to large integer types or from fuzz tests. (detail)
    by rksimon
  67. [DAGCombiner][NFC] Remove unused var (detail)
    by rupprecht
  68. [Tests] Add a tricky LFTR case for documentation purposes

    Thought of this case while working on something else.  We appear to get it right in all of the variations I tried, but that's by accident.  So, add a test which would catch the potential bug. (detail)
    by reames
  69. Store a pointer to the return value in a static alloca and let the debugger use that
    as the variable address for NRVO variables.

    Subscribers: hiraditya, cfe-commits, llvm-commits

    Tags: #clang, #llvm

    Differential Revision: https://reviews.llvm.org/D63361 (detail)
    by akhuang
  70. [InstCombine] cttz(-x) -> cttz(x)

    Summary: Signedness does not change number of trailing zeros.

    Reviewers: spatel, lebedev.ri, nikic

    Reviewed By: spatel

    Subscribers: llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63534 (detail)
    by xbolva00
  71. AMDGPU: Eliminate test usage of legacy FP elim attributes (detail)
    by arsenm
  72. AMDGPU: Fix ignoring DisableFramePointerElim in leaf functions

    The attribute can specify elimination for leaf or non-leaf, so it
    should always be considered. I copied this bug from AArch64, which
    probably should also be fixed. (detail)
    by arsenm
  73. [CodeGen] Fix formatting and comments (NFC) (detail)
    by evandro
  74. [AMDGPU] gfx10 tests. NFC. (detail)
    by rampitec
  75. [InstCombine] add commuted variants for power-of-2 checks; NFC (detail)
    by spatel
  76. AMDGPU: Treat undef as an inline immediate

    This should only matter in vectors with an undef component, since a
    full undef vector would have been folded out. (detail)
    by arsenm
  77. AMDGPU: Make test functions hidden

    Reduces amount of code in the function from eliminating the GOT load. (detail)
    by arsenm
  78. [InstCombine] add tests for checking power-of-2; NFC (detail)
    by spatel
  79. [NFC][SLP] Pre-commit unary FNeg test to X86/phi3.ll (detail)
    by mcinally
  80. [ARM] Add a batch of MVE integer instructions.

    This includes integer arithmetic of various kinds (add/sub/multiply,
    saturating and not), and the immediate forms of VMOV and VMVN that
    load an immediate into all lanes of a vector.

    Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

    Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62674 (detail)
    by statham
  81. [AMDGPU] gfx1010 core wave32 changes

    Differential Revision: https://reviews.llvm.org/D63204 (detail)
    by rampitec
  82. Virtualize TargetInstrInfo::getRegClass()

    AMDGPU target needs to override getRegClass() used during
    instruction selection. We now may have either 32 or 64 bit
    conditional registers used in the same instructions. For
    that purpose special SReg_1 register class is created which
    is dynamically resolved to either SReg_64 or SGPR_32 depending
    on the subtarget attributes.

    Differential Revision: https://reviews.llvm.org/D63205 (detail)
    by rampitec
  83. [yaml2obj] - Convert `ELFState<ELFT>::addSymbols` method to `toELFSymbols` helper. NFCI.

    ELFState<ELFT>::addSymbols method looks a bit strange.
    User code have to create the destination symbols vector outside,
    add a null symbol and then pass it to addSymbols when it seems
    the more natural logic is to isolate all work with symbols inside some
    function, build the list right there and return it.

    Differential revision: https://reviews.llvm.org/D63493 (detail)
    by grimar
  84. [DAGCombiner] Support (shl (zext (srl x, C)), C) -> (zext (shl (srl x, C), C)) non-uniform folds.

    Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. (detail)
    by rksimon
  85. [SLP][X86] Add lookahead reordering tests from D60897 (detail)
    by rksimon
  86. [DAGCombine] Add TODOs for some combines that should support non-uniform vectors

    We tend to only test for scalar/scalar consts when really we could support non-uniform vectors using ISD::matchUnaryPredicate/matchBinaryPredicate etc. (detail)
    by rksimon
  87. [X86] LowerAVXExtend - handle ANY_EXTEND_VECTOR_INREG lowering as well. (detail)
    by rksimon
  88. [DAGCombine] Reduce scope of ShAmtVal variable. NFCI.

    Fixes cppcheck warning.

    Use the more capable getAPIntVal() instead of getZExtValue() as well since I'm here. (detail)
    by rksimon
  89. [llvm-nm] Generalize ELF symbol types 'N' and 'n'

    Reviewed By: grimar, jhenderson

    Differential Revision: https://reviews.llvm.org/D63588 (detail)
    by maskray
  90. [NFC] Update documentation for AtomicCmpXchgInst

    Fix bz#42325 (detail)
    by serge_sans_paille
  91. TargetParserTest.ARMExtensionFeatures run out of memory on 32-bit (PR42316)

    Nothing of these tests made much sense. Loops were iterating too much, and I
    also don't think it was actually testing anything. I think we simply want to
    check that AEK_SOME_EXT returns "+some_ext".

    I've given the AArch64 tests the same treatment as they very similarly didn't
    made any sense either.

    This fixes PR42316.

    Differential Revision: https://reviews.llvm.org/D63569 (detail)
    by sjoerdmeijer
  92. [MIPS GlobalISel] Select integer to floating point conversions

    Select G_SITOFP and G_UITOFP for MIPS32.

    Differential Revision: https://reviews.llvm.org/D63542 (detail)
    by petar.avramovic
  93. [MIPS GlobalISel] Select floating point to integer conversions

    Select G_FPTOSI and G_FPTOUI for MIPS32.

    Differential Revision: https://reviews.llvm.org/D63541 (detail)
    by petar.avramovic
  94. [X86] Add test cases showing missed opportunities to use the C flag from the BLSI instruction to avoid a TEST instruction (detail)
    by ctopper
  95. [X86] Remove memory instructions form isUseDefConvertible.

    The caller of this is looking for comparisons of the input
    to these instructions with 0. But the memory instructions
    input is an addess not a value input in a register. (detail)
    by ctopper
  96. [X86] Add v64i8/v32i16 to several places in X86CallingConv.td where they seemed obviously missing. (detail)
    by ctopper
  97. AMDGPU: Don't clobber VCC in MUBUF addr64 emulation

    Introducing VCC defs during SIFixSGPRCopies is generally
    problematic. Avoid it by starting with the VOP3 form with the general
    condition register. This is the easiest to fix instance, but doesn't
    solve any specific problems I'm looking at. (detail)
    by arsenm
  98. [llvm-objdump] Switch between ARM/Thumb based on mapping symbols.

    The ARMDisassembler changes allow changing between ARM and Thumb mode
    based on the MCSubtargetInfo, rather than the Target, which simplifies
    the other changes a bit.

    I'm not really happy with adding more target-specific logic to
    tools/llvm-objdump/, but there isn't any easy way around it: the logic
    in question specifically applies to disassembling an object file, and
    that code simply isn't located in lib/Target, at least at the moment.

    Differential Revision: https://reviews.llvm.org/D60927 (detail)
    by efriedma
  99. AMDGPU: Consolidate some getGeneration checks

    This is incomplete, and ideally these would all be removed, but it's
    better to localize them to the subtarget first with comments about
    what they're for. (detail)
    by arsenm
  100. [FileCheck] Stop qualifying expressions as numeric

    Summary:
    Stop referring to "numeric expression", using simply the term
    "expression" instead. Likewise for numeric operation since operations
    are only used in numeric expressions.

    Reviewers: jhenderson, jdenny, probinson, arichardson

    Subscribers: hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63500 (detail)
    by thopre
  101. FileCheck: Return parse error w/ Error & Expected

    Summary:
    Make use of Error and Expected to bubble up diagnostics and force
    checking of errors in the callers.

    Reviewers: jhenderson, jdenny, probinson, arichardson

    Subscribers: hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63125 (detail)
    by thopre
  102. AMDGPU: Undo sub x, c canonicalization for v2i16

    Should avoid regression from D62341 (detail)
    by arsenm
  103. AMDGPU: Add baseline test for vector sub x, c canonicalization

    This will catch regressions from D62341, and show improvements from a
    future patch to fix them. (detail)
    by arsenm
  104. [DAGCombine] Use ConstantSDNode::getAPIntValue() instead of getZExtValue().

    Use getAPIntValue() in a few more places. Most of the time getZExtValue() is fine, but occasionally there's fuzzed code or someone decides to create i65536 or something..... (detail)
    by rksimon
  105. [mips] Mark the `lwupc` instruction as MIPS64 R6 only

    The "The MIPS64 Instruction Set Reference Manual" [1] states that
    the `lwupc` is MIPS64 Release 6 only. It should not be supported
    for 32-bit CPUs.

    [1] https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00087-2B-MIPS64BIS-AFP-6.06.pdf (detail)
    by atanasyan
  106. [mips] Add (GPR|PTR)_64 predicates to PseudoReturn64 and PseudoIndirectHazardBranch64

    This patch is one of a series of patches. The goal is to make P5600
    scheduler model complete and turn on the `CompleteModel` flag. (detail)
    by atanasyan
  107. [Util] Add a helper script for converting -print-before-all output into a file based equivelent

    Simple little utility which takes a opt logfile generated with "opt -print-before-all -print-module-scope -o /dev/null <args> 2&>1", and splits into a series of individual "chunk-X.ll" files. The intended purpose is to help automate one step in failure reduction.

    The imagined workflow is:

        New crasher bug reported against clang or other frontend
        Frontend run with -emit-llvm equivalent and manually confirmed that opt -O2 <emit.ll> crashes
        Run this splitter script
        Manually map pass name to invocation command (next on the to automate list)
        Run bugpoint on last chunk file + manual command

    I chose to dump every chunk rather than only the last since miscompile debugging frequently requires either manual step by step reduction, or cross feeding IR into different compiler versions. Not an immediate target, but there may be applications.

    Differential Revision: https://reviews.llvm.org/D63461 (detail)
    by reames
  108. LFTR for multiple exit loops

    Teach IndVarSimply's LinearFunctionTestReplace transform to handle multiple exit loops. LFTR does two key things 1) it rewrites (all) exit tests in terms of a common IV potentially eliminating one in the process and 2) it moves any offset/indexing/f(i) style logic out of the loop.

    This turns out to actually be pretty easy to implement. SCEV already has all the information we need to know what the backedge taken count is for each individual exit. (We use that when computing the BE taken count for the loop as a whole.) We basically just need to iterate through the exiting blocks and apply the existing logic with the exit specific BE taken count. (The previously landed NFC makes this super obvious.)

    I chose to go ahead and apply this to all loop exits instead of only latch exits as originally proposed. After reviewing other passes, the only case I could find where LFTR form was harmful was LoopPredication. I've fixed the latch case, and guards aren't LFTRed anyways. We'll have some more work to do on the way towards widenable_conditions, but that's easily deferred.

    I do want to note that I added one bit after the review.  When running tests, I saw a new failure (no idea why didn't see previously) which pointed out LFTR can rewrite a constant condition back to a loop varying one.  This was theoretically possible with a single exit, but the zero case covered it in practice.  With multiple exits, we saw this happening in practice for the eliminate-comparison.ll test case because we'd compute a ExitCount for one of the exits which was guaranteed to never actually be reached.  Since LFTR ran after simplifyAndExtend, we'd immediately turn around and undo the simplication work we'd just done.  The solution seemed obvious, so I didn't bother with another round of review.

    Differential Revision: https://reviews.llvm.org/D62625 (detail)
    by reames
  109. [Tests] Autogen a test so that future changes are understandable (detail)
    by reames
  110. [MemorySSA] Cleanup trivial phis.

    Summary:
    This is unfortunately needed for correctness, if we are to extend the tolerance of the update API to the way simple loop unswitch is doing cloning.

    In simple loop unswitch (as opposed to loop unswitch), not all blocks are cloned. This can create unreachable cloned blocks (no predecessor), which are later cleaned up.

    In MemorySSA, the  APIs for supporting these kind of updates (clone + update exit blocks), make certain assumption on the integrity of the CFG. When cloning, if something was not cloned, it's values in MemorySSA default to LiveOnEntry. When updating exit blocks, it is safe to assume that we can first insert phis in the blocks merging two clones, then add additional phis in the IDF of the blocks that received phis. This no longer holds true if one of the clones being merged comes from an unreachable block. We'd conservatively need to add all phis before filling in their incoming definitions. In practice this restriction can be relaxed if we clean up trivial phis after the first round of insertion.

    Reviewers: george.burgess.iv

    Subscribers: jlebar, Prazek, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63354 (detail)
    by asbirlea
  111. [MemorySSA] Use GraphDiff info when computing IDF.

    Summary:
    When computing IDF for insert updates, ensure we use the snapshot CFG offered by GraphDiff.
    Caught by D63389.

    Reviewers: kuhar, george.burgess.iv

    Subscribers: jlebar, Prazek, llvm-commits, Szelethus

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63443 (detail)
    by asbirlea
  112. [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC]

    (Resumbit of r363292 which was reverted along w/an earlier patch) (detail)
    by reames
  113. AMDGPU: Fix folding immediate into readfirstlane through reg_sequence

    The def instruction for the vreg may not match, because it may be
    folding through a reg_sequence. The assert was overly conservative and
    not necessary. It's not actually important if DefMI really defined the
    register, because the fold that will be done cares about the def of
    the value that will be folded.

    For some reason copies aren't making it through the reg_sequence,
    although they should. (detail)
    by arsenm
  114. [LFTR] Rename variable to minimize confusion [NFC]

    (Recommit of r363293 which was reverted when a dependent patch was.)

    As pointed out by Nikita in D62625, BackedgeTakenCount is generally used to refer to the backedge taken count of the loop. A conditional backedge taken count - one which only applies if a particular exit is taken - is called a ExitCount in SCEV code, so be consistent here. (detail)
    by reames
  115. hwasan: Shrink outlined checks by 1 instruction.

    Turns out that we can save an instruction by folding the right shift into
    the compare.

    Differential Revision: https://reviews.llvm.org/D63568 (detail)
    by pcc
  116. Reapply "AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics"

    This reapplies r363678, using the correct chain for the CopyToReg for
    v0. glueCopyToM0 counterintuitively changes the operands of the
    original node. (detail)
    by arsenm
  117. [llvm-readobj] Match GNU output for DT_RPATH and DT_RUNPATH when dumping dynamic symbol table.

    Reviewers: jhenderson, grimar, MaskRay, rupprecht, espindola

    Subscribers: emaste, nemanjai, arichardson, kbarton, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63347 (detail)
    by yuanfang
  118. [SCEV] Revise a method description to match actual behavior [NFC]

    Reword the ScalarEvolution::getExitCount comment in the same terminology as used by getBackedgeTakenCount since they're equivelent for single exit loops.  Also, strengthen the comment to indicate exiting on the exact iteration specified is guaranteed.  Several transforms implicitly rely on this; and the actual implementation checks for it (via dominating latch checks).  So, spell out the guarantee in the comment. (detail)
    by reames
  119. gn build: Merge r363757. (detail)
    by pcc
  120. gn build: Merge r363848. (detail)
    by pcc
  121. gn build: Merge r363846. (detail)
    by pcc
  122. gn build: Merge r363794. (detail)
    by pcc
  123. gn build: Merge r363680. (detail)
    by pcc
  124. gn build: Merge r363712. (detail)
    by pcc
  125. [llvm-objdump] Remove unnecessary indentation when dumping ELF data.

    Reviewers: MaskRay, jhenderson, rupprecht

    Subscribers: llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63393 (detail)
    by yuanfang
  126. [TargetLowering] SimplifyDemandedBits - add ANY_EXTEND_VECTOR_INREG support

    Move 'lowest' demanded elt -> bitcast fold out of ZERO_EXTEND_VECTOR_INREG into ANY_EXTEND_VECTOR_INREG case. (detail)
    by rksimon
  127. Fix GlobalISel MachineVerifier tests. NFC.

    These test were failing when building llvm with
    `-DLLVM_DEFAULT_TARGET_TRIPLE=''`. Add `-march` to the
    run line to fix the issue. (detail)
    by volkan
  128. [x86] avoid vector load narrowing with extracted store uses (PR42305)

    This is an exception to the rule that we should prefer xmm ops to ymm ops.
    As shown in PR42305:
    https://bugs.llvm.org/show_bug.cgi?id=42305
    ...the store folding opportunity with vextractf128 may result in better
    perf by reducing the instruction count.

    Differential Revision: https://reviews.llvm.org/D63517 (detail)
    by spatel
  129. [x86] add test for unaligned 32-byte load/store splitting; NFC (detail)
    by spatel
  130. [test] Fix TargetParserTest runtime.

    r363780 fixes extreme memory growth by using a new std::vector every loop iteration, but causes runtime to go up (and occasionally timeout in certain situations) because of constructor cost every loop iteration. Fix this by moving the constructor back out, but clearing contents in the loop.

    Also apply this to the AArch64 features test case, which seems to use the same pattern. (detail)
    by rupprecht
  131. [TargetLowering] SimplifyDemandedBits ZERO_EXTEND_VECTOR_INREG -> ANY_EXTEND_VECTOR_INREG

    Simplify ZERO_EXTEND_VECTOR_INREG if the extended bits are not required.

    Matches what we already do for ZERO_EXTEND. (detail)
    by rksimon
  132. [X86][SSE] combineToExtendVectorInReg - add ANY_EXTEND support TODO. NFCI.

    So I don't forget - there's a load of yak shaving to do first. (detail)
    by rksimon
  133. [InstCombine] Fold  icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0  earlier

    Summary:
    To generate simplified IR, make sure fold
    ```
      (X & signbit) ==/!= 0) -> X s>=/s< 0;
    ```
    is scheduled before fold
    ```
      ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.
    ```

    https://rise4fun.com/Alive/fbdh

    Reviewers: lebedev.ri, efriedma, spatel, craig.topper

    Reviewed By: lebedev.ri

    Subscribers: hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63026 (detail)
    by huihuiz
  134. [InstSimplify] add a phi test with 1 incoming value; NFC

    D63489 proposes to change this behavior, but there's no
    direct -instsimplify test to verify that the transform exists. (detail)
    by spatel
  135. [X86][SSE] Combine shuffles to ANY_EXTEND/ANY_EXTEND_VECTOR_INREG.

    We already do this for ZERO_EXTEND/ZERO_EXTEND_VECTOR_INREG - this just extends the pattern matcher to recognize cases where we don't need the zeros in the extension. (detail)
    by rksimon
  136. [AArch64] Improve jump tables testing (NFC)

    Improve testing of the minimum and maximum sizes of jump tables. (detail)
    by evandro
  137. [ARM] Add MVE vector bit-operations (register inputs).

    This includes all the obvious bitwise operations (AND, OR, BIC, ORN,
    MVN) in register-to-register forms, and the immediate forms of
    AND/OR/BIC/ORN; byte-order reverse instructions; and the VMOVs that
    access a single lane of a vector.

    Some of those VMOVs (specifically, the ones that access a 32-bit lane)
    share an encoding with existing instructions that were disassembled as
    accessing half of a d-register (e.g. `vmov.32 r0, d1[0]`), but in
    8.1-M they're now written as accessing a quarter of a q-register (e.g.
    `vmov.32 r0, q0[2]`). The older syntax is still accepted by the
    assembler.

    Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

    Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62673 (detail)
    by statham
  138. [AArch64] Improve jump tables testing (NFC)

    Improve testing of the minimum and maximum sizes of jump tables. (detail)
    by evandro
  139. [NFC][IR] Move CreateFNegFMF(...) next to CreateFNeg(...).

    This is now in line with the other Create*FMF(...) functions. (detail)
    by mcinally
  140. [test][llvm-dwarfdump] Remove pointless CHECK-NOT lines

    The original line was there from when this test was added, but it is
    checking for a switch that doesn't exist, so really has no purpose, at
    least any more. (detail)
    by jhenderson
  141. [AVR] Change limit type to match the argument type (NFC) (detail)
    by evandro
  142. [Hexagon] Change limit type to match the argument type (NFC) (detail)
    by evandro
  143. [llvm-mca][docs] clarify how the quality of the perf report is affected by the quality of the scheduling models.

    Differential Revision: https://reviews.llvm.org/D63556 (detail)
    by adibiagio
  144. [NFC][llvm-objcopy] Fix overly restrictive od output check

    The check against the output of `od` in the affected tests expect a
    specific input offset format. They also expect a specific offset value,
    not consistent with the EXAMPLE section for `od` in POSIX.1-2017
    Chapter 4, while using the `-j` option. In particular, the example shows
    that the input offset begins at 0 following the bytes skipped.

    This patch adjusts the matching of the input offset to be more generic.
    In order to avoid false matches, it restricts the number of bytes to be
    formatted. (detail)
    by hubert.reinterpretcast
  145. [NFC][LSR] Avoid undefined grep in pr2570.ll

    greater-than-sign is not a BRE special character.

    POSIX.1-2017 XBD Section 9.3.2 indicates that the interpretation of `\>`
    is undefined. This patch replaces the pattern. (detail)
    by hubert.reinterpretcast
  146. Specify log level for CMake messages (less stderr)

    Summary:
    Specify message levels in CMake. Prefer STATUS (stdout).

    As the default message mode (i.e. level) is NOTICE in CMake, more then necessary messages get printed to stderr. Some tools,  noticably ccmake treat this as an error and require additional confirmation and re-running CMake's configuration step.

    This commit specifies a mode (either STATUS or WARNING or FATAL_ERROR)  instead of the default.

    * I used `csearch -f 'llvm-project/.+(CMakeLists\.txt|cmake)' -l 'message\("'` to find all locations.
    * Reviewers were chosen by the most common authors of specific files. If there are more suitable reviewers for these CMake changes, please let me know.

    Patch by: Christoph Siedentop

    Reviewers: zturner, beanz, xiaobai, kbobyrev, lebedev.ri, sgraenitz

    Reviewed By: sgraenitz

    Subscribers: mgorny, lebedev.ri, #sanitizers, lldb-commits, llvm-commits

    Tags: #sanitizers, #lldb, #llvm

    Differential Revision: https://reviews.llvm.org/D63370 (detail)
    by stefan.graenitz
  147. [X86] getExtendInVec - take a ISD::*_EXTEND opcode instead of a IsSigned bool flag. NFCI.

    Prep work to support ANY_EXTEND/ANY_EXTEND_VECTOR_INREG without needing another flag. (detail)
    by rksimon
  148. [DFSan] Add UnaryOperator visitor to DataFlowSanitizer

    Differential Revision: https://reviews.llvm.org/D62815 (detail)
    by mcinally
  149. [Reassociate] Handle unary FNeg in the Reassociate pass

    Differential Revision: https://reviews.llvm.org/D63445 (detail)
    by mcinally
  150. [X86] Add *_EXTEND -> *_EXTEND_VECTOR_INREG opcode conversion helper. NFCI.

    Given a *_EXTEND or *_EXTEND_VECTOR_INREG opcode, convert it to *_EXTEND_VECTOR_INREG. (detail)
    by rksimon
  151. [ConstantFolding] Add constant folding for smul.fix and smul.fix.sat

    Summary:
    This patch teaches ConstantFolding to constant fold
    both scalar and vector variants of llvm.smul.fix and
    llvm.smul.fix.sat.

    As described in the LangRef rounding is unspecified for
    these instrinsics. If the result cannot be represented
    exactly the default behavior in ConstantFolding is to
    round down towards negative infinity. If a target has a
    preferred rounding that is different some kind of target
    hook would be needed (same strategy as used by the
    SelectionDAG legalizer).

    Reviewers: nikic, leonardchan, RKSimon

    Reviewed By: leonardchan

    Subscribers: hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63385 (detail)
    by bjope
  152. [ConstantFolding] Refactor ConstantFoldScalarCall. NFC

    This patch splits ConstantFoldScalarCall into several
    functions.

    Benefits:
    - Reduces indentation levels and avoids long if-statements.
    - Makes it easier to add support for > 3 operands. (detail)
    by bjope
  153. [X86] Merge extract_subvector(*_EXTEND) and extract_subvector(*_EXTEND_VECTOR_INREG) handling. NFCI. (detail)
    by rksimon
  154. [SystemZ] Support vector load/store alignment hints

    Vector load/store instructions support an optional alignment field
    that the compiler can use to provide known alignment info to the
    hardware.  If the field is used (and the information is correct),
    the hardware may be able (on some models) to perform faster memory
    accesses than otherwise.

    This patch adds support for alignment hints in the assembler and
    disassembler, and fills in known alignment during codegen. (detail)
    by uweigand
  155. [TargetLowering] SimplifyDemandedBits SIGN_EXTEND_VECTOR_INREG -> ANY/ZERO_EXTEND_VECTOR_INREG

    Simplify SIGN_EXTEND_VECTOR_INREG if the extended bits are not required/known zero.

    Matches what we already do for SIGN_EXTEND. (detail)
    by rksimon
  156. [llvm-dwarfdump] --gdb-index: fix uninitialized TuListOffset

    The test only checks the existence of the `Types CU list` line.
    Unfortunately I can't make a better test because
    {gcc,clang} -fuse-ld={lld,gold} --gdb-index do not give me a non-empty types CU list.

    Reviewed By: ikudrin

    Differential Revision: https://reviews.llvm.org/D63537 (detail)
    by maskray
  157. Revert rL363678 : AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics

    There may or may not be additional work to handle this correctly on
    SI/CI.
    ........
    Breaks EXPENSIVE_CHECKS buildbots - http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/78/ (detail)
    by rksimon
  158. [NFC] Added tests for D63534 (detail)
    by xbolva00
  159. [NFC] Added tests for cttz(abs(x)) -> cttz(x) fold (detail)
    by xbolva00
  160. [DAGCombiner] Support (shl (ext (shl x, c1)), c2) -> (shl (ext x), (add c1, c2)) non-uniform folds.

    Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases. (detail)
    by rksimon
  161. [DAGCombiner] Support (shl (ext (shl x, c1)), c2) -> 0 non-uniform folds.

    Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases.

    This requires us to tweak matchBinaryPredicate to allow it to (optionally) handle constants with different type widths. (detail)
    by rksimon
  162. [X86] Add non-uniform (shl (ext (shl x, c1)), c2) -> (shl (ext x), (add c1, c2)) test (detail)
    by rksimon
  163. [DAGCombiner] visitSHL - pull out repeated shift amount VT. NFCI. (detail)
    by rksimon
  164. [DAGCombine] Fix (shl (ext (shl x, c1)), c2) -> (shl (ext x), (add c1, c2)) comment. NFCI.

    We pre-extend, not post. (detail)
    by rksimon
  165. [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion

    Summary:
    Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

    The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

    A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
    B) Instructions in the middle block have different line numbers which give the impression of another iteration.

    In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

    I have set up a separate review D61933 for a fix which is required for this patch.

    Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

    Reviewed By: hfinkel, jmorse

    Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

    Tags: #llvm, #debug-info

    Differential Revision: https://reviews.llvm.org/D60831

    llvm-svn: 363046 (detail)
    by orlandoch
  166. [ConstantFolding] Fix assertion failure on non-power-of-two vector load.

    Summary:
    The test case does an (out of bounds) load from a global constant with
    type <3 x float>. InstSimplify tried to turn this into an integer load
    of the whole alloc size of the vector, which is 128 bits due to
    alignment padding, and then bitcast this to <3 x vector> which failed
    an assertion due to the type size mismatch.

    The fix is to do an integer load of the normal size of the vector, with
    no alignment padding.

    Reviewers: tpr, arsenm, majnemer, dstuttard

    Reviewed By: arsenm

    Subscribers: hfinkel, wdng, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63375 (detail)
    by foad
  167. [RISCV] Allow parsing immediates that use tilde & exclaim

    This patch allows immediates (and CSR alias immediates) which start with
    a tilde token or an exclaim (!) token to be parsed as intended.

    Differential Revision: https://reviews.llvm.org/D57320 (detail)
    by lewis-revill
  168. [RISCV] Fix failure to parse parenthesized immediates

    Since the parser attempts to parse an operand as a register with
    parentheses before parsing it as an immediate, immediates in
    parentheses should not be parsed by parseRegister. However in the case
    where the immediate does not start with an identifier, the LParen is not
    unlexed and so the RParen causes an unexpected token error.

    This patch adds the missing UnLex, and modifies the existing UnLex to
    not use a buffered token, as it should always be unlexing an LParen.

    Differential Revision: https://reviews.llvm.org/D57319 (detail)
    by lewis-revill
  169. Fix r363773: Update Barcelona MCA tests. (detail)
    by courbet
  170. Make TargetParserTest.ARMExtensionFeatures not run out of memory on 32-bit (PR42316)

    The test still probably shouldn't run this loop 17 million times, but at
    least now it won't run out of memory. (detail)
    by hans
  171. [yaml2obj/obj2yaml] - Make RawContentSection::Info Optional<>

    This allows to customize this field for "implicit" sections properly.

    Differential revision: https://reviews.llvm.org/D63487 (detail)
    by grimar
  172. [NFC][X86][MCA] Barcelona: add load/store/load-store-throughput tests (detail)
    by lebedevri
  173. [NFC][X86][MCA] BdVer2: add load-store-throughput test (detail)
    by lebedevri
  174. [X86] Add missing properties on llvm.x86.sse.{st,ld}mxcsr

    Summary:
    llvm.x86.sse.stmxcsr only writes to memory.
    llvm.x86.sse.ldmxcsr only reads from memory, and might generate an FPE.

    Reviewers: craig.topper, RKSimon

    Subscribers: llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62896 (detail)
    by courbet
  175. [RISCV] Add lowering of global TLS addresses

    This patch adds lowering for global TLS addresses for the TLS models of
    InitialExec, GlobalDynamic, LocalExec and LocalDynamic.

    LocalExec support required using a 4-operand add instruction, which uses
    the fourth operand to express a relocation on the symbol. The necessary
    fixup is emitted when the instruction is emitted.

    Differential Revision: https://reviews.llvm.org/D55305 (detail)
    by lewis-revill
  176. vs integration: bump version nbr (detail)
    by hans
  177. Revert r359557 "vs integration: vs2019 support"

    Turns out this worked on my machine because I still had VS2017 installed, but
    it didn't actually work in general.

    Since the extension is unmaintained and MS is doing their own LLVM toolset
    integration for VS2019, let's just revert. (detail)
    by hans
  178. Test commit access (detail)
    by yuanfang
  179. [RISCV] Fix test after r363757

    r363757 renamed ExpandISelPseudo to FinalizeISel, so the RUN line in
    select-optimize-multiple.mir needed updating to refer to finalize-isel. (detail)
    by asb
  180. [NFC] move some hardware loop checking code to a common place for other using.
    Differential Revision: https://reviews.llvm.org/D63478 (detail)
    by shchenz
  181. Rename ExpandISelPseudo->FinalizeISel, delay register reservation

    This allows targets to make more decisions about reserved registers
    after isel. For example, now it should be certain there are calls or
    stack objects in the frame or not, which could have been introduced by
    legalization.

    Patch by Matthias Braun (detail)
    by arsenm
  182. [WebAssembly] Optimize ISel for SIMD Boolean reductions

    Summary:
    Converting the result *.{all,any}_true to a bool at the source level
    generates LLVM IR that compares the result to 0. This check is
    redundant since these instructions already return either 0 or 1 and
    therefore conform to the BooleanContents setting for WebAssembly. This
    CL adds patterns to detect and remove such redundant operations on the
    result of Boolean reductions.

    Reviewers: dschuff, aheejin

    Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63529 (detail)
    by tlively
  183. Re-commit r363744: [tblgen][disasm] Allow multiple encodings to disassemble to the same instruction

    It seems macOS lets you have ArrayRef<const X> even though this is apparently
    forbidden by the language standard (Thanks MSVC++ for the clear error message).
    Removed the problematic const's to fix this.

    (It also seems I'm not receiving buildbot emails anymore and I'm trying to find
    out why. In the mean time I'll be polling lab.llvm.org to hopefully see if/when
    failures occur) (detail)
    by dsanders
  184. [demangle] Special case clang's creative mangling of __uuidof expressions. (detail)
    by epilk
  185. [test] Change comment wording (NFC) (detail)
    by evandro
  186. Revert [tblgen][disasm] Allow multiple encodings to disassemble to the same instruction

    This reverts r363744 (git commit 9b2252123d1e79d2b3594097a9d9cc60072b83d9)

    This breaks many buildbots, e.g. http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/203/steps/build%20stage%201/logs/stdio (detail)
    by rupprecht
  187. Print dylib load kind (weak, reexport, etc) in llvm-objdump -m -dylibs-used

    Summary:
    Historically llvm-objdump prints the path to a dylib as well as the
    dylib's compatibility version and current version number. This change
    extends this information by adding the kind of dylib load: weak,
    reexport, etc.

    rdar://51383512

    Reviewers: pete, lhames

    Reviewed By: pete

    Subscribers: rupprecht, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62866 (detail)
    by mtrent
  188. [GlobalISel][Localizer] Remove redundant set lookup.

    After changing the algorithm to only process the entry block we never revisit
    a processed instruction. (detail)
    by aemerson
  189. [tblgen][disasm] Allow multiple encodings to disassemble to the same instruction

    Summary:
    Add an AdditionalEncoding class which can be used to define additional encodings
    for a given instruction. This causes the disassembler to add an additional
    encoding to its matching tables that map to the specified instruction.

    Usage:
      def ADD1 : Instruction {
        bits<8> Reg;
        bits<32> Inst;

        let Size = 4;
        let Inst{0-7} = Reg;
        let Inst{8-14} = 0;
        let Inst{15} = 1; // Continuation bit
        let Inst{16-31} = 0;
        ...
      }
      def : AdditionalEncoding<ADD1> {
        bits<8> Reg;
        bits<16> Inst; // You can also have bits<32> and it will still be a 16-bit encoding
        let Size = 2;
        let Inst{0-3} = 0;
        let Inst{4-7} = Reg;
        let Inst{8-15} = 0;
        ...
      }
    with those definitions, llvm-mc will successfully disassemble both of these:
      0x01 0x00
      0x10 0x80 0x00 0x00
    to:
      ADD1 r1

    Depends on D52366

    Reviewers: bogner, charukcs

    Reviewed By: bogner

    Subscribers: nlguillemot, nhaehnle, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D52369 (detail)
    by dsanders
  190. Recommit [SROA] Enhance SROA to handle `addrspacecast`ed allocas

    [SROA] Enhance SROA to handle `addrspacecast`ed allocas

    - Fix typo in original change
    - Add additional handling to ensure all return pointers are properly
      casted.

    Summary:
    - After `addrspacecast` is allowed to be eliminated in SROA, the
      adjusting of storage pointer (from `alloca) needs to handle the
      potential different address spaces between the storage pointer (from
      alloca) and the pointer being used.

    Reviewers: arsenm

    Subscribers: wdng, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63501 (detail)
    by hliao
  191. InstCombine: Pre-commit test for reassociating nuw

    D39417 (detail)
    by arsenm
  192. [ARM] Comply with rules on ARMv8-A thumb mode partial deprecation of IT.

    Summary:
    When identifing instructions that can be folded into a MOVCC instruction,
    checking for a predicate operand is not enough, also need to check for
    thumb2 function, with restrict-IT, is the machine instruction eligible for
    ARMv8 IT or not.

    Notes in ARMv8-A Architecture Reference Manual, section "Partial deprecation of IT"
      https://usermanual.wiki/Pdf/ARM20Architecture20Reference20ManualARMv8.1667877052.pdf

    "ARMv8-A deprecates some uses of the T32 IT instruction. All uses of IT that apply to
    instructions other than a single subsequent 16-bit instruction from a restricted set
    are deprecated, as are explicit references to the PC within that single 16-bit
    instruction. This permits the non-deprecated forms of IT and subsequent instructions
    to be treated as a single 32-bit conditional instruction."

    Reviewers: efriedma, lebedev.ri, t.p.northover, jmolloy, aemerson, compnerd, stoklund, ostannard

    Reviewed By: ostannard

    Subscribers: ostannard, javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63474 (detail)
    by huihuiz
  193. [RISCV] Prevent re-ordering some adds after shifts

    Summary:
    DAGCombine will normally turn a `(shl (add x, c1), c2)` into `(add (shl x, c2), c1 << c2)`, where `c1` and `c2` are constants. This can be prevented by a callback in TargetLowering.

    On RISC-V, materialising the constant `c1 << c2` can be more expensive than materialising `c1`, because materialising the former may take more instructions, and may use a register, where materialising the latter would not.

    This patch implements the hook in RISCVTargetLowering to prevent this transform, in the cases where:
    - `c1` fits into the immediate field in an `addi` instruction.
    - `c1` takes fewer instructions to materialise than `c1 << c2`.

    In future, DAGCombine could do the check to see whether `c1` fits into an add immediate, which might simplify more targets hooks than just RISC-V.

    Reviewers: asb, luismarques, efriedma

    Reviewed By: asb

    Subscribers: xbolva00, lebedev.ri, craig.topper, lewis-revill, Jim, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62857 (detail)
    by lenary
  194. [MachinePipeliner][NFC] Do resource tracking log only when requested.

    In most cases we don't need to do resource tracking debug,
    so leave them off by default. (detail)
    by jsji
  195. [x86] add another test for load splitting with extracted stores (PR42305); NFC (detail)
    by spatel
  196. Add debug location verification for !llvm.loop attachments.

    This patch teaches the Verifier how to detect broken !llvm.loop
    attachments as discussed in https://reviews.llvm.org/D60831. This
    allows LLVM to warn and strip out the broken debug info before
    attempting an LTO compilation with input generated by LLVM predating
    https://reviews.llvm.org/rL361149.

    rdar://problem/51631158

    Differential Revision: https://reviews.llvm.org/D63499

    [Re-applies r363725 without changes after fixing a broken testcase.] (detail)
    by Adrian Prantl
  197. Fix broken debug info in in an !llvm.loop attachment in this testcase. (detail)
    by Adrian Prantl
  198. [AMDGPU] gfx10 wave32 patterns

    Differential Revision: https://reviews.llvm.org/D63511 (detail)
    by rampitec
  199. Revert Add debug location verification for !llvm.loop attachments.

    This reverts r363725 (git commit 8ff822d61dacf5a9466755eedafd3eeb54abc00d) (detail)
    by Adrian Prantl
  200. [coroutines] Add missing pass dependency.

    Summary:
    CoroSplit depends on CallGraphWrapperPass, but it was not explicitly adding it as a pass dependency.

    This missing dependency can trigger errors / assertions / crashes in PMTopLevelManager::schedulePass() under certain configurations.

    Author: ben-clayton

    Reviewers: GorNishanov

    Reviewed By: GorNishanov

    Subscribers: capn, EricWF, modocache, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63144 (detail)
    by gornishanov
  201. Add debug location verification for !llvm.loop attachments.

    This patch teaches the Verifier how to detect broken !llvm.loop
    attachments as discussed in https://reviews.llvm.org/D60831. This
    allows LLVM to warn and strip out the broken debug info before
    attempting an LTO compilation with input generated by LLVM predating
    https://reviews.llvm.org/rL361149.

    rdar://problem/51631158

    Differential Revision: https://reviews.llvm.org/D63499 (detail)
    by Adrian Prantl
  202. [PDB] Ignore .debug$S subsections with high bit set

    Some versions of the Visual C++ 2015 runtime have line tables with the
    subsection kind of 0x800000F2. In cvinfo.h, 0x80000000 is documented to
    be DEBUG_S_IGNORE. This appears to implement the intended behavior. (detail)
    by rnk
  203. [AMDGPU] gfx1010 disassembler changes for wave32

    Differential Revision: https://reviews.llvm.org/D63506 (detail)
    by rampitec
  204. [X86] Remove unnecessary line that makes v4f32 FP_ROUND Legal. NFC

    FP_ROUND defaults to Legal for all MVT types and nothing changes
    the v4f32 entry way from this default. If we needed this line
    we'd also need one for v8f32 with AVX512 which we don't have. (detail)
    by ctopper
  205. Revert [SROA] Enhance SROA to handle `addrspacecast`ed allocas

    This reverts r363711 (git commit 76a149ef8187310a60fd20481fdb2a10c8ba968e)

    This causes stage2 build failures, e.g.:
    http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/132/steps/stage%202%20build/logs/stdio
    http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/87/steps/build-stage2-unified-tree/logs/stdio (detail)
    by rupprecht
  206. [TargetLowering] SimplifyDemandedBits - Cleanup ANY_EXTEND handling

    Match SIGN_EXTEND + ZERO_EXTEND handling - will be adding ANY_EXTEND_VECTOR_INREG support in a future patch. (detail)
    by rksimon
  207. [TargetLowering] SimplifyDemandedBits - Merge ZERO_EXTEND+ZERO_EXTEND_VECTOR_INREG handling

    Other than adding consistent demanded elts handling which was a trivial addition, the other differences in functionality will be added in later patches. (detail)
    by rksimon
  208. [SROA] Enhance SROA to handle `addrspacecast`ed allocas

    Summary:
    - After `addrspacecast` is allowed to be eliminated in SROA, the
      adjusting of storage pointer (from `alloca) needs to handle the
      potential different address spaces between the storage pointer (from
      alloca) and the pointer being used.

    Reviewers: arsenm

    Subscribers: wdng, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63501 (detail)
    by hliao
  209. [TargetLowering] SimplifyDemandedBits - Merge SIGN_EXTEND+SIGN_EXTEND_VECTOR_INREG handling

    Other than adding consistent demanded elts handling which was a trivial addition, the other differences in functionality will be added in later patches. (detail)
    by rksimon
  210. [x86] add test for load splitting with extracted store (PR42305); NFC (detail)
    by spatel
  211. [mips] Add more strict predicates to the RSQRT_S_MM and TAILCALL_MM

    This patch is one of a series of patches. The goal is to make P5600
    scheduler model complete and turn on the `CompleteModel` flag. (detail)
    by atanasyan
  212. [mips] Add PTR_64 and GPR_64 predicates to some MIPS 64-bit instructions

    Add `IsGP64bit` and `IsPTR64bit` to the list of `UnsupportedFeatures`
    of the P5600 scheduling definitions. Also mark some MIPS 64-bit
    instructions by PTR_64 and GPR_64 predicates. This reduces number
    of "No schedule information for" and "lacks information for" errors
    in case of marking this scheduler model as complete.

    This patch is one of a series of patches. The goal is to make P5600
    scheduler model complete and turn on the `CompleteModel` flag.

    Differential Revision: https://reviews.llvm.org/D63237 (detail)
    by atanasyan
  213. [mips] Set the hasNoSchedulingInfo flag for the `MipsAsmPseudoInst`

    Set the hasNoSchedulingInfo flag for the`MipsAsmPseudoInst`. These
    pseudo-instructions are never used by codegen. This flag allows to
    reduce number of "No schedule information for" and "lacks information
    for" errors in case of marking a scheduler model as complete.

    This patch is one of a series of patches. The goal is to make P5600
    scheduler model complete and turn on the `CompleteModel` flag.

    Differential Revision: https://reviews.llvm.org/D63236 (detail)
    by atanasyan
  214. Fix some lit test ResourceWarnings on Windows

    When running LLDB lit tests on Windows, the system selects a debug version
    of Python, which was issuing lots of ResourceWarnings about files that
    weren't closed.  There are two kinds of them, and each test triggered one
    of each.

    This patch fixes one kind by ensuring TestRunner explicitly close the
    temporary files created for routing stderr.  This is important on Windows
    but has no net effect on Posix systems.

    The remaining ResourceWarnings are more elusive; the bug may lie in
    the Python library subprocess.py, and it may be Windows-specific.

    Differential Revision: https://reviews.llvm.org/D63102 (detail)
    by amccarth
  215. [ARM] Add MVE vector shift instructions.

    This includes saturating and non-saturating shifts, both with
    immediate shift count and with the shift counts given by another
    vector register; VSHLC (in which the bits shifted out of each active
    vector lane are shifted in to the next active lane); and also VMOVL,
    which is enough like an immediate shift that it didn't fit too badly
    in this category.

    Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

    Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62672 (detail)
    by statham
  216. [ARM] Add MVE integer vector min/max instructions.

    Summary:
    These form a small family of their own, to go with the floating-point
    VMINNM/VMAXNM instructions added in a previous commit.

    They introduce the first of many special cases in the mnemonic
    recognition code, because VMIN with the E suffix used by the VPT
    predication system needs to avoid being interpreted as the nonexistent
    instruction 'VMI' with an ordinary 'NE' condition suffix.

    Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover

    Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62671 (detail)
    by statham
  217. [TargetLowering] SimplifyDemandedVectorElts - support MUL and ANY_EXTEND_VECTOR_INREG

    Also fold ANY_EXTEND_VECTOR_INREG -> BITCAST if we only need the bottom element.

    Fixes temporary regression introduced in rL363693. (detail)
    by rksimon
  218. [X86][AVX] extract_subvector(any_extend(x)) -> any_extend_vector_inreg(x)

    Part of fixing the X86 regression noted in D63281 - I've split this into X86 and generic parts - the generic commit will be coming shortly and will fix the vector-reduce-mul-widen.ll regression introduced here. (detail)
    by rksimon
  219. [ARM] Rename MVE instructions in Tablegen for consistency.

    Summary:
    Their names began with a mishmash of `MVE_`, `t2` and no prefix at
    all. Now they all start with `MVE_`, which seems like a reasonable
    choice on the grounds that (a) NEON is the thing they're most at risk
    of being confused with, and (b) MVE implies Thumb-2, so a prefix
    indicating MVE is strictly more specific than one indicating Thumb-2.

    Reviewers: ostannard, SjoerdMeijer, dmgreen

    Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63492 (detail)
    by statham
  220. [RISCV] Lower calls through PLT

    This patch adds support for generating calls through the procedure
    linkage table where required for a given ExternalSymbol or GlobalAddress
    callee.

    Differential Revision: https://reviews.llvm.org/D55304 (detail)
    by lewis-revill
  221. Fix -Wunused-but-set-variable warning. NFCI. (detail)
    by rksimon
  222. [llvm-readobj] Allow --hex-dump/--string-dump to dump multiple sections

    1) `-x foo` currently dumps one `foo`. This change makes it dump all `foo`.
    2) `-x foo -x foo` currently dumps `foo` twice. This change makes it dump `foo` once.
       In addition, if foo has section index 9, `-x foo -x 9` dumps `foo` once.
    3) Give a warning instead of an error if `foo` does not exist.

    The new behaviors match GNU readelf.

    Also, print a new line as a separator between two section dumps.
    GNU readelf uses two lines, but one seems good enough.

    Reviewed By: grimar, jhenderson

    Differential Revision: https://reviews.llvm.org/D63475 (detail)
    by maskray
  223. AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics

    There may or may not be additional work to handle this correctly on
    SI/CI. (detail)
    by arsenm
  224. [MCA] Slightly refactor the bottleneck analysis view. NFCI

    This patch slightly refactors data structures internally used by the bottleneck
    analysis to track data and resource dependencies.
    This patch also updates methods used to print out information about dependency
    edges when in debug mode.
    This is the last of a sequence of commits done in preparation for an upcoming
    patch that fixes PR37494. No functional change intended. (detail)
    by adibiagio
  225. AMDGPU: Change API for checking for exec modification

    Invert the name and return value to better reflect the imprecise
    nature.

    Force passing in the DefMI, since it's known in the 2 users and could
    possibly fail for an arbitrary vreg.

    Allow specifying a specific user instruction. Scan through use
    instructions, instead of use operands. Add scan thresholds instead of
    searching infinitely.

    Stop using a set to track seen uses. I didn't understand this usage,
    or why it would not check the last use. I don't think the use list has
    any particular order. (detail)
    by arsenm
  226. MCContext: Delete unused functions (detail)
    by maskray
  227. gn build: Merge r363658 (detail)
    by nico
  228. gn build: Merge r363649

    This reverts commit "gn build: Merge r363626" because r363626
    was reverted in r363649. (detail)
    by nico
  229. [SelectionDAG] Legalize vaargs that require vector splitting

    This adds vector splitting for vaarg instructions during type legalization

    Committed on behalf of @luke (Luke Lau)

    Differential Revision: https://reviews.llvm.org/D60762 (detail)
    by rksimon
  230. AMDGPU: Fold readlane from copy of SGPR or imm

    These may be inserted to assert uniformity somewhere. (detail)
    by arsenm
  231. AMDGPU: Remove unnecessary check for virtual register

    The copy was found by searching the uses of a virtual register, so
    it's already known to be virtual. (detail)
    by arsenm
  232. AMDGPU: Fix iterator crash in AMDGPUPromoteAlloca

    The lifetime intrinsic was erased, which was the next iterator. (detail)
    by arsenm
  233. AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.scale (detail)
    by arsenm
  234. [ARM] Some Thumb2ITBlock clean ups. NFC

    Some more refactoring, like registering the IT Block pass, less cryptic
    variable names, and some simplification of loops.

    Differential Revision: https://reviews.llvm.org/D63419 (detail)
    by sjoerdmeijer
  235. [SystemZ]  Fix AHIMuxK pseudo expansion.

    Do not emit a copy if the source and destination registers are the same.

    Review: Ulrich Weigand (detail)
    by jonpa
  236. [AMDGPU] Speed up live-in virtual register set computaion in GCNScheduleDAGMILive.

    Differential revision: https://reviews.llvm.org/D62401 (detail)
    by vpykhtin
  237. [SVE][IR] Scalable Vector IR Type with pr42210 fix

    Recommit of D32530 with a few small changes:
      - Stopped recursively walking through aggregates in
        the verifier, so that we don't impose too much
        overhead on large modules under LTO (see PR42210).
      - Changed tests to match; the errors are slightly
        different since they only report the array or
        struct that actually contains a scalable vector,
        rather than all aggregates which contain one in
        a nested member.
      - Corrected an older comment

    Reviewers: thakis, rengolin, sdesmalen

    Reviewed By: sdesmalen

    Differential Revision: https://reviews.llvm.org/D63321 (detail)
    by huntergr
  238. [X86] Regenerate promote.ll. NFC. (detail)
    by rksimon
  239. [NFC] Improve triple match of scripts that update tests

    Summary:
    The prior behavior of the triple matcher would stop
    in the first matched triple. It was not possible to
    create specific matches for sub-sets of a triple
    (e.g aarch64-apple-darwin would never be used after
    aarch64 was matched).

    This patch:
    1) Allows that specialized triples take priority,
    considering that the string lenght of the triple
    indentifies how specialized a triple is. If two
    triples of same lenght match, the one matched first
    prevails, preserving the old behavior.

    2) Remove 20 duplicated triples of arm, thumb,
    aarch64 options with same arguments, matching
    the common prefix (aarch64, arm, thumb) of them.

    3) Creates three new function matching regexes and
    five triple options for arm64-apple-ios,
    (arm|thumb)-apple-ios and thumb(v5)?-macho

    Reviewers: lebedev.ri, RKSimon, MaskRay, gbedwell

    Reviewed By: MaskRay

    Subscribers: javed.absar, kristof.beyls, llvm-commits, carwil

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63145 (detail)
    by dnsampaio
  240. [X86] Replace any_extend* vector extensions with zero_extend* equivalents

    First step toward addressing the vector-reduce-mul-widen.ll regression in D63281 - we should replace ANY_EXTEND/ANY_EXTEND_VECTOR_INREG in X86ISelDAGToDAG to avoid having to add duplicate patterns when treating any extensions as legal.

    In future patches this will also allow us to keep any extension nodes around a lot longer in the DAG, which should mean that we can keep better track of undef elements that otherwise become zeros that we think we have to keep......

    Differential Revision: https://reviews.llvm.org/D63326 (detail)
    by rksimon
  241. [DebugInfo][Docs] Document that prologue/epilogue variable location changes are ignored

    This patch documents that LLVM does not describe all changes in variable
    locations during the prologue and the epilogue. The debugger doesn't /
    shouldn't step through that portion of the function anyway, and describing
    every location through such stages would bloat location lists.

    Perform some minor cleanup at the same time,
    * Fix an enumerated list
    * Document that dbg.declare intrinsics have their variable location recorded
       in a MachineFunction table, not with DBG_VALUE meta-insts
    * Adds frame-indexes to the list of things that can be operands to
       DBG_VALUEs.

    Differential Revision: https://reviews.llvm.org/D63083 (detail)
    by jmorse
  242. [SimplifyCFG] NFC, prof branch_weighs handling is simplified

    Using the new SwitchInstProfUpdateWrapper this patch
    simplifies 3 places of prof branch_weights handling.

    Differential Revision: https://reviews.llvm.org/D62123 (detail)
    by yrouban
  243. [llvm-objdump] Tidy up AMDGCNPrettyPrinter (detail)
    by maskray
  244. [X86] Add i128 ctpop and i32/i64/i128 optsize test cases to popcnt.ll

    Test cases for PR41151 and D59909. (detail)
    by ctopper
  245. [X86] Move code that shrinks immediates for ((x << C1) op C2) into a helper function. NFCI

    Preliminary step for D59909 (detail)
    by ctopper
  246. [X86] Remove MOVDI2SSrm/MOV64toSDrm/MOVSS2DImr/MOVSDto64mr CodeGenOnly instructions.

    The isel patterns for these use a bitcast and load/store, but
    DAG combine should have canonicalized those away.

    For the purposes of the memory folding table these opcodes can be
    replaced by the MOVSSrm_alt/MOVSDrm_alt and MOVSSmr/MOVSDmr opcodes. (detail)
    by ctopper
  247. [X86] Introduce new MOVSSrm/MOVSDrm opcodes that use VR128 register class.

    Rename the old versions that use FR32/FR64 to MOVSSrm_alt/MOVSDrm_alt.

    Use the new versions in patterns that previously used a COPY_TO_REGCLASS
    to VR128. These patterns expect the upper bits to be zero. The
    current set up appears to work, but I'm not sure we should be
    enforcing upper bits being zero through a COPY_TO_REGCLASS.

    I wanted to flip the arrangement and use a COPY_TO_REGCLASS to
    FR32/FR64 for the patterns that need an f32/f64 result, but that
    complicated fastisel and globalisel.

    I've been doing some experiments with reducing some isel patterns
    and ended up in a situation where I had a
    (SUBREG_TO_REG (COPY_TO_RECLASS (VMOVSSrm), VR128)) and our
    post-isel peephole was unable to avoid using an instruction for
    the SUBREG_TO_REG due to the COPY_TO_REGCLASS. Having a VR128
    instruction removes the COPY_TO_REGCLASS that was breaking this. (detail)
    by ctopper
  248. GlobalISel: Remove redundant pass initialization

    Summary:
    All the GlobalISel passes are initialized when the target calls
    initializeGlobalISel(), so we don't need to call the initializers
    from the pass constructors.

    Reviewers: qcolombet, t.p.northover, paquette, dsanders, aemerson, aditya_nandakumar

    Reviewed By: aemerson

    Subscribers: rovka, kristof.beyls, hiraditya, volkan, Petar.Avramovic, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63235 (detail)
    by tstellar
  249. [llvm-strip] Error when using stdin twice

    Summary: Implements bug [[ https://bugs.llvm.org/show_bug.cgi?id=42204 | 42204 ]]. llvm-strip now warns when the same input file is used more than once, and errors when stdin is used more than once.

    Reviewers: jhenderson, rupprecht, espindola, alexshap

    Reviewed By: jhenderson, rupprecht

    Subscribers: emaste, arichardson, jakehehrlich, MaskRay, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63122 (detail)
    by abrachet
  250. GlobalISel: Use the original flags when lowering fneg to fsub

    This was ignoring the flag on fneg, and using the source instruction's
    flags. Also fixes tests missing from r358702.

    Note the expansion itself isn't correct without nnan, but that should
    be fixed separately. (detail)
    by arsenm
  251. hwasan: Use bits [3..11) of the ring buffer entry address as the base stack tag.

    This saves roughly 32 bytes of instructions per function with stack objects
    and causes us to preserve enough information that we can recover the original
    tags of all stack variables.

    Now that stack tags are deterministic, we no longer need to pass
    -hwasan-generate-tags-with-calls during check-hwasan. This also means that
    the new stack tag generation mechanism is exercised by check-hwasan.

    Differential Revision: https://reviews.llvm.org/D63360 (detail)
    by pcc
  252. hwasan: Add a tag_offset DWARF attribute to instrumented stack variables.

    The goal is to improve hwasan's error reporting for stack use-after-return by
    recording enough information to allow the specific variable that was accessed
    to be identified based on the pointer's tag. Currently we record the PC and
    lower bits of SP for each stack frame we create (which will eventually be
    enough to derive the base tag used by the stack frame) but that's not enough
    to determine the specific tag for each variable, which is the stack frame's
    base tag XOR a value (the "tag offset") that is unique for each variable in
    a function.

    In IR, the tag offset is most naturally represented as part of a location
    expression on the llvm.dbg.declare instruction. However, the presence of the
    tag offset in the variable's actual location expression is likely to confuse
    debuggers which won't know about tag offsets, and moreover the tag offset
    is not required for a debugger to determine the location of the variable on
    the stack, so at the DWARF level it is represented as an attribute so that
    it will be ignored by debuggers that don't know about it.

    Differential Revision: https://reviews.llvm.org/D63119 (detail)
    by pcc
  253. gn build: Merge r363626. (detail)
    by pcc
  254. [GlobalISel][Localizer] Rewrite localizer to run in 2 phases, inter & intra block.

    Inter-block localization is the same as what currently happens, except now it
    only runs on the entry block because that's where the problematic constants with
    long live ranges come from.

    The second phase is a new intra-block localization phase which attempts to
    re-sink the already localized instructions further right before one of the
    multiple uses.

    One additional change is to also localize G_GLOBAL_VALUE as they're constants
    too. However, on some targets like arm64 it takes multiple instructions to
    materialize the value, so some additional heuristics with a TTI hook have been
    introduced attempt to prevent code size regressions when localizing these.

    Overall, these changes improve CTMark code size on arm64 by 1.2%.

    Full code size results:

    Program                                         baseline       new       diff
    ------------------------------------------------------------------------------
    test-suite...-typeset/consumer-typeset.test    1249984      1217216     -2.6%
    test-suite...:: CTMark/ClamAV/clamscan.test    1264928      1232152     -2.6%
    test-suite :: CTMark/SPASS/SPASS.test          1394092      1361316     -2.4%
    test-suite...Mark/mafft/pairlocalalign.test    731320       714928      -2.2%
    test-suite :: CTMark/lencod/lencod.test        1340592      1324200     -1.2%
    test-suite :: CTMark/kimwitu++/kc.test         3853512      3820420     -0.9%
    test-suite :: CTMark/Bullet/bullet.test        3406036      3389652     -0.5%
    test-suite...ark/tramp3d-v4/tramp3d-v4.test    8017000      8016992     -0.0%
    test-suite...TMark/7zip/7zip-benchmark.test    2856588      2856588      0.0%
    test-suite...:: CTMark/sqlite3/sqlite3.test    765704       765704       0.0%
    Geomean difference                                                      -1.2%

    Differential Revision: https://reviews.llvm.org/D63303 (detail)
    by aemerson
  255. Propagate fmf in IRTranslate for fneg

    Summary: This case is related to D63405 in that we need to be propagating FMF on negates.

    Reviewers: volkan, spatel, arsenm

    Reviewed By: arsenm

    Subscribers: wdng, javed.absar

    Differential Revision: https://reviews.llvm.org/D63458 (detail)
    by mcberg2017
  256. Use VR128X instead of FR32X/FR64X for the register class in VMOVSSZmrk/VMOVSDZmrk.

    Removes COPY_TO_REGCLASS from some patterns. (detail)
    by ctopper
  257. [X86] Make an assert in LowerSCALAR_TO_VECTOR stricter to make it clear what types are allowed here. NFC

    Make it clear that only integer type with i32 or smaller elements shoudl get to this part of the code. (detail)
    by ctopper
  258. [AMDGPU] Use custom inserter for gfx10 VOP2b

    This is part of the approved D63204 pending parent revision.
    This small change is in fact a part of the VOP2b legalization which
    does not technically belong to wave32 support, so extracted
    separately. (detail)
    by rampitec
  259. [AMDGPU] gfx1010 subvector test. NFC. (detail)
    by rampitec
  260. [test][AArch64] Relax the check line for G_BRJT in legalizer-info-validation.mir

    Replace the specific number with a pattern to relax the test. (detail)
    by volkan
  261. Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges

    This patch really contains two pieces:
        Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi.
        Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses.

    Differential Revision: https://reviews.llvm.org/D63224 (detail)
    by reames
  262. Add convenience utility for replacing a range within a container with a
    different range, in preparation for use in Clang. (detail)
    by rsmith
  263. [globalisel] Fix iterator invalidation in the extload combines

    Summary:
    Change the way we deal with iterator invalidation in the extload combines as it
    was still possible to neglect to visit a use. Even worse, it happened in the
    in-tree test cases and the checks weren't good enough to detect it.

    We now take a cheap copy of the use list before iterating over it. This
    prevents iterator invalidation from occurring and has the nice side effect
    of making the existing schedule-for-erase/schedule-for-insert mechanism
    moot.

    Reviewers: aditya_nandakumar

    Reviewed By: aditya_nandakumar

    Subscribers: rovka, kristof.beyls, javed.absar, volkan, Petar.Avramovic, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D61813 (detail)
    by dsanders
  264. [AMDGPU] Propagate function attributes thru bitcasts

    AMDGPUPropagateAttributes will not work on function bitcatsts,
    so move AMDGPUFixFunctionBitcasts before it.

    Differential Revision: https://reviews.llvm.org/D63455 (detail)
    by rampitec
  265. Fix a bug w/inbounds invalidation in LFTR (recommit)

    Recommit r363289 with a bug fix for crash identified in pr42279.  Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case.  Test case previously committed in r363601.

    Original commit comment follows:

    This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

    The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program.

    As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

    (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

    Differential Revision: https://reviews.llvm.org/D62939 (detail)
    by reames
  266. gn build: Merge r363483. (detail)
    by pcc
  267. gn build: Merge r363584. (detail)
    by pcc
  268. AMDGPU/GFX10: Don't generate s_code_end padding in the asm-printer

    Summary:
    The purpose of the padding is to guard against stale code being
    fetched into the instruction cache by the lowest level prefetching.
    We're generating relocatable ELF here, and so the padding should
    arguably be added by the linker. This is in fact what Mesa does.

    This also fixes multi-part shaders for Mesa.

    Change-Id: I6bfede58f20e9f337762ccf39ef9e0e263e69e82

    Reviewers: arsenm, rampitec, t-tye

    Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63427 (detail)
    by nha
  269. Reduced test case for pr42279 in advance of the relevant re-commit + fix (detail)
    by reames
  270. AMDGPU: Explicitly define a triple for some tests

    Summary:
    This is related to the changes to the groupstaticsize intrinsic in
    D61494 which would otherwise make the related tests in these files
    fail or much less useful.

    Note that for some reason, SOPK generation is less effective in the
    amdhsa OS, which is why I chose PAL. I haven't investigated this
    deeper.

    Change-Id: I6bb99569338f7a433c28b4c9eb1e3e036b00d166

    Reviewers: arsenm

    Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63392 (detail)
    by nha
  271. [EarlyCSE] Fix hashing of self-compares

    Summary:
    Update compare normalization in SimpleValue hashing to break ties (when
    the same value is being compared to itself) by switching to the swapped
    predicate if it has a lower numerical value.  This brings the hashing in
    line with isEqual, which already recognizes the self-compares with
    swapped predicates as equal.

    Fixes PR 42280.

    Reviewers: spatel, efriedma, nikic, fhahn, uabelho

    Reviewed By: nikic

    Subscribers: hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63349 (detail)
    by josepht
  272. [MemorySSA] Don't use template when the clone is a simplified instruction.

    Summary:
    LoopRotate doesn't create a faithful clone of an instruction, it may
    simplify it beforehand. Hence the clone of an instruction that has a
    MemoryDef associated may not be a definition, but a use or not a memory
    alternig instruction.
    Don't rely on the template when the clone may be simplified.

    Reviewers: george.burgess.iv

    Subscribers: jlebar, Prazek, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63355 (detail)
    by asbirlea
  273. [GlobalISel][AArch64] Fold G_SUB into G_ICMP when it's safe to do so

    Basically porting over the behaviour in AArch64ISelLowering to GISel. See
    emitComparison for reference.

    When we have something like this:

    ```
      lhs = G_SUB 0, y
      ...
      G_ICMP lhs, rhs
    ```

    We can fold away the G_SUB and produce a cmn instead, given that we produce
    the same value in NZCV.

    Add a test showing that the transformation works, and also showing that we
    don't perform the transformation when it's unsafe.

    Also factor out the CSet emission into emitCSetForICMP.

    Differential Revision: https://reviews.llvm.org/D63163 (detail)
    by paquette
  274. [X86] Add TB_NO_REVERSE to some memory folding table entries where the register form requires 64-bit mode, but the memory form does not.

    We don't know if its safe to unfold if we're in 32-bit mode.

    This is simlar to what was done to some load opcodes in r363523.

    I think its pretty unlikely we will try to unfold these anyway so
    I don't think this is testable. (detail)
    by ctopper
  275. LiveInterval.h: add LiveRange::findIndexesLiveAt function - return a list of SlotIndexes the LiveRange live at.

    Differential revision: https://reviews.llvm.org/D62411 (detail)
    by vpykhtin
  276. [X86][SSE] Scalarize under-aligned XMM vector nt-stores (PR42026)

    If a XMM non-temporal store has less than natural alignment, scalarize the vector - with SSE4A we can stay on the vector and use MOVNTSD(f64), else we must move to GPRs and use MOVNTI(i32/i64). (detail)
    by rksimon
  277. AMDGPU: Make getreg intrinsic inaccessiblememonly (detail)
    by arsenm
  278. [MemorySSA] Add all MemoryPhis before filling their values.

    Summary:
    Add all MemoryPhis in IDF before filling in their incomign values.
    Otherwise, a new Phi can be added that needs to become the incoming
    value of another Phi.
    Test fails the verification in verifyPrevDefInPhis.

    Reviewers: george.burgess.iv

    Subscribers: jlebar, Prazek, zzheng, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63353 (detail)
    by asbirlea
  279. [AMDGPU] gfx1010 wavefrontsize intrinsic folding

    Differential Revision: https://reviews.llvm.org/D63206 (detail)
    by rampitec
  280. AMDGPU: Fold readlane/readfirstlane calls (detail)
    by arsenm
  281. [AMDGPU] Pass to propagate ABI attributes from kernels to the functions

    The pass works in two modes:

    Mode 1: Just set attributes starting from kernels. This can work at
    the very beginning of opt and llc pipeline, but cannot clone functions
    because it must be a function pass.

    Mode 2: Actually clone functions for new attributes. This can only work
    after all function passes in the opt pipeline because it has to be a
    module pass.

    Differential Revision: https://reviews.llvm.org/D63208 (detail)
    by rampitec
  282. gn build: Merge r363541 (detail)
    by nico
  283. [X86][AVX] Split under-aligned vector nt-stores.

    If a YMM/ZMM non-temporal store has less than natural alignment, split the vector - either they will be satisfactorily aligned or will continue to be split until they are XMMs - at which point the legalizer will scalarize it. (detail)
    by rksimon
  284. [LV] Suppress vectorization in some nontemporal cases

    When considering a loop containing nontemporal stores or loads for
    vectorization, suppress the vectorization if the corresponding
    vectorized store or load with the aligment of the original scaler
    memory op is not supported with the nontemporal hint on the target.

    This adds two new functions:
      bool isLegalNTStore(Type *DataType, unsigned Alignment) const;
      bool isLegalNTLoad(Type *DataType, unsigned Alignment) const;

    to TTI, leaving the target independent default implementation as
    returning true, but with overriding implementations for X86 that
    check the legality based on available Subtarget features.

    This fixes https://llvm.org/PR40759

    Differential Revision: https://reviews.llvm.org/D61764 (detail)
    by wristow
  285. GlobalISel: Ignore callsite attributes when picking intrinsic type

    A target intrinsic may be defined as possibly reading memory, but the
    call site may have additional knowledge that it doesn't read
    memory. The intrinsic lowering will expect the pessimistic assumption
    of the intrinsic definition, so the chain should still be used.

    I fixed the same bug in SelectionDAG in r287593. (detail)
    by arsenm
  286. GlobalISel: Verify intrinsics

    I keep using the wrong instruction when manually writing tests. This
    really needs to check the number of operands, but I don't see an easy
    way to do that right now. (detail)
    by arsenm
  287. AMDGPU/GlobalISel: Account for multiple defs when finding intrinsic ID (detail)
    by arsenm
  288. [AMDGPU] gfx1010 wave32 metadata

    Differential Revision: https://reviews.llvm.org/D63207 (detail)
    by rampitec
  289. AMDGPU/GlobalISel: Implement select for G_ICMP and G_SELECT

    Reviewers: arsenm

    Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D60640 (detail)
    by tstellar
  290. [Remarks] Extend -fsave-optimization-record to specify the format

    Use -fsave-optimization-record=<format> to specify a different format
    than the default, which is YAML.

    For now, only YAML is supported. (detail)
    by thegameg
  291. [X86] combineLoad - begun making the load split code more generic. NFCI.

    This is currently only used for ymm->xmm splitting but we shouldn't hardcode the offsets/alignment.

    This is necessary for an upcoming patch to split under-aligned non-temporal vector loads. (detail)
    by rksimon
  292. PHINode: introduce setIncomingValueForBlock() function, and use it.

    Summary:
    There is PHINode::getBasicBlockIndex() and PHINode::setIncomingValue()
    but no function to replace incoming value for a specified BasicBlock*
    predecessor.
    Clearly, there are a lot of places that could use that functionality.

    Reviewer: craig.topper, lebedev.ri, Meinersbur, kbarton, fhahn
    Reviewed By: Meinersbur, fhahn
    Subscribers: fhahn, hiraditya, zzheng, jsji, llvm-commits
    Tag: LLVM
    Differential Revision: https://reviews.llvm.org/D63338 (detail)
    by whitneyt
  293. [X86][SSE] Add tests for underaligned nt loads

    Test both 'unaligned' (which we should just use regular unaligned loads) and 'subvector aligned' (which we should split) (detail)
    by rksimon
  294. [X86][SSE] Prevent misaligned non-temporal vector load/store combines

    For loads, pre-SSE41 we can't perform NT loads at all, and after that we can only perform vector aligned loads, so if the alignment is less than for a xmm we'll just end up using the regular unaligned vector loads anyway.

    First step towards fixing PR42026 - the next step for stores will be to use SSE4A movntsd where possible and to avoid the stack spill on SSE2 targets.

    Differential Revision: https://reviews.llvm.org/D63246 (detail)
    by rksimon
  295. InferAddressSpaces: Fix cloning original addrspacecast

    If an addrspacecast needed to be inserted again, this was creating a
    clone of the original cast for each user. Just use the original, which
    also saves losing the value name. (detail)
    by arsenm
  296. AMDGPU: Ignore subtarget for InferAddressSpaces

    Even if the target doesn't have flat instructions, addrspace(0) is
    still flat. It just happens to not work. (detail)
    by arsenm
  297. AMDGPU: Mark exp/exp.compr as inaccessiblememonly

    Should also be marked writeonly, but I think that would require
    splitting the version with done set to a separate intrinsic

    Test change is only from renumbering the attribute group numbers,
    which for some reason the generated check lines consider. (detail)
    by arsenm
  298. AMDGPU/GlobalISel: Fix default mapping for non-register operands

    Tests will be in future commits when new intrinsics are handled here. (detail)
    by arsenm
  299. AMDGPU: Cleanup custom PseudoSourceValue definitions

    Use separate enums for each kind, avoid repeating overloads, and add
    missing classof implementation. (detail)
    by arsenm
  300. [CodeGen] Check for HardwareLoop Latch ExitBlock

    The HardwareLoops pass finds exit blocks with a scevable exit count.
    If the target specifies to update the loop counter in a register,
    through a phi, we need to ensure that the exit block is a latch so
    that we can insert the phi with the correct value for the incoming
    edge.

    Differential Revision: https://reviews.llvm.org/D63336 (detail)
    by sam_parker
  301. [X86][SSE] Avoid unnecessary stack codegen in NT store codegen tests. (detail)
    by rksimon
  302. AsmPrinter: add doc-string for EmitLinkage

    Change-Id: I376fcbd58f84a2aac6aaf744bc1665c92d312b25 (detail)
    by nha
  303. gn build: Merge r363530 (detail)
    by nico
  304. [LV] Deny irregular types in interleavedAccessCanBeWidened

    Summary:
    Avoid that loop vectorizer creates loads/stores of vectors
    with "irregular" types when interleaving. An example of
    an irregular type is x86_fp80 that is 80 bits, but that
    may have an allocation size that is 96 bits. So an array
    of x86_fp80 is not bitcast compatible with a vector
    of the same type.

    Not sure if interleavedAccessCanBeWidened is the best
    place for this check, but it solves the problem seen
    in the added test case. And it is the same kind of check
    that already exists in memoryInstructionCanBeWidened.

    Reviewers: fhahn, Ayal, craig.topper

    Reviewed By: fhahn

    Subscribers: hiraditya, rkruppe, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63386 (detail)
    by bjope
  305. Test forward references in IntrinsicEmitter on Neon LD(2|3|4)

    This patch tests the forward-referencing added in D62995 by changing
    some existing intrinsics to use forward referencing of overloadable
    parameters, rather than backward referencing.

    This patch changes the TableGen definition/implementation of
    llvm.aarch64.neon.ld2lane and llvm.aarch64.neon.ld2lane intrinsics
    (and similar for ld3 and ld4). This change is intended to be
    non-functional, since the behaviour of the intrinsics is
    expected to be the same.

    Reviewers: arsenm, dmgreen, RKSimon, greened, rnk

    Reviewed By: RKSimon

    Differential Revision: https://reviews.llvm.org/D63189 (detail)
    by s.desmalen
  306. [DAGCombiner] [CodeGenPrepare] More comprehensive GEP splitting

    Some GEPs were not being split, presumably because that split would just be
    undone by the DAGCombiner. Not performing those splits can prevent important
    optimizations, such as preventing the element indices / member offsets from
    being (partially) folded into load/store instruction immediates. This patch:

    - Makes the splits also occur in the cases where the base address and the GEP
      are in the same BB.
    - Ensures that the DAGCombiner doesn't reassociate them back again.

    Differential Revision: https://reviews.llvm.org/D60294 (detail)
    by luismarques
  307. Fix clang -Wcovered-switch-default after stack-id change by D60137 (detail)
    by maskray
  308. [SelectionDAG] Fold insert_subvector(undef, extract_subvector(v, c), c) -> v in getNode

    This is already done in DAGCombiner::visitINSERT_SUBVECTOR, but this helps a number of shuffles across different vector widths recognise when they come from the same source. (detail)
    by rksimon
  309. [SCEV] Use NoWrapFlags when expanding a simple mul

    Second functional change following on from rL362687. Pass the
    NoWrapFlags from the MulExpr to InsertBinop when we're generating a
    shl or mul.

    Differential Revision: https://reviews.llvm.org/D61934 (detail)
    by sam_parker
  310. [llvm-objdump] Use %08 instead of %016 to print leading addresses for 32-bit binaries

    Reviewed By: grimar

    Differential Revision: https://reviews.llvm.org/D63398 (detail)
    by maskray
  311. [lit] Delete empty lines at the end of lit.local.cfg NFC (detail)
    by maskray
  312. [NFC][Codegen] Standalone tests for icmp eq/ne (urem %x, C), 0 -> icmp eq/ne %x, 0 fold (D63390) (detail)
    by lebedevri
  313. [ARM] Fix another -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D63265 (detail)
    by maskray
  314. [ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D63265 (detail)
    by maskray
  315. Describe stack-id as an enum

    This patch changes MIR stack-id from an integer to an enum,
    and adds printing/parsing support for this in MIR files. The default
    stack-id '0' is now renamed to 'default'.

    This should make MIR tests that have stack objects with different stack-ids
    more descriptive. It also clarifies code operating on StackID.

    Reviewers: arsenm, thegameg, qcolombet

    Reviewed By: arsenm

    Differential Revision: https://reviews.llvm.org/D60137 (detail)
    by s.desmalen
  316. [ARM] Remove ARMComputeBlockSize

    Forgot to remove file! (detail)
    by sam_parker
  317. [ARM] Add ARMBasicBlockInfo.cpp

    Forgot to add file! (detail)
    by sam_parker
  318. [ARM] Extract some code from ARMConstantIslandPass

    Create the ARMBasicBlockUtils class for tracking and querying basic
    blocks sizes so we can use them when generating low-overhead loops.

    Differential Revision: https://reviews.llvm.org/D63265 (detail)
    by sam_parker
  319. Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"

    Third time's the charm.

    This was reverted in r363220 due to being suspected of an internal benchmark
    regression and a test failure, none of which turned out to be caused by this. (detail)
    by hans
  320. [SimplifyCFG] Fix prof branch_weights MD while removing unreachable switch cases

    SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata
    if unreachable switch cases are removed. This patch fixes this bug by making use
    of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122).
    A new test is created.

    Differential Revision: https://reviews.llvm.org/D62186 (detail)
    by yrouban
  321. PowerPC: Optimize SPE double parameter calling setup

    Summary:
    SPE passes doubles the same as soft-float, in register pairs as i32
    types.  This is all handled by the target-independent layer.  However,
    this is not optimal when splitting or reforming the doubles, as it
    pushes to the stack and loads from, on either side.

    For instance, to pass a double argument to a function, assuming the
    double value is in r5, the sequence currently looks like this:

        evstdd      5, X(1)
        lwz         3, X(1)
        lwz         4, X+4(1)

    Likewise, to form a double into r5 from args in r3 and r4:

        stw         3, X(1)
        stw         4, X+4(1)
        evldd       5, X(1)

    This optimizes the fence to use SPE instructions.  Now, to pass a double
    to a function:

        mr          4, 5
        evmergehi   3, 5, 5

    And to form a double into r5 from args in r3 and r4:

        evmergelo   5, 3, 4

    This is comparable to the way that gcc generates the double splits.

    This also fixes a bug with expanding builtins to libcalls, where the
    LowerCallTo() code path was generating intermediate illegal type nodes.

    Reviewers: nemanjai, hfinkel, joerg

    Subscribers: kbarton, jfb, jsji, llvm-commits

    Differential Revision: https://reviews.llvm.org/D54583 (detail)
    by jhibbits
  322. [yaml2obj][MachO] Don't fill dummy data for virtual sections

    Summary:
    Currently, MachOWriter::writeSectionData writes dummy data (0xdeadbeef) to fill section data areas in the file even if the section is a virtual one. Since virtual sections don't occupy any space in the file, writing dummy data could results the  "OS.tell() - fileStart <= Sec.offset" assertion failure.

    This patch fixes the bug by simply not writing any dummy data for virtual sections.

    Reviewers: beanz, jhenderson, rupprecht, alexshap

    Reviewed By: alexshap

    Subscribers: compnerd, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62991 (detail)
    by seiya
  323. [llvm-objcopy] Add elf32-sparc and elf32-sparcel target

    Summary:
    The "sparc"/"sparcel" architectures appears in ArchMap (used by -B option) but not in OutputFormatMap (used by -I/-O option). Add their targets into OutputFormatMap for consistency.

    Note that AFAIK there're no targets for 32-bit little-endian SPARC ("elf32-sparcel") in GNU binutils.

    Reviewers: espindola, alexshap, rupprecht, jhenderson, compnerd, jakehehrlich

    Reviewed By: jhenderson, compnerd, jakehehrlich

    Subscribers: jyknight, emaste, arichardson, fedor.sergeev, jakehehrlich, MaskRay, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63238 (detail)
    by seiya
  324. [X86] Add TB_NO_REVERSE to some folding table entries where the register from uses the REX prefix, but the memory form does not.

    It would not be safe to unfold the memory form the register form
    without checking that we are compiling for 64-bit mode.

    This probaby isn't a real functional issue since we are unlikely
    to unfold any of these instructions since they don't have any
    tied registers, aren't commutable, and don't have any inputs
    other than the address. (detail)
    by ctopper
  325. [InstSimplify] Fix addo/subo undef folds (PR42209)

    Fix folds of addo and subo with an undef operand to be:

    `@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`,
    as per LLVM undef rules.
    Same for commuted variants.

    Based on the original version of the patch by @nikic.

    Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 | PR42209 ]]

    Differential Revision: https://reviews.llvm.org/D63065 (detail)
    by lebedevri
Revision: 362564
Changes
  1. [OpenCL][PR41963] Add generic addr space to old atomics in C++ mode

    Add overloads with generic address space pointer to old atomics.
    This is currently only added for C++ compilation mode.

    Differential Revision: https://reviews.llvm.org/D62335 (detail)
    by stulova
  2. Print more type node information when dumping the AST to JSON. (detail)
    by aaronballman
  3. [clang][NewPM] Add -fno-experimental-new-pass-manager to tests

    As per the discussion on D58375, we disable test that have optimizations under
    the new PM. This patch adds -fno-experimental-new-pass-manager to RUNS that:

    - Already run with optimizations (-O1 or higher) that were missed in D58375.
    - Explicitly test new PM behavior along side some new PM RUNS, but are missing
      this flag if new PM is enabled by default.
    - Specify -O without the number. Based on getOptimizationLevel(), it seems the
      default is 2, and the IR appears to be the same when changed to -O2, so
      update the test to explicitly say -O2 and provide -fno-experimental-new-pass-manager`.

    Differential Revision: https://reviews.llvm.org/D63156 (detail)
    by leonardchan
  4. [OPENMP]Fix PR42159: do not capture threadprivate variables.

    The threadprivate variables should not be captured in the outlined
    regions, otherwise it leads to the compiler crash. (detail)
    by abataev
  5. Add an automated note to files produced by gen_ast_dump_json_test.py.

    This also details what filters, if any, were used to generate the test output. Updates all the current JSON testing files to include the automated note. (detail)
    by aaronballman
  6. Print information about various type nodes when dumping the AST to JSON. (detail)
    by aaronballman
  7. Fix test/AST/ast-dump-records-json.cpp after ConstantExpr change in D63376 (detail)
    by maskray
  8. [Sema] Fix diagnostic for addr spaces in reference binding

    Extend reference binding behavior to account for address spaces.

    Differential Revision: https://reviews.llvm.org/D62914 (detail)
    by stulova
  9. [Sema] Improved diagnostic for qualifiers in reference binding

    Improved wording and also simplified by using printing
    method from qualifiers.

    Differential Revision: https://reviews.llvm.org/D62914 (detail)
    by stulova
  10. [cmake] Add llvm-dwarfdump to clang test dependencies

    Commit r363496 ("[Clang] Harmonize Split DWARF options with llc",
    2019-06-15) introduced the use of llvm-dwarfdump in the clang tests,
    so ensure the clang tests are dependent on llvm-dwarfdump. (detail)
    by svenvh
  11. [OpenCL] Remove duplicate read_image declarations

    Patch by Pierre Gondois. (detail)
    by svenvh
  12. [RISC-V] Add -msave-restore and -mno-save-restore to clang driver

    Summary:
    The GCC RISC-V toolchain accepts `-msave-restore` and `-mno-save-restore`
    to control whether libcalls are used for saving and restoring the stack within
    prologues and epilogues.

    Clang currently errors if someone passes -msave-restore or -mno-save-restore.
    This means that people need to change build configurations to use clang. This
    patch adds these flags, so that clang invocations can now match gcc.

    As the RISC-V backend does not currently have a `save-restore` target feature,
    we emit a warning if someone requests `-msave-restore`. LLVM does not error if
    we pass the (unimplemented) target features `+save-restore` or `-save-restore`.

    Reviewers: asb, luismarques

    Reviewed By: asb

    Subscribers: rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63498 (detail)
    by lenary
  13. [git-clang-format] recognize hxx as a C++ file

    clangd, clang-tidy, etc does that already, no reason why
    git-clang-format should skip hxx files.

    Reviewed By: ilya-biryukov

    Differential Revision: https://reviews.llvm.org/D63621 (detail)
    by vmiklos
  14. [clang] Small improvments after Adding APValue to ConstantExpr

    Summary:
    this patch has multiple small improvements related to the APValue in ConstantExpr.

    changes:
    - APValue in ConstantExpr are now cleaned up using ASTContext::addDestruction instead of there own system.
    - ConstantExprBits Stores the ValueKind of the result beaing stored.
    - VerifyIntegerConstantExpression now stores the evaluated value in ConstantExpr.
    - the Constant Evaluator uses the stored value of ConstantExpr when available.

    Reviewers: rsmith

    Reviewed By: rsmith

    Subscribers: cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63376 (detail)
    by tyker
  15. [CodeGen][test] Use FileCheck variable matchers for better test support

    Summary: Depending on how clang is built, it may discard the IR names and use names like `%2` instead of `%result.ptr`, causing tests that rely on the IR name to fail. Using FileCheck matchers makes the test work regardless of how clang is built.

    This test passes with both `-fno-discard-value-names` and `-fdiscard-value-names` to make sure it passes regardless of the build mode.

    Reviewers: rnk, akhuang, aprantl, lebedev.ri

    Subscribers: cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63625 (detail)
    by rupprecht
  16. [analyzer] DeadStores: Update the crude suppression for files generated by IIG.

    They changed the comments that we were looking for. (detail)
    by dergachev
  17. [X86] Change LL to O in the definitions for the vp2intersect builtins.

    This is needed to support OpenCL where long long is 128 bits.

    This was done for the other builtins already, but I think
    vp2intersect was in phabricator at the time. (detail)
    by ctopper
  18. Print information about various ObjC expression nodes when dumping the AST to JSON. (detail)
    by aaronballman
  19. AMDGPU: Add DS GWS sema builtins (detail)
    by arsenm
  20. [test][Driver] Fix Clang :: Driver/cl-response-file.c

    Clang :: Driver/cl-response-file.c currently FAILs on Solaris:

      Command Output (stderr):
      --
      /vol/llvm/src/clang/dist/test/Driver/cl-response-file.c:10:11: error: CHECK: expected string not found in input
      // CHECK: "-I" "{{.*}}\\Inputs\\cl-response-file\\" "-D" "FOO=2"
                ^

    Looking at the generated response file reveals that this is no surprise:

      /I/vol/llvm/src/clang/dist/test/Driver\Inputs

    with no newline at the end.  The echo command used to create it boils down to

      echo 'a\cb'

    However, one cannot expect \c to be emitted literally: e.g. bash's builtin
    echo has

      \c        suppress further output

    I've tried various combinations of builtin echo, /usr/bin/echo, GNU echo if
    different, the same for printf, and the backslash unescaped and quoted
    (a\cb and a\\cb).  The only combination that worked reliably on Solaris,
    Linux, and macOS was

      printf 'a\\cb'

    so this is what this patch uses.  Tested on amd64-pc-solaris2.11 and
    x86_64-pc-linux-gnu.

    Differential Revision: https://reviews.llvm.org/D63600 (detail)
    by ro
  21. Rename CodeGenFunction::overlapFor* to getOverlapFor*. (detail)
    by rsmith
  22. P0840R2: support for [[no_unique_address]] attribute

    Summary:
    Add support for the C++2a [[no_unique_address]] attribute for targets using the Itanium C++ ABI.

    This depends on D63371.

    Reviewers: rjmccall, aaron.ballman

    Subscribers: dschuff, aheejin, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63451 (detail)
    by rsmith
  23. [clang-tidy] Fail gracefully upon empty database fields

    Fix bz#42281

    Differential Revision: https://reviews.llvm.org/D63613 (detail)
    by serge_sans_paille
  24. Fix passing structs and AVX vectors through sysv_abi

    Do this the same way we did it for ms_abi in r324594.

    Fixes PR36806. (detail)
    by rnk
  25. Fix crash and rejects-valid when a later template parameter or default
    template argument contains a backreference to a dependently-typed
    earlier parameter.

    In a case like:
      template<typename T, T A, decltype(A) = A> struct X {};
      template<typename U> auto Y = X<U, 0>();
    we previously treated both references to `A` in the third parameter as
    being of type `int` when checking the template-id in `Y`. That`s wrong;
    the type of `A` in these contexts is the dependent type `U`.

    When we encounter a non-type template argument that we can't convert to
    the parameter type because of type-dependence, we now insert a dependent
    conversion node so that the SubstNonTypeTemplateParmExpr for the
    template argument will have the parameter's type rather than whatever
    type the argument had. (detail)
    by rsmith
  26. [clang][NewPM] Do not eliminate available_externally durng `-O2 -flto` runs

    This fixes CodeGen/available-externally-suppress.c when the new pass manager is
    turned on by default. available_externally was not emitted during -O2 -flto
    runs when it should still be retained for link time inlining purposes. This can
    be fixed by checking that we aren't LTOPrelinking when adding the
    EliminateAvailableExternallyPass.

    Differential Revision: https://reviews.llvm.org/D63580 (detail)
    by leonardchan
  27. [clang][NewPM] Move EntryExitInstrumenterPass to the start of the pipeline

    This fixes CodeGen/x86_64-instrument-functions.c when running under the new
    pass manager. The pass should go before any other pass to prevent
    `__cyg_profile_func_enter/exit()` from not being emitted by inlined functions.

    Differential Revision: https://reviews.llvm.org/D63577 (detail)
    by leonardchan
  28. Print additional information about @encode expressions when dumping the AST to JSON. (detail)
    by aaronballman
  29. Print additional information on dependent scopes when dumping the AST to JSON. (detail)
    by aaronballman
  30. [NFC] Fix for InterfaceStubs tests (adding REQUIRES: x86-registered-target).

    clang-hexagon-elf bot was failing with:

    'No available targets are compatible with triple "x86_64-unknown-linux-gnu"'

    Adding a "// REQUIRES: x86-registered-target" to these tests to quiet the bot. (detail)
    by zer0
  31. [X86] Make _mm_mask_cvtps_ph, _mm_maskz_cvtps_ph, _mm256_mask_cvtps_ph, and _mm256_maskz_cvtps_ph aliases for their corresponding cvt_roundps_ph intrinsic.

    These intrinsics should always take an immediate for the rounding mode.
    The base instruction comes from before EVEX embdedded rounding. The
    user should always provide the immediate rather than us assuming
    CUR_DIRECTION.

    Make the 512-bit versions also explicit aliases instead of copy
    pasting the code. (detail)
    by ctopper
  32. [OpenMP] Add support for handling declare target to clause when unified memory is required

    Summary:
    This patch adds support for the handling of the variables under the declare target to clause.

    The variables in this case are handled like link variables are. A pointer is created on the host and then mapped to the device. The runtime will then copy the address of the host variable in the device pointer.

    Reviewers: ABataev, AlexEichenberger, caomhin

    Reviewed By: ABataev

    Subscribers: guansong, jdoerfert, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63108 (detail)
    by gbercea
  33. Store a pointer to the return value in a static alloca and let the debugger use that
    as the variable address for NRVO variables.

    Subscribers: hiraditya, cfe-commits, llvm-commits

    Tags: #clang, #llvm

    Differential Revision: https://reviews.llvm.org/D63361 (detail)
    by akhuang
  34. [clang-ifs] Clang Interface Stubs, first version (second landing attempt).

    This change reverts r363649; effectively re-landing r363626. At this point
    clang::Index::CodegenNameGeneratorImpl has been refactored into
    clang::AST::ASTNameGenerator. This makes it so that the previous circular link
    dependency no longer exists, fixing the previous share lib
    (-DBUILD_SHARED_LIBS=ON) build issue which was the reason for r363649.

    Clang interface stubs (previously referred to as clang-ifsos) is a new frontend
    action in clang that allows the generation of stub files that contain mangled
    name info that can be used to produce a stub library. These stub libraries can
    be useful for breaking up build dependencies and controlling access to a
    library's internal symbols. Generation of these stubs can be invoked by:

    clang -fvisibility=<visibility> -emit-interface-stubs \
                                    -interface-stub-version=<interface format>

    Notice that -fvisibility (along with use of visibility attributes) can be used
    to control what symbols get generated. Currently the interface format is
    experimental but there are a wide range of possibilities here.

    Currently clang-ifs produces .ifs files that can be thought of as analogous to
    object (.o) files, but just for the mangled symbol info. In a subsequent patch
    I intend to add support for merging the .ifs files into one .ifs/.ifso file
    that can be the input to something like llvm-elfabi to produce something like a
    .so file or .dll (but without any of the code, just symbols).

    Differential Revision: https://reviews.llvm.org/D60974 (detail)
    by zer0
  35. [Sema] Diagnose addr space mismatch while constructing objects

    If we construct an object in some arbitrary non-default addr space
    it should fail unless either:
    - There is an implicit conversion from the address space to default
    /generic address space.
    - There is a matching ctor qualified with an address space that is
    either exactly matching or convertible to the address space of an
    object.

    Differential Revision: https://reviews.llvm.org/D62156 (detail)
    by stulova
  36. Dump more information about expressions involving temporaries when dumping the AST to JSON. (detail)
    by aaronballman
  37. AIX system headers need stdint.h and inttypes.h to be re-enterable

    Summary:
    AIX system headers need stdint.h and inttypes.h to be re-enterable when macro _STD_TYPES_T is defined so that limit macro definitions such as UINT32_MAX can be found. This patch attempts to allow that on AIX.

    Reviewers: hubert.reinterpretcast, jasonliu, mclow.lists, EricWF

    Reviewed by: hubert.reinterpretcast, mclow.lists

    Subscribers: jfb, jsji, christof, cfe-commits, libcxx-commits, llvm-commits

    Tags: #LLVM, #clang, #libc++

    Differential Revision: https://reviews.llvm.org/D59253 (detail)
    by xingxue
  38. Removing a helper function that was trivial to inline into its only use; NFC. (detail)
    by aaronballman
  39. Add test cases for explicit casts when dumping the AST to JSON; NFC. (detail)
    by aaronballman
  40. Dump more information about construct expressions (resolved and unresolved) when dumping the AST to JSON. (detail)
    by aaronballman
  41. Revert "[clang] Fixing windows buildbot after D61552"

    This reverts commit 5d5d2ca69e2b29b36db1a7dd1993ead7b7d2680f.

    has already been fixed by c230eea2f349533468e14672eee94c2016476784 (detail)
    by tyker
  42. [clang] Fixing windows buildbot after D61552

    Summary:
    original review : https://reviews.llvm.org/D61552

    build bot faillure : http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/110

    this adds a missing definition of cxxDeductionGuideDecl.
    surprisingly it was still working on linux with out it.

    Reviewers: aaron.ballman

    Differential Revision: https://reviews.llvm.org/D63592 (detail)
    by tyker
  43. [clang][ASTMatchers] Add definition for cxxDeductionGuideDecl introduced in rL363855 (detail)
    by kadircet
  44. [Testing] Dumping the graph requires assertions be enabled (detail)
    by davezarzycki
  45. [clang][AST] Refactoring ASTNameGenerator to use pimpl pattern (NFC).

    The original pimpl pattern used between CodegenNameGenerator and
    CodegenNameGeneratorImpl did a good job of hiding DataLayout making it so that
    users of CodegenNameGenerator did not need to link with llvm core.  This is an
    NFC change to neatly wrap ASTNameGenerator in a pimpl.

    Differential Revision: https://reviews.llvm.org/D63584 (detail)
    by zer0
  46. [analyzer] exploded-graph-rewriter: Implement a --diff mode.

    In this mode the tool would avoid duplicating the contents of the
    program state on every node, replacing them with a diff-like dump
    of changes that happened on that node.

    This is useful because most of the time we only interested in whether
    the effect of the statement was modeled correctly. A diffed graph would
    also be much faster to load and navigate, being much smaller than
    the original graph.

    The diffs are computed "semantically" as opposed to plain text diffs.
    I.e., the diff algorithm is hand-crafted separately for every state trait,
    taking the underlying data structures into account. This is especially nice
    for Environment because textual diffs would have been terrible.
    On the other hand, it requires some boilerplate to implement.

    Differential Revision: https://reviews.llvm.org/D62761 (detail)
    by dergachev
  47. [analyzer] exploded-graph-rewriter: Fix escaping StringRegions.

    Quotes around StringRegions are now escaped and unescaped correctly,
    producing valid JSON.

    Additionally, add a forgotten escape for Store values.

    Differential Revision: https://reviews.llvm.org/D63519 (detail)
    by dergachev
  48. [analyzer] Fix JSON dumps for store clusters.

    Include a unique pointer so that it was possible to figure out if it's
    the same cluster in different program states. This allows comparing
    dumps of different states against each other.

    Differential Revision: https://reviews.llvm.org/D63362 (detail)
    by dergachev
  49. [analyzer] Fix JSON dumps for location contexts.

    Location context ID is a property of the location context, not of an item
    within it. It's useful to know the id even when there are no items
    in the context, eg. for the purposes of figuring out how did contents
    of the Environment for the same location context changed across states.

    Differential Revision: https://reviews.llvm.org/D62754 (detail)
    by dergachev
  50. [analyzer] Fix JSON dumps for dynamic type information.

    They're now valid JSON.

    Differential Revision: https://reviews.llvm.org/D62716 (detail)
    by dergachev
  51. [analyzer] NFC: Change evalCall() to provide a CallEvent.

    This changes the checker callback signature to use the modern, easy to
    use interface. Additionally, this unblocks future work on allowing
    checkers to implement evalCall() for calls that don't correspond to any
    call-expression or require additional information that's only available
    as part of the CallEvent, such as C++ constructors and destructors.

    Differential Revision: https://reviews.llvm.org/D62440 (detail)
    by dergachev
  52. [analyzer] DeadStores: Add a crude suppression files generated by DriverKit IIG.

    IIG is a replacement for MIG in DriverKit: IIG is autogenerating C++ code.
    Suppress dead store warnings on such code, as the tool seems to be producing
    them regularly, and the users of IIG are not in position to address these
    warnings, as they don't control the autogenerated code. IIG-generated code
    is identified by looking at the comments at the top of the file.

    Differential Revision: https://reviews.llvm.org/D63118 (detail)
    by dergachev
  53. [analyzer] RetainCount: Add support for OSRequiredCast().

    It's a new API for custom RTTI in Apple IOKit/DriverKit framework that is
    similar to OSDynamicCast() that's already supported, but crashes instead of
    returning null (and therefore causing UB when the cast fails unexpectedly).
    Kind of like cast_or_null<> as opposed to dyn_cast_or_null<> in LLVM's RTTI.

    Historically, RetainCountChecker was responsible for modeling OSDynamicCast.
    This is simply an extension of the same functionality.

    Differential Revision: https://reviews.llvm.org/D63117 (detail)
    by dergachev
  54. [X86] Correct the __min_vector_width__ attribute on a few intrinsics. (detail)
    by ctopper
  55. [clang][AST] ASTNameGenerator: A refactoring of CodegenNameGeneratorImpl (NFC).

    This is a NFC refactor move of CodegenNameGeneratorImpl from clang::Index to
    clang:AST (and rename to ASTNameGenerator). The purpose is to make the
    highlevel mangling code more reusable inside of clang (say in places like clang
    FrontendAction). This does not affect anything in CodegenNameGenerator, except
    that CodegenNameGenerator will now use ASTNameGenerator (in AST).

    Differential Revision: https://reviews.llvm.org/D63535 (detail)
    by zer0
  56. Print whether a generic selection expression is result dependent when dumping the AST to JSON. (detail)
    by aaronballman
  57. Reapply "r363684: AMDGPU: Add GWS instruction builtins" (detail)
    by arsenm
  58. Print out the union field being initialized by an InitListExpr when dumping the AST to JSON. (detail)
    by aaronballman
  59. Dump the value calculated by a constant expression when dumping the AST to JSON. (detail)
    by aaronballman
  60. Switching this test to use output generated by script; NFC. (detail)
    by aaronballman
  61. [AST] Fixed extraneous warnings for binary conditional operator

    Summary:
    Binary conditional operator gave warnings where ternary operators
    did not. They have been fixed to warn similarly to ternary operators.

    Link: https://bugs.llvm.org/show_bug.cgi?id=42239

    Reviewers: rsmith, aaron.ballman, nickdesaulniers

    Reviewed By: rsmith, nickdesaulniers

    Subscribers: srhines, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63369 (detail)
    by nathan-huckleberry
  62. [clang] Adapt ASTMatcher to explicit(bool) specifier

    Summary:
    Changes:
    - add an ast matcher for deductiong guide.
    - allow isExplicit matcher for deductiong guide.
    - add hasExplicitSpecifier matcher which give access to the expression of the explicit specifier if present.

    Reviewers: klimek, rsmith, aaron.ballman

    Reviewed By: aaron.ballman

    Subscribers: aaron.ballman, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D61552 (detail)
    by tyker
  63. Add test cases for dumping record definition data to JSON; NFC. (detail)
    by aaronballman
  64. [clang][test] Add missing LambdaTemplateParams test and migrate from getLocStart

    These were removed a long time ago in r341573, but this test was missed because it was not in cmake (detail)
    by rupprecht
  65. [clang][NewPM] Fixing remaining -O0 tests that are broken under new PM

    - CodeGen/flatten.c will fail under new PM becausec the new PM AlwaysInliner
      seems to intentionally inline functions but not call sites marked with
      alwaysinline (D23299)
    - Tests that check remarks happen to check them for the inliner which is not
      turned on at O0. These tests just check that remarks work, but we can make
      separate tests for the new PM with -O1 so we can turn on the inliner and
      check the remarks with minimal changes.

    Differential Revision: https://reviews.llvm.org/D62225 (detail)
    by leonardchan
  66. Unify DependencyFileGenerator class and DependencyCollector interface (NFCI)

    Make DependencyFileGenerator a DependencyCollector as it was intended when
    DependencyCollector was introduced. The missing PPCallbacks overrides are added to
    the DependencyCollector as well.

    This change will allow clang-scan-deps to access the produced dependencies without
    writing them out to .d files to disk, so that it will be able collate them and
    report them to the user.

    Differential Revision: https://reviews.llvm.org/D63290 (detail)
    by arphaman
  67. [NFC][codeview] Avoid undefined grep in debug-info-codeview-display-name.cpp

    vertical-line is not a BRE special character.

    POSIX.1-2017 XBD Section 9.3.2 indicates that the interpretation of `\|`
    is undefined. This patch uses an ERE instead. (detail)
    by hubert.reinterpretcast
  68. Revert rL363684 : AMDGPU: Add GWS instruction builtins
    ........
    Depends on rL363678 which was reverted at rL363797 (detail)
    by rksimon
  69. [analyzer] SARIF: Add EOF newline; replace diff_sarif

    Summary:
    This patch applies a change similar to rC363069, but for SARIF files.

    The `%diff_sarif` lit substitution invokes `diff` with a non-portable
    `-I` option. The intended effect can be achieved by normalizing the
    inputs to `diff` beforehand. Such normalization can be done with
    `grep -Ev`, which is also used by other tests.

    Additionally, this patch updates the SARIF output to have a newline at
    the end of the file. This makes it so that the SARIF file qualifies as a
    POSIX text file, which increases the consumability of the generated file
    in relation to various tools.

    Reviewers: NoQ, sfertile, xingxue, jasonliu, daltenty, aaron.ballman

    Reviewed By: aaron.ballman

    Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, Charusso, jsji, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D62952 (detail)
    by hubert.reinterpretcast
  70. Add a script to help generate expected test output for dumping the AST to JSON.

    Patch by Abhishek Bhaskar. (detail)
    by aaronballman
  71. Change the way we output templates for JSON AST dumping and dump information about template arguments.

    Previously, we attempted to write out template parameters and specializations to their own array, but due to the architecture of the ASTNodeTraverser, this meant that other nodes were not being written out. This now follows the same behavior as the regular AST dumper and puts all the (correct) information into the "inner" array. When we correct the AST node traverser itself, we can revisit splitting this information into separate arrays again. (detail)
    by aaronballman
  72. [OpenMP] Strengthen regression tests for task allocation under nowait depend clauses NFC

    Summary:
    This patch strengthens the tests introduced in D63009 by:
    - adding new test for default device ID.
    - modifying existing tests to pass device ID local variable to the task allocation function.

    Reviewers: ABataev, Hahnfeld, caomhin, jdoerfert

    Reviewed By: ABataev

    Subscribers: guansong, jdoerfert, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63454 (detail)
    by gbercea
  73. Allow copy/move assignment operator to be coroutine as per N4775

    This change fixes https://bugs.llvm.org/show_bug.cgi?id=40997.

    Reviewers: GorNishanov, rsmith
    Reviewed by: GorNishanov
    Subscribers: cfe-commits, lewissbaker, modocache, llvm-commits

    Differential Revision: https://reviews.llvm.org/D63381 (detail)
    by vivekvpandya
  74. [Syntax] Fix a crash when dumping empty token buffer (detail)
    by ibiryukov
  75. [OpenCL] Split type and macro definitions into opencl-c-base.h

    Using the -fdeclare-opencl-builtins option will require a way to
    predefine types and macros such as `int4`, `CLK_GLOBAL_MEM_FENCE`,
    etc.  Move these out of opencl-c.h into opencl-c-base.h such that the
    latter can be shared by -fdeclare-opencl-builtins and
    -finclude-default-header.

    This changes the behaviour of -finclude-default-header when
    -fdeclare-opencl-builtins is specified: instead of including the full
    header, it will include the header with only the base definitions.

    Differential revision: https://reviews.llvm.org/D63256 (detail)
    by svenvh
  76. Revert r363116 "[X86] [ABI] Fix i386 ABI "__m64" type bug"

    This introduced MMX instructions in code that wasn't previously using
    them, breaking programs using 64-bit vectors and x87 floating-point in
    the same application. See discussion on the code review for more
    details.

    > According to System V i386 ABI: the  __m64 type paramater and return
    > value are passed by MMX registers. But current implementation treats
    > __m64 as i64 which results in parameter passing by stack and returning
    > by EDX and EAX.
    >
    > This patch fixes the bug (https://bugs.llvm.org/show_bug.cgi?id=41029)
    > for Linux and NetBSD.
    >
    > Patch by Wei Xiao (wxiao3)
    >
    > Differential Revision: https://reviews.llvm.org/D59744 (detail)
    by hans
  77. [analyzer][NFC][tests] Pre-normalize expected-sarif files

    As discussed in the review for D62952, this patch pre-normalizes the
    reference expected output sarif files by removing lines containing
    fields for which we expect differences that should be ignored. (detail)
    by hubert.reinterpretcast
  78. [RISCV] Mark TLS as supported

    Inform Clang that TLS is implemented by LLVM for RISC-V

    Differential Revision: https://reviews.llvm.org/D57055 (detail)
    by lewis-revill
  79. git-clang-format: Remove trailing whitespace in docstring. NFC.

    Differential Revision: https://reviews.llvm.org/D62915 (detail)
    by sbc
  80. Fix tests after r363749

    We changed -Wmissing-prototypes there, which was used in these tests via
    -Weverything. (detail)
    by aaronpuchert
  81. Suggestions to fix -Wmissing-{prototypes,variable-declarations}

    Summary:
    I've found that most often the proper way to fix this warning is to add
    `static`, because if the code otherwise compiles and links, the function
    or variable is apparently not needed outside of the TU.

    We can't provide a fix-it hint for variable declarations, because
    multiple VarDecls can share the same type, and if we put static in front
    of that, we affect all declared variables, some of which might have
    previous declarations.

    We also provide no fix-it hint for the rare case of an `extern` function
    definition, because that would require removing `extern` and I have no
    idea how to get the source location of the storage class specifier from
    a FunctionDecl. I believe this information is only available earlier in
    the AST construction from DeclSpec::getStorageClassSpecLoc(), but we
    don't have that here.

    Reviewed By: aaron.ballman

    Differential Revision: https://reviews.llvm.org/D59402 (detail)
    by aaronpuchert
  82. Show note for -Wmissing-prototypes for functions with parameters

    Summary:
    There was a search for non-prototype declarations for the function, but
    we only showed the results for zero-parameter functions. Now we show the
    note for functions with parameters as well, but we omit the fix-it hint
    suggesting to add `void`.

    Reviewed By: aaron.ballman

    Differential Revision: https://reviews.llvm.org/D62750 (detail)
    by aaronpuchert
  83. [test] NFC, udpate clang-scan-deps tests to not use -c to avoid driver issues when no integrated assembler is present

    Caught by Douglas Yung. (detail)
    by arphaman
  84. [OPENMP]Use host's mangling for 128 bit float types on the device.

    Device have to use the same mangling as the host for 128bit float types. Otherwise, the codegen for the device is unable to find the parent function when it tries to generate the outlined function for the target region and it leads to incorrect compilation and crash at the runtime. (detail)
    by abataev
  85. [OPENMP][NVPTX]Correct codegen for 128 bit long double.

    If the host uses 128 bit long doubles, the compiler should generate correct code for NVPTX devices. If the return type has 128 bit long doubles, in LLVM IR this type must be coerced to int array instead. (detail)
    by abataev
  86. [OPENMP]Use host's long double when compiling the code for device.

    The device code must use the same long double type as the host.
    Otherwise the code cannot be linked and executed properly. Patch adds
    only basic support and checks for supporting of the host long double
    double on the device. (detail)
    by abataev
  87. Add test cases for dumping AST function decl nodes to JSON; NFC. (detail)
    by aaronballman
  88. Add test cases for dumping AST decl nodes to JSON; NFC. (detail)
    by aaronballman
  89. [Syntax] Add a helper to find expansion by its first spelled token

    Summary: Used in clangd for a code tweak that expands a macro.

    Reviewers: sammccall

    Reviewed By: sammccall

    Subscribers: kadircet, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D62954 (detail)
    by ibiryukov
  90. [CodeGen][ARM] Fix FP16 vector coercion

    Summary:
    When a function argument or return type is a homogeneous aggregate
    which contains an FP16 vector but the target does not support FP16
    operations natively, the type must be converted into an array of
    integer vectors by then front end (otherwise LLVM will handle FP16
    vectors incorrectly by scalarizing them and promoting FP16 to float,
    see https://reviews.llvm.org/D50507).

    Currently the logic for checking whether or not a given homogeneous
    aggregate contains FP16 vectors is incorrect: it only looks at the
    type of the first vector.

    This patch fixes the issue by adding a new method
    ARMABIInfo::containsAnyFP16Vectors and using it. The traversal logic
    of this method is largely the same as in
    ABIInfo::isHomogeneousAggregate.

    Reviewers: eli.friedman, olista01, ostannard

    Reviewed By: ostannard

    Subscribers: ostannard, john.brawn, javed.absar, kristof.beyls, pbarrio, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63437 (detail)
    by miyuki
  91. AMDGPU: Add GWS instruction builtins (detail)
    by arsenm
  92. AMDGPU: Disable errno by default (detail)
    by arsenm
  93. Require commas to separate multiple GNU-style attributes in the same attribute list.

    Fixes PR38352. (detail)
    by aaronballman
  94. Fix compiler warning by removing unused variable (detail)
    by uabelho
  95. Revert D60974 "[clang-ifs] Clang Interface Stubs, first version."

    This reverts commit rC363626.

    clangIndex depends on clangFrontend. r363626 adds a dependency from
    clangFrontend to clangIndex, which creates a circular dependency.

    This is disallowed by -DBUILD_SHARED_LIBS=on builds:

        CMake Error: The inter-target dependency graph contains the following strongly connected component (cycle):
          "clangFrontend" of type SHARED_LIBRARY
            depends on "clangIndex" (weak)
          "clangIndex" of type SHARED_LIBRARY
            depends on "clangFrontend" (weak)
        At least one of these targets is not a STATIC_LIBRARY.  Cyclic dependencies are allowed only among static libraries.

    Note, the dependency on clangIndex cannot be removed because
    libclangFrontend.so is linked with -Wl,-z,defs: a shared object must
    have its full direct dependencies specified on the linker command line.

    In -DBUILD_SHARED_LIBS=off builds, this appears to work when linking
    `bin/clang-9`. However, it can cause trouble to downstream clang library
    users. The llvm build system links libraries this way:

        clang main_program_object_file ... lib/libclangIndex.a ...  lib/libclangFrontend.a -o exe

    libclangIndex.a etc are not wrapped in --start-group.

    If the downstream application depends on libclangFrontend.a but not any
    other clang libraries that depend on libclangIndex.a, this can cause undefined
    reference errors when the linker is ld.bfd or gold.

    The proper fix is to not include clangIndex files in clangFrontend. (detail)
    by maskray
  96. [NFC] Undoing r363646 to fix bots.

    -DBUILD_SHARED_LIBS=ON is still having problem caused by layering issues with
    D60974. Locally there weren't problems building with shared libs on or off but
    the bots appear to be acting up. (detail)
    by zer0
  97. [NFC] Fixing -DBUILD_SHARED_LIBS=ON problem caused by layering issue in  D60974 (detail)
    by zer0
  98. [Remarks][Driver] Use the specified format in the remarks file extension

    By default, use `.opt.yaml`, but when a format is specified with
    `-fsave-optimization-record=<format>`, use `.opt.<format>`. (detail)
    by thegameg
  99. [clang-ifs] Clang Interface Stubs, first version.

    Clang interface stubs (previously referred to as clang-ifsos) is a new frontend
    action in clang that allows the generation of stub files that contain mangled
    name info that can be used to produce a stub library. These stub libraries can
    be useful for breaking up build dependencies and controlling access to a
    library's internal symbols. Generation of these stubs can be invoked by:

    clang -fvisibility=<visibility> -emit-interface-stubs \
                                    -interface-stub-version=<interface format>

    Notice that -fvisibility (along with use of visibility attributes) can be used
    to control what symbols get generated. Currently the interface format is
    experimental but there are a wide range of possibilities here.

    Differential Revision: https://reviews.llvm.org/D60974 (detail)
    by zer0
  100. Fix crash when checking a dependently-typed reference that is
    initialized from a non-value-dependent initializer. (detail)
    by rsmith
  101. Rewrite ConstStructBuilder with a mechanism that can cope with splitting and updating constants.

    Summary:
    This adds a ConstantBuilder class that deals with incrementally building
    an aggregate constant, including support for overwriting
    previously-emitted parts of the aggregate with new values.

    This fixes a bunch of cases where we used to be unable to reduce a
    DesignatedInitUpdateExpr down to an IR constant, and also lays some
    groundwork for emission of class constants with [[no_unique_address]]
    members.

    Reviewers: rjmccall

    Subscribers: cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63371 (detail)
    by rsmith
  102. Clang :: Sema/wchar.c has long been failing on Solaris:

      error: 'error' diagnostics expected but not seen:
        File /vol/llvm/src/clang/local/test/Sema/wchar.c Line 22: initializing wide char array with non-wide string literal
      error: 'error' diagnostics seen but not expected:
        File /vol/llvm/src/clang/local/test/Sema/wchar.c Line 20: array initializer must be an initializer list
        File /vol/llvm/src/clang/local/test/Sema/wchar.c Line 22: array initializer must be an initializer list

    It turns out the definition is wrong, as can be seen in GCC's gcc/config/sol2.h:

      /* wchar_t is called differently in <wchar.h> for 32 and 64-bit
         compilations.  This is called for by SCD 2.4.1, p. 6-83, Figure 6-65
         (32-bit) and p. 6P-10, Figure 6.38 (64-bit).  */
     
      #undef WCHAR_TYPE
      #define WCHAR_TYPE (TARGET_64BIT ? "int" : "long int")

    The following patch implements this, and at the same time corrects the wint_t
    definition which is the same:

      /* Same for wint_t.  See SCD 2.4.1, p. 6-83, Figure 6-66 (32-bit).  There's
         no corresponding 64-bit definition, but this is what Solaris 8
         <iso/wchar_iso.h> uses.  */
     
      #undef WINT_TYPE
      #define WINT_TYPE (TARGET_64BIT ? "int" : "long int")

    Clang :: Preprocessor/wchar_t.c and Clang :: Sema/format-strings.c need to
    be adjusted to account for that.

    Tested on i386-pc-solaris2.11, x86_64-pc-solaris2.11, and x86_64-pc-linux-gnu.

    Differential Revision: https://reviews.llvm.org/D62944 (detail)
    by ro
  103. PR42205: DebugInfio: Do not attempt to emit debug info metadata for static member variable template partial specializations

    Would cause a crash in an attempt to create the type for the still
    unresolved 'auto' in the partial specialization (& even without the use
    of 'auto', the expression would be value dependent &
    crash/assertion-fail there). (detail)
    by dblaikie
  104. [clang][AST] Remove unnecessary 'const'. (detail)
    by hliao
  105. Various improvements to Clang MSVC Visualizer

    This change adds/improves MSVC visualizers for many Clang types, including array types, trailing return types in function, deduction guides, a fix for OpaquePtr, etc. It also replaces all of the view(deref) with the "na" formatter, which is a better built-in natvis technique for doing the same thing.

    Differential Revision: https://reviews.llvm.org/D63039 (detail)
    by mps
  106. [Remarks] Extend -fsave-optimization-record to specify the format

    Use -fsave-optimization-record=<format> to specify a different format
    than the default, which is YAML.

    For now, only YAML is supported. (detail)
    by thegameg
  107. [clang][CodeGen] Remove std::move on temporary (detail)
    by kadircet
  108. [HIP] Add the interface deriving the stub name of device kernels.

    Summary:
    - Revise the interface to derive the stub name and simplify the
      assertion of it.

    Reviewers: yaxunl, tra

    Subscribers: cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63335 (detail)
    by hliao
  109. Promote -fdebug-compilation-dir from a cc1 flag to clang and clang-cl driver flags

    The flag is useful when wanting to create .o files that are independent
    from the absolute path to the build directory. -fdebug-prefix-map= can
    be used to the same effect, but it requires putting the absolute path
    to the build directory on the build command line, so it still requires
    the build command line to be dependent on the absolute path of the build
    directory. With this flag, "-fdebug-compilation-dir ." makes it so that
    both debug info and the compile command itself are independent of the
    absolute path of the build directory, which is good for build
    determinism (in the sense that the build is independent of which
    directory it happens in) and for caching compile results.
    (The tradeoff is that the debugger needs explicit configuration to know
    the build directory. See also http://dwarfstd.org/ShowIssue.php?issue=171130.2)

    Differential Revision: https://reviews.llvm.org/D63387 (detail)
    by nico
  110. Recommit [OpenCL] Move OpenCLBuiltins.td and remove unused include

    Reland r363242 after fixing an issue with the tablegen dependence.

    Patch by Pierre Gondois and Sven van Haastregt.

    Differential revision: https://reviews.llvm.org/D62849 (detail)
    by svenvh
  111. Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"

    Third time's the charm.

    This was reverted in r363220 due to being suspected of an internal benchmark
    regression and a test failure, none of which turned out to be caused by this. (detail)
    by hans
  112. [docs] Fix another bot error by setting highlight language of objc code-block to objc instead of c++. (detail)
    by dhinton
Revision: 362564
Changes
  1. [clangd] Add include-mapping for C symbols.

    Summary:
    This resolves the issue of introducing c++-style includes for C files.

    - refactor the gen_std.py, make it reusable for parsing C symbols.
    - add a language mode to the mapping method to use different mapping for
      C and C++ files.

    Reviewers: kadircet

    Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, jfb, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63270 (detail)
    by hokein
  2. [clang-tidy] Fix a typo in the doc. (detail)
    by hokein
  3. [clang-tidy] Move test files of rL363975 into Inputs directory (detail)
    by kadircet
  4. [clang-tidy] Fail gracefully upon empty database fields

    Fix bz#42281

    Differential Revision: https://reviews.llvm.org/D63613 (detail)
    by serge_sans_paille
  5. [clangd] Include the diagnostics's code when comparing diagnostics

    Summary: This fixes https://github.com/clangd/clangd/issues/60

    Reviewers: kadircet

    Reviewed By: kadircet

    Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63316 (detail)
    by nridge
  6. [clangd] Consume error returned by cleanupAndFormat

    When called by ClangdServer::applyTweak.
    No idea how to actually trigger this in practice, so no tests. (detail)
    by ibiryukov
  7. [clangd] Format changes produced by rename

    Reviewers: hokein, kadircet, sammccall

    Reviewed By: kadircet

    Subscribers: MaskRay, jkorous, arphaman, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63562 (detail)
    by ibiryukov
  8. [clangd] Collect tokens of main files when building the AST

    Summary:
    The first use of this is a code tweak to expand macro calls.
    Will later be used to build syntax trees.

    The memory overhead is small as we only store tokens of the main file.

    Reviewers: sammccall

    Reviewed By: sammccall

    Subscribers: mgorny, MaskRay, jkorous, arphaman, kadircet, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D62956 (detail)
    by ibiryukov
  9. [clangd] Correct the MessageType enum values. (detail)
    by hokein
  10. Revert "[clangd] Return vector<TextEdit> from applyTweak. NFC"

    This reverts commit r363691. (detail)
    by sammccall
  11. [clangd] Add ClangdServer accessor for buffer contents (detail)
    by sammccall
  12. [TEST] Fix test on Windows by looking for substrings rather than a regex
    since the escaping of special characters appears to break on Windows. (detail)
    by dyung
  13. Fix more tests after r363749

    Apparently -Wmissing-prototypes is used for quite a few integration
    tests. (detail)
    by aaronpuchert
  14. [clang-tidy] Split fuchsia-default-arguments

    Splits fuchsia-default-arguments check into two checks. fuchsia-default-arguments-calls warns if a function or method is called with default arguments. fuchsia-default-arguments-declarations warns if a function or method is declared with default parameters.

    Committed on behalf of @diegoast (Diego Astiazarán).

    Resolves b38051.

    Differential Revision: https://reviews.llvm.org/D62437 (detail)
    by juliehockett
  15. [clangd] Return vector<TextEdit> from applyTweak. NFC

    For the same reasons as r363150, which got overwritten by changes in
    r363680.

    Sending without review to unbreak our integrate. (detail)
    by ibiryukov
  16. [clangd] Remove the extra ";", NFC (detail)
    by hokein
  17. [clangd] Add hidden tweaks to dump AST/selection.

    Summary:
    This introduces a few new concepts:
    - tweaks have an Intent (they don't all advertise as refactorings)
    - tweaks may produce messages (for ShowMessage notification). Generalized
       Replacements -> Effect.
    - tweaks (and other features) may be hidden (clangd -hidden-features flag).
       We may choose to promote these one day. I'm not sure they're worth their own
       feature flags though.

    Verified it in vim-clangd (not yet open source), curious if the UI is ok in VSCode.

    Reviewers: ilya-biryukov

    Subscribers: mgorny, MaskRay, jkorous, arphaman, kadircet, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D62538 (detail)
    by sammccall
  18. [clangd] Add a capability to enable completions with fixes.

    Reviewers: ilya-biryukov

    Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63091 (detail)
    by sammccall
  19. [clangd] Parse files without extensions if we don't have a compile command.

    Summary: This would enable clangd for C++ standard library files.

    Reviewers: sammccall

    Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63481 (detail)
    by hokein
  20. [clangd] Detect C++ language based on well-known file path in vscode extension

    Summary:
    Matching the "C++" pattern on the first line of the file doesn't cover
    all cases, MSVC C++ headers doesn't have such pattern. This patch
    introduce a new heuristic to detect language based on the file path.

    MSVC C++ standard headers are in the directory like
    "c:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\VC\Tools\MSVC\14.15.26726\include"

    Reviewers: sammccall

    Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63483 (detail)
    by hokein
  21. [clangd] Perform merge for main file symbols.

    Summary:
    Previously, we randomly pick one main file symbol in dynamic index, we
    may loose the ideal symbol (with definition location) in the index.

    It fixes the issue where sometimes we fail to go to the symbol definition, see:

    1. call go-to-decl on Foo in Foo.cpp
    2. jump to Foo.h, call go-to-def on Foo in Foo.h

    we can't go back to Foo.cpp -- because we open Foo.cpp, Foo.h in clangd, both
    files have Foo symbol (one with def&decl, one with decl only), we randomely
    choose one.

    Reviewers: kadircet

    Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63425 (detail)
    by hokein
  22. [clangd] Bump vscode-clangd v0.0.15.

    CHANGELOG:
    - support detecting C++ language from first line (`-*- C++ -*-`) of the file. (detail)
    by hokein
  23. [clangd] Detect C++ for extension-less source files in vscode extension

    Summary:
    Extend our extension to support detecting these files as C++ files based on the first
    line (`-*- C++ -*-`), it will make clangd work on C++ standard headers
    (e.g. iostream).

    We use the contributes.languages[1] to enrich the builtin VScode C++
    support.

    [1]: https://code.visualstudio.com/api/references/contribution-points#contributes.languages

    Reviewers: kadircet

    Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63397 (detail)
    by hokein
  24. [docs] Fix another bot warning by adding a blank line to separate the `option::` command from the text below. (detail)
    by dhinton
Revision: 362564
Changes
  1. [asan] Avoid two compiler-synthesized calls to memset & memcpy

    Otherwise the tests hang on Windows attempting to report nested errors.

    Reviewed By: vitalybuka

    Differential Revision: https://reviews.llvm.org/D63627 (detail)
    by rnk
  2. [libFuzzer] split DataFlow.cpp into two .cpp files, one of which can be compiled w/o dfsan to speed things up (~25% speedup) (detail)
    by kcc
  3. [libFuzzer] ensure that DFT and autofocus works for C++ (mangled) functions (detail)
    by kcc
  4. Specify log level for CMake messages (less stderr)

    Summary:
    Specify message levels in CMake. Prefer STATUS (stdout).

    As the default message mode (i.e. level) is NOTICE in CMake, more then necessary messages get printed to stderr. Some tools,  noticably ccmake treat this as an error and require additional confirmation and re-running CMake's configuration step.

    This commit specifies a mode (either STATUS or WARNING or FATAL_ERROR)  instead of the default.

    * I used `csearch -f 'llvm-project/.+(CMakeLists\.txt|cmake)' -l 'message\("'` to find all locations.
    * Reviewers were chosen by the most common authors of specific files. If there are more suitable reviewers for these CMake changes, please let me know.

    Patch by: Christoph Siedentop

    Reviewers: zturner, beanz, xiaobai, kbobyrev, lebedev.ri, sgraenitz

    Reviewed By: sgraenitz

    Subscribers: mgorny, lebedev.ri, #sanitizers, lldb-commits, llvm-commits

    Tags: #sanitizers, #lldb, #llvm

    Differential Revision: https://reviews.llvm.org/D63370 (detail)
    by stefan.graenitz
  5. [libFuzzer] Remove too aggressive static_assert in FuzzedDataProvider.

    Summary:
    http://lab.llvm.org:8011/builders/clang-cmake-aarch64-full/builds/31

    error: static_assert failed due to requirement
    'std::numeric_limits<char>::is_signed' "Destination type must be
    signed."
        static_assert(std::numeric_limits<TS>::is_signed,
        ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    /home/buildslave/buildslave/clang-cmake-aarch64-full/llvm/projects/compiler-rt/lib/fuzzer/utils/FuzzedDataProvider.h:126:19:
    note: in instantiation of function template specialization
    'FuzzedDataProvider::ConvertUnsignedToSigned<char, unsigned char>'
    requested here
          char next = ConvertUnsignedToSigned<char>(data_ptr_[0]);
                      ^
    1 error generated.

    Reviewers: Dor1s

    Reviewed By: Dor1s

    Subscribers: javed.absar, kristof.beyls, delcypher, #sanitizers, llvm-commits

    Tags: #llvm, #sanitizers

    Differential Revision: https://reviews.llvm.org/D63553 (detail)
    by dor1s
  6. Revert r363633 "[CMake] Fix the value of `config.target_cflags` for non-macOS Apple platforms. Attempt #2."

    This caused Chromium's clang package to stop building, see comment on
    https://reviews.llvm.org/D61242 for details.

    > Summary:
    > The main problem here is that `-*-version_min=` was not being passed to
    > the compiler when building test cases. This can cause problems when
    > testing on devices running older OSs because Clang would previously
    > assume the minimum deployment target is the the latest OS in the SDK
    > which could be much newer than what the device is running.
    >
    > Previously the generated value looked like this:
    >
    > `-arch arm64 -isysroot
    > <path_to_xcode>/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.1.sdk`
    >
    > With this change it now looks like:
    >
    > `-arch arm64 -stdlib=libc++ -miphoneos-version-min=8.0 -isysroot
    > <path_to_xcode>/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.1.sdk`
    >
    > This mirrors the setting of `config.target_cflags` on macOS.
    >
    > This change is made for ASan, LibFuzzer, TSan, and UBSan.
    >
    > To implement this a new `get_test_cflags_for_apple_platform()` function
    > has been added that when given an Apple platform name and architecture
    > returns a string containing the C compiler flags to use when building
    > tests. This also calls a new helper function `is_valid_apple_platform()`
    > that validates Apple platform names.
    >
    > This is the second attempt at landing the patch. The first attempt (r359305)
    > had to be reverted (r359327) due to a buildbot failure. The problem was
    > that calling `get_test_cflags_for_apple_platform()` can trigger a CMake
    > error if the provided architecture is not supported by the current
    > CMake configuration. Previously, this could be triggered by passing
    > `-DCOMPILER_RT_ENABLE_IOS=OFF` to CMake. The root cause is that we were
    > generating test configurations for a list of architectures without
    > checking if the relevant Sanitizer actually supported that architecture.
    > We now intersect the list of architectures for an Apple platform
    > with `<SANITIZER>_SUPPORTED_ARCH` (where `<SANITIZER>` is a Sanitizer
    > name) to iterate through the correct list of architectures.
    >
    > rdar://problem/50124489
    >
    > Reviewers: kubamracek, yln, vsk, juliehockett, phosek
    >
    > Subscribers: mgorny, javed.absar, kristof.beyls, #sanitizers, llvm-commits
    >
    > Tags: #llvm, #sanitizers
    >
    > Differential Revision: https://reviews.llvm.org/D61242 (detail)
    by hans
  7. [Sanitizers] Fix sanitizer_posix_libcdep.cc compilation on Solaris 11.5

    A recent build of Solaris 11.5 Beta (st_047) gained madvise(MADV_DONTDUMP)
    support for Linux compatibility.  This broke the compiler-rt build:

      /vol/llvm/src/llvm/dist/projects/compiler-rt/lib/sanitizer_comm/sanitizer_posix_libcdep.cc: In function ‘bool __sanitizer::DontDumpShadowMemory(__sanitizer::uptr, __sanitizer::uptr)’:
      /vol/llvm/src/llvm/dist/projects/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cc:81:18: error: invalid conversion from ‘void*’ to ‘caddr_t’ {aka ‘char*’} [-fpermissive]
         81 |   return madvise((void *)addr, length, MADV_DONTDUMP) == 0;
            |                  ^~~~~~~~~~~~
            |                  |
            |                  void*
      In file included from
    /vol/llvm/src/llvm/dist/projects/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cc:32:
      /usr/include/sys/mman.h:231:20: note: initializing argument 1 of ‘int
    madvise(caddr_t, std::size_t, int)’
        231 | extern int madvise(caddr_t, size_t, int);
            |                    ^~~~~~~

    The obvious fix is to use the same solution that has already been used a
    couple of lines earlier:

      // In the default Solaris compilation environment, madvise() is declared
      // to take a caddr_t arg; casting it to void * results in an invalid
      // conversion error, so use char * instead.

    This allowed the compiler-rt build to finish and was tested successfully on
    i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

    Differential Revision: https://reviews.llvm.org/D62892 (detail)
    by ro
  8. Don't crash if PR_SET_VMA_ANON_NAME fails.

    This prctl is not implemented on very old devices.
    It is not necessary for the core functionality of the tool. Simply
    ignore the failure. (detail)
    by eugenis
  9. [libFuzzer] Improve FuzzedDataProvider helper.

    Summary:
    The following changes are made based on the feedback from Tim King:
    - Removed default template parameters, to have less assumptions.
    - Implemented `ConsumeBytesWithTerminator` method.
    - Made `PickValueInArray` method work with `initializer_list` argument.
    - Got rid of `data_type` type alias, that was redundant.
    - Refactored `ConsumeBytes` logic into a private method for better code reuse.
    - Replaced implementation defined unsigned to signed conversion.
    - Fixed `ConsumeRandomLengthString` to always call `shrink_to_fit`.
    - Clarified and fixed some commments.
    - Applied clang-format to both the library and the unittest source.

    Tested on Linux, Mac, Windows.

    Reviewers: morehouse, metzman

    Reviewed By: morehouse

    Subscribers: delcypher, #sanitizers, llvm-commits, kcc

    Tags: #llvm, #sanitizers

    Differential Revision: https://reviews.llvm.org/D63348 (detail)
    by dor1s
  10. [scudo][standalone] Fuchsia related changes

    Summary:
    Fuchsia wants to use mutexes with PI in the Scudo code, as opposed to
    our own implementation. This required making `lock` & `unlock` platform
    specific (as opposed to `wait` & `wake`) [code courtesy of John
    Grossman].
    There is an additional flag required now for mappings as well:
    `ZX_VM_ALLOW_FAULTS`.

    Reviewers: morehouse, mcgrathr, eugenis, vitalybuka, hctim

    Reviewed By: morehouse

    Subscribers: delcypher, jfb, #sanitizers, llvm-commits

    Tags: #llvm, #sanitizers

    Differential Revision: https://reviews.llvm.org/D63435 (detail)
    by cryptoad
  11. [compiler-rt][SystemZ] Work around ASAN failures via -fno-partial-inlining

    Since updating the SystemZ LLVM build bot system to Ubuntu 18.04, all bots
    are red due to two ASAN failures.  It turns out these are triggered due to
    building the ASAN support libraries, in particular the interceptor routines
    using GCC 7.  Specifically, at least on our platform, this compiler decides
    to "partially inline" some of those interceptors, creating intermediate
    stub routines like "__interceptor_recvfrom.part.321".  These will show up
    in the backtraces at interception points, causing testsuite failures.

    As a workaround to get the build bots green again, this patch adds the
    -fno-partial-inlining command line option when building the common
    sanitizer support libraries on s390x, if that option is supported by
    the compiler. (detail)
    by uweigand
  12. Disable recently added Darwin symbolization tests for iOS.

    These tests won't necessarily work because the reported modules paths
    from the device don't match what's on the host and so offline
    symbolization fails. (detail)
    by delcypher
  13. [NFC] Split `Darwin/asan-symbolize-partial-report-with-module-map.cc`.

    Split `Darwin/asan-symbolize-partial-report-with-module-map.cc` into two
    separate test cases due to them testing slightly different things. (detail)
    by delcypher
  14. [asan_symbolize] Teach `asan_symbolize.py` to symbolicate partially symbolicated ASan reports.

    Summary:
    The use case here is to be able symbolicate ASan reports that might be
    partially symbolicated, in particular where the function name is known but no source
    location is available. This can be caused by missing debug info. Previously we
    would only try to symbolicate completely unsymbolicated reports.

    The code currently contains an unfortunate quirk to handle a darwin
    specific bug (rdar://problem/49784442) in the way partially symbolicated
    reports are emitted when the source location is missing.

    rdar://problem/49476995

    Reviewers: kubamracek, yln, samsonov, dvyukov, vitalybuka

    Subscribers: aprantl, #sanitizers, llvm-commits

    Tags: #llvm, #sanitizers

    Differential Revision: https://reviews.llvm.org/D60533 (detail)
    by delcypher
  15. hwasan: Use bits [3..11) of the ring buffer entry address as the base stack tag.

    This saves roughly 32 bytes of instructions per function with stack objects
    and causes us to preserve enough information that we can recover the original
    tags of all stack variables.

    Now that stack tags are deterministic, we no longer need to pass
    -hwasan-generate-tags-with-calls during check-hwasan. This also means that
    the new stack tag generation mechanism is exercised by check-hwasan.

    Differential Revision: https://reviews.llvm.org/D63360 (detail)
    by pcc
  16. [CMake] Fix the value of `config.target_cflags` for non-macOS Apple platforms. Attempt #2.

    Summary:
    The main problem here is that `-*-version_min=` was not being passed to
    the compiler when building test cases. This can cause problems when
    testing on devices running older OSs because Clang would previously
    assume the minimum deployment target is the the latest OS in the SDK
    which could be much newer than what the device is running.

    Previously the generated value looked like this:

    `-arch arm64 -isysroot
    <path_to_xcode>/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.1.sdk`

    With this change it now looks like:

    `-arch arm64 -stdlib=libc++ -miphoneos-version-min=8.0 -isysroot
    <path_to_xcode>/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS12.1.sdk`

    This mirrors the setting of `config.target_cflags` on macOS.

    This change is made for ASan, LibFuzzer, TSan, and UBSan.

    To implement this a new `get_test_cflags_for_apple_platform()` function
    has been added that when given an Apple platform name and architecture
    returns a string containing the C compiler flags to use when building
    tests. This also calls a new helper function `is_valid_apple_platform()`
    that validates Apple platform names.

    This is the second attempt at landing the patch. The first attempt (r359305)
    had to be reverted (r359327) due to a buildbot failure. The problem was
    that calling `get_test_cflags_for_apple_platform()` can trigger a CMake
    error if the provided architecture is not supported by the current
    CMake configuration. Previously, this could be triggered by passing
    `-DCOMPILER_RT_ENABLE_IOS=OFF` to CMake. The root cause is that we were
    generating test configurations for a list of architectures without
    checking if the relevant Sanitizer actually supported that architecture.
    We now intersect the list of architectures for an Apple platform
    with `<SANITIZER>_SUPPORTED_ARCH` (where `<SANITIZER>` is a Sanitizer
    name) to iterate through the correct list of architectures.

    rdar://problem/50124489

    Reviewers: kubamracek, yln, vsk, juliehockett, phosek

    Subscribers: mgorny, javed.absar, kristof.beyls, #sanitizers, llvm-commits

    Tags: #llvm, #sanitizers

    Differential Revision: https://reviews.llvm.org/D61242 (detail)
    by delcypher
  17. [GWP-ASan] Disable GWP-ASan on Android for now.

    Summary:
    Temporarily disable GWP-ASan for android until the bugs at:
    http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/87
    ... can be fixed. See comments for the full bug trace.

    Reviewers: eugenis

    Reviewed By: eugenis

    Subscribers: srhines, kubamracek, mgorny, cryptoad, jfb, #sanitizers, llvm-commits

    Tags: #sanitizers, #llvm

    Differential Revision: https://reviews.llvm.org/D63460 (detail)
    by hctim
  18. Stop counting pops in tsan/check_analyze.sh.

    Summary:
    It looks like LLVM has started doing less tail duplication in this code,
    or something like that, resulting in a significantly smaller number of
    pop instructions (16 -> 12). Removing the check.

    Reviewers: vitalybuka, dvyukov

    Subscribers: kubamracek, #sanitizers, llvm-commits

    Tags: #sanitizers, #llvm

    Differential Revision: https://reviews.llvm.org/D63450 (detail)
    by eugenis
  19. Attempt to fix GWP-ASan build failure on sanitizer-android. Add -fPIC. (detail)
    by hctim
  20. [GWP-ASan] Integration with Scudo [5].

    Summary:
    See D60593 for further information.

    This patch adds GWP-ASan support to the Scudo hardened allocator. It also
    implements end-to-end integration tests using Scudo as the backing allocator.
    The tests include crash handling for buffer over/underflow as well as
    use-after-free detection.

    Reviewers: vlad.tsyrklevich, cryptoad

    Reviewed By: vlad.tsyrklevich, cryptoad

    Subscribers: kubamracek, mgorny, #sanitizers, llvm-commits, morehouse

    Tags: #sanitizers, #llvm

    Differential Revision: https://reviews.llvm.org/D62929 (detail)
    by hctim
  21. [scudo][standalone] Introduce the combined allocator

    Summary:
    The Combined allocator hold together all the other components, and
    provides a memory allocator interface based on various template
    parameters. This will be in turn used by "wrappers" that will provide
    the standard C and C++ memory allocation functions, but can be
    used as is as well.

    This doesn't depart significantly from the current Scudo implementation
    except for a few details:
    - Quarantine batches are now protected by a header a well;
    - an Allocator instance has its own TSD registry, as opposed to a
      static one for everybody;
    - a function to iterate over busy chunks has been added, for Android
      purposes;

    This also adds the associated tests, and a few default configurations
    for several platforms, that will likely be further tuned later on.

    Reviewers: morehouse, hctim, eugenis, vitalybuka

    Reviewed By: morehouse

    Subscribers: srhines, mgorny, delcypher, jfb, #sanitizers, llvm-commits

    Tags: #llvm, #sanitizers

    Differential Revision: https://reviews.llvm.org/D63231 (detail)
    by cryptoad
Revision: 362564
Changes
  1. Use rvalue references throughout the is_constructible traits. (detail)
    by ericwf
  2. Make move and forward work in C++03.

    These functions are key to allowing the use of rvalues and variadics
    in C++03 mode. Everything works the same as in C++11, except for one
    tangentially related case:

    struct T {
      T(T &&) = default;
    };

    In C++11, T has a deleted copy constructor. But in C++03 Clang gives
    it both a move and a copy constructor. This seems reasonable enough
    given the extensions it's using.

    The other changes in this patch were the minimal set required
    to keep the tests passing after the move/forward change. Most notably
    the removal of the `__rv<unique_ptr>` hack that was present
    in an attempt to make unique_ptr move only without language support. (detail)
    by ericwf
  3. Enable aligned_union in C++03 (detail)
    by ericwf
  4. Get is_convertible tests passing in C++03 (except the fallback). (detail)
    by ericwf
  5. Remove dead non-variadic workarounds in <type_traits>

    We can use variadics with clang (detail)
    by ericwf
  6. Make rvalue metaprogramming traits work in C++03.

    The next step is to get move and forward working in C++03. (detail)
    by ericwf
  7. Remove even more dead code. (detail)
    by ericwf
  8. Assume __is_final,  __is_base_of, and friends.

    All the compilers we support provide these builtins. We don't
    need to do a configuration dance anymore.

    This patch also cleans up some dead or almost dead
    C++11 feature detection macros. (detail)
    by ericwf
  9. Remove dead config now that C++03 requires Clang. (detail)
    by ericwf
  10. [libc++] Avoid using timespec when it might not be available

    Summary:
    The type timespec is unconditionally used in __threading_support.
    Since the C library is only required to provide it in C11, this might
    cause problems for platforms with external thread porting layer (i.e.
    when _LIBCPP_HAS_THREAD_API_EXTERNAL is defined) with pre-C11
    C libraries.

    In our downstream port of libc++ we used to provide a definition of
    timespec in __external_threading, but this solution is not ideal
    because timespec is not a reserved name.

    This patch renames timespec into __libcpp_timespec_t in the
    thread-related parts of libc++. For all cases except external
    threading this type is an alias for ::timespec (and no functional
    changes are intended).

    In case of external threading it is expected that the
    __external_threading header will either provide a similar typedef (if
    timespec is available in the vendor's C library) or provide a
    definition of __libcpp_timespec_t compatible with POSIX timespec.

    Reviewers: ldionne, mclow.lists, EricWF

    Reviewed By: ldionne

    Subscribers: dexonsmith, libcxx-commits, christof, carwil

    Tags: #libc

    Differential Revision: https://reviews.llvm.org/D63328 (detail)
    by miyuki
  11. [libc++] Recommit r363692 to implement P0608R3

    Re-apply the change which was reverted in r363764 as-is after
    breakages being resolved.  Thanks Eric Fiselier for working
    hard on this.

    See also: https://bugs.llvm.org/show_bug.cgi?id=42330

    Differential Revision: https://reviews.llvm.org/D44865 (detail)
    by lichray
  12. [libc++] Take 2: Implement CTAD for map and multimap

    This is a re-application of r362986 (which was reverted in r363688) with fixes
    for the issue that caused it to be reverted.

    Thanks to Arthur O'Dwyer for the patch.

    Differential Revision: https://reviews.llvm.org/D58587 (detail)
    by Louis Dionne
  13. AIX system headers need stdint.h and inttypes.h to be re-enterable

    Summary:
    AIX system headers need stdint.h and inttypes.h to be re-enterable when macro _STD_TYPES_T is defined so that limit macro definitions such as UINT32_MAX can be found. This patch attempts to allow that on AIX.

    Reviewers: hubert.reinterpretcast, jasonliu, mclow.lists, EricWF

    Reviewed by: hubert.reinterpretcast, mclow.lists

    Subscribers: jfb, jsji, christof, cfe-commits, libcxx-commits, llvm-commits

    Tags: #LLVM, #clang, #libc++

    Differential Revision: https://reviews.llvm.org/D59253 (detail)
    by xingxue
  14. [NFC][libc++] Remove stray semi-colon after function definition (detail)
    by Louis Dionne
  15. Mark papers P1458, P1459, P1462 and P1464 as complete. No changed needed to either the library or the tests. (detail)
    by marshall
  16. [libc++] Revert r363692 which implements P0608R3

    The change caused a large number of compiler failures in
    Google's codebase.  People need time to evaluate the impact. (detail)
    by lichray
  17. Disable the 'nextafter' portions of these tests on PPC when using 128-bit doubles because the 'nextafter' call doesn't work right. Reviewed as https://reviews.llvm.org/D62384. Thanks to Xing Xue for the patch, and Hubert for the explanation. (detail)
    by marshall
  18. Remove GCC C++03 fallbacks for decltype and static_assert.

    This means libc++ no longer needs to write extra braces in
    static asserts: Ex `static_assert((is_same_v<T, V>), "msg")`. (detail)
    by ericwf
  19. Reconfigure docker builders to be more modular.

    And other various cleanups to the configuration. (detail)
    by ericwf
  20. Fix the floating point version of midpoint. It wasn't constexpr, among other things. Add more tests. As a drive-by, the LCD implementation had a class named '__abs' which did a 'absolute value to a common-type' conversion. Rename that to be '__ct_abs'. (detail)
    by marshall
  21. [libc++] Implement P0608R3 - A sane variant converting constructor

    Summary:
    Prefer user-defined conversions over narrowing conversions and conversions to bool.

    References:
    http://wg21.link/p0608

    Reviewers: EricWF, mpark, mclow.lists

    Reviewed By: mclow.lists

    Subscribers: zoecarver, ldionne, libcxx-commits, cfe-commits, christof

    Differential Revision: https://reviews.llvm.org/D44865 (detail)
    by lichray
  22. [libc++] Re-apply XFAIL to is_base_of test that was inadvertently reverted (detail)
    by Louis Dionne
  23. [libc++] Revert the addition of map/multimap CTAD

    This was found to be broken on Clang trunk. This is a revert of the
    following commits (the subsequent commits added XFAILs to the tests
    that were missing from the original submission):

        r362986: Implement deduction guides for map/multimap.
        r363014: Add some XFAILs
        r363097: Add more XFAILs
        r363197: Add even more XFAILs (detail)
    by Louis Dionne
  24. [NFC] Assign a couple of LWG issues to myself (detail)
    by Louis Dionne
  25. [libc++] Update ABI list for ABI v2

    I forgot to add symbols for filesystem. (detail)
    by Louis Dionne
  26. Update status of issue 3209 (detail)
    by marshall
  27. Add tests for LWG 3206. NFC (detail)
    by marshall
  28. Update the meeting page with papers/issues that are ready for Cologne (detail)
    by marshall
  29. Fix a '>= 0' test on unsigned that I inadvertantly introduced. Now correctly '!= 0'. Thanks to Arthur for the catch (detail)
    by marshall

Started by upstream project relay-test-suite-verify-machineinstrs build number 5435
originally caused by:

This run spent:

  • 4 hr 54 min waiting;
  • 5 hr 10 min build duration;
  • 5 hr 11 min total from scheduled to completion.