Started 1 mo 3 days ago
Took 54 min on green-dragon-02

Failed Build #14709 (Sep 16, 2019 12:59:50 PM)

  • : 372025
  • : 372026
  • : 371997
  • : 371835
  • : 372027
  • : 372008
  1. Open fstream files in O_CLOEXEC mode when possible.

    Reviewers: EricWF, mclow.lists, ldionne

    Reviewed By: ldionne

    Subscribers: smeenai, dexonsmith, christof, ldionne, libcxx-commits

    Tags: #libc

    Differential Revision: (detail)
    by danalbert
  2. do not emit -Wunused-macros warnings in -frewrite-includes mode (PR15614)

    -frewrite-includes calls PP.SetMacroExpansionOnlyInDirectives() to avoid
    macro expansions that are useless in that mode, but this can lead
    to -Wunused-macros false positives. As -frewrite-includes does not emit
    normal warnings, block -Wunused-macros too.

    Differential Revision: (detail)
    by llunak
  3. [Coverage] Speed up file-based queries for coverage info, NFC

    Speed up queries for coverage info in a file by reducing the amount of
    time spent determining whether a function record corresponds to a file.

    This gives a 36% speedup when generating a coverage report for `llc`.
    The reduction is entirely in user time.


    Differential Revision: (detail)
    by Vedant Kumar
  4. [Coverage] Assert that filenames in a TU are unique, NFC (detail)
    by Vedant Kumar
  5. [LTO][Legacy] Add new C inferface to query libcall functions

    This is needed to implemented the same approach as lld (implemented in r338434)
    for how to handling symbols that can be generated by LTO code generator
    but not present in the symbol table for linker that uses legacy C APIs.

    libLTO is in charge of providing the list of symbols. Linker is in
    charge of implementing the eager loading from static libraries using
    the list of symbols.


    Reviewers: tejohnson, bd1976llvm, deadalnix, espindola

    Reviewed By: tejohnson

    Subscribers: emaste, arichardson, hiraditya, MaskRay, dang, kledzik, mehdi_amini, inglorion, jkorous, dexonsmith, ributzka, llvm-commits

    Tags: #llvm

    Differential Revision: (detail)
    by steven_wu
  6. [PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups

    This fixes relocations against __profd_ symbols in discarded sections,
    which is PR41380.

    In general, instrumentation happens very early, and optimization and
    inlining happens afterwards. The counters for a function are calculated
    early, and after inlining, counters for an inlined function may be
    widely referenced by other functions.

    For C++ inline functions of all kinds (linkonce_odr &
    available_externally mainly), instr profiling wants to deduplicate these
    __profc_ and __profd_ globals. Otherwise the binary would be quite

    I made __profd_ and __profc_ comdat in r355044, but I chose to make
    __profd_ internal. At the time, I was only dealing with coverage, and in
    that case, none of the instrumentation needs to reference __profd_.
    However, if you use PGO, then instrumentation passes add calls to
    __llvm_profile_instrument_range which reference __profd_ globals. The
    solution is to make these globals externally visible by using
    linkonce_odr linkage for data as was done for counters.

    This is safe because PGO adds a CFG hash to the names of the data and
    counter globals, so if different TUs have different globals, they will
    get different data and counter arrays.

    Reviewers: xur, hans

    Differential Revision: (detail)
    by rnk
  7. [ARM][Codegen] Autogenerate arm-cgp-casts.ll test.

    Apparently it got broken by r372009 while i thought it was r372012. (detail)
    by lebedevri
  8. Implement std::condition_variable via pthread_cond_clockwait() where available

    std::condition_variable is currently implemented via
    pthread_cond_timedwait() on systems that use pthread. This is
    problematic, since that function waits by default on CLOCK_REALTIME
    and libc++ does not provide any mechanism to change from this

    Due to this, regardless of if condition_variable::wait_until() is
    called with a chrono::system_clock or chrono::steady_clock parameter,
    condition_variable::wait_until() will wait using CLOCK_REALTIME. This
    is not accurate to the C++ standard as calling
    condition_variable::wait_until() with a chrono::steady_clock parameter
    should use CLOCK_MONOTONIC.

    This is particularly problematic because CLOCK_REALTIME is a bad
    choice as it is subject to discontinuous time adjustments, that may
    cause condition_variable::wait_until() to immediately timeout or wait

    This change fixes this issue with a new POSIX function,
    pthread_cond_clockwait() proposed on The new function is
    similar to pthread_cond_timedwait() with the addition of a clock
    parameter that allows it to wait using either CLOCK_REALTIME or
    CLOCK_MONOTONIC, thus allowing condition_variable::wait_until() to
    wait using CLOCK_REALTIME for chrono::system_clock and CLOCK_MONOTONIC
    for chrono::steady_clock.

    pthread_cond_clockwait() is implemented in glibc (2.30 and later) and
    Android's bionic (Android API version 30 and later).

    This change additionally makes wait_for() and wait_until() with clocks
    other than chrono::system_clock use CLOCK_MONOTONIC.<Paste> (detail)
    by danalbert
  9. [Clang][Codegen] Disable arm_acle.c test.

    This test is broken by design. Clang codegen tests should not depend
    on llvm middle-end behaviour, they should *only* test clang codegen.
    Yet this test runs whole optimization pipeline.
    I've really tried to fix it, but there isn't just a few things
    that depend on passes, but everything there does. (detail)
    by lebedevri
  10. [Clang][Codegen] Relax available-externally-suppress.c test

    That test is broken by design.
    It depends on llvm middle-end behavior.
    No clang codegen test should be doing that.
    This one is salvageable by relaxing check lines. (detail)
    by lebedevri
  11. [X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands

    Determine if all of the uses of LHS/RHS operands can be replaced with a zero vector. (detail)
    by rksimon
  12. [ARM] A predicate cast of a predicate cast is a predicate cast

    The adds some very basic folding of PREDICATE_CASTS, removing cases when they
    are chained together. These would already be removed eventually, as these are
    lowered to copies. This just allows it to happen earlier, which can help other

    Differential Revision: (detail)
    by dmgreen
  13. [OPENMP]Fix parsing/sema for function templates with declare simd.

    Need to return original declaration group with FunctionTemplateDecl, not
    the inner FunctionDecl, to correctly handle parsing of directives with
    the templates parameters. (detail)
    by abataev
  14. [SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost, not per-BB cost

    Previously, if the threshold was 2, we were willing to speculatively
    execute 2 cheap instructions in both basic blocks (thus we were willing
    to speculatively execute cost = 4), but weren't willing to speculate
    when one BB had 3 instructions and other one had no instructions,
    even thought that would have total cost of 3.

    This looks inconsistent to me.
    I don't think `cmov`-like instructions will start executing
    until both of it's inputs are available:
    So i don't see why the existing behavior is the correct one.

    Also, let's add it's own `cl::opt` for this threshold,
    with default=4, so it is not stricter than the previous threshold:
    will allow to fold when there are 2 BB's each with cost=2.
    And since the logic has changed, it will also allow to fold when
    one BB has cost=3 and other cost=1, or there is only one BB with cost=4.

    This is an alternative solution to D65148:
    This fix is mainly motivated by `signbit-like-value-extension.ll` test.
    That pattern comes up in JPEG decoding, see e.g.
    `Figure F.12 – Extending the sign bit of a decoded value in V`
    of `ITU T.81` (JPEG specification).
    That branch is not predictable, and it is within the innermost loop,
    so the fact that that pattern ends up being stuck with a branch
    instead of `select` (i.e. `CMOV` for x86) is unlikely to be beneficial.

    This has great results on the final assembly (vanilla test-suite + RawSpeed): (metric pass - D67240)
    | metric                                 |     old |     new | delta |      % |
    | x86-mi-counting.NumMachineFunctions    |   37720 |   37721 |     1 |  0.00% |
    | x86-mi-counting.NumMachineBasicBlocks  |  773545 |  771181 | -2364 | -0.31% |
    | x86-mi-counting.NumMachineInstructions | 7488843 | 7486442 | -2401 | -0.03% |
    | x86-mi-counting.NumUncondBR            |  135770 |  135543 |  -227 | -0.17% |
    | x86-mi-counting.NumCondBR              |  423753 |  422187 | -1566 | -0.37% |
    | x86-mi-counting.NumCMOV                |   24815 |   25731 |   916 |  3.69% |
    | x86-mi-counting.NumVecBlend            |      17 |      17 |     0 |  0.00% |

    We significantly decrease basic block count, notably decrease instruction count,
    significantly decrease branch count and very significantly increase `cmov` count.

    Performance-wise, unsurprisingly, this has great effect on
    target RawSpeed benchmark. I'm seeing 5 **major** improvements:
    Benchmark                                                                                             Time             CPU      Time Old      Time New       CPU Old       CPU New
    Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
    Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean                                  -0.3064         -0.3064      226.9913      157.4452      226.9800      157.4384
    Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median                                -0.3057         -0.3057      226.8407      157.4926      226.8282      157.4828
    Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev                                -0.4985         -0.4954        0.3051        0.1530        0.3040        0.1534
    Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
    Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean                                   -0.1747         -0.1747       80.4787       66.4227       80.4771       66.4146
    Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median                                 -0.1742         -0.1743       80.4686       66.4542       80.4690       66.4436
    Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev                                 +0.6089         +0.5797        0.0670        0.1078        0.0673        0.1062
    Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
    Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean                                  -0.1598         -0.1598      171.6996      144.2575      171.6915      144.2538
    Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median                                -0.1598         -0.1597      171.7109      144.2755      171.7018      144.2766
    Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev                                +0.4024         +0.3850        0.0847        0.1187        0.0848        0.1175
    Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
    Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean                                   -0.0550         -0.0551      280.3046      264.8800      280.3017      264.8559
    Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_median                                 -0.0554         -0.0554      280.2628      264.7360      280.2574      264.7297
    Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev                                 +0.7005         +0.7041        0.2779        0.4725        0.2775        0.4729
    Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
    Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean                                   -0.0354         -0.0355      316.7396      305.5208      316.7342      305.4890
    Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_median                                 -0.0354         -0.0356      316.6969      305.4798      316.6917      305.4324
    Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev                                 +0.0493         +0.0330        0.3562        0.3737        0.3563        0.3681

    That being said, it's always best-effort, so there will likely
    be cases where this worsens things.

    Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot, hfinkel, chandlerc

    Reviewed By: jmolloy

    Subscribers: xbolva00, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: (detail)
    by lebedevri
  15. [clangd] Simplify semantic highlighting visitor

    - Functions to compute highlighting kinds for things are separated from
      the ones that add highlighting tokens.
      This keeps each of them more focused on what they're doing: getting
      locations and figuring out the kind of the entity, correspondingly.

    - Less special cases in visitor for various nodes.

    This change is an NFC.

    Reviewers: hokein

    Reviewed By: hokein

    Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

    Tags: #clang

    Differential Revision: (detail)
    by ibiryukov
  16. [InstCombine] remove unneeded one-use checks for icmp fold

    Related folds were added in:
    ...the code comment about register pressure is discussed in
    more detail in:

    But 10 years later, perf testing bzip2 with this change now
    shows a slight (0.2% average) improvement on Haswell although
    that's probably within test noise.

    Given that this is IR canonicalization, we shouldn't be worried
    about register pressure though; the backend should be able to
    adjust for that as needed.

    This is part of solving PR43310 the theoretically right way:, if we don't cripple basic transforms, then we won't
    need to add special-case code to detect larger patterns.

    rL371940 and rL371981 are related patches in this series. (detail)
    by spatel

Started by timer (4 times)

This run spent:

  • 3 hr 48 min waiting;
  • 54 min build duration;
  • 4 hr 43 min total from scheduled to completion.

Identified problems

Compile Error

This build failed because of a compile error. Below is a list of all errors in the build log:
Indication 1

Missing test results

The test result file Jenkins is looking for does not exist after the build.
Indication 2

Ninja target failed

Below is a link to the first failed ninja target.
Indication 3