Started 7 mo 29 days ago
Took 3 min 42 sec on green-dragon-15

Failed Build #306 (Feb 20, 2019 1:37:07 AM)

Revisions
  • http://llvm.org/svn/llvm-project/llvm/trunk : 354441
  • http://llvm.org/svn/llvm-project/cfe/trunk : 354435
  • http://llvm.org/svn/llvm-project/compiler-rt/trunk : 354402
  • http://llvm.org/svn/llvm-project/debuginfo-tests/trunk : 346271
  • http://llvm.org/svn/llvm-project/libcxx/trunk : 354212
  • http://llvm.org/svn/llvm-project/clang-tools-extra/trunk : 354349
Changes
  1. [llvm-exegesis] Opcode stabilization / reclusterization (PR40715)

    Summary:
    Given an instruction `Opcode`, we can make benchmarks (measurements) of the
    instruction characteristics/performance. Then, to facilitate further analysis
    we group the benchmarks with *similar* characteristics into clusters.
    Now, this is all not entirely deterministic. Some instructions have variable
    characteristics, depending on their arguments. And thus, if we do several
    benchmarks of the same instruction `Opcode`, we may end up with *different*
    performance characteristics measurements. And when we then do clustering,
    these several benchmarks of the same instruction `Opcode` may end up being
    clustered into *different* clusters. This is not great for further analysis.

    We shall find every `Opcode` with benchmarks not in just one cluster, and move
    *all* the benchmarks of said `Opcode` into one new unstable cluster per `Opcode`.

    I have solved this by making `ClusterId` a bit field, adding a `IsUnstable` bit,
    and introducing `-analysis-display-unstable-clusters` switch to toggle between
    displaying stable-only clusters and unstable-only clusters.

    The reclusterization is deterministically stable, produces identical reports
    between runs. (Or at least that is what i'm seeing, maybe it isn't)

    Timings/comparisons:
    old (current trunk/head) {F8303582}
    ```
    $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html
    no exegesis target for x86_64-unknown-linux-gnu, using default
    Parsed 43970 benchmark points
    Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
    ...
    no exegesis target for x86_64-unknown-linux-gnu, using default
    Parsed 43970 benchmark points
    Printing sched class consistency analysis results to file '/tmp/clusters-old.html'

    Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (25 runs):

               6624.73 msec task-clock                #    0.999 CPUs utilized            ( +-  0.53% )
                   172      context-switches          #   25.965 M/sec                    ( +- 29.89% )
                     0      cpu-migrations            #    0.042 M/sec                    ( +- 56.54% )
                 31073      page-faults               # 4690.754 M/sec                    ( +-  0.08% )
           26538711696      cycles                    # 4006230.292 GHz                   ( +-  0.53% )  (83.31%)
            2017496807      stalled-cycles-frontend   #    7.60% frontend cycles idle     ( +-  0.93% )  (83.32%)
           13403650062      stalled-cycles-backend    #   50.51% backend cycles idle      ( +-  0.33% )  (33.37%)
           19770706799      instructions              #    0.74  insn per cycle
                                                      #    0.68  stalled cycles per insn  ( +-  0.04% )  (50.04%)
            4419821812      branches                  # 667207369.714 M/sec               ( +-  0.03% )  (66.69%)
             121741669      branch-misses             #    2.75% of all branches          ( +-  0.28% )  (83.34%)

                6.6283 +- 0.0358 seconds time elapsed  ( +-  0.54% )
    ```

    patch, with reclustering but without filtering (i.e. outputting all the stable *and* unstable clusters) {F8303586}
    ```
    $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-all.html
    no exegesis target for x86_64-unknown-linux-gnu, using default
    Parsed 43970 benchmark points
    Printing sched class consistency analysis results to file '/tmp/clusters-new-all.html'
    ...
    no exegesis target for x86_64-unknown-linux-gnu, using default
    Parsed 43970 benchmark points
    Printing sched class consistency analysis results to file '/tmp/clusters-new-all.html'

    Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-all.html' (25 runs):

               6475.29 msec task-clock                #    0.999 CPUs utilized            ( +-  0.31% )
                   213      context-switches          #   32.952 M/sec                    ( +- 23.81% )
                     1      cpu-migrations            #    0.130 M/sec                    ( +- 43.84% )
                 31287      page-faults               # 4832.057 M/sec                    ( +-  0.08% )
           25939086577      cycles                    # 4006160.279 GHz                   ( +-  0.31% )  (83.31%)
            1958812858      stalled-cycles-frontend   #    7.55% frontend cycles idle     ( +-  0.68% )  (83.32%)
           13218961512      stalled-cycles-backend    #   50.96% backend cycles idle      ( +-  0.29% )  (33.37%)
           19752995402      instructions              #    0.76  insn per cycle
                                                      #    0.67  stalled cycles per insn  ( +-  0.04% )  (50.04%)
            4417079244      branches                  # 682195472.305 M/sec               ( +-  0.03% )  (66.70%)
             121510065      branch-misses             #    2.75% of all branches          ( +-  0.19% )  (83.34%)

                6.4832 +- 0.0229 seconds time elapsed  ( +-  0.35% )
    ```
    Funnily, *this* measurement shows that said reclustering actually improved performance.

    patch, with reclustering, only the stable clusters {F8303594}
    ```
    $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-stable.html
    no exegesis target for x86_64-unknown-linux-gnu, using default
    Parsed 43970 benchmark points
    Printing sched class consistency analysis results to file '/tmp/clusters-new-stable.html'
    ...
    no exegesis target for x86_64-unknown-linux-gnu, using default
    Parsed 43970 benchmark points
    Printing sched class consistency analysis results to file '/tmp/clusters-new-stable.html'

    Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-stable.html' (25 runs):

               6387.71 msec task-clock                #    0.999 CPUs utilized            ( +-  0.13% )
                   133      context-switches          #   20.792 M/sec                    ( +- 23.39% )
                     0      cpu-migrations            #    0.063 M/sec                    ( +- 61.24% )
                 31318      page-faults               # 4903.256 M/sec                    ( +-  0.08% )
           25591984967      cycles                    # 4006786.266 GHz                   ( +-  0.13% )  (83.31%)
            1881234904      stalled-cycles-frontend   #    7.35% frontend cycles idle     ( +-  0.25% )  (83.33%)
           13209749965      stalled-cycles-backend    #   51.62% backend cycles idle      ( +-  0.16% )  (33.36%)
           19767554347      instructions              #    0.77  insn per cycle
                                                      #    0.67  stalled cycles per insn  ( +-  0.04% )  (50.03%)
            4417480305      branches                  # 691618858.046 M/sec               ( +-  0.03% )  (66.68%)
             118676358      branch-misses             #    2.69% of all branches          ( +-  0.07% )  (83.33%)

                6.3954 +- 0.0118 seconds time elapsed  ( +-  0.18% )
    ```
    Performance improved even further?! Makes sense i guess, less clusters to print.

    patch, with reclustering, only the unstable clusters {F8303601}
    ```
    $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-unstable.html -analysis-display-unstable-clusters
    no exegesis target for x86_64-unknown-linux-gnu, using default
    Parsed 43970 benchmark points
    Printing sched class consistency analysis results to file '/tmp/clusters-new-unstable.html'
    ...
    no exegesis target for x86_64-unknown-linux-gnu, using default
    Parsed 43970 benchmark points
    Printing sched class consistency analysis results to file '/tmp/clusters-new-unstable.html'

    Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=0.5 -benchmarks-file=/home/lebedevri/PileDriver-Sched/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters-new-unstable.html -analysis-display-unstable-clusters' (25 runs):

               6124.96 msec task-clock                #    1.000 CPUs utilized            ( +-  0.20% )
                   194      context-switches          #   31.709 M/sec                    ( +- 20.46% )
                     0      cpu-migrations            #    0.039 M/sec                    ( +- 49.77% )
                 31413      page-faults               # 5129.261 M/sec                    ( +-  0.06% )
           24536794267      cycles                    # 4006425.858 GHz                   ( +-  0.19% )  (83.31%)
            1676085087      stalled-cycles-frontend   #    6.83% frontend cycles idle     ( +-  0.46% )  (83.32%)
           13035595603      stalled-cycles-backend    #   53.13% backend cycles idle      ( +-  0.16% )  (33.36%)
           18260877653      instructions              #    0.74  insn per cycle
                                                      #    0.71  stalled cycles per insn  ( +-  0.05% )  (50.03%)
            4112411983      branches                  # 671484364.603 M/sec               ( +-  0.03% )  (66.68%)
             114066929      branch-misses             #    2.77% of all branches          ( +-  0.11% )  (83.32%)

                6.1278 +- 0.0121 seconds time elapsed  ( +-  0.20% )
    ```
    This tells us that the actual `-analysis-inconsistencies-output-file=` outputting only takes ~0.4 sec for 43970 benchmark points (3 whole sweeps)
    (Also, wow this is fast, it used to take several minutes originally)

    Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=40715 | PR40715 ]].

    Reviewers: courbet, gchatelet

    Reviewed By: courbet

    Subscribers: tschuett, jdoerfert, llvm-commits, RKSimon

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D58355 (detail/ViewSVN)
    by lebedevri
  2. [RegAllocGreedy] Take last chance recoloring into account in split and assign

    Summary:
    This is a follow-up to r353988 where tryEvict was extended to take last
    chance recoloring into account. Now we do the same thing for trySplit and
    tryAssign.

    Now we always pass a "FixedRegisters" argument to canEvictInterference and
    tryEvict so it doesn't need to have a default value anymore.

    The need for this was found long ago in an out-of-tree target.
    Unfortunately I don't have a reproducer for an in-tree target.

    Reviewers: qcolombet, rudkx

    Reviewed By: qcolombet, rudkx

    Subscribers: rudkx, MatzeB, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D58376 (detail/ViewSVN)
    by uabelho
  3. [NFC] add/modify wrapper function for findRegisterDefOperand(). (detail/ViewSVN)
    by shchenz
  4. [DTU] Refine the document of mutation APIs [NFC] (PR40528)

    Summary:
    It was pointed out in [[ https://bugs.llvm.org/show_bug.cgi?id=40528 | Bug 40528 ]] that it is not clear whether insert/deleteEdge can be used to perform multiple updates and [[ https://reviews.llvm.org/D57316#1388344 | a comment in D57316 ]] reveals that the difference between several ways to update the DominatorTree is confusing.

    This patch tries to address issues above.

    Reviewers: mkazantsev, kuhar, asbirlea, chandlerc, brzycki

    Reviewed By: mkazantsev, kuhar, brzycki

    Subscribers: llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D57881 (detail/ViewSVN)
    by sima
  5. [X86] Remove FeatureSlowIncDec from Sandy Bridge and later Intel Core CPUs

    Summary:
    Inc and Dec were at one point slow on Intel CPUs due to their tendency to cause partial flag stalls on P6 derived CPU cores. This is because these instructions are defined to preserve the carry flag. This partial flag stall issue persisted until Sandy Bridge when flag merging was changed to be handled as a data dependency instead of as a stall until retirement. Sandy Bridge and later CPUs rename the C flag separately from OSPAZ so there is no flag merge needed on INC/DEC to preserve the C flag.

    Given these improvements I don't know why INC/DEC was ever considered slow on Sandy Bridge. If anything they should have been disabled on the earlier CPUs instead.

    Note after this patch, INC/DEC are still considered slow on Silvermont, Goldmont, Knights Landing and our generic "x86-64" CPU.

    Reviewers: spatel, RKSimon, chandlerc

    Reviewed By: chandlerc

    Subscribers: llvm-commits

    Differential Revision: https://reviews.llvm.org/D58412 (detail/ViewSVN)
    by ctopper
  6. Limit new PM tests to X86 registered targets. (detail/ViewSVN)
    by leonardchan
  7. Temporarily Revert "[X86][SLP] Enable SLP vectorization for 128-bit horizontal X86 instructions (add, sub)"

    As this has broken the lto bootstrap build for 3 days and is
    showing a significant regression on the Dither_benchmark results (from
    the LLVM benchmark suite) -- specifically, on the
    BENCHMARK_FLOYD_DITHER_128, BENCHMARK_FLOYD_DITHER_256, and
    BENCHMARK_FLOYD_DITHER_512; the others are unchanged.  These have
    regressed by about 28% on Skylake, 34% on Haswell, and over 40% on
    Sandybridge.

    This reverts commit r353923. (detail/ViewSVN)
    by echristo
  8. [Dominators] Simplify and optimize path compression used in link-eval forest.

    Summary:
    * NodeToInfo[*] have been allocated so the addresses are stable. We can store them instead of NodePtr to save NumToNode lookups.
    * Nodes are traversed twice. Using `Visited` to check the traversal number is expensive and obscure. Just split the two traversals into two loops explicitly.
    * The check `VInInfo.DFSNum < LastLinked` is redundant as it is implied by `VInInfo->Parent < LastLinked`
    * VLabelInfo PLabelInfo are used to save a NodeToInfo lookup in the second traversal.

    Also add some comments explaining eval().

    This shows a ~4.5% improvement (9.8444s -> 9.3996s) on

        perf stat -r 10 taskset -c 0 opt -passes=$(printf '%.0srequire<domtree>,invalidate<domtree>,' {1..1000})'require<domtree>' -disable-output sqlite-autoconf-3270100/sqlite3.bc

    Reviewers: kuhar, sanjoy, asbirlea

    Reviewed By: kuhar

    Subscribers: brzycki, NutshellySima, kristina, jdoerfert, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D58327 (detail/ViewSVN)
    by maskray
  9. Remove test on incompatible mpis target. (detail/ViewSVN)
    by leonardchan
  10. [NewPM] Add other sanitizers at O0

    This allows for MSan and TSan to be used without optimizations required.

    Differential Revision: https://reviews.llvm.org/D58424 (detail/ViewSVN)
    by leonardchan
  11. [RISCV] Implement pseudo instructions for load/store from a symbol address.

    Summary:
    Those pseudo-instructions are making load/store instructions able to
    load/store from/to a symbol, and its always using PC-relative addressing
    to generating a symbol address.

    Reviewers: asb, apazos, rogfer01, jrtc27

    Differential Revision: https://reviews.llvm.org/D50496 (detail/ViewSVN)
    by kito
  12. [Dominators] Delete UpdateLevelsAfterInsertion in edge insertion of depth-based search for release builds

    Summary:
    After insertion of (From, To), v is affected iff
    depth(NCD)+1 < depth(v) && path P from To to v exists where every w on P s.t. depth(v) <= depth(w)

    All affected vertices change their idom to NCD.

    If a vertex u has changed its depth, it must be a descendant of an
    affected vertex v. Its depth must have been updated by UpdateLevel()
    called by setIDom() of the first affected ancestor.

    So UpdateLevelsAfterInsertion and its bookkeeping variable VisitedNotAffectedQueue are redundant.
    Run them only in debug builds as a sanity check.

    Reviewers: kuhar

    Reviewed By: kuhar

    Subscribers: kristina, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D58369 (detail/ViewSVN)
    by maskray
  13. [PowerPC] exploit P9 instruction maddld.
    Differential Revision: https://reviews.llvm.org/D58364 (detail/ViewSVN)
    by shchenz
  14. [WebAssembly] Generalize section ordering constraints

    Summary:
    Changes from using a total ordering of known sections to using a
    dependency graph approach. This allows our tools to accept and process
    binaries that are compliant with the spec and tool conventions that
    would have been previously rejected. It also means our own tools can
    do less work to enforce an artificially imposed ordering. Using a
    general mechanism means fewer special cases and exceptions in the
    ordering logic.

    Reviewers: aheejin, dschuff

    Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jdoerfert, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D58312 (detail/ViewSVN)
    by tlively
  15. [WebAssembly] Refactor atomic operation definitions (NFC)

    Summary:
    - Make `ATOMIC_I`, `ATOMIC_NRI`, `AtomicLoad`, `AtomicStore` classes and
      make other operations inherit from them
    - Factor the common opcode prefix '0xfe' out from the opcodes into the
      common class
    - Reorder instructions in the order of increasing opcodes

    Reviewers: tlively

    Subscribers: dschuff, sbc100, jgravelle-google, sunfish, jfb, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D58338 (detail/ViewSVN)
    by aheejin
  16. [InstCombine] regenerate test checks; NFC (detail/ViewSVN)
    by spatel
  17. [WebAssembly] Fix load/store name detection for atomic instructions

    Summary:
    Fixed a bug in the routine in AsmParser that determines whether the
    current instruction is a load or a store. Atomic instructions' prefixes
    are not `atomic_` but `atomic.`, and all atomic instructions are also
    memory instructions. Also fixed the printing format of atomic
    instructions to match other memory instructions and added encoding tests
    for atomic instructions.

    Reviewers: aardappel, tlively

    Subscribers: dschuff, sbc100, jgravelle-google, sunfish, jfb, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D58337 (detail/ViewSVN)
    by aheejin
  18. CMake: Fix stand-alone clang builds since r353268

    Summary:
    Handle the case where LLVM_MAIN_SRC_DIR is not set and also use
    LLVM_CMAKE_DIR for locating installed cmake files rather than
    LLVM_CMAKE_PATH.

    Reviewers: phosek, andrewrk, smeenai

    Reviewed By: phosek, andrewrk, smeenai

    Subscribers: mgorny, cfe-commits, llvm-commits

    Tags: #clang, #llvm

    Differential Revision: https://reviews.llvm.org/D58204 (detail/ViewSVN)
    by tstellar
  19. [WebAssembly] Fixed disassembler not knowing about OPERAND_EVENT

    Reviewers: aheejin

    Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D58414 (detail/ViewSVN)
    by aardappel

Started by upstream project Clang Stage 2: cmake, R -g Asan&UBSan, using Stage 1 RA, Phase 1 build number 6151
originally caused by:

This run spent:

  • 6.5 sec waiting;
  • 3 min 42 sec build duration;
  • 3 min 49 sec total from scheduled to completion.

Identified problems

Missing test results

The test result file Jenkins is looking for does not exist after the build.
Indication 1

CMake Error

This build failed because of an CMake configuration error. Below is a list of all errors in the build log:
Indication 2