FailedChanges

Summary

  1. Let mlir-nvidia and mlir-windows builders combine commits, otherwise the waiting queue is too long. (details)
Commit 8d3a31cb12b51456e276a19baf6694cc44ff8c59 by gkistanova
Let mlir-nvidia and mlir-windows builders combine commits, otherwise the waiting queue is too long.
The file was modifiedbuildbot/osuosl/master/config/builders.py (diff)

Summary

  1. [DWARFYAML] Use writeDWARFOffset() to write the prologue_length field. NFC. (details)
  2. [libc] Extend MPFRMatcher to handle multiple-input-multiple-output functions. (details)
  3. [libc][obvious] Add back the accidentally removed MPFRNumber destructor. (details)
  4. Remove the use of global dialect registration from the standalone-translate.cpp example (NFC) (details)
  5. Fix a 32-bit overflow issue when reading LTO-generated bitcode files whose strtab are of size > 2^29 (details)
  6. [InstCombine] PHI-of-extractvalues -> extractvalue-of-PHI, aka invokes are bad (details)
  7. Revert "[InstCombine] PHI-of-extractvalues -> extractvalue-of-PHI, aka invokes are bad" (details)
Commit 75e0b5866869ea1feb140d6f718d74c786547113 by Xing
[DWARFYAML] Use writeDWARFOffset() to write the prologue_length field. NFC.

Use writeDWARFOffset() to simplify the logic. NFC.
The file was modifiedllvm/lib/ObjectYAML/DWARFEmitter.cpp
Commit 3f4674a5577dcc63a846d33f61e9bd95e388223d by sivachandra
[libc] Extend MPFRMatcher to handle multiple-input-multiple-output functions.

Tests for frexp[f|l] now use the new capability. Not all input-output
combinations have been addressed by this change. Support for newer combinations
can be added in future as needed.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D86506
The file was modifiedlibc/test/src/math/frexpl_test.cpp
The file was modifiedlibc/test/src/math/frexp_test.cpp
The file was modifiedlibc/utils/MPFRWrapper/MPFRUtils.h
The file was modifiedlibc/test/src/math/CMakeLists.txt
The file was modifiedlibc/test/src/math/frexpf_test.cpp
The file was modifiedlibc/utils/MPFRWrapper/MPFRUtils.cpp
Commit 1948acb61b1d900b43fa457b3517de2d7beacd63 by sivachandra
[libc][obvious] Add back the accidentally removed MPFRNumber destructor.
The file was modifiedlibc/utils/MPFRWrapper/MPFRUtils.cpp
Commit a3ef1054fd5ba3177e2fe33d45e40de00b79f1f0 by joker.eph
Remove the use of global dialect registration from the standalone-translate.cpp example (NFC)
The file was modifiedmlir/examples/standalone/standalone-translate/standalone-translate.cpp
Commit 47849870278ce05cde03d41f03fd3a1e65ee22a6 by jianzhouzh
Fix a 32-bit overflow issue when reading LTO-generated bitcode files whose strtab are of size > 2^29

This happens when using -flto and -Wl,--plugin-opt=emit-llvm to create a linked LTO bitcode file, and the bitcode file has a strtab with size > 2^29.

All the issues relate to a pattern like this
  size_t x64 = y64 + z32 * C
  When z32 is >= (2^32)/C, z32 * C overflows.

Reviewed-by: MaskRay

Differential Revision: https://reviews.llvm.org/D86500
The file was modifiedllvm/lib/Bitstream/Reader/BitstreamReader.cpp
Commit fcb51d8c2460faa23b71e06abb7e826243887dd6 by lebedev.ri
[InstCombine] PHI-of-extractvalues -> extractvalue-of-PHI, aka invokes are bad

While since D86306 we do it's sibling fold for `insertvalue`,
we should also do this for `extractvalue`'s.

And unlike that one, the results here are, quite honestly, shocking,
as it can be observed here on vanilla llvm test-suite + RawSpeed results:

```
| statistic name                                     | baseline  | proposed  |       Δ |       % |    |%| |
|----------------------------------------------------|-----------|-----------|--------:|--------:|-------:|
| asm-printer.EmittedInsts                           | 7945095   | 7942507   |   -2588 |  -0.03% |  0.03% |
| assembler.ObjectBytes                              | 273209920 | 273069800 | -140120 |  -0.05% |  0.05% |
| early-cse.NumCSE                                   | 2183363   | 2183398   |      35 |   0.00% |  0.00% |
| early-cse.NumSimplify                              | 541847    | 550017    |    8170 |   1.51% |  1.51% |
| instcombine.NumAggregateReconstructionsSimplified  | 2139      | 108       |   -2031 | -94.95% | 94.95% |
| instcombine.NumCombined                            | 3601364   | 3635448   |   34084 |   0.95% |  0.95% |
| instcombine.NumConstProp                           | 27153     | 27157     |       4 |   0.01% |  0.01% |
| instcombine.NumDeadInst                            | 1694521   | 1765022   |   70501 |   4.16% |  4.16% |
| instcombine.NumPHIsOfExtractValues                 | 0         | 37546     |   37546 |   0.00% |  0.00% |
| instcombine.NumSunkInst                            | 63158     | 63686     |     528 |   0.84% |  0.84% |
| instcount.NumBrInst                                | 874304    | 871857    |   -2447 |  -0.28% |  0.28% |
| instcount.NumCallInst                              | 1757657   | 1758402   |     745 |   0.04% |  0.04% |
| instcount.NumExtractValueInst                      | 45623     | 11483     |  -34140 | -74.83% | 74.83% |
| instcount.NumInsertValueInst                       | 4983      | 580       |   -4403 | -88.36% | 88.36% |
| instcount.NumInvokeInst                            | 61018     | 59478     |   -1540 |  -2.52% |  2.52% |
| instcount.NumLandingPadInst                        | 35334     | 34215     |   -1119 |  -3.17% |  3.17% |
| instcount.NumPHIInst                               | 344428    | 331116    |  -13312 |  -3.86% |  3.86% |
| instcount.NumRetInst                               | 100773    | 100772    |      -1 |   0.00% |  0.00% |
| instcount.TotalBlocks                              | 1081154   | 1077166   |   -3988 |  -0.37% |  0.37% |
| instcount.TotalFuncs                               | 101443    | 101442    |      -1 |   0.00% |  0.00% |
| instcount.TotalInsts                               | 8890201   | 8833747   |  -56454 |  -0.64% |  0.64% |
| instsimplify.NumSimplified                         | 75822     | 75707     |    -115 |  -0.15% |  0.15% |
| simplifycfg.NumHoistCommonCode                     | 24203     | 24197     |      -6 |  -0.02% |  0.02% |
| simplifycfg.NumHoistCommonInstrs                   | 48201     | 48195     |      -6 |  -0.01% |  0.01% |
| simplifycfg.NumInvokes                             | 2785      | 4298      |    1513 |  54.33% | 54.33% |
| simplifycfg.NumSimpl                               | 997332    | 1018189   |   20857 |   2.09% |  2.09% |
| simplifycfg.NumSinkCommonCode                      | 7088      | 6464      |    -624 |  -8.80% |  8.80% |
| simplifycfg.NumSinkCommonInstrs                    | 15117     | 14021     |   -1096 |  -7.25% |  7.25% |
```
... which tells us that this new fold fires whopping 38k times,
increasing the amount of SimplifyCFG's `invoke`->`call` transforms by +54% (+1513) (again, D85787 did that last time),
decreasing total instruction count by -0.64% (-56454),
and sharply decreasing count of `insertvalue`'s (-88.36%, i.e. 9 times less)
and `extractvalue`'s (-74.83%, i.e. four times less).

This causes geomean -0.01% binary size decrease
http://llvm-compile-time-tracker.com/compare.php?from=4d5ca22b8adfb6643466e4e9f48ba14bb48938bc&to=97dacca0111cb2ae678204e52a3cee00e3a69208&stat=size-text
and, ignoring `O0-g`, is a geomean -0.01%..-0.05% compile-time improvement
http://llvm-compile-time-tracker.com/compare.php?from=4d5ca22b8adfb6643466e4e9f48ba14bb48938bc&to=97dacca0111cb2ae678204e52a3cee00e3a69208&stat=instructions

The other thing that tells is, is that while this is a massive win for `invoke`->`call` transform
`InstCombinerImpl::foldAggregateConstructionIntoAggregateReuse()` fold,
which is supposed to be dealing with such aggregate reconstructions,
fires a lot less now. There are two reasons why:
1. After this fold, as it can be seen in tests, we may (will) end up with trivially redundant PHI nodes.
   We don't CSE them in InstCombine presently, which means that EarlyCSE needs to run and then InstCombine rerun.
2. But then, EarlyCSE not only manages to fold such redundant PHI's,
   it also sees that the extract-insert chain recreates the original aggregate,
   and replaces it with the original aggregate.

The take-aways are
1. We maybe should do most trivial, same-BB PHI CSE in InstCombine
2. I need to check if what other patterns remain, and how they can be resolved.
   (i.e. i wonder if `foldAggregateConstructionIntoAggregateReuse()` might go away)

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D86530
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineInternal.h
The file was modifiedllvm/test/Transforms/InstCombine/phi-of-extractvalues.ll
The file was modifiedllvm/test/Transforms/InstCombine/phi-aware-aggregate-reconstruction.ll
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
Commit c295c6f2c04e771f44322e5dfdd681a5ace300e5 by lebedev.ri
Revert "[InstCombine] PHI-of-extractvalues -> extractvalue-of-PHI, aka invokes are bad"

This reverts commit fcb51d8c2460faa23b71e06abb7e826243887dd6.

As buildbots report, there's apparently some missing check to ensure
that the types of incoming values match the type of PHI.
Let's revert for a moment.
The file was modifiedllvm/test/Transforms/InstCombine/phi-aware-aggregate-reconstruction.ll
The file was modifiedllvm/test/Transforms/InstCombine/phi-of-extractvalues.ll
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineInternal.h

Summary

  1. Let mlir-nvidia and mlir-windows builders combine commits, otherwise the waiting queue is too long. (details)
Commit 8d3a31cb12b51456e276a19baf6694cc44ff8c59 by gkistanova
Let mlir-nvidia and mlir-windows builders combine commits, otherwise the waiting queue is too long.
The file was modifiedbuildbot/osuosl/master/config/builders.py