FailedChanges

Changes from Git (git http://labmaster3.local/git/llvm-zorg.git)

Summary

  1. Do not set default cmake options in CmakeCommand. (details)
  2. Removed some default cmake options which doesn't seem worth being (details)
  3. Changed clang-x86_64-debian-fast builder to use UnifiedTreeBuilder. (details)
  4. NFC. Few cosmetic changes. (details)
  5. Incremental. (details)
Commit ccd52d237bf9347eb1d3524327ee0323f796f4f0 by gkistanova
Do not set default cmake options in CmakeCommand.
llvm-svn: 374689
The file was modifiedzorg/buildbot/commands/CmakeCommand.py
Commit a5e0c6003e20b7d5f909dc9f7d1b401363c1d858 by gkistanova
Removed some default cmake options which doesn't seem worth being
default from UnifiedTreeBuilder.addCmakeSteps.
llvm-svn: 374690
The file was modifiedzorg/buildbot/builders/UnifiedTreeBuilder.py
Commit a9bfe55fd764d032485342c46313ecc398c20f95 by gkistanova
Changed clang-x86_64-debian-fast builder to use UnifiedTreeBuilder.
llvm-svn: 374691
The file was modifiedbuildbot/osuosl/master/config/builders.py
Commit 8986c669370acbb545339657e5dbe95d59c848d1 by gkistanova
NFC. Few cosmetic changes.
llvm-svn: 374693
The file was modifiedbuildbot/osuosl/master/config/builders.py
Commit 7047cf7740c0528b2665d7b099cce4885d24a2cc by gkistanova
Incremental.
llvm-svn: 374695
The file was modifiedbuildbot/osuosl/master/config/builders.py

Changes from Git (git http://labmaster3.local/git/llvm-project.git)

Summary

  1. [X86][SSE] Avoid unnecessary PMOVZX in v4i8 sum reduction (details)
  2. remove an useless allocation found by scan-build - the new Dead nested (details)
  3. [NFC][LoopIdiom] Move one bcmp test into the proper place (details)
  4. [NFC][LoopIdiom] Add bcmp loop idiom miscompile test from PR43206. (details)
  5. [LoopIdiomRecognize] Recommit: BCmp loop idiom recognition (details)
  6. [clang-format] Proposal for clang-format to give compiler style warnings (details)
  7. Revert r374657: "[lit] Try again to fix new tests that fail on Windows (details)
  8. [lit] Try yet again to fix new tests that fail on Windows bots (details)
  9. [lit] Adjust error handling for decode introduced by r374665 (details)
  10. [X86] Use any_of/all_of patterns in shuffle mask pattern recognisers. (details)
  11. Fix cppcheck shadow variable name warnings. NFCI. (details)
  12. Replace for-loop of SmallVector::push_back with SmallVector::append. (details)
  13. [NFC][LoopIdiom] Adjust FIXME to be self-explanatory (details)
  14. [lit] Try errors="ignore" for decode introduced by r374665 (details)
  15. SymbolRecord - fix uninitialized variable warnings. NFCI. (details)
  16. SymbolRecord - consistently use explicit for single operand constructors (details)
  17. [X86] scaleShuffleMask - use size_t Scale to avoid overflow warnings (details)
  18. Revert r374671: "[lit] Try errors="ignore" for decode introduced by (details)
  19. Revert r374666: "[lit] Adjust error handling for decode introduced by (details)
  20. Revert r374665: "[lit] Try yet again to fix new tests that fail on (details)
  21. Revert r374653: "[lit] Fix a few oversights in r374651 that broke some (details)
  22. Revert r374652: "[lit] Fix internal diff's --strip-trailing-cr and use (details)
  23. Revert 374651: "Reland r374392: [lit] Extend internal diff to support (details)
  24. Revert r374650: "Reland r374390: [lit] Extend internal diff to support (details)
  25. Revert r374649: "Reland r374389: [lit] Clean up internal diff's encoding (details)
  26. Revert r374648: "Reland r374388: [lit] Make internal diff work in (details)
  27. [libc++][test] Silence MSVC warning in std::optional test (details)
  28. gn build: (manually) merge r374663 (details)
  29. Revert r374663 "[clang-format] Proposal for clang-format to give (details)
Commit 9f0885d38d821080a7cf48b2507ae19fb3e70eb9 by llvm-dev
[X86][SSE] Avoid unnecessary PMOVZX in v4i8 sum reduction
This should go away once D66004 has landed and we can simplify shuffle
chains using demanded elts.
llvm-svn: 374658
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
The file was modifiedllvm/test/CodeGen/X86/vector-reduce-add.ll
Commit 4644e9a50a4fb2ad52ad6089f9e322f88ee23d6c by sylvestre
remove an useless allocation found by scan-build - the new Dead nested
assignment check
llvm-svn: 374659
The file was modifiedclang/lib/CodeGen/CGExprScalar.cpp
Commit c41e9f6bbfd0d58655464edd8b691a6e24ec781d by lebedev.ri
[NFC][LoopIdiom] Move one bcmp test into the proper place
llvm-svn: 374660
The file was modifiedllvm/test/Transforms/LoopIdiom/bcmp-negative-tests.ll
The file was modifiedllvm/test/Transforms/LoopIdiom/bcmp-basic.ll
Commit 45539737ddb7bf0e6ade616cccc6502a830e1d50 by lebedev.ri
[NFC][LoopIdiom] Add bcmp loop idiom miscompile test from PR43206.
The transform forgot to check SCEV loop scopes.
https://bugs.llvm.org/show_bug.cgi?id=43206
llvm-svn: 374661
The file was modifiedllvm/test/Transforms/LoopIdiom/bcmp-negative-tests.ll
Commit 76cdcf25b883751d83402baea6316772aa73865c by lebedev.ri
[LoopIdiomRecognize] Recommit: BCmp loop idiom recognition
Summary: This is a recommit, this originally landed in rL370454 but was
subsequently reverted in  rL370788 due to
https://bugs.llvm.org/show_bug.cgi?id=43206 The reduced testcase was
added to bcmp-negative-tests.ll as @pr43206_different_loops - we must
ensure that the SCEV's we got are both for the same loop we are
currently investigating.
Original commit message:
@mclow.lists brought up this issue up in IRC. It is a reasonably common
problem to compare some two values for equality. Those may be just some
integers, strings or arrays of integers.
In C, there is `memcmp()`, `bcmp()` functions. In C++, there exists
`std::equal()` algorithm. One can also write that function manually.
libstdc++'s `std::equal()` is specialized to directly call `memcmp()`
for various types, but not `std::byte` from C++2a.
https://godbolt.org/z/mx2ejJ
libc++ does not do anything like that, it simply relies on simple C++'s
`operator==()`. https://godbolt.org/z/er0Zwf (GOOD!)
So likely, there exists a certain performance opportunities. Let's
compare performance of naive `std::equal()` (no `memcmp()`) with one
that is using `memcmp()` (in this case, compiled with modified
compiler). {F8768213}
```
#include <algorithm>
#include <cmath>
#include <cstdint>
#include <iterator>
#include <limits>
#include <random>
#include <type_traits>
#include <utility>
#include <vector>
#include "benchmark/benchmark.h"
template <class T> bool equal(T* a, T* a_end, T* b) noexcept {
for (; a != a_end; ++a, ++b) {
   if (*a != *b) return false;
}
return true;
}
template <typename T> std::vector<T> getVectorOfRandomNumbers(size_t
count) {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(),
                                      std::numeric_limits<T>::max());
std::vector<T> v;
v.reserve(count);
std::generate_n(std::back_inserter(v), count,
                 [&dis, &gen]() { return dis(gen); });
assert(v.size() == count);
return v;
}
struct Identical {
template <typename T>
static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
   auto Tmp = getVectorOfRandomNumbers<T>(count);
   return std::make_pair(Tmp, std::move(Tmp));
}
};
struct InequalHalfway {
template <typename T>
static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
   auto V0 = getVectorOfRandomNumbers<T>(count);
   auto V1 = V0;
   V1[V1.size() / size_t(2)]++;  // just change the value.
   return std::make_pair(std::move(V0), std::move(V1));
}
};
template <class T, class Gen> void BM_bcmp(benchmark::State& state) {
const size_t Length = state.range(0);
  const std::pair<std::vector<T>, std::vector<T>> Data =
     Gen::template Gen<T>(Length);
const std::vector<T>& a = Data.first;
const std::vector<T>& b = Data.second;
assert(a.size() == Length && b.size() == a.size());
  benchmark::ClobberMemory();
benchmark::DoNotOptimize(a);
benchmark::DoNotOptimize(a.data());
benchmark::DoNotOptimize(b);
benchmark::DoNotOptimize(b.data());
  for (auto _ : state) {
   const bool is_equal = equal(a.data(), a.data() + a.size(), b.data());
   benchmark::DoNotOptimize(is_equal);
}
state.SetComplexityN(Length);
state.counters["eltcnt"] =
     benchmark::Counter(Length,
benchmark::Counter::kIsIterationInvariant);
state.counters["eltcnt/sec"] =
     benchmark::Counter(Length,
benchmark::Counter::kIsIterationInvariantRate);
const size_t BytesRead = 2 * sizeof(T) * Length;
state.counters["bytes_read/iteration"] =
     benchmark::Counter(BytesRead, benchmark::Counter::kDefaults,
                        benchmark::Counter::OneK::kIs1024);
state.counters["bytes_read/sec"] = benchmark::Counter(
     BytesRead, benchmark::Counter::kIsIterationInvariantRate,
     benchmark::Counter::OneK::kIs1024);
}
template <typename T> static void
CustomArguments(benchmark::internal::Benchmark* b) {
const size_t L2SizeBytes = []() {
   for (const benchmark::CPUInfo::CacheInfo& I :
        benchmark::CPUInfo::Get().caches) {
     if (I.level == 2) return I.size;
   }
   return 0;
}();
// What is the largest range we can check to always fit within given L2
cache?
const size_t MaxLen = L2SizeBytes / /*total bufs*/ 2 /
                       /*maximal elt size*/ sizeof(T) / /*safety
margin*/ 2;
b->RangeMultiplier(2)->Range(1, MaxLen)->Complexity(benchmark::oN);
}
BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, Identical)
   ->Apply(CustomArguments<uint8_t>); BENCHMARK_TEMPLATE(BM_bcmp,
uint16_t, Identical)
   ->Apply(CustomArguments<uint16_t>); BENCHMARK_TEMPLATE(BM_bcmp,
uint32_t, Identical)
   ->Apply(CustomArguments<uint32_t>); BENCHMARK_TEMPLATE(BM_bcmp,
uint64_t, Identical)
   ->Apply(CustomArguments<uint64_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, InequalHalfway)
   ->Apply(CustomArguments<uint8_t>); BENCHMARK_TEMPLATE(BM_bcmp,
uint16_t, InequalHalfway)
   ->Apply(CustomArguments<uint16_t>); BENCHMARK_TEMPLATE(BM_bcmp,
uint32_t, InequalHalfway)
   ->Apply(CustomArguments<uint32_t>); BENCHMARK_TEMPLATE(BM_bcmp,
uint64_t, InequalHalfway)
   ->Apply(CustomArguments<uint64_t>);
```
{F8768210}
```
$ ~/src/googlebenchmark/tools/compare.py --no-utest benchmarks
build-{old,new}/test/llvm-bcmp-bench RUNNING:
build-old/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpb6PEUx 2019-04-25
21:17:11 Running build-old/test/llvm-bcmp-bench Run on (8 X 4000 MHz CPU
s) CPU Caches:
L1 Data 16K (x8)
L1 Instruction 64K (x4)
L2 Unified 2048K (x4)
L3 Unified 8192K (x1) Load Average: 0.65, 3.90, 4.14
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU 
Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...> BM_bcmp<uint8_t, Identical>/512000           432131 ns     
432101 ns         1613 bytes_read/iteration=1000k
bytes_read/sec=2.20706G/s eltcnt=825.856M eltcnt/sec=1.18491G/s
BM_bcmp<uint8_t, Identical>_BigO               0.86 N          0.86 N
BM_bcmp<uint8_t, Identical>_RMS                   8 %             8 %
<...> BM_bcmp<uint16_t, Identical>/256000          161408 ns     
161409 ns         4027 bytes_read/iteration=1000k
bytes_read/sec=5.90843G/s eltcnt=1030.91M eltcnt/sec=1.58603G/s
BM_bcmp<uint16_t, Identical>_BigO              0.67 N          0.67 N
BM_bcmp<uint16_t, Identical>_RMS                 25 %            25 %
<...> BM_bcmp<uint32_t, Identical>/128000           81497 ns      
81488 ns         8415 bytes_read/iteration=1000k
bytes_read/sec=11.7032G/s eltcnt=1077.12M eltcnt/sec=1.57078G/s
BM_bcmp<uint32_t, Identical>_BigO              0.71 N          0.71 N
BM_bcmp<uint32_t, Identical>_RMS                 42 %            42 %
<...> BM_bcmp<uint64_t, Identical>/64000            50138 ns      
50138 ns        10909 bytes_read/iteration=1000k
bytes_read/sec=19.0209G/s eltcnt=698.176M eltcnt/sec=1.27647G/s
BM_bcmp<uint64_t, Identical>_BigO              0.84 N          0.84 N
BM_bcmp<uint64_t, Identical>_RMS                 27 %            27 %
<...> BM_bcmp<uint8_t, InequalHalfway>/512000      192405 ns     
192392 ns         3638 bytes_read/iteration=1000k
bytes_read/sec=4.95694G/s eltcnt=1.86266G eltcnt/sec=2.66124G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.38 N          0.38 N
BM_bcmp<uint8_t, InequalHalfway>_RMS              3 %             3 %
<...> BM_bcmp<uint16_t, InequalHalfway>/256000     127858 ns     
127860 ns         5477 bytes_read/iteration=1000k
bytes_read/sec=7.45873G/s eltcnt=1.40211G eltcnt/sec=2.00219G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint16_t, InequalHalfway>_RMS             0 %             0 %
<...> BM_bcmp<uint32_t, InequalHalfway>/128000      49140 ns      
49140 ns        14281 bytes_read/iteration=1000k
bytes_read/sec=19.4072G/s eltcnt=1.82797G eltcnt/sec=2.60478G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.40 N          0.40 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            18 %            18 %
<...> BM_bcmp<uint64_t, InequalHalfway>/64000       32101 ns      
32099 ns        21786 bytes_read/iteration=1000k
bytes_read/sec=29.7101G/s eltcnt=1.3943G eltcnt/sec=1.99381G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint64_t, InequalHalfway>_RMS             1 %             1 %
RUNNING: build-new/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpQ46PP0
2019-04-25 21:19:29 Running build-new/test/llvm-bcmp-bench Run on (8 X
4000 MHz CPU s) CPU Caches:
L1 Data 16K (x8)
L1 Instruction 64K (x4)
L2 Unified 2048K (x4)
L3 Unified 8192K (x1) Load Average: 1.01, 2.85, 3.71
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU 
Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...> BM_bcmp<uint8_t, Identical>/512000            18593 ns      
18590 ns        37565 bytes_read/iteration=1000k
bytes_read/sec=51.2991G/s eltcnt=19.2333G eltcnt/sec=27.541G/s
BM_bcmp<uint8_t, Identical>_BigO               0.04 N          0.04 N
BM_bcmp<uint8_t, Identical>_RMS                  37 %            37 %
<...> BM_bcmp<uint16_t, Identical>/256000           18950 ns      
18948 ns        37223 bytes_read/iteration=1000k
bytes_read/sec=50.3324G/s eltcnt=9.52909G eltcnt/sec=13.511G/s
BM_bcmp<uint16_t, Identical>_BigO              0.08 N          0.08 N
BM_bcmp<uint16_t, Identical>_RMS                 34 %            34 %
<...> BM_bcmp<uint32_t, Identical>/128000           18627 ns      
18627 ns        37895 bytes_read/iteration=1000k
bytes_read/sec=51.198G/s eltcnt=4.85056G eltcnt/sec=6.87168G/s
BM_bcmp<uint32_t, Identical>_BigO              0.16 N          0.16 N
BM_bcmp<uint32_t, Identical>_RMS                 35 %            35 %
<...> BM_bcmp<uint64_t, Identical>/64000            18855 ns      
18855 ns        37458 bytes_read/iteration=1000k
bytes_read/sec=50.5791G/s eltcnt=2.39731G eltcnt/sec=3.3943G/s
BM_bcmp<uint64_t, Identical>_BigO              0.32 N          0.32 N
BM_bcmp<uint64_t, Identical>_RMS                 33 %            33 %
<...> BM_bcmp<uint8_t, InequalHalfway>/512000        9570 ns       
9569 ns        73500 bytes_read/iteration=1000k
bytes_read/sec=99.6601G/s eltcnt=37.632G eltcnt/sec=53.5046G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.02 N          0.02 N
BM_bcmp<uint8_t, InequalHalfway>_RMS             29 %            29 %
<...> BM_bcmp<uint16_t, InequalHalfway>/256000       9547 ns       
9547 ns        74343 bytes_read/iteration=1000k
bytes_read/sec=99.8971G/s eltcnt=19.0318G eltcnt/sec=26.8159G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.04 N          0.04 N
BM_bcmp<uint16_t, InequalHalfway>_RMS            29 %            29 %
<...> BM_bcmp<uint32_t, InequalHalfway>/128000       9396 ns       
9394 ns        73521 bytes_read/iteration=1000k
bytes_read/sec=101.518G/s eltcnt=9.41069G eltcnt/sec=13.6255G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.08 N          0.08 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            30 %            30 %
<...> BM_bcmp<uint64_t, InequalHalfway>/64000        9499 ns       
9498 ns        73802 bytes_read/iteration=1000k
bytes_read/sec=100.405G/s eltcnt=4.72333G eltcnt/sec=6.73808G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.16 N          0.16 N
BM_bcmp<uint64_t, InequalHalfway>_RMS            28 %            28 %
Comparing build-old/test/llvm-bcmp-bench to
build-new/test/llvm-bcmp-bench Benchmark                               
                 Time             CPU      Time Old      Time New     
CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------
<...> BM_bcmp<uint8_t, Identical>/512000                      -0.9570  
     -0.9570        432131         18593        432101         18590
<...> BM_bcmp<uint16_t, Identical>/256000                     -0.8826  
     -0.8826        161408         18950        161409         18948
<...> BM_bcmp<uint32_t, Identical>/128000                     -0.7714  
     -0.7714         81497         18627         81488         18627
<...> BM_bcmp<uint64_t, Identical>/64000                      -0.6239  
     -0.6239         50138         18855         50138         18855
<...> BM_bcmp<uint8_t, InequalHalfway>/512000                 -0.9503  
     -0.9503        192405          9570        192392          9569
<...> BM_bcmp<uint16_t, InequalHalfway>/256000                -0.9253  
     -0.9253        127858          9547        127860          9547
<...> BM_bcmp<uint32_t, InequalHalfway>/128000                -0.8088  
     -0.8088         49140          9396         49140          9394
<...> BM_bcmp<uint64_t, InequalHalfway>/64000                 -0.7041  
     -0.7041         32101          9499         32099          9498
```
What can we tell from the benchmark?
* Performance of naive equality check somewhat improves with element
size,
maxing out at eltcnt/sec=1.58603G/s for uint16_t, or
bytes_read/sec=19.0209G/s
for uint64_t. I think, that instability implies performance problems.
* Performance of `memcmp()`-aware benchmark always maxes out at around
bytes_read/sec=51.2991G/s for every type. That is 2.6x the throughput
of the
naive variant!
* eltcnt/sec metric for the `memcmp()`-aware benchmark maxes out at
eltcnt/sec=27.541G/s for uint8_t (was: eltcnt/sec=1.18491G/s, so 24x)
and
linearly decreases with element size.
For uint64_t, it's ~4x+ the elements/second.
* The call obvious is more pricey than the loop, with small element
count.
As it can be seen from the full output {F8768210}, the `memcmp()` is
almost
universally worse, independent of the element size (and thus buffer
size) when
element count is less than 8.
So all in all, bcmp idiom does indeed pose untapped performance
headroom. This diff does implement said idiom recognition. I think a
reasonable test coverage is present, but do tell if there is anything
obvious missing.
Now, quality. This does succeed to build and pass the test-suite, at
least without any non-bundled elements. {F8768216} {F8768217} This
transform fires 91 times:
```
$ /build/test-suite/utils/compare.py -m loop-idiom.NumBCmp
result-new.json Tests: 1149 Metric: loop-idiom.NumBCmp
Program                                         result-new
MultiSourc...Benchmarks/7zip/7zip-benchmark    79.00
MultiSource/Applications/d/make_dparser         3.00
SingleSource/UnitTests/vla                      2.00
MultiSource/Applications/Burg/burg              1.00
MultiSourc.../Applications/JM/lencod/lencod     1.00
MultiSource/Applications/lemon/lemon            1.00
MultiSource/Benchmarks/Bullet/bullet            1.00
MultiSourc...e/Benchmarks/MallocBench/gs/gs     1.00
MultiSourc...gs-C/TimberWolfMC/timberwolfmc     1.00
MultiSourc...Prolangs-C/simulator/simulator     1.00
``` The size changes are: I'm not sure what's going on with
SingleSource/UnitTests/vla.test yet, did not look.
```
$ /build/test-suite/utils/compare.py -m size..text result-{old,new}.json
--filter-hash Tests: 1149 Same hash: 907 (filtered out) Remaining: 242
Metric: size..text
Program                                        result-old result-new
diff test-suite...ingleSource/UnitTests/vla.test   753.00     833.00   
10.6% test-suite...marks/7zip/7zip-benchmark.test   1001697.00 966657.00
-3.5% test-suite...ngs-C/simulator/simulator.test   32369.00   32321.00
  -0.1% test-suite...plications/d/make_dparser.test   89585.00 
89505.00   -0.1% test-suite...ce/Applications/Burg/burg.test   40817.00
40785.00   -0.1% test-suite.../Applications/lemon/lemon.test   47281.00
  47249.00   -0.1% test-suite...TimberWolfMC/timberwolfmc.test 
250065.00  250113.00   0.0% test-suite...chmarks/MallocBench/gs/gs.test
149889.00  149873.00  -0.0% test-suite...ications/JM/lencod/lencod.test
  769585.00  769569.00  -0.0%
test-suite.../Benchmarks/Bullet/bullet.test   770049.00  770049.00 
0.0% test-suite...HMARK_ANISTROPIC_DIFFUSION/128    NaN        NaN     
nan% test-suite...HMARK_ANISTROPIC_DIFFUSION/256    NaN        NaN    
  nan% test-suite...CHMARK_ANISTROPIC_DIFFUSION/64    NaN        NaN   
   nan% test-suite...CHMARK_ANISTROPIC_DIFFUSION/32    NaN        NaN  
    nan% test-suite...ENCHMARK_BILATERAL_FILTER/64/4    NaN        NaN 
     nan% Geomean difference                                           
      nan%
        result-old    result-new       diff count  1.000000e+01
10.00000      10.000000 mean   3.152090e+05  311695.40000  0.006749 std
  3.790398e+05  372091.42232  0.036605 min    7.530000e+02  833.00000  
-0.034981 25%    4.243300e+04  42401.00000  -0.000866 50%  
1.197370e+05  119689.00000 -0.000392 75%    6.397050e+05  639705.00000
-0.000005 max    1.001697e+06  966657.00000  0.106242
```
I don't have timings though.
And now to the code. The basic idea is to completely replace the whole
loop. If we can't fully kill it, don't transform. I have left one or two
comments in the code, so hopefully it can be understood.
Also, there is a few TODO's that i have left for follow-ups:
* widening of `memcmp()`/`bcmp()`
* step smaller than the comparison size
* Metadata propagation
* more than two blocks as long as there is still a single backedge?
* ???
Reviewers: reames, fhahn, mkazantsev, chandlerc, craig.topper, courbet
Reviewed By: courbet
Subscribers: miyuki, hiraditya, xbolva00, nikic, jfb, gchatelet,
courbet, llvm-commits, mclow.lists
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61144
llvm-svn: 374662
The file was modifiedllvm/test/Transforms/LoopIdiom/bcmp-basic.ll
The file was modifiedllvm/docs/ReleaseNotes.rst
The file was modifiedllvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
The file was modifiedllvm/test/Transforms/LoopIdiom/bcmp-debugify-remarks.ll
The file was modifiedllvm/test/Transforms/LoopIdiom/bcmp-widening.ll
Commit 1f20bc17d005fb37816fac95955ec4a36f9de26f by mydeveloperday
[clang-format] Proposal for clang-format to give compiler style warnings
Summary: Related somewhat to {D29039}
On seeing a quote on twitter by @invalidop
> If it's not formatted with clang-format it's a build error.
This made me want to change the way I use clang-format into a tool that
could optionally show me where my source code violates clang-format
syle.
When I'm making a change to clang-format itself, one thing I like to do
to test the change is to ensure I didn't cause a huge wave of changes,
what I want to do is simply run this on a known formatted directory and
see if any new differences arrive in a manner I'm used to.
This started me thinking that we should allow build systems to run
clang-format on a whole tree and emit compiler style warnings about
files that fail clang-format in a form that would make them as a warning
in most build systems and because those build systems range in their
construction I don't think its unreasonable to NOT expect them to have
to do the directory searching or parsing the output replacements
themselves, but simply transform that into an error code when there are
changes required.
I am starting this by suggesing adding a -n or -dry-run command line
argument which would emit a warning/error of the form
Support for various common compiler command line argumuments like
'-Werror' and '-ferror-limit' could make this very flexible to be
integrated into build systems and CI systems.
```
> $ /usr/bin/clang-format --dry-run ClangFormat.cpp -ferror-limit=3
-fcolor-diagnostics
> ClangFormat.cpp:54:29: warning: code should be clang-formatted
[-Wclang-format-violations]
> static cl::list<std::string>
>                             ^
> ClangFormat.cpp:55:20: warning: code should be clang-formatted
[-Wclang-format-violations]
> LineRanges("lines", cl::desc("<start line>:<end line> - format a range
of\n"
>                    ^
> ClangFormat.cpp:55:77: warning: code should be clang-formatted
[-Wclang-format-violations]
> LineRanges("lines", cl::desc("<start line>:<end line> - format a range
of\n"
>                                                                      
     ^
```
Reviewers: mitchell-stellar, klimek, owenpan
Reviewed By: klimek
Subscribers: mgorny, cfe-commits
Tags: #clang-format, #clang-tools-extra, #clang
Differential Revision: https://reviews.llvm.org/D68554
llvm-svn: 374663
The file was modifiedclang/tools/clang-format/CMakeLists.txt
The file was addedclang/test/Format/dry-run-alias.cpp
The file was addedclang/test/Format/dry-run.cpp
The file was modifiedclang/tools/clang-format/ClangFormat.cpp
Commit 8259f7ca120734b6fba0e67ae445492009f9871d by jdenny.ornl
Revert r374657: "[lit] Try again to fix new tests that fail on Windows
bots"
llvm-svn: 374664
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
The file was modifiedllvm/utils/lit/tests/shtest-shell.py
Commit a271acbf79326da83cab36a02938fc6cd5740b1b by jdenny.ornl
[lit] Try yet again to fix new tests that fail on Windows bots
I seem to have misread the bot logs on my last attempt.  When lit's
internal diff runs on Windows under Python 2.7, it's text diffs not
binary diffs that need decoding to avoid this error when writing the
diff to stdout:
``` UnicodeEncodeError: 'ascii' codec can't encode characters in
position 7-8: ordinal not in range(128)
```
There is no `decode` attribute in this case under Python 3.6.8 under
Ubuntu, so this patch checks for the `decode` attribute before using it
here.  Hopefully nothing else is needed when `decode` isn't available.
It might take a couple more attempts to figure out what error handling,
if any, is needed for this decoding.
llvm-svn: 374665
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
Commit 64c00893fae650bdfd1ae05136de3216f70c69b4 by jdenny.ornl
[lit] Adjust error handling for decode introduced by r374665
On that decode, Windows bots fail with:
``` UnicodeEncodeError: 'ascii' codec can't encode characters in
position 7-8: ordinal not in range(128)
```
That's the same error as before r374665 except it's now at the decode
before the write to stdout.
llvm-svn: 374666
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
Commit 6446079add805b3e0478f8709a30fd6b0434b4e1 by llvm-dev
[X86] Use any_of/all_of patterns in shuffle mask pattern recognisers.
NFCI.
llvm-svn: 374667
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
Commit 37041c7d226621b19492bb54da959f3374658727 by llvm-dev
Fix cppcheck shadow variable name warnings. NFCI.
llvm-svn: 374668
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
Commit 66417a9f036d1b653298fae9d541b71cfcfd3447 by llvm-dev
Replace for-loop of SmallVector::push_back with SmallVector::append.
NFCI.
llvm-svn: 374669
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
Commit c8ac97edc899515fe676dea89b828cbd9630e6c1 by lebedev.ri
[NFC][LoopIdiom] Adjust FIXME to be self-explanatory
llvm-svn: 374670
The file was modifiedllvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
Commit 1e98a6c57a0f71b5a5b06c7d6de5f35d89fd102d by jdenny.ornl
[lit] Try errors="ignore" for decode introduced by r374665
Still trying to fix the same error as in r374666.
llvm-svn: 374671
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
Commit 936c6b57be163b621056a0e11c3bcc540d0dcb8a by llvm-dev
SymbolRecord - fix uninitialized variable warnings. NFCI.
llvm-svn: 374672
The file was modifiedllvm/include/llvm/DebugInfo/CodeView/SymbolRecord.h
Commit 5f2543f8dc09268a7844e4731076f8073376517b by llvm-dev
SymbolRecord - consistently use explicit for single operand constructors
llvm-svn: 374673
The file was modifiedllvm/include/llvm/DebugInfo/CodeView/SymbolRecord.h
Commit 6716512670396634e73861eb291b59ab2ea1e588 by llvm-dev
[X86] scaleShuffleMask - use size_t Scale to avoid overflow warnings
llvm-svn: 374674
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.h
Commit 459a93659a9b3a34027e9e0dac7be5d80914c122 by jdenny.ornl
Revert r374671: "[lit] Try errors="ignore" for decode introduced by
r374665"
This series of patches still breaks a Windows bot.
llvm-svn: 374675
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
Commit b005d9e86ffcc2cf3a3403106b6342e5ede27429 by jdenny.ornl
Revert r374666: "[lit] Adjust error handling for decode introduced by
r374665"
This series of patches still breaks a Windows bot.
llvm-svn: 374676
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
Commit e9d3b8192ed63127a60b9dc7657f5157536d4fbd by jdenny.ornl
Revert r374665: "[lit] Try yet again to fix new tests that fail on
Windows bots"
This series of patches still breaks a Windows bot.
llvm-svn: 374677
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
Commit 9abfa5817116d04439773586b8dcd41a8540baba by jdenny.ornl
Revert r374653: "[lit] Fix a few oversights in r374651 that broke some
bots"
This series of patches still breaks a Windows bot.
llvm-svn: 374678
The file was modifiedllvm/utils/lit/tests/shtest-shell.py
The file was modifiedllvm/test/MC/ARM/preserve-comments-arm.s
Commit 57046e8fd9b9dd91358b92862d0591c6ac717750 by jdenny.ornl
Revert r374652: "[lit] Fix internal diff's --strip-trailing-cr and use
it"
This series of patches still breaks a Windows bot.
llvm-svn: 374679
The file was modifiedllvm/utils/lit/tests/shtest-shell.py
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
The file was modifiedllvm/test/tools/llvm-profdata/roundtrip.test
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-in.dos
The file was modifiedllvm/utils/lit/tests/max-failures.py
The file was modifiedllvm/test/MC/AsmParser/preserve-comments.s
The file was modifiedllvm/test/tools/llvm-profdata/profile-symbol-list.test
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-strip-trailing-cr.txt
The file was modifiedllvm/test/tools/llvm-profdata/sample-remap.test
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-in.unix
The file was modifiedllvm/test/tools/llvm-cxxmap/remap.test
Commit b3f157a90087858fb45de4ecdafcae9d4246fb62 by jdenny.ornl
Revert 374651: "Reland r374392: [lit] Extend internal diff to support
-U"
This series of patches still breaks a Windows bot.
llvm-svn: 374680
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-unified.txt
The file was modifiedllvm/utils/lit/tests/max-failures.py
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
The file was modifiedllvm/utils/lit/tests/shtest-shell.py
Commit f6210fc24f2160fa36e8980e74418ff7535e4b9f by jdenny.ornl
Revert r374650: "Reland r374390: [lit] Extend internal diff to support
`-` argument"
This series of patches still breaks a Windows bot.
llvm-svn: 374681
The file was modifiedllvm/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt
The file was modifiedllvm/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
The file was modifiedllvm/utils/lit/tests/max-failures.py
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-r-error-7.txt
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-r-error-8.txt
The file was modifiedllvm/utils/lit/tests/shtest-shell.py
Commit 199f5cd8636dbc6571c5eb4a147a210f372a8e37 by jdenny.ornl
Revert r374649: "Reland r374389: [lit] Clean up internal diff's encoding
handling"
This series of patches still breaks a Windows bot.
llvm-svn: 374682
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-encodings.txt
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-in.utf8
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-in.utf16
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-in.bin
The file was modifiedllvm/utils/lit/lit/builtin_commands/diff.py
The file was modifiedllvm/utils/lit/tests/max-failures.py
The file was modifiedllvm/utils/lit/tests/shtest-shell.py
Commit 3049748e15b76465ad21e1d68c48fc1dcf7def09 by jdenny.ornl
Revert r374648: "Reland r374388: [lit] Make internal diff work in
pipelines"
This series of patches still breaks a Windows bot.
llvm-svn: 374683
The file was removedllvm/utils/lit/tests/Inputs/shtest-shell/diff-pipes.txt
The file was addedllvm/utils/lit/tests/Inputs/shtest-shell/diff-error-0.txt
The file was removedllvm/utils/lit/lit/builtin_commands/diff.py
The file was modifiedllvm/utils/lit/tests/shtest-shell.py
The file was modifiedllvm/utils/lit/lit/TestRunner.py
Commit e198823b87b20c8dacb1bc950e9c44f1ce46f702 by Casey
[libc++][test] Silence MSVC warning in std::optional test
`make_optional<string>(4, 'X')` passes `4` (an `int`) as the first
argument to `string`'s `(size_t, charT)` constructor, triggering a
signed/unsigned mismatch warning when compiling with MSVC at `/W4`. The
incredibly simple fix is to instead use an unsigned literal (`4u`).
llvm-svn: 374684
The file was modifiedlibcxx/test/std/utilities/optional/optional.specalg/make_optional_explicit.pass.cpp
Commit b12012cc980fe125121b821eda86076bcaa6d58b by nicolasweber
gn build: (manually) merge r374663
llvm-svn: 374686
The file was modifiedllvm/utils/gn/secondary/clang/tools/clang-format/BUILD.gn
Commit e95d1ca1e2e24ffddd10bd890ed18280b8e415a2 by nicolasweber
Revert r374663 "[clang-format] Proposal for clang-format to give
compiler style warnings"
The test fails on macOS and looks a bit wrong, see comments on the
review.
Also revert follow-up r374686.
llvm-svn: 374688
The file was removedclang/test/Format/dry-run-alias.cpp
The file was modifiedllvm/utils/gn/secondary/clang/tools/clang-format/BUILD.gn
The file was modifiedclang/tools/clang-format/ClangFormat.cpp
The file was modifiedclang/tools/clang-format/CMakeLists.txt
The file was removedclang/test/Format/dry-run.cpp