SuccessChanges

Summary

  1. [docs][llvm-strings] Write llvm-strings documentation (details)
  2. Move some definitions from Sema to Basic to fix shared libs build (details)
  3. [clangd][vscode] update the development doc. (details)
  4. [InstCombine] add/move tests for icmp with add operand; NFC (details)
  5. [X86][NFC] Add a `use-aa` feature. (details)
  6. AMDGPU/GlobalISel: Remove another illegal select test (details)
  7. AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL (details)
  8. AMDGPU/GlobalISel: Fix some broken run lines (details)
  9. AMDGPU/GlobalISel: Fail select of G_INSERT non-32-bit source (details)
  10. [NFC] remove unused functions (details)
  11. [SystemZ]  Call erase() on the right MBB in (details)
  12. [LV] Add ARM MVE tail-folding tests (details)
  13. [libFuzzer] Remove unused version of FuzzedDataProvider.h. (details)
  14. [ExecutionEngine] Don't dereference a dyn_cast result. NFCI. (details)
  15. [ARM] Add patterns for CTLZ on MVE (details)
  16. [ARM] Lower CTTZ on MVE (details)
  17. [ARM] Add patterns for bitreverse intrinsic on MVE (details)
  18. [ARM] Add patterns for BSWAP intrinsic on MVE (details)
  19. [InstCombine] move tests for icmp+add; NFC (details)
  20. [InstCombine] remove unneeded one-use checks for icmp fold (details)
  21. [clangd] Simplify semantic highlighting visitor (details)
  22. [SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost, (details)
  23. [OPENMP]Fix parsing/sema for function templates with declare simd. (details)
  24. [ARM] A predicate cast of a predicate cast is a predicate cast (details)
  25. [X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands (details)
  26. [Clang][Codegen] Relax available-externally-suppress.c test (details)
  27. [Clang][Codegen] Disable arm_acle.c test. (details)
Commit 75b6279c5e7b2a77fd65c6f3c3b5f126b728ceb5 by jh7370
[docs][llvm-strings] Write llvm-strings documentation
Previously we only had a stub document.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D67554
llvm-svn: 371984
The file was modifiedllvm/docs/CommandGuide/llvm-strings.rst
Commit b79f3319584c31c6ea7a93bc7d6f99c46eda5923 by erich.keane
Move some definitions from Sema to Basic to fix shared libs build
r371875 moved some functionality around to a Basic header file, but
didn't move its definitions as well.  This patch moves some things
around so that shared library building can work.
llvm-svn: 371985
The file was modifiedclang/utils/TableGen/ClangAttrEmitter.cpp
The file was modifiedclang/lib/Sema/ParsedAttr.cpp
The file was modifiedclang/lib/Basic/Attributes.cpp
Commit 91154d65165e5bc757d307051d6d6daf2e91e697 by hokein
[clangd][vscode] update the development doc.
llvm-svn: 371986
The file was modifiedclang-tools-extra/clangd/clients/clangd-vscode/DEVELOPING.md
Commit f201b1c91875024224a945862cf394c24c6a29e3 by spatel
[InstCombine] add/move tests for icmp with add operand; NFC
llvm-svn: 371988
The file was modifiedllvm/test/Transforms/InstCombine/icmp-add.ll
The file was modifiedllvm/test/Transforms/InstCombine/icmp.ll
Commit 44bfbcc28e715212f9f8ac104424d72e76d38acf by courbet
[X86][NFC] Add a `use-aa` feature.
Summary: This allows enabling useaa on the command-line and will allow
enabling the feature on a per-CPU basis where benchmarking shows
improvements.
This is modelled after the ARM/AArch64 target.
Reviewers: RKSimon, andreadb, craig.topper
Subscribers: javed.absar, kristof.beyls, hiraditya, ychen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67266
llvm-svn: 371989
The file was modifiedllvm/lib/Target/X86/X86.td
The file was modifiedllvm/lib/Target/X86/X86Subtarget.h
Commit bf7524db35befce9c90d4372571efdfde75740ba by Matthew.Arsenault
AMDGPU/GlobalISel: Remove another illegal select test
llvm-svn: 371990
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir
Commit 1fc07d66488b914cc8b26e817618a2a490ef2b32 by Matthew.Arsenault
AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL
llvm-svn: 371991
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-fceil.mir
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-frint.mir
Commit 07b85976566d6d0da100fcbde1a5cd9c78ac9259 by Matthew.Arsenault
AMDGPU/GlobalISel: Fix some broken run lines
llvm-svn: 371992
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.ldexp.s16.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-fadd-local.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fabs.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fneg.mir
Commit fb51e64eaccb49199f06982f4170b160c5737420 by Matthew.Arsenault
AMDGPU/GlobalISel: Fail select of G_INSERT non-32-bit source
This was producing an illegal copy which would hit an assert later.
Error on selection for now until this is implemented.
llvm-svn: 371993
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
Commit 98cb8db836b155d0497f9d609a43510fd8eaf83d by gchatelet
[NFC] remove unused functions
Reviewers: courbet
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67616
llvm-svn: 371994
The file was modifiedllvm/lib/Analysis/TargetTransformInfo.cpp
The file was modifiedllvm/include/llvm/Analysis/TargetTransformInfoImpl.h
The file was modifiedllvm/include/llvm/Analysis/TargetTransformInfo.h
Commit b7dadc3562d0488127727c924b8735e4780a2b69 by paulsson
[SystemZ]  Call erase() on the right MBB in
SystemZTargetLowering::emitSelect()
Since MBB was split *before* MI, the MI(s) will reside in JoinMBB (MBB)
at the point of erasing them, so calling StartMBB->erase() is actually
wrong, although it is "working" by all appearances.
Review: Ulrich Weigand llvm-svn: 371995
The file was modifiedllvm/lib/Target/SystemZ/SystemZISelLowering.cpp
Commit c2bafadd7a3338ab0d7f9c5543754ec1a803b17b by sjoerd.meijer
[LV] Add ARM MVE tail-folding tests
Now that the vectorizer can do tail-folding (rL367592), and the ARM
backend understands MVE masked loads/stores (rL371932), it's time to add
the MVE tail-folding equivalent of the X86 tests that I added.
llvm-svn: 371996
The file was addedllvm/test/Transforms/LoopVectorize/ARM/tail-loop-folding.ll
Commit d0f63f83e7c5c6fc11e964f848d1496234695182 by mmoroz
[libFuzzer] Remove unused version of FuzzedDataProvider.h.
Summary: The actual version lives in compiler-rt/include/fuzzer/.
Reviewers: Dor1s
Reviewed By: Dor1s
Subscribers: delcypher, #sanitizers, llvm-commits
Tags: #llvm, #sanitizers
Differential Revision: https://reviews.llvm.org/D67623
llvm-svn: 371997
The file was removedcompiler-rt/lib/fuzzer/utils/FuzzedDataProvider.h
Commit a48b6e98abc15b2f41570396040454e3c093b568 by llvm-dev
[ExecutionEngine] Don't dereference a dyn_cast result. NFCI.
The static analyzer is warning about potential null dereferences of
dyn_cast<> results - in these cases we can safely use cast<> directly as
we know that these cases should all be the correct type, which is why
its working atm and anyway cast<> will assert if they aren't.
llvm-svn: 371998
The file was modifiedllvm/lib/ExecutionEngine/ExecutionEngine.cpp
Commit cd1a0b92710e567c00f6d2b932b197e9a1773f7d by oliver.cruickshank
[ARM] Add patterns for CTLZ on MVE
CTLZ intrinsic can use the VCLS instruction on MVE, which produces
better results than expanding.
llvm-svn: 371999
The file was modifiedllvm/lib/Target/ARM/ARMISelLowering.cpp
The file was modifiedllvm/lib/Target/ARM/ARMInstrMVE.td
The file was addedllvm/test/CodeGen/Thumb2/mve-ctlz.ll
Commit 5f799ef1627f6f4f548f411a40fb94c620af25b6 by oliver.cruickshank
[ARM] Lower CTTZ on MVE
Lower CTTZ on MVE using VBRSR and VCLS which will reverse the bits and
count the leading zeros, equivalent to a count trailing zeros (CTTZ).
llvm-svn: 372000
The file was modifiedllvm/lib/Target/ARM/ARMISelLowering.cpp
The file was addedllvm/test/CodeGen/Thumb2/mve-cttz.ll
Commit e9510a6cadb1aeb407184514803065413f8dd7bf by oliver.cruickshank
[ARM] Add patterns for bitreverse intrinsic on MVE
BITREVERSE can use the VBRSR which will reverse and right shift.
Shifting right by 0 will just reverse the bits.
llvm-svn: 372001
The file was addedllvm/test/CodeGen/Thumb2/mve-bitreverse.ll
The file was modifiedllvm/lib/Target/ARM/ARMInstrMVE.td
The file was modifiedllvm/lib/Target/ARM/ARMISelLowering.cpp
Commit ee6fbebbaff5af0a0fbe58a0e33ef191340223ea by oliver.cruickshank
[ARM] Add patterns for BSWAP intrinsic on MVE
BSWAP can use the VREV instruction on MVE to produce better results than
expanding.
llvm-svn: 372002
The file was modifiedllvm/lib/Target/ARM/ARMInstrMVE.td
The file was addedllvm/test/CodeGen/Thumb2/mve-bswap.ll
The file was modifiedllvm/lib/Target/ARM/ARMISelLowering.cpp
Commit 4d9d0f9cf532ec40f07178693f1c37049c18bc79 by spatel
[InstCombine] move tests for icmp+add; NFC
llvm-svn: 372004
The file was removedllvm/test/Transforms/InstCombine/2009-01-31-Pressure.ll
The file was modifiedllvm/test/Transforms/InstCombine/icmp-add.ll
Commit 3961a143e13a9cd7fdfec74a9f26e86117618708 by spatel
[InstCombine] remove unneeded one-use checks for icmp fold
Related folds were added in: rL125734
...the code comment about register pressure is discussed in more detail
in: https://bugs.llvm.org/show_bug.cgi?id=2698
But 10 years later, perf testing bzip2 with this change now shows a
slight (0.2% average) improvement on Haswell although that's probably
within test noise.
Given that this is IR canonicalization, we shouldn't be worried about
register pressure though; the backend should be able to adjust for that
as needed.
This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't need to add
special-case code to detect larger patterns.
rL371940 and rL371981 are related patches in this series.
llvm-svn: 372007
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
The file was modifiedllvm/test/Transforms/InstCombine/icmp-add.ll
Commit 685d8a95c5a392b016b259b41c17065d30b66afe by ibiryukov
[clangd] Simplify semantic highlighting visitor
Summary:
- Functions to compute highlighting kinds for things are separated from
the ones that add highlighting tokens.
This keeps each of them more focused on what they're doing: getting
locations and figuring out the kind of the entity, correspondingly.
- Less special cases in visitor for various nodes.
This change is an NFC.
Reviewers: hokein
Reviewed By: hokein
Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67341
llvm-svn: 372008
The file was modifiedclang-tools-extra/clangd/SemanticHighlighting.cpp
Commit 10151f661854e3ee4922662f1d0f62b327cbfa8c by lebedev.ri
[SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost,
not per-BB cost
Summary: Previously, if the threshold was 2, we were willing to
speculatively execute 2 cheap instructions in both basic blocks (thus we
were willing to speculatively execute cost = 4), but weren't willing to
speculate when one BB had 3 instructions and other one had no
instructions, even thought that would have total cost of 3.
This looks inconsistent to me. I don't think `cmov`-like instructions
will start executing until both of it's inputs are available:
https://godbolt.org/z/zgHePf So i don't see why the existing behavior is
the correct one.
Also, let's add it's own `cl::opt` for this threshold, with default=4,
so it is not stricter than the previous threshold: will allow to fold
when there are 2 BB's each with cost=2. And since the logic has changed,
it will also allow to fold when one BB has cost=3 and other cost=1, or
there is only one BB with cost=4.
This is an alternative solution to D65148: This fix is mainly motivated
by `signbit-like-value-extension.ll` test. That pattern comes up in JPEG
decoding, see e.g.
`Figure F.12 – Extending the sign bit of a decoded value in V` of `ITU
T.81` (JPEG specification). That branch is not predictable, and it is
within the innermost loop, so the fact that that pattern ends up being
stuck with a branch instead of `select` (i.e. `CMOV` for x86) is
unlikely to be beneficial.
This has great results on the final assembly (vanilla test-suite +
RawSpeed): (metric pass - D67240)
| metric                                 |     old |     new | delta | 
   % |
| x86-mi-counting.NumMachineFunctions    |   37720 |   37721 |     1 |
0.00% |
| x86-mi-counting.NumMachineBasicBlocks  |  773545 |  771181 | -2364 |
-0.31% |
| x86-mi-counting.NumMachineInstructions | 7488843 | 7486442 | -2401 |
-0.03% |
| x86-mi-counting.NumUncondBR            |  135770 |  135543 |  -227 |
-0.17% |
| x86-mi-counting.NumCondBR              |  423753 |  422187 | -1566 |
-0.37% |
| x86-mi-counting.NumCMOV                |   24815 |   25731 |   916 |
3.69% |
| x86-mi-counting.NumVecBlend            |      17 |      17 |     0 |
0.00% |
We significantly decrease basic block count, notably decrease
instruction count, significantly decrease branch count and very
significantly increase `cmov` count.
Performance-wise, unsurprisingly, this has great effect on target
RawSpeed benchmark. I'm seeing 5 **major** improvements:
``` Benchmark                                                          
                                 Time             CPU      Time Old    
Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue    
                           0.0000          0.0000      U Test,
Repetitions: 49 vs 49
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean      
                          -0.3064         -0.3064      226.9913    
157.4452      226.9800      157.4384
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median    
                          -0.3057         -0.3057      226.8407    
157.4926      226.8282      157.4828
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev    
                          -0.4985         -0.4954        0.3051      
0.1530        0.3040        0.1534
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue     
                           0.0000          0.0000      U Test,
Repetitions: 49 vs 49
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean       
                          -0.1747         -0.1747       80.4787     
66.4227       80.4771       66.4146
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median     
                          -0.1742         -0.1743       80.4686     
66.4542       80.4690       66.4436
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev     
                          +0.6089         +0.5797        0.0670      
0.1078        0.0673        0.1062
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue    
                           0.0000          0.0000      U Test,
Repetitions: 49 vs 49
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean      
                          -0.1598         -0.1598      171.6996    
144.2575      171.6915      144.2538
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median    
                          -0.1598         -0.1597      171.7109    
144.2755      171.7018      144.2766
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev    
                          +0.4024         +0.3850        0.0847      
0.1187        0.0848        0.1175 Canon/EOS
77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue               
                 0.0000          0.0000      U Test, Repetitions: 49 vs
49 Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean    
                             -0.0550         -0.0551      280.3046    
264.8800      280.3017      264.8559 Canon/EOS
77D/IMG_4049.CR2/threads:8/process_time/real_time_median               
                -0.0554         -0.0554      280.2628      264.7360    
280.2574      264.7297 Canon/EOS
77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev               
                +0.7005         +0.7041        0.2779        0.4725    
  0.2775        0.4729 Canon/EOS
5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue               
                 0.0000          0.0000      U Test, Repetitions: 49 vs
49 Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean    
                             -0.0354         -0.0355      316.7396    
305.5208      316.7342      305.4890 Canon/EOS
5DS/2K4A9929.CR2/threads:8/process_time/real_time_median               
                -0.0354         -0.0356      316.6969      305.4798    
316.6917      305.4324 Canon/EOS
5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev               
                +0.0493         +0.0330        0.3562        0.3737    
  0.3563        0.3681
```
That being said, it's always best-effort, so there will likely be cases
where this worsens things.
Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot,
hfinkel, chandlerc
Reviewed By: jmolloy
Subscribers: xbolva00, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67318
llvm-svn: 372009
The file was modifiedllvm/lib/Transforms/Utils/SimplifyCFG.cpp
The file was modifiedllvm/test/Transforms/SimplifyCFG/PhiEliminate3.ll
The file was modifiedllvm/test/Transforms/SimplifyCFG/X86/speculate-cttz-ctlz.ll
The file was modifiedllvm/test/Transforms/SimplifyCFG/safe-low-bit-extract.ll
The file was modifiedllvm/test/Transforms/SimplifyCFG/speculate-math.ll
The file was modifiedllvm/test/Transforms/IndVarSimplify/loop_evaluate_1.ll
The file was modifiedllvm/test/Transforms/PGOProfile/chr.ll
The file was modifiedllvm/test/Transforms/SimplifyCFG/safe-abs.ll
The file was modifiedllvm/test/Transforms/SimplifyCFG/signbit-like-value-extension.ll
The file was modifiedllvm/test/Transforms/SimplifyCFG/SpeculativeExec.ll
The file was modifiedllvm/test/Transforms/SimplifyCFG/X86/switch_to_lookup_table.ll
Commit a00630785fc7bbc723950543b0d82b6a70bfc797 by a.bataev
[OPENMP]Fix parsing/sema for function templates with declare simd.
Need to return original declaration group with FunctionTemplateDecl, not
the inner FunctionDecl, to correctly handle parsing of directives with
the templates parameters.
llvm-svn: 372011
The file was modifiedclang/lib/Sema/SemaOpenMP.cpp
The file was modifiedclang/test/OpenMP/declare_simd_ast_print.cpp
Commit 8d21460dc50e4b2adabdffc19537b970b5fa7094 by david.green
[ARM] A predicate cast of a predicate cast is a predicate cast
The adds some very basic folding of PREDICATE_CASTS, removing cases when
they are chained together. These would already be removed eventually, as
these are lowered to copies. This just allows it to happen earlier,
which can help other simplifications.
Differential Revision: https://reviews.llvm.org/D67591
llvm-svn: 372012
The file was modifiedllvm/test/CodeGen/Thumb2/mve-masked-ldst.ll
The file was modifiedllvm/lib/Target/ARM/ARMISelLowering.cpp
The file was modifiedllvm/test/CodeGen/Thumb2/mve-pred-loadstore.ll
The file was modifiedllvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
Commit 3df0daddfd466cfc33124379a9c43a781bb6da13 by llvm-dev
[X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands
Determine if all of the uses of LHS/RHS operands can be replaced with a
zero vector.
llvm-svn: 372013
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-512-v8.ll
The file was modifiedllvm/test/CodeGen/X86/vector-shuffle-256-v4.ll
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
Commit b9909ffed82f7ed8a211b21146225badc0f22428 by lebedev.ri
[Clang][Codegen] Relax available-externally-suppress.c test
That test is broken by design. It depends on llvm middle-end behavior.
No clang codegen test should be doing that. This one is salvageable by
relaxing check lines.
llvm-svn: 372014
The file was modifiedclang/test/CodeGen/available-externally-suppress.c
Commit 6fcd4e080f09c9765d6e0ea03b1da91669c8509a by lebedev.ri
[Clang][Codegen] Disable arm_acle.c test.
This test is broken by design. Clang codegen tests should not depend on
llvm middle-end behaviour, they should *only* test clang codegen. Yet
this test runs whole optimization pipeline. I've really tried to fix it,
but there isn't just a few things that depend on passes, but everything
there does.
llvm-svn: 372015
The file was modifiedclang/test/CodeGen/arm_acle.c