Commit
5f5a2547c174cf1eaf7874ff02c198629fe02c22
by llvm-dev[X86] LowerBUILD_VECTOR - track zero/nonzero elements with APInt masks. NFCI.
Prep work for undef/zero 'upper elements' handling as proposed in D92645.
|
 | llvm/lib/Target/X86/X86ISelLowering.cpp |
Commit
aefedb170734d680516c3875873c80fc29498b43
by marukawa[VE] Add logical mask intrinsic instructions
Add andm, orm, xorm, eqvm, nndm, negm, pcvm, lzvm, and tovm intrinsic instructions, a few pseudo instructions to expand logical intrinsic using VM512, a mechnism to expand such pseudo instructions, and regression tests. Also, assign vector mask types and vector mask register classes correctly. This is required to use VM512 registers as function arguments.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D93093
|
 | llvm/test/CodeGen/VE/VELIntrinsics/tovm.ll |
 | llvm/lib/Target/VE/VEInstrInfo.cpp |
 | llvm/test/CodeGen/VE/VELIntrinsics/eqvm.ll |
 | llvm/test/CodeGen/VE/VELIntrinsics/pcvm.ll |
 | llvm/include/llvm/IR/IntrinsicsVEVL.gen.td |
 | llvm/test/CodeGen/VE/VELIntrinsics/lzvm.ll |
 | llvm/test/CodeGen/VE/VELIntrinsics/xorm.ll |
 | llvm/test/CodeGen/VE/VELIntrinsics/nndm.ll |
 | llvm/test/CodeGen/VE/VELIntrinsics/orm.ll |
 | llvm/lib/Target/VE/VEInstrIntrinsicVL.gen.td |
 | llvm/lib/Target/VE/VEInstrVec.td |
 | llvm/test/CodeGen/VE/VELIntrinsics/negm.ll |
 | llvm/test/CodeGen/VE/VELIntrinsics/andm.ll |
Commit
07e92e6b6002d95d438d24eaabf4452ad6e4ef8f
by jay.foad[AMDGPU] Make use of HasSMemRealTime predicate. NFC.
We have this subtarget feature so it makes sense to use it here. This is NFC because it's always defined by default on GFX8+.
Differential Revision: https://reviews.llvm.org/D93202
|
 | llvm/lib/Target/AMDGPU/AMDGPU.td |
 | llvm/lib/Target/AMDGPU/SMInstructions.td |
Commit
c21df2a79c268d1e0f467ec25a1ec7cb4aff5dfb
by raulRevert "Re-apply "[CMake][compiler-rt][AArch64] Avoid preprocessing LSE builtins separately""
This reverts commit 03ebe1937192c247c4a7b8ec19dde2cf9845c914.
It's still breaking bots, e.g. http://green.lab.llvm.org/green/job/clang-stage1-RA/17027/console although it doesn't change any actual code. The compile errors don't make much sense either. Revert for now.
Differential Revision: https://reviews.llvm.org/D93228
|
 | compiler-rt/lib/builtins/aarch64/lse.S |
 | compiler-rt/cmake/Modules/CompilerRTDarwinUtils.cmake |
 | compiler-rt/lib/builtins/CMakeLists.txt |
Commit
87d7757bbe14fed420092071ded3430072053316
by Stanislav.Mekhanoshin[SLP] Control maximum vectorization factor from TTI
D82227 has added a proper check to limit PHI vectorization to the maximum vector register size. That unfortunately resulted in at least a couple of regressions on SystemZ and x86.
This change reverts PHI handling from D82227 and replaces it with a more general check in SLPVectorizerPass::tryToVectorizeList(). Moved to tryToVectorizeList() it allows to restart vectorization if initial chunk fails.
However, this function is more general and handles not only PHI but everything which SLP handles. If vectorization factor would be limited to maximum vector register size it would limit much more vectorization than before leading to further regressions. Therefore a new TTI callback getMaximumVF() is added with the default 0 to preserve current behavior and limit nothing. Then targets can decide what is better for them.
The callback gets ElementSize just like a similar getMinimumVF() function and the main opcode of the chain. The latter is to avoid regressions at least on the AMDGPU. We can have loads and stores up to 128 bit wide, and <2 x 16> bit vector math on some subtargets, where the rest shall not be vectorized. I.e. we need to differentiate based on the element size and operation itself.
Differential Revision: https://reviews.llvm.org/D92059
|
 | llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp |
 | llvm/test/Transforms/SLPVectorizer/slp-max-phi-size.ll |
 | llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h |
 | llvm/include/llvm/Analysis/TargetTransformInfo.h |
 | llvm/lib/Analysis/TargetTransformInfo.cpp |
 | llvm/test/Transforms/SLPVectorizer/AMDGPU/add_sub_sat.ll |
 | llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp |
 | llvm/test/Transforms/SLPVectorizer/AMDGPU/round.ll |
 | llvm/include/llvm/Analysis/TargetTransformInfoImpl.h |
Commit
9ad2091e78eb47e6707abbc7c83e208ea1150589
by sivachandra[libc][Obvious] Include <fenv.h> from DummyFenv.h.
|
 | libc/utils/FPUtil/DummyFEnv.h |