Commit
b6ee5f2b1df66987e65e1b636ba9ae1554b0334b
by listmailMove code for checking loop metadata into Analysis [nfc]
I need the mustprogress loop metadata in ScalarEvolution and it makes sense to keep all the accessors for quering loop metadate together.
|
 | llvm/include/llvm/Analysis/LoopInfo.h (diff) |
 | llvm/lib/Transforms/Utils/LoopUtils.cpp (diff) |
 | llvm/include/llvm/Transforms/Utils/LoopUtils.h (diff) |
 | llvm/lib/Analysis/LoopInfo.cpp (diff) |
|
 | clang/lib/Serialization/ASTReaderStmt.cpp (diff) |
Commit
aaaeb4b160fe94e0ad3bcd6073eea4807f84a33a
by listmail[SCEV] Use mustprogress flag on loops (in addition to function attribute)
This addresses a performance regression reported against 3c6e4191. That change (correctly) limited a transform based on assumed finiteness to mustprogress loops, but the previous change (38540d7) which introduced the mustprogress check utility only handled function attributes, not the loop metadata form.
It turns out that clang uses the function attribute form for C++, and the loop metadata form for C. As a result, 3c6e4191 ended up being a large regression in practice for C code as loops weren't being considered mustprogress despite the language semantics.
|
 | llvm/lib/Analysis/ScalarEvolution.cpp (diff) |
 | llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll (diff) |
Commit
c03b6305d8419fda84a67f4fe357b69a86e4b54f
by i[ELF][RISCV] Resolve branch relocations referencing undefined weak to current location if not using PLT
In a -no-pie link we optimize R_PLT_PC to R_PC. Currently we resolve a branch relocation to the link-time zero address. However such a choice tends to cause relocation overflow possibility for RISC architectures.
* aarch64: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the next instruction * mips: GNU ld: branch to the start of the text segment (?); ld.lld: branch to zero * ppc32: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the current instruction * ppc64: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the current instruction * riscv: GNU ld: branch to the absolute zero address (with instruction rewriting) * i386/x86_64: GNU ld/ld.lld: branch to the link-time zero address
I think that resolving to the same location is a good choice. The instruction, if triggered, is clearly an undefined behavior. Resolving to the same location can cause an infinite loop (making the user aware of the issue) while ensuring no overflow.
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D103001
|
 | lld/ELF/InputSection.cpp (diff) |
 | lld/test/ELF/riscv-undefined-weak.s (diff) |
|
 | lld/ELF/InputSection.cpp (diff) |
|
 | libcxx/include/__ranges/enable_view.h (diff) |
Commit
7629b2a09c169bfd7f7295deb3678f3fa7755eee
by listmail[LI] Add a cover function for checking if a loop is mustprogress [nfc]
Essentially, the cover function simply combines the loop level check and the function level scope into one call. This simplifies several callers and is (subjectively) less error prone.
|
 | llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp (diff) |
 | llvm/lib/Analysis/LoopInfo.cpp (diff) |
 | llvm/lib/Transforms/Utils/LoopUtils.cpp (diff) |
 | llvm/lib/Analysis/ScalarEvolution.cpp (diff) |
 | llvm/include/llvm/Analysis/LoopInfo.h (diff) |
|
 | llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp (diff) |
Commit
667fbcdd0b2ee5e78f5ce9789b862e3bbca94644
by mizvekov[clang] NRVO: Improvements and handling of more cases.
This expands NRVO propagation for more cases:
Parse analysis improvement: * Lambdas and Blocks with dependent return type can have their variables marked as NRVO Candidates.
Variable instantiation improvements: * Fixes crash when instantiating NRVO variables in Blocks. * Functions, Lambdas, and Blocks which have auto return type have their variables' NRVO status propagated. For Blocks with non-auto return type, as a limitation, this propagation does not consider the actual return type.
This also implements exclusion of VarDecls which are references to dependent types.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99696
|
 | clang/lib/Sema/SemaCoroutine.cpp (diff) |
 | clang/lib/Sema/Sema.cpp (diff) |
 | clang/lib/Sema/SemaExprCXX.cpp (diff) |
 | clang/lib/Sema/SemaStmt.cpp (diff) |
 | clang/test/CodeGen/nrvo-tracking.cpp (diff) |
 | clang/lib/Sema/SemaTemplateInstantiateDecl.cpp (diff) |
 | clang/include/clang/Sema/Sema.h (diff) |
|
 | llvm/test/Transforms/SimplifyCFG/two-entry-phi-return.ll (diff) |
Commit
4f01122c3f6c70beee8f736f196a09976602685f
by joachim[LV] Parallel annotated loop does not imply all loads can be hoisted.
As noted in https://bugs.llvm.org/show_bug.cgi?id=46666, the current behavior of assuming if-conversion safety if a loop is annotated parallel (`!llvm.loop.parallel_accesses`), is not expectable, the documentation for this behavior was since removed from the LangRef again, and can lead to invalid reads. This was observed in POCL (https://github.com/pocl/pocl/issues/757) and would require similar workarounds in current work at hipSYCL.
The question remains why this was initially added and what the implications of removing this optimization would be. Do we need an alternative mechanism to propagate the information about legality of if-conversion? Or is the idea that conditional loads in `#pragma clang loop vectorize(assume_safety)` can be executed unmasked without additional checks flawed in general? I think this implication is not part of what a user of that pragma (and corresponding metadata) would expect and thus dangerous.
Only two additional tests failed, which are adapted in this patch. Depending on the further direction force-ifcvt.ll should be removed or further adapted.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D103907
|
 | llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h (diff) |
 | llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp (diff) |
 | llvm/test/Transforms/LoopVectorize/X86/force-ifcvt.ll |
 | llvm/test/Transforms/LoopVectorize/X86/tail_folding_and_assume_safety.ll (diff) |
Commit
20daedacca803b81db6d8773b705345702bf0fc3
by ataei2d Arm Neon sdot op, and lowering to the intrinsic.
This adds Sdot2d op, which is similar to the usual Neon intrinsic except that it takes 2d vector operands, reflecting the structure of the arithmetic that it's performing: 4 separate 4-dimensional dot products, whence the vector<4x4xi8> shape.
This also adds a new pass, arm-neon-2d-to-intr, lowering this new 2d op to the 1d intrinsic.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D102504
|
 | mlir/lib/Conversion/CMakeLists.txt (diff) |
 | mlir/include/mlir/Conversion/Passes.h (diff) |
 | mlir/include/mlir/Dialect/ArmNeon/ArmNeon.td (diff) |
 | mlir/lib/Conversion/PassDetail.h (diff) |
 | mlir/include/mlir/Conversion/Passes.td (diff) |
 | mlir/test/Dialect/ArmNeon/invalid.mlir |
 | mlir/test/Target/LLVMIR/arm-neon-2d.mlir |
 | mlir/include/mlir/Conversion/ArmNeon2dToIntr/ArmNeon2dToIntr.h |
 | mlir/lib/Conversion/ArmNeon2dToIntr/ArmNeon2dToIntr.cpp |
 | mlir/lib/Conversion/ArmNeon2dToIntr/CMakeLists.txt |
|
 | mlir/docs/DialectConversion.md (diff) |
Commit
933df6ca796c0ace889bcc64706ec53462bd859a
by Jessica Paquette[AArch64][GlobalISel] Legalize scalar G_CTTZ + G_CTTZ_ZERO_UNDEF
This adds legalization for scalar G_CTTZ and G_CTTZ_ZERO_UNDEF. Vector support requires handling vector G_BITREVERSE, which I haven't gotten around to yet.
For G_CTTZ_ZERO_UNDEF, we just lower it to G_CTTZ.
For G_CTTZ, we match SelectionDAG's lowering to a G_BITREVERSE + G_CTLZ.
e.g. https://godbolt.org/z/nPEseYh1s
(With this patch, we have slightly worse codegen than SDAG for types smaller than s32; it seems like we're missing a combine.)
Also, this adds in a function to build G_BITREVERSE to MachineIRBuilder.
Differential Revision: https://reviews.llvm.org/D104065
|
 | llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (diff) |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir (diff) |
 | llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.h (diff) |
 | llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h (diff) |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalize-cttz-zero-undef.mir |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalize-cttz.mir |
Commit
462f8f06113616ac5646144972d3f453639aac69
by cjdb[libcxx][ranges] removes default_initializable from weakly_incrementable and view
also:
* removes default constructors from predefined iterators * makes span and string_view views
Partially implements P2325. Partially resolves LWG3326.
Differential Revision: https://reviews.llvm.org/D102468
|
 | libcxx/test/std/iterators/iterator.requirements/iterator.concepts/iterator.concept.winc/weakly_incrementable.compile.pass.cpp (diff) |
 | libcxx/test/std/iterators/predef.iterators/insert.iterators/insert.iter.ops/insert.iter.cons/default.pass.cpp |
 | libcxx/include/__ranges/enable_view.h (diff) |
 | libcxx/include/iterator (diff) |
 | libcxx/test/std/iterators/predef.iterators/insert.iterators/back.insert.iter.ops/back.insert.iter.cons/default.pass.cpp |
 | libcxx/test/std/iterators/iterator.requirements/iterator.concepts/iterator.concept.winc/subsumption.compile.pass.cpp |
 | libcxx/docs/Cxx2aStatusPaperStatus.csv (diff) |
 | libcxx/test/std/containers/views/range_concept_conformance.compile.pass.cpp (diff) |
 | libcxx/test/std/strings/string.view/range_concept_conformance.compile.pass.cpp (diff) |
 | libcxx/include/string_view (diff) |
 | libcxx/test/std/ranges/range.req/range.view/view.subsumption.compile.pass.cpp (diff) |
 | libcxx/test/std/iterators/stream.iterators/ostreambuf.iterator/ostreambuf.iter.cons/default.pass.cpp |
 | libcxx/test/std/iterators/stream.iterators/ostream.iterator/ostream.iterator.cons.des/default.pass.cpp |
 | libcxx/include/__ranges/concepts.h (diff) |
 | libcxx/docs/Cxx2aStatusIssuesStatus.csv (diff) |
 | libcxx/test/std/iterators/predef.iterators/insert.iterators/front.insert.iter.ops/front.insert.iter.cons/default.pass.cpp |
 | libcxx/test/std/ranges/range.req/range.view/view.compile.pass.cpp (diff) |
 | libcxx/include/__iterator/concepts.h (diff) |
 | libcxx/include/span (diff) |
Commit
41555eaf65b12db00c8a18e7fe530f72ab9ebfc0
by andrew.kaylorPreserve more MD_mem_parallel_loop_access and MD_access_group in SROA
SROA sometimes preserves MD_mem_parallel_loop_access and MD_access_group metadata on loads/stores, and sometimes fails to do so. This change adds copying of the MD after other CreateAlignedLoad/CreateAlignedStores. Also fix a case where the metadata was being copied from a load, rather than the store.
Added a LIT test to catch one case.
Patch by Mark Mendell
Differential Revision: https://reviews.llvm.org/D103254
|
 | llvm/test/Transforms/SROA/mem-par-metadata-sroa-cast.ll |
 | llvm/lib/Transforms/Scalar/SROA.cpp (diff) |
Commit
cbd0054b9eb17ec48f0702e3828209646c8f5ebd
by mizvekov[clang] Implement P2266 Simpler implicit move
This Implements [[http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2266r1.html|P2266 Simpler implicit move]].
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99005
|
 | clang/lib/Sema/SemaType.cpp (diff) |
 | clang/test/SemaCXX/return-stack-addr.cpp (diff) |
 | clang/lib/Sema/SemaExprCXX.cpp (diff) |
 | clang/test/SemaCXX/coroutines.cpp (diff) |
 | clang/test/CXX/dcl.dcl/dcl.spec/dcl.type/dcl.spec.auto/p7-cxx14.cpp (diff) |
 | clang/test/SemaCXX/warn-return-std-move.cpp (diff) |
 | clang/test/SemaCXX/coroutine-rvo.cpp (diff) |
 | clang/test/CXX/drs/dr3xx.cpp (diff) |
 | clang/include/clang/Sema/Sema.h (diff) |
 | clang/test/SemaCXX/constant-expression-cxx14.cpp (diff) |
 | clang/test/SemaCXX/deduced-return-type-cxx14.cpp (diff) |
 | clang/lib/Sema/SemaCoroutine.cpp (diff) |
 | clang/lib/Sema/SemaStmt.cpp (diff) |
 | clang/test/CXX/class/class.init/class.copy.elision/p3.cpp (diff) |
 | clang/test/CXX/temp/temp.decls/temp.mem/p5.cpp (diff) |
 | clang/test/CXX/expr/expr.prim/expr.prim.lambda/p4-cxx14.cpp (diff) |
 | clang/test/SemaCXX/constant-expression-cxx11.cpp (diff) |
Commit
189428c8fc2465c25efbf4f0bb73e26fecf150ce
by aeubanks[Profile] Handle invalid profile data
This mostly follows LLVM's InstrProfReader.cpp error handling. Previously, attempting to merge corrupted profile data would result in crashes. See https://crbug.com/1216811#c4.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D104050
|
 | compiler-rt/lib/profile/InstrProfilingFile.c (diff) |
 | compiler-rt/test/profile/instrprof-merge.c (diff) |
 | compiler-rt/test/profile/Linux/instrprof-merge-vp.c (diff) |
 | compiler-rt/test/profile/instrprof-without-libc.c (diff) |
 | compiler-rt/lib/profile/InstrProfiling.h (diff) |
 | compiler-rt/lib/profile/InstrProfilingMerge.c (diff) |
 | compiler-rt/test/profile/Linux/corrupted-profile.c |
Commit
fc018ebb608ee0c1239b405460e49f1835ab6175
by ndesaulniers[IR] make -warn-frame-size into a module attr
-Wframe-larger-than= is an interesting warning; we can't know the frame size until PrologueEpilogueInsertion (PEI); very late in the compilation pipeline.
-Wframe-larger-than= was propagated through CC1 as an -mllvm flag, then was a cl::opt in LLVM's PEI pass; this meant it was dropped during LTO and needed to be re-specified via -plugin-opt.
Instead, make it part of the IR proper as a module level attribute, similar to D103048. Introduce -fwarn-stack-size CC1 option.
Reviewed By: rsmith, qcolombet
Differential Revision: https://reviews.llvm.org/D103928
|
 | clang/include/clang/Driver/Options.td (diff) |
 | llvm/lib/CodeGen/PrologEpilogInserter.cpp (diff) |
 | clang/include/clang/Basic/CodeGenOptions.def (diff) |
 | llvm/lib/IR/Module.cpp (diff) |
 | clang/test/Frontend/backend-diagnostic.c (diff) |
 | clang/lib/CodeGen/CodeGenModule.cpp (diff) |
 | llvm/test/CodeGen/ARM/warn-stack.ll (diff) |
 | clang/test/Driver/Wframe-larger-than.c |
 | llvm/test/Linker/warn-stack-frame.ll |
 | llvm/include/llvm/IR/Module.h (diff) |
 | clang/test/Misc/backend-stack-frame-diagnostics-fallback.cpp (diff) |
 | llvm/test/CodeGen/X86/warn-stack.ll (diff) |
 | clang/lib/Driver/ToolChains/Clang.cpp (diff) |
|
 | compiler-rt/lib/profile/InstrProfilingMerge.c (diff) |
Commit
119965865cc730060e4cc95690ee7dab91c2c440
by vkelesLoadStoreVectorizer: support different operand orders in the add sequence match
First we refactor the code which does no wrapping add sequences match: we need to allow different operand orders for the key add instructions involved in the match.
Then we use the refactored code trying 4 variants of matching operands.
Originally the code relied on the fact that the matching operands of the two last add instructions of memory index calculations had the same LHS argument. But which operand is the same in the two instructions is actually not essential, so now we allow that to be any of LHS or RHS of each of the two instructions. This increases the chances of vectorization to happen.
Reviewed By: volkan
Differential Revision: https://reviews.llvm.org/D103912
|
 | llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp (diff) |
 | llvm/test/Transforms/LoadStoreVectorizer/X86/vectorize-i8-nested-add.ll (diff) |
|
 | llvm/test/CodeGen/X86/constructor.ll (diff) |
 | llvm/test/CodeGen/SPARC/constructor.ll (diff) |
 | llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (diff) |
 | llvm/test/CodeGen/X86/2011-08-29-InitOrder.ll (diff) |
Commit
ffaca140d01b0b93723c3322b08351b03b95831f
by ndesaulniers[IR] Value: Fix OpCode checks
Value::SubclassID cannot be directly compared to Instruction enums, such as Instruction::{Call,Invoke,CallBr}. We have to first subtract InstructionVal from the SubclassID to get the OpCode, similar to Instruction::getOpCode().
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D104043
|
 | llvm/lib/IR/Value.cpp (diff) |
Commit
b35a842581f089daa57dd7e6b78ccb08d92709b2
by craig.topper[RISCV] Add test cases that show failure to use some W instructions if they are proceeded by a load. NFC
The loads end up becoming sextload/zextload which prevent our isel patterns from finding the sign_extend_inreg or AND instruction we need.
The easiest way to fix this is to use computeKnownBits or ComputeNumSignBits in our isel matching to catch this.
|
 | llvm/test/CodeGen/RISCV/half-convert.ll (diff) |
 | llvm/test/CodeGen/RISCV/double-convert.ll (diff) |
 | llvm/test/CodeGen/RISCV/rv64zbb.ll (diff) |
 | llvm/test/CodeGen/RISCV/float-convert.ll (diff) |
Commit
cfbb92441f17d1f5a9d9c3e195646df4117cb0ca
by carl.ritson[SDAG] Fix pow2 assumption when splitting vectors
When reducing vector builds to shuffles it possible that the DAG combiner may try to extract invalid subvectors.
This happens as the existing code assumes vectors will be power of 2 sizes, which is already untrue, but becomes more noticable with v6 and v7 types. Specifically the existing code assumes that half PowerOf2Ceil of a given vector index will fit twice into a given vector.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D103880
|
 | llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (diff) |
|
 | llvm/include/llvm/Support/MachineValueType.h (diff) |
 | llvm/lib/CodeGen/ValueTypes.cpp (diff) |
 | llvm/utils/TableGen/CodeGenTarget.cpp (diff) |
 | llvm/include/llvm/CodeGen/ValueTypes.td (diff) |
Commit
670edf3ee0045ce007f2f6aec94a2c3344c5682e
by Amara Emerson[AArch64][GlobalISel] Fix incorrectly generating uxtw/sxtw for addressing modes.
When the extend is from 8 or 16 bits, the addressing modes don't support those extensions, but we weren't checking that and therefore always generated the 32->64b extension mode. Fun.
Differential Revision: https://reviews.llvm.org/D104070
|
 | llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp (diff) |
 | llvm/test/CodeGen/AArch64/GlobalISel/select-arith-extended-reg.mir (diff) |
Commit
f8a1d652da00ecff448213c58522da5a61d9bc4b
by riddleriver[mlir][IR] Move MemRefElementTypeInterface to a new BuiltinTypeInterfaces file
This allows for using other type interfaces in the builtin dialect, which currently results in a compile time failure (as it generates duplicate interface declarations).
|
 | mlir/include/mlir/IR/BuiltinTypes.td (diff) |
 | mlir/include/mlir/IR/BuiltinTypeInterfaces.td |
 | mlir/lib/IR/CMakeLists.txt (diff) |
 | mlir/include/mlir/IR/CMakeLists.txt (diff) |
Commit
c42dd5dbb015afaef99cf876195c474c63c2393e
by riddleriver[mlir] Add new SubElementAttr/SubElementType Interfaces
These interfaces allow for a composite attribute or type to opaquely provide access to any held attributes or types. There are several intended use cases for this interface. The first of which is to allow the printer to create aliases for non-builtin dialect attributes and types. In the future, this interface will also be extended to allow for SymbolRefAttr to be placed on other entities aside from just DictionaryAttr and ArrayAttr.
To limit potential test breakages, this revision only adds the new interfaces to the builtin attributes/types that are currently hardcoded during AsmPrinter alias generation. In a followup the remaining builtin attributes/types, and non-builtin attributes/types can be extended to support it.
Differential Revision: https://reviews.llvm.org/D102945
|
 | mlir/include/mlir/IR/BuiltinAttributes.h (diff) |
 | mlir/include/mlir/IR/BuiltinTypes.td (diff) |
 | mlir/lib/IR/SubElementInterfaces.cpp |
 | mlir/unittests/IR/SubElementInterfaceTest.cpp |
 | mlir/include/mlir/IR/BuiltinTypes.h (diff) |
 | mlir/include/mlir/IR/SubElementInterfaces.td |
 | mlir/lib/IR/BuiltinAttributes.cpp (diff) |
 | mlir/lib/IR/BuiltinTypes.cpp (diff) |
 | mlir/lib/IR/AsmPrinter.cpp (diff) |
 | mlir/lib/IR/CMakeLists.txt (diff) |
 | mlir/include/mlir/IR/BuiltinAttributes.td (diff) |
 | mlir/unittests/IR/CMakeLists.txt (diff) |
 | mlir/include/mlir/IR/SubElementInterfaces.h |
 | mlir/include/mlir/IR/CMakeLists.txt (diff) |
Commit
8800047707a9cd86fb7143699af0e5564c28f4aa
by riddleriver[mlir-ir-printing] Prefix the dump message with the split marker(// -----)
This allows for better interaction with tools (such as mlir-lsp-server), as it separates the IR into separate modules for consecutive dumps.
Differential Revision: https://reviews.llvm.org/D104073
|
 | mlir/lib/Pass/IRPrinting.cpp (diff) |
 | mlir/test/Pass/run-reproducer.mlir (diff) |
 | mlir/test/Pass/ir-printing.mlir (diff) |
Commit
7836d058c7e115eace62e324ef6c01670326f518
by llvm-project[Flang] Compile fix after D99459.
Fix Flang build after addition of a new OpenMP clauses for a Clang patch (D99459). Flang is using TableGen to generation the declaration of clause checks and the new clause was missing a definiton.
|
 | flang/lib/Semantics/check-omp-structure.cpp (diff) |
Commit
420bd5ee8ec996a2c2e305541e59465a5ba436e3
by craig.topper[RISCV] Use ComputeNumSignBits/MaskedValueIsZero in RISCVDAGToDAGISel::selectSExti32/selectZExti32.
This helps us select W instructions in more cases. Most of the affected tests have had the sign_extend_inreg or AND folded into sextload/zextload.
Differential Revision: https://reviews.llvm.org/D104079
|
 | llvm/lib/Target/RISCV/RISCVInstrInfoB.td (diff) |
 | llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (diff) |
 | llvm/test/CodeGen/RISCV/double-convert.ll (diff) |
 | llvm/test/CodeGen/RISCV/rem.ll (diff) |
 | llvm/test/CodeGen/RISCV/rv64zbb.ll (diff) |
 | llvm/test/CodeGen/RISCV/half-convert.ll (diff) |
 | llvm/test/CodeGen/RISCV/float-convert.ll (diff) |
Commit
2670c7dd5b25e87825edc0aca7729c1d3dba5afc
by qiucofan[VectorCombine] Fix alignment in single element store
This fixes the concern in single element store scalarization that the alignment of new store may be larger than it should be. It calculates the largest alignment if index is constant, and a safe one if not.
Reviewed By: lebedev.ri, spatel
Differential Revision: https://reviews.llvm.org/D103419
|
 | llvm/lib/Transforms/Vectorize/VectorCombine.cpp (diff) |
 | llvm/test/Transforms/VectorCombine/load-insert-store.ll (diff) |