Commit
d89d3dfae17d7795dc1ef013db66272020de1959
by dvyukovsanitizer_common: optimize memory drain
Currently we allocate MemoryMapper per size class. MemoryMapper mmap's and munmap's internal buffer. This results in 50 mmap/munmap calls under the global allocator mutex. Reuse MemoryMapper and the buffer for all size classes. This radically reduces number of mmap/munmap calls. Smaller size classes tend to have more objects allocated, so it's highly likely that the buffer allocated for the first size class will be enough for all subsequent size classes.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D105778
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h |
 | compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp |
Commit
1d8030053d46b89e3677986d059065c6a2e7a2e1
by jeroen.dobbelaere[NFC] Do not track calls to inlined intrinsics in IFI.
Just like intrinsics are not tracked for IFI.InlinedCalls, they should not be tracked for IFI.InlinedCallSites.
In the current top-of-tree this change is a NFC, but the full restrict patches (D68484) potentially trigger an read-after-free if intrinsics are also added to the InlindeCallSites, due to a late optimization potentially removing some of the inlined intrinsics.
Also see https://lists.llvm.org/pipermail/llvm-dev/2021-July/151722.html for a discussion about the problem.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D105805
|
 | llvm/lib/Transforms/Utils/InlineFunction.cpp |
 | llvm/lib/Transforms/IPO/Inliner.cpp |
Commit
45430983ef8235f2a018e7daa10a0ad71ef7b85c
by ro[sanitizer_common] Define internal_usleep on Solaris
The Solaris/amd64 buildbot <https://lab.llvm.org/staging/#/builders/101/builds/2845> has recently been broken several times, at least one of those remains unfixed:
[63/446] Generating Sanitizer-x86_64-Test [...] Undefined first referenced symbol in file _ZN11__sanitizer15internal_usleepEy /opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/stage1/projects/compiler-rt/lib/sanitizer_common/tests/libRTSanitizerCommon.test.x86_64.a(sanitizer_common.cpp.o) ld: fatal: symbol referencing errors
Thist patch fixes it by defining the missing `internal_usleep`.
Tested on `amd64-pc-solaris2.11.`
Differential Revision: https://reviews.llvm.org/D105878
|
 | compiler-rt/lib/sanitizer_common/sanitizer_solaris.cpp |
Commit
90a6bb30fafa4e68d4af1fef62987fe187fa70ab
by jeroen.dobbelaere[remangleIntrinsicFunction] Detect and resolve name clash
It is possible that the remangled name for an intrinsic already exists with a different (and wrong) prototype within the module. As the bitcode reader keeps both versions of all remangled intrinsics around for a longer time, this can result in a crash, as can be seen in https://bugs.llvm.org/show_bug.cgi?id=50923
This patch makes 'remangleIntrinsicFunction' aware of this situation. When it is detected, it moves the version with the wrong prototype to a different name. That version will be removed anyway once the module is completely loaded.
With thanks to @asbirlea for reporting this issue when trying out an lto build with the full restrict patches, and @efriedma for suggesting a sane resolution mechanism.
Reviewed By: apilipenko
Differential Revision: https://reviews.llvm.org/D105118
|
 | llvm/lib/IR/Function.cpp |
 | llvm/test/tools/llvm-link/Inputs/remangle2.ll |
 | llvm/test/tools/llvm-link/Inputs/remangle1.ll |
 | llvm/include/llvm/IR/Intrinsics.h |
 | llvm/test/Assembler/remangle.ll |
 | llvm/test/tools/llvm-link/remangle.ll |
Commit
d991b7212b4c852c29b03d6d9aec40a6e819be95
by fraser[RISCV] Pass undef VECTOR_SHUFFLE indices on to BUILD_VECTOR
Often when lowering vector shuffles, we split the shuffle into two LHS/RHS shuffles which are then blended together. To do so we split the original indices into two, indexed into each respective vector. These two index vectors are then separately lowered as BUILD_VECTORs.
This patch forwards on any undef indices to the BUILD_VECTOR, rather than having the VECTOR_SHUFFLE lowering decide on an optimal concrete index. The motiviation for ths change is so that we don't duplicate optimization logic between the two lowering methods and let BUILD_VECTOR do what it does best.
Propagating undef in this way allows us, for example, to generate `vid.v` to produce the LHS indices of commonly-used interleave-type shuffles. I have designs on further optimizing interleave-type and other common shuffle patterns in the near future.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D104789
|
 | llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll |
 | llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-buildvec.ll |
 | llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-shuffles.ll |
 | llvm/test/CodeGen/RISCV/rvv/interleave-crash.ll |
 | llvm/test/CodeGen/RISCV/rvv/common-shuffle-patterns.ll |
 | llvm/lib/Target/RISCV/RISCVISelLowering.cpp |
Commit
8724a7ec1131d2d550cabab36784a30c6a97c852
by gchatelet[libc] update benchmark distributions
All distributions (expect D) have been updated using 7 days worth of data. Distributions are smoother. This patch also moves data from header file to individual csv file. It helps the editor and allows easier export/plotting of the data.
Differential Revision: https://reviews.llvm.org/D105766
|
 | libc/benchmarks/distributions/MemcmpGoogleU.csv |
 | libc/benchmarks/distributions/MemcmpGoogleS.csv |
 | libc/benchmarks/distributions/MemcmpGoogleD.csv |
 | libc/benchmarks/distributions/MemsetGoogleA.csv |
 | libc/benchmarks/distributions/README.md |
 | libc/benchmarks/distributions/MemcmpGoogleW.csv |
 | libc/benchmarks/distributions/Uniform384To4096.csv |
 | libc/benchmarks/distributions/MemcpyGoogleQ.csv |
 | libc/benchmarks/distributions/MemcmpGoogleM.csv |
 | libc/benchmarks/distributions/MemcpyGoogleB.csv |
 | libc/benchmarks/distributions/MemsetGoogleD.csv |
 | libc/benchmarks/distributions/MemsetGoogleW.csv |
 | libc/benchmarks/distributions/MemsetGoogleB.csv |
 | libc/benchmarks/distributions/MemcpyGoogleU.csv |
 | libc/benchmarks/distributions/MemcmpGoogleQ.csv |
 | libc/benchmarks/distributions/MemsetGoogleS.csv |
 | libc/benchmarks/distributions/MemcpyGoogleL.csv |
 | libc/benchmarks/distributions/MemcpyGoogleS.csv |
 | libc/benchmarks/distributions/MemcpyGoogleM.csv |
 | libc/benchmarks/distributions/MemcmpGoogleL.csv |
 | libc/benchmarks/distributions/MemcpyGoogleA.csv |
 | libc/benchmarks/distributions/MemcpyGoogleW.csv |
 | libc/benchmarks/distributions/MemsetGoogleL.csv |
 | libc/benchmarks/distributions/MemcmpGoogleA.csv |
 | libc/benchmarks/distributions/MemsetGoogleM.csv |
 | libc/benchmarks/MemorySizeDistributions.cpp |
 | libc/benchmarks/distributions/MemsetGoogleQ.csv |
 | libc/benchmarks/distributions/MemcmpGoogleB.csv |
 | libc/benchmarks/distributions/MemsetGoogleU.csv |
 | libc/benchmarks/distributions/MemcpyGoogleD.csv |
Commit
7802f62b3f2c18aa689e315460539736e1c81974
by Tim NorthoverAArch64: use 4-byte slots for arm64_32 pointers in a tail call
|
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp |
 | llvm/test/CodeGen/AArch64/swifttail-arm64_32.ll |
Commit
78463ebde2f8a1b8ce984c1ae7c6da0c2d323005
by anton.zabaznov[OpenCL] Add support of __opencl_c_generic_address_space feature macro
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D103401
|
 | clang/lib/Basic/TargetInfo.cpp |
 | clang/lib/Parse/ParseDecl.cpp |
 | clang/test/CodeGenOpenCL/address-spaces.cl |
 | clang/test/CodeGenOpenCL/amdgpu-sizeof-alignof.cl |
 | clang/test/SemaOpenCL/address-spaces.cl |
 | clang/test/CodeGenOpenCL/overload.cl |
 | clang/test/SemaOpenCL/address-spaces-conversions-cl2.0.cl |
 | clang/test/CodeGenOpenCL/address-spaces-mangling.cl |
 | clang/test/CodeGenOpenCL/address-spaces-conversions.cl |
Commit
9d72c0ad43e720ef2394a23a2f4c58f79d753f03
by sebastian.neubauer[AMDGPU] Mark waterfall loops as SI_WATERFALL_LOOP
This way, they can be detected later, e.g. by the SIOptimizeVGPRLiveRange pass.
Differential Revision: https://reviews.llvm.org/D105467
|
 | llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.mir |
 | llvm/lib/Target/AMDGPU/SIInstructions.td |
 | llvm/lib/Target/AMDGPU/SIInstrInfo.cpp |
 | llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp |
Commit
ad2c66ec5d4bb0425625155bba966732ef85e6e5
by sebastian.neubauer[AMDGPU] Optimize VGPR LiveRange in waterfall loops
The loops are run exactly once per lane, so VGPRs do not need to be saved. Use the SIOptimizeVGPRLiveRange pass to add phi nodes that take undef when coming from the loop.
There is still a shortcoming: Return values from a function call in the loop are copied because their live range conflicts with the live range of arguments, even if arguments are only IMPLICIT_DEF after the phi insertion.
Differential Revision: https://reviews.llvm.org/D105192
|
 | llvm/lib/Target/AMDGPU/SIOptimizeVGPRLiveRange.cpp |
 | llvm/test/CodeGen/AMDGPU/indirect-call.ll |
 | llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.buffer.load.format.v3f16.ll |
 | llvm/test/CodeGen/AMDGPU/image-sample-waterfall.ll |
 | llvm/test/CodeGen/AMDGPU/vgpr-descriptor-waterfall-loop-idom-update.ll |
Commit
e312fc49ae1ec86999676edc9c02a4ac0bc39cec
by nicolas.vasilache[mlir][Linalg] Add layout specification support to bufferization.
Previously, linalg bufferization always had to be conservative at function boundaries and assume the most dynamic strided memref layout. This revision introduce the mechanism to specify a linalg.buffer_layout function argument attribute that carries an affine map used to set a less pessimistic layout.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D105859
|
 | mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp |
 | mlir/include/mlir/Dialect/Linalg/IR/LinalgBase.td |
 | mlir/test/Dialect/Linalg/comprehensive-module-bufferize.mlir |
Commit
85cb4f9904e9b080302e0c0874e3b441fe7062a7
by Tim NorthoverSupport: reduce stack used in default size test.
When the sanitizers aren't enabled they can use more than 1KB of stack, causing an overflow where there shouldn't be.
Should fix Green Dragon test.
|
 | llvm/unittests/Support/Threading.cpp |
Commit
afdae7c5d797f952bdfbaeb2cfe41a7dcca7a7b9
by llvm-dev[X86][SSE] Add signbit tests to show cmpss/cmpsd intrinsics not recognised as 'allbits' results.
This adds test coverage for the crash reported on rGe4aa6ad13216
|
 | llvm/test/CodeGen/X86/known-signbits-vector.ll |
Commit
af55335924ea852e1208d35e2462435f4a3d639c
by nicolas.vasilache[mlir][Linalg] Better support for bufferizing non-tensor results.
Clean up corner cases related to elemental tensor / buffer type return values that would previously fail.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D105857
|
 | mlir/test/Dialect/Linalg/comprehensive-module-bufferize.mlir |
 | mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp |
Commit
72748488addd651beb7b60da462c721f3e175357
by jan.kratochvil[lldb] Fix editline unicode on Linux
Based on: [lldb-dev] proposed change to remove conditional WCHAR support in libedit wrapper https://lists.llvm.org/pipermail/lldb-dev/2021-July/016961.html
There is already setlocale in lldb/source/Core/IOHandlerCursesGUI.cpp but that does not apply for Editline GUI editing.
Unaware how to make automated test for this, it requires pty.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D105779
|
 | lldb/source/Core/IOHandlerCursesGUI.cpp |
 | lldb/tools/driver/Driver.cpp |
Commit
b6b53ffef4414ed62701a63ad28e70cfd9d26191
by jonathanchesterfield[libomptarget][devicertl] Remove branches around setting parallelLevel
Simplifies control flow to allow store/load forwarding
This change folds two basic blocks into one, leaving a single store to parallelLevel. This is a step towards spmd kernels with sufficiently aggressive inlining folding the loads from parallelLevel and thus discarding the nested parallel handling when it is unused.
Transform: ``` int threadId = GetThreadIdInBlock(); if (threadId == 0) { parallelLevel[0] = expr; } else if (GetLaneId() == 0) { parallelLevel[GetWarpId()] = expr; } // => if (GetLaneId() == 0) { parallelLevel[GetWarpId()] = expr; } // because unsigned GetLaneId() { return GetThreadIdInBlock() & (WARPSIZE - 1);} // so whenever threadId == 0, GetLaneId() is also 0. ```
That replaces a store in two distinct basic blocks with as single store.
A more aggressive follow up is possible if the threads in the warp/wave race to write the same value to the same address. This is not done as part of this change.
``` if (GetLaneId() == 0) { parallelLevel[GetWarpId()] = expr; } // => parallelLevel[GetWarpId()] = expr; // because unsigned GetWarpId() { return GetThreadIdInBlock() / WARPSIZE; } // so GetWarpId will index the same element for every thread in the warp // and, because expr is lane-invariant in this case, every lane stores the // same value to this unique address ```
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D105699
|
 | openmp/libomptarget/deviceRTLs/common/src/omptarget.cu |
Commit
b205f2bb8938447638e9ddc4ee1f6b82caeb1ad3
by abidh[AMDGPU] Handle s_branch to another section.
Currently, if target of s_branch instruction is in another section, it will fail with the error of undefined label. Although in this case, the label is not undefined but present in another section. This patch tries to handle this issue. So while handling fixup_si_sopp_br fixup in getRelocType, if the target label is undefined we issue an error as before. If it is defined, a new relocation type R_AMDGPU_REL16 is returned.
This issue has been reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 and https://bugs.llvm.org/show_bug.cgi?id=45887. Before https://reviews.llvm.org/D79943, we used to get an crash for this scenario. The crash is fixed now but the we still get an undefined label error. Jumps to other section can arise with hold/cold splitting.
A patch to handle the relocation in lld will follow shortly.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D105760
|
 | llvm/docs/AMDGPUUsage.rst |
 | llvm/include/llvm/BinaryFormat/ELFRelocs/AMDGPU.def |
 | llvm/test/MC/AMDGPU/reloc.s |
 | llvm/test/tools/llvm-readobj/ELF/reloc-types-elf-amdgpu.test |
 | llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp |
Commit
bb0166dc72791e2cefdb0c8dc9e495ea0555357b
by georgios.rokos[libomptarget] Update device pointer only if needed
Currently, libomptarget will always perform a host-to-device memory transfer in order to update the device pointer of a PTR_AND_OBJ entry. This is not always necessary because the device pointer may have been set to the correct pointee address already, so we can eliminate the redundant memory transfer.
|
 | openmp/libomptarget/src/omptarget.cpp |
 | openmp/libomptarget/test/mapping/device_ptr_update.c |
Commit
9c90725eaee5a00e5dd450e51c4070afd7081472
by frgossen[MLIR] Fix documentation of the `ExecutionEngine` in the toy tutorial example
Differential Revision: https://reviews.llvm.org/D105813
|
 | mlir/docs/Tutorials/Toy/Ch-6.md |
Commit
3cee36c5acdb292c331818c553bfb8e5abbdb95e
by llvm-dev[X86][SSE] X86ISD::FSETCC nodes (cmpss/cmpsd) return a 0/-1 allbits signbits result (REAPPLIED)
Annoyingly, i686 cmpsd handling still fails to remove the unnecessary neg(and(x,1))
Reapplied rGe4aa6ad13216 with fix for intrinsic variants of the opcode which uses a vector return type
|
 | llvm/test/CodeGen/X86/known-signbits-vector.ll |
 | llvm/lib/Target/X86/X86ISelLowering.cpp |
Commit
4709d9d5be79835a5a8751dba83e9150dbce9e6e
by lebedev.ri[libomp] ompd_init(): fix heap-buffer-overflow when constructing libompd.so path
There is no guarantee that the space allocated in `libname` is enough to accomodate the whole `dl_info.dli_fname`, because it could e.g. have an suffix - `.5`, and that highlights another problem - what it should do about suffxies, and should it do anything to resolve the symlinks before changing the filename?
``` $ LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/lib" ./src/utilities/rstest/rstest -c /tmp/f49137920.NEF dl_info.dli_fname "/usr/local/lib/libomp.so.5" strlen(dl_info.dli_fname) 26 lib_path_length 14 lib_path_length + 12 26 ================================================================= ==30949==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60300000002a at pc 0x000000548648 bp 0x7ffdfa0aa780 sp 0x7ffdfa0a9f40 WRITE of size 27 at 0x60300000002a thread T0 #0 0x548647 in strcpy (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x548647) #1 0x7fb9e3e3d234 in ompd_init() /repositories/llvm-project/openmp/runtime/src/ompd-specific.cpp:102:5 #2 0x7fb9e3dcb446 in __kmp_do_serial_initialize() /repositories/llvm-project/openmp/runtime/src/kmp_runtime.cpp:6742:3 #3 0x7fb9e3dcb40b in __kmp_get_global_thread_id_reg /repositories/llvm-project/openmp/runtime/src/kmp_runtime.cpp:251:7 #4 0x59e035 in main /home/lebedevri/rawspeed/build-Clang-SANITIZE/../src/utilities/rstest/rstest.cpp:491 #5 0x7fb9e3762d09 in __libc_start_main csu/../csu/libc-start.c:308:16 #6 0x4df449 in _start (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x4df449)
0x60300000002a is located 0 bytes to the right of 26-byte region [0x603000000010,0x60300000002a) allocated by thread T0 here: #0 0x55cc5d in malloc (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x55cc5d) #1 0x7fb9e3e3d224 in ompd_init() /repositories/llvm-project/openmp/runtime/src/ompd-specific.cpp:101:17 #2 0x7fb9e3762d09 in __libc_start_main csu/../csu/libc-start.c:308:16
SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x548647) in strcpy Shadow bytes around the buggy address: 0x0c067fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c067fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c067fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c067fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c067fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x0c067fff8000: fa fa 00 00 00[02]fa fa fa fa fa fa fa fa fa fa 0x0c067fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c067fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c067fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c067fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c067fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==30949==ABORTING Aborted ```
|
 | openmp/runtime/src/ompd-specific.cpp |
Commit
ab76101f40f80bbec82073fc5bfddd7203e63a52
by anton.zabaznov[OpenCL] Add support of __opencl_c_read_write_images feature macro
This feature requires support of __opencl_c_images, so diagnostics for that is provided as well
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D104915
|
 | clang/include/clang/Basic/OpenCLOptions.h |
 | clang/lib/Sema/SemaDeclAttr.cpp |
 | clang/test/Misc/opencl-c-3.0.incorrect_options.cl |
 | clang/lib/Basic/Targets.cpp |
 | clang/lib/Basic/OpenCLOptions.cpp |
 | clang/test/SemaOpenCL/access-qualifier.cl |
 | clang/test/SemaOpenCL/unsupported-image.cl |
 | clang/include/clang/Basic/DiagnosticCommonKinds.td |
 | clang/include/clang/Basic/DiagnosticSemaKinds.td |
Commit
c99e17fef5f34ac536192fa7b915641f1962c7b9
by llvm-dev[InstCombine] Pre-commit ashr(or(neg(x),x),bw-1) --> sext(icmp_ne(x,0)) tests from D105764
Added 'thwart complexity-based canonicalization' hacks and the lshr(or(neg(x),x),bw-1) --> zext(icmp_ne(x,0)) variants suggested by Sanjay.
|
 | llvm/test/Transforms/InstCombine/sub-lshr-or-to-icmp-select.ll |
 | llvm/test/Transforms/InstCombine/sub-ashr-or-to-icmp-select.ll |
Commit
45ffe6341d9642487785b0d0028166e6fbdbe5d7
by thakis[clang/objc] Optimize getters for non-atomic, copied properties
Properties that were declared `@property(copy, nonatomic) id foo` make an unnecessary call to objc_get_property(). This call can be replaced with a direct access to the backing variable identical to how a `@property(nonatomic) id foo` would do it.
This reduces codegen by 4 bytes (x86_64/arm64) and removes a cross linkage unit function call per property declared as copy/nonatomic.
Differential Revision: https://reviews.llvm.org/D105311
|
 | clang/test/CodeGenObjC/arc-blocks.m |
 | clang/lib/CodeGen/CGObjC.cpp |
Commit
b2f6cf14798ac738bc2c9b35bd83171e0771b7a3
by llvm-dev[InstCombine] Fold lshr/ashr(or(neg(x),x),bw-1) --> zext/sext(icmp_ne(x,0)) (PR50816)
Handle the missing fold reported in PR50816, which is a variant of the existing ashr(sub_nsw(X,Y),bw-1) --> sext(icmp_sgt(X,Y)) fold.
We also handle the lshr(or(neg(x),x),bw-1) --> zext(icmp_ne(x,0)) equivalent - https://alive2.llvm.org/ce/z/SnZmSj
We still allow multi uses of the neg(x) - as this is likely to let us further simplify other uses of the neg - but not multi uses of the or() which would increase instruction count.
Differential Revision: https://reviews.llvm.org/D105764
|
 | llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp |
 | llvm/test/Transforms/InstCombine/sub-lshr-or-to-icmp-select.ll |
 | llvm/test/Transforms/InstCombine/sub-ashr-or-to-icmp-select.ll |
Commit
e9533b84920798cf9b35d26586a61bad0a1f9825
by alexfh[NFC] Add paranthesis around logical expression to silence -Wlogical-op-parentheses warning.
Reviewed By: alexfh
Differential Revision: https://reviews.llvm.org/D105890
|
 | clang/lib/Sema/SemaDeclAttr.cpp |
Commit
db635a28e65fa168536a100542d250f0b13c7039
by hansang.bae[OpenMP] Minor improvement in task allocation
This patch includes a few changes to improve task allocation performance slightly. These changes are enough to restore performance drop observed after introducing hidden helper.
Differential Revision: https://reviews.llvm.org/D105715
|
 | openmp/runtime/src/kmp_tasking.cpp |
Commit
2a9366c0e53593b2be2b91b4a37019ca8cae4557
by Louis Dionne[libc++] Generate ABI list for macOS arm64
|
 | libcxx/lib/abi/arm64-apple-darwin.libcxxabi.v1.stable.exceptions.no_new_in_libcxx.abilist |
Commit
c5ad8bb8d41018ba58490873e95cc841d9276702
by Louis Dionne[libc++] Target x86_64 only for the backdeployment jobs
Differential Revision: https://reviews.llvm.org/D105846
|
 | libcxx/utils/ci/buildkite-pipeline.yml |
Commit
0da95a5cf2691c8e01ae02f108487c397ab7e0ce
by Louis Dionne[libc++] Workaround non-constexpr std::exchange pre C++20
std::exchange is only constexpr in C++20 and later. We were using it in a constructor marked unconditionally constexpr, which caused issues when building with -std=c++17.
The weird part is that the issue only showed up when building on the arm64 macs, but that must be caused by the specific version of Clang used on those. Since the code is clearly wrong and the fix is obvious, I'm not going to investigate this further.
|
 | libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/explicit_optional_U.pass.cpp |
Commit
6a3904f16e8e2095082f71e862a33266e10fa871
by Matthew.ArsenaultMips: Mark special case calling convention handling as custom
The number of registers used for passing f64 in some cases is context dependent, and thus getNumRegistersForCallingConv is sometimes inaccurate. For f64, it reports 1 but is sometimes split into 2 32-bit registers.
For GlobalISel, the generic argument assignment code expects getNumRegistersForCallingConv to return an accurate answer. Switch to marking these arguments as custom so we can deal with this case as a custom assignment rather.
This temporarily breaks a few globalisel tests which are fixed by a future change to use more of the generic infrastructure.
|
 | llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/float_args.ll |
 | llvm/lib/Target/Mips/MipsISelLowering.cpp |
 | llvm/test/CodeGen/Mips/GlobalISel/irtranslator/float_args.ll |
Commit
121541fdcd5c9760ff242451d2b682c45a2a54df
by Matthew.ArsenaultMips/GlobalISel: Use more standard call lowering infrastructure
This also fixes some missing implicit uses on call instructions, adds missing G_ASSERT_SEXT/ZEXT annotations, and some missing outgoing sext/zexts. This also fixes not respecting tablegen requested type promotions.
This starts treating f64 passed in i32 GPRs as a type of custom assignment, which restores some previously XFAILed tests. This is due to getNumRegistersForCallingConv returns a static value, but in this case it is context dependent on other arguments.
Most of the ugliness is reproducing a hack CC_MipsO32 uses in SelectionDAG. CC_MipsO32 depends on a bunch of vectors populated from the original IR argument types in MipsCCState. The way this ends up working in GlobalISel is it only ends up inspecting the most recently added vector element. I'm pretty sure there are cleaner ways to do this, but this seemed easier than fixing up the current DAG handling. This is another case where it would be easier of the CCAssignFns were passed the original type instead of only the pre-legalized ones.
There's still a lot of junk here that shouldn't be necessary. This also likely breaks big endian handling, but it wasn't complete/tested anyway since the IRTranslator gives up on big endian targets.
|
 | llvm/lib/Target/Mips/MipsCallLowering.cpp |
 | llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h |
 | llvm/lib/Target/Mips/MipsCallLowering.h |
 | llvm/lib/Target/Mips/MipsCCState.h |
 | llvm/lib/Target/Mips/MipsCCState.cpp |
 | llvm/lib/Target/ARM/ARMCallLowering.cpp |
 | llvm/test/CodeGen/Mips/GlobalISel/irtranslator/float_args.ll |
 | llvm/test/CodeGen/Mips/GlobalISel/irtranslator/extend_args.ll |
 | llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/float_args.ll |
Commit
77a608d9de472766fcab51412100764e534ceaf9
by Matthew.ArsenaultGlobalISel: Remove getIntrinsicID utility function
This is redundant with a method directly on MachineInstr
|
 | llvm/lib/CodeGen/GlobalISel/Utils.cpp |
 | llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp |
 | llvm/include/llvm/CodeGen/GlobalISel/Utils.h |
Commit
222fde1eec341a47f571a8afdf90e83c3a830c5b
by Matthew.ArsenaultGlobalISel: Use extension instead of merge with undef in common case
This fixes not respecting signext/zeroext in these cases. In the anyext case, this avoids a larger merge with undef and should be a better canonical form.
This should also handle this if a merge is needed, but I'm not aware of a case where that can happen. In a future change this will also allow AMDGPU to drop some custom code without introducing regressions.
|
 | llvm/lib/CodeGen/GlobalISel/CallLowering.cpp |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call.ll |
Commit
fb44c3223e0c36e969762dd182b4992061b455d3
by Matthew.ArsenaultAMDGPU: Promote signext/zeroext i16 shader returns
This makes them consistent with all the other return convention handling. If we don't do this, we lose the sext/zext flag if treated as a full assignment, which complicates a future GlobalISel patch.
|
 | llvm/lib/Target/AMDGPU/AMDGPUCallingConv.td |
Commit
1e03c37b97b6176a60404d84665c40321f4e33a4
by John.EricsonPrepare Compiler-RT for GnuInstallDirs, matching libcxx, document all
This is a second attempt at D101497, which landed as 9a9bc76c0eb72f0f2732c729a460abbd5239c2e3 but had to be reverted in 8cf7ddbdd4e5af966a369e170c73250f2e3920e7.
This issue was that in the case that `COMPILER_RT_INSTALL_PATH` is empty, expressions like "${COMPILER_RT_INSTALL_PATH}/bin" evaluated to "/bin" not "bin" as intended and as was originally.
One solution is to make `COMPILER_RT_INSTALL_PATH` always non-empty, defaulting it to `CMAKE_INSTALL_PREFIX`. D99636 adopted that approach. But, I think it is more ergonomic to allow those project-specific paths to be relative the global ones. Also, making install paths absolute by default inhibits the proper behavior of functions like `GNUInstallDirs_get_absolute_install_dir` which make relative install paths absolute in a more complicated way.
Given all this, I will define a function like the one asked for in https://gitlab.kitware.com/cmake/cmake/-/issues/19568 (and needed for a similar use-case).
---
Original message:
Instead of using `COMPILER_RT_INSTALL_PATH` through the CMake for complier-rt, just use it to define variables for the subdirs which themselves are used.
This preserves compatibility, but later on we might consider getting rid of `COMPILER_RT_INSTALL_PATH` and just changing the defaults for the subdir variables directly.
---
There was a seaming bug where the (non-Apple) per-target libdir was `${target}` not `lib/${target}`. I suspect that has to do with the docs on `COMPILER_RT_INSTALL_PATH` saying was the library dir when that's no longer true, so I just went ahead and fixed it, allowing me to define fewer and more sensible variables.
That last part should be the only behavior changes; everything else should be a pure refactoring.
---
I added some documentation of these variables too. In particular, I wanted to highlight the gotcha where `-DSomeCachePath=...` without the `:PATH` will lead CMake to make the path absolute. See [1] for discussion of the problem, and [2] for the brief official documentation they added as a result.
[1]: https://cmake.org/pipermail/cmake/2015-March/060204.html
[2]: https://cmake.org/cmake/help/latest/manual/cmake.1.html#options
In 38b2dec37ee735d5409148e71ecba278caf0f969 the problem was somewhat misidentified and so `:STRING` was used, but `:PATH` is better as it sets the correct type from the get-go.
---
D99484 is the main thrust of the `GnuInstallDirs` work. Once this lands, it should be feasible to follow both of these up with a simple patch for compiler-rt analogous to the one for libcxx.
Reviewed By: phosek, #libc_abi, #libunwind
Differential Revision: https://reviews.llvm.org/D105765
|
 | libunwind/CMakeLists.txt |
 | compiler-rt/include/CMakeLists.txt |
 | clang/runtime/CMakeLists.txt |
 | compiler-rt/lib/dfsan/CMakeLists.txt |
 | libunwind/docs/BuildingLibunwind.rst |
 | compiler-rt/cmake/Modules/CompilerRTDarwinUtils.cmake |
 | compiler-rt/cmake/Modules/CompilerRTUtils.cmake |
 | libcxx/CMakeLists.txt |
 | libcxxabi/CMakeLists.txt |
 | compiler-rt/cmake/base-config-ix.cmake |
 | libcxx/docs/BuildingLibcxx.rst |
 | compiler-rt/docs/BuildingCompilerRT.rst |
 | compiler-rt/cmake/Modules/AddCompilerRT.cmake |
Commit
32627f4ab4b717dc1932141db99605b723037bf8
by tpopp[mlir] Handle unused variable when assertions are disabled.
|
 | mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp |
Commit
03d8fed34951bc6e92b36615ec3afe6f36d10de6
by anton.zabaznov[OpenCL] Add verbosity when checking support of read_write images
Parenthesis were fixed incorrectly by D105890
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D105892
|
 | clang/lib/Sema/SemaDeclAttr.cpp |
Commit
10e0cdfc6526578c8892d895c0448e77cb9ba876
by wei.huang[PowerPC][NFC] Power ISA features for Semachecking
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be used for semachecking builtin functions that are only valid for certain versions of ppc.
Reviewed By: nemanjai, #powerpc Authored By: Quinn Pham <Quinn.Pham@ibm.com>
Differential revision: https://reviews.llvm.org/D105501
|
 | llvm/lib/Target/PowerPC/PPC.td |
 | clang/lib/Sema/SemaChecking.cpp |
 | clang/include/clang/Basic/DiagnosticSemaKinds.td |
 | clang/lib/Basic/Targets/PPC.h |
 | clang/lib/Basic/Targets/PPC.cpp |
 | llvm/lib/Target/PowerPC/PPCInstrInfo.td |
 | llvm/lib/Target/PowerPC/PPCSubtarget.cpp |
 | llvm/lib/Target/PowerPC/PPCSubtarget.h |
Commit
1bfec34ac3e71ae3e65d5132fb475b6f8cc0bafe
by llvm-dev[InstCombine] Regenerate select-gep.ll tests
|
 | llvm/test/Transforms/InstCombine/select-gep.ll |
Commit
4975837f1480621d9428a4be468831d07b2201de
by llvm-dev[InstCombine] Add basic (select C, (gep Ptr, Idx), Ptr) tests from PR50183
|
 | llvm/test/Transforms/InstCombine/select-gep.ll |
Commit
f1aca5ac96ebd0beadfa68a474c5947d3bc8c109
by albionapc[PowerPC] Fix L[D|W]ARX Implementation
LDARX and LWARX sometimes gets optimized out by the compiler when it is critical to the correctness of the code. This inline asm generation ensures that it preserved.
Differential Revision: https://reviews.llvm.org/D105754
|
 | clang/test/CodeGen/builtins-ppc-xlcompat-LoadReseve-StoreCond-64bit-only.c |
 | clang/test/CodeGen/builtins-ppc-xlcompat-LoadReseve-StoreCond.c |
 | llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-LoadReserve-StoreCond-64bit-only.ll |
 | llvm/lib/Target/PowerPC/PPCInstrInfo.td |
 | clang/lib/CodeGen/CGBuiltin.cpp |
 | llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-LoadReserve-StoreCond.ll |
 | llvm/include/llvm/IR/IntrinsicsPowerPC.td |
 | llvm/lib/Target/PowerPC/PPCInstr64Bit.td |
Commit
7039dfc6dd157a26de2f5a6fd15662510a1dd119
by ajcbik[mlir][memref] adjust integration tests to new lowering passes
these tests run under the emulator and thus were overlooked
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D105855
|
 | mlir/test/Integration/Dialect/Vector/CPU/AMX/test-tilezero-block.mlir |
 | mlir/test/Integration/Dialect/Vector/CPU/AMX/test-tilezero.mlir |
 | mlir/test/Integration/Dialect/Vector/CPU/X86Vector/test-sparse-dot-product.mlir |
 | mlir/test/Integration/Dialect/Vector/CPU/AMX/test-muli.mlir |
 | mlir/test/Integration/Dialect/Vector/CPU/AMX/test-mulf.mlir |
 | mlir/test/Integration/Dialect/Vector/CPU/AMX/test-muli-ext.mlir |
Commit
a006af5d6ec6280034ae4249f6d2266d726ccef4
by gchatelet[llvm] Add enum iteration to Sequence
This patch allows iterating typed enum via the ADT/Sequence utility.
Differential Revision: https://reviews.llvm.org/D103900
|
 | llvm/include/llvm/Support/MachineValueType.h |
 | llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp |
 | llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp |
 | llvm/unittests/ADT/SequenceTest.cpp |
 | mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp |
 | llvm/unittests/CodeGen/ScalableVectorMVTsTest.cpp |
 | llvm/tools/llvm-exegesis/lib/X86/Target.cpp |
 | llvm/include/llvm/ADT/Sequence.h |
 | llvm/unittests/IR/ConstantRangeTest.cpp |
Commit
3d89fb4d13bc3af1c3643a310b90fce51a649119
by i[RISCV] Support machine constraint "S"
Similar to D46745, "S" represents an absolute symbolic operand, which can be used to specify the access models, e.g.
extern int var; void *addr_via_asm() { void *ret; asm("lui %0, %%hi(%1)\naddi %0,%0,%%lo(%1)" : "=r"(ret) : "S"(&var)); return ret; }
'S' is documented in trunk GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101275
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D105254
|
 | llvm/test/CodeGen/RISCV/inline-asm-S-constraint.ll |
 | clang/lib/Basic/Targets/RISCV.cpp |
 | clang/test/CodeGen/RISCV/riscv-inline-asm.c |
 | llvm/lib/Target/RISCV/RISCVISelLowering.cpp |
Commit
68ae8bacfce3b9bd73fefb0d28efd461e1588586
by nicolas.vasilache[mlir][Linalg] Properly specify Linalg attribute.
This fixes undefined reference introduced by https://reviews.llvm.org/D105859
Differential Revision: https://reviews.llvm.org/D105897
|
 | mlir/include/mlir/Dialect/Linalg/IR/LinalgBase.td |
 | mlir/lib/Dialect/Linalg/IR/LinalgTypes.cpp |
Commit
1893b630fec06947b4f59e43c00db4d787f39262
by julian.lettnerAvoid triggering assert when program calls OSAtomicCompareAndSwapLong
A previous change brought the new, relaxed implementation of "on failure memory ordering" for synchronization primitives in LLVM over to TSan land [1]. It included the following assert: ``` // 31.7.2.18: "The failure argument shall not be memory_order_release // nor memory_order_acq_rel". LLVM (2021-05) fallbacks to Monotonic // (mo_relaxed) when those are used. CHECK(IsLoadOrder(fmo));
static bool IsLoadOrder(morder mo) { return mo == mo_relaxed || mo == mo_consume || mo == mo_acquire || mo == mo_seq_cst; } ```
A previous workaround for a false positive when using an old Darwin synchronization API assumed this failure mode to be unused and passed a dummy value [2]. We update this value to `mo_relaxed` which is also the value used by the actual implementation to avoid triggering the assert.
[1] https://reviews.llvm.org/D99434 [2] https://reviews.llvm.org/D21733
rdar://78122243
Differential Revision: https://reviews.llvm.org/D105844
|
 | compiler-rt/lib/tsan/rtl/tsan_interceptors_mac.cpp |
Commit
b25aca503d296eeeb2a174d8fb97637de74b8653
by aeubanks[OpaquePtr] Use AllocaInst::getAllocatedType()
|
 | llvm/lib/Target/NVPTX/NVPTXLowerAlloca.cpp |
Commit
693bc04bf615b63b0070c7d1ad15257a7ce31a20
by aeubanks[OpaquePtr] Use GlobalValue::getValueType() more
|
 | llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp |
 | llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp |
 | llvm/lib/Transforms/Coroutines/Coroutines.cpp |
 | llvm/lib/Transforms/IPO/MergeFunctions.cpp |
Commit
113a80797731b1d7cb20d8b42238908efc9e4f48
by aeubanks[OpaquePtr] Get load/store type without PointerType::getElementType()
|
 | llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp |
Commit
ab5693aa4ac45fed0fa4c9106f0eef6d409b6c3e
by aeubanks[OpaquePtr] Use byval type more
|
 | llvm/lib/Transforms/IPO/ArgumentPromotion.cpp |
 | llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp |
 | llvm/lib/Transforms/Coroutines/CoroFrame.cpp |
Commit
2c47b8847ec75c25187e9819abd85cc9e908d742
by gchateletRevert "[llvm] Add enum iteration to Sequence"
This reverts commit a006af5d6ec6280034ae4249f6d2266d726ccef4.
|
 | llvm/unittests/CodeGen/ScalableVectorMVTsTest.cpp |
 | llvm/tools/llvm-exegesis/lib/X86/Target.cpp |
 | llvm/unittests/ADT/SequenceTest.cpp |
 | mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp |
 | llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp |
 | llvm/unittests/IR/ConstantRangeTest.cpp |
 | llvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp |
 | llvm/include/llvm/Support/MachineValueType.h |
 | llvm/include/llvm/ADT/Sequence.h |
Commit
46e89708170c40e8cf0305b6de048ca879f43aab
by craig.topper[RISCV] Prevent use of t0(aka x5) as rs1 for jalr instructions.
Some microarchitectures treat rs1=x1/x5 on jalr as a hint to pop the return-address stack. We should avoid using x5 on jalr instructions since we aren't using x5 as an alternate link register.
Differential Revision: https://reviews.llvm.org/D105875
|
 | llvm/lib/Target/RISCV/RISCVRegisterInfo.td |
 | llvm/lib/Target/RISCV/RISCVInstrInfo.td |
 | llvm/test/CodeGen/RISCV/tail-calls.ll |
 | llvm/test/CodeGen/RISCV/calls.ll |
Commit
ae4cea38f18e32d4a106871d751af380032e16fe
by thomasraoux[mlir] Add support for tensor.extract to comprehensive bufferization
Differential Revision: https://reviews.llvm.org/D105870
|
 | mlir/test/Dialect/Linalg/comprehensive-module-bufferize.mlir |
 | mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp |
Commit
489742991f7dc4c621264d223e8973ff876e9080
by aeubanks[NFC] Inline variable to prevent unused variable warning
|
 | llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp |
Commit
e4b43973fbd41aee3b8197cf250e9fb9ac40f986
by listmail[ScalarEvolution] Fix overflow when computing max trip counts
This is split from D105216 to reduce patch complexity. Original code by Eli with very minor modification by me.
The primary point of this patch is to add the getUDivCeilSCEV routine. I included the two callers with constant arguments as we know those must constant fold even without any of the fancy inference logic.
|
 | llvm/include/llvm/Analysis/ScalarEvolution.h |
 | llvm/lib/Analysis/ScalarEvolution.cpp |
Commit
7a20670d168af31ef77209f43ca0622800ce513a
by Saleem AbdulrasoolAST: correct name decoration for swift async functions on Windows
The name decoration scheme on Windows does not have a vendor namespace, and the decoration scheme is not shared ownership - it is controlled by Microsoft. `T` is a reserved identifier for an unknown calling convention. The `W` identifier has been discussed with Microsoft offline and is reserved as `Swift_3` as the identifier for the swift async calling convention. Adjust the name decoration accordingly.
|
 | clang/lib/AST/MicrosoftMangle.cpp |
Commit
14f77576c9c4f502267a92992abe3bdcbeb96b2c
by marcos.horro[llvm-mca] [NFC] Formatting code
Applied clang-format to all files. Discarded BottleneckAnalysis.h 80-column width violation since it contains an example of report. Caught some typos and minor style details.
Reviewed By: andreadb
Differential Revision: https://reviews.llvm.org/D105900
|
 | llvm/tools/llvm-mca/Views/View.h |
 | llvm/tools/llvm-mca/Views/RegisterFileStatistics.cpp |
 | llvm/tools/llvm-mca/Views/SummaryView.h |
 | llvm/tools/llvm-mca/PipelinePrinter.cpp |
 | llvm/tools/llvm-mca/Views/InstructionView.cpp |
 | llvm/tools/llvm-mca/llvm-mca.cpp |
 | llvm/tools/llvm-mca/Views/InstructionView.h |
 | llvm/tools/llvm-mca/Views/BottleneckAnalysis.cpp |
 | llvm/tools/llvm-mca/Views/SummaryView.cpp |
 | llvm/tools/llvm-mca/Views/DispatchStatistics.cpp |
 | llvm/tools/llvm-mca/Views/TimelineView.h |
 | llvm/tools/llvm-mca/Views/RetireControlUnitStatistics.cpp |
 | llvm/tools/llvm-mca/Views/BottleneckAnalysis.h |
Commit
03282f2fe14e9dd61aaeeda3785f56c7ccb4f3c9
by mizvekov[clang] C++98 implicit moves are back with a vengeance
After taking C++98 implicit moves out in D104500, we put it back in, but now in a new form which preserves compatibility with pure C++98 programs, while at the same time giving almost all the goodies from P1825.
* We use the exact same rules as C++20 with regards to which id-expressions are move eligible. The previous incarnation would only benefit from the proper subset which is copy ellidable. This means we can implicit move, in addition: * Parameters. * RValue references. * Exception variables. * Variables with higher-than-natural required alignment. * Objects with different type from the function return type. * We preserve the two-overload resolution, with one small tweak to the first one: If we either pick a (possibly converting) constructor which does not take an rvalue reference, or a user conversion operator which is not ref-qualified, we abort into the second overload resolution.
This gives C++98 almost all the implicit move patterns which we had created test cases for, while at the same time preserving the meaning of these three patterns, which are found in pure C++98 programs: * Classes with both const and non-const copy constructors, but no move constructors, continue to have their non-const copy constructor selected. * We continue to reject as ambiguous the following pattern: ``` struct A { A(B &); }; struct B { operator A(); }; A foo(B x) { return x; } ``` * We continue to pick the copy constructor in the following pattern: ``` class AutoPtrRef { }; struct AutoPtr { AutoPtr(AutoPtr &); AutoPtr();
AutoPtr(AutoPtrRef); operator AutoPtrRef(); }; AutoPtr test_auto_ptr() { AutoPtr p; return p; } ```
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D105756
|
 | clang/test/CXX/class/class.init/class.copy.elision/p3.cpp |
 | clang/test/SemaCXX/conversion-function.cpp |
 | clang/test/SemaObjCXX/block-capture.mm |
 | clang/lib/Sema/SemaStmt.cpp |
Commit
405eefe46497bde580c28ce2d2b79f0e96f2a1d0
by jonathan.l.peyton[OpenMP][NFC] Change comment style to eliminate warnings from GCC
Standalone build for OpenMP runtime using GCC is giving -Wcomment warnings where a backslash newline is encountered in the // style comment. This switches the // style for /* style to silence the warnings.
|
 | openmp/runtime/src/kmp_os.h |
Commit
b5f4ac4c11b041ab9dfed42a7133d1eca6536aaa
by amy.kwan1[PowerPC] Add FI alignment check if the addressing mode is DS/DQ-Form, emit X-Form if necessary.
This patch adds a function that checks whether or not the frame index is aligned when the computed addressing mode is an aligned D-Form (DS, or DQ-Form). If the frame index appears to be unaligned, within these two modes, reset the mode to X-Form in order to fall back to selection X-Form loads.
A test case is added to ensure that the test emits X-Form loads and not DQ-Form loads since the frame index is not aligned within the test case.
Differential Revision: https://reviews.llvm.org/D105661
|
 | llvm/test/CodeGen/PowerPC/unaligned-dqform-ld.ll |
 | llvm/lib/Target/PowerPC/PPCISelLowering.cpp |
Commit
1e670dc7d78427156c252317b3571576d465043f
by craig.topper[RISCV] Use DIVUW/REMUW/DIVW instructions for i8/i16/i32 udiv/urem/sdiv when LHS is constant.
We don't really have optimizations for division with a constant LHS. If we don't use a W instruction we end up needing to sign or zero extend the RHS to use the 64-bit instruction.
I had to sign_extend i32 constants on the LHS instead of using any_extend which becomes zero_extend. If we don't do this, constants that were originally negative become harder to materialize. I think this problem exists for more of our W instruction cases. For example (i32 (shl -1, X)), but we don't have lit tests. I'll work on that as a follow up.
I also left a FIXME for enabling W instruction for RHS constants under -Oz.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D105769
|
 | llvm/lib/Target/RISCV/RISCVISelLowering.cpp |
 | llvm/test/CodeGen/RISCV/rem.ll |
 | llvm/test/CodeGen/RISCV/div.ll |
Commit
04942a7ffc716d2f782402089201cadfbbbb2a04
by Louis Dionne[libc++] NFC: Add comment for running macOS CI setup script remotely
|
 | libcxx/utils/ci/macos-ci-setup |
Commit
424f14f0d2e98b83a41bdc7408f15d28aaa4cbd0
by jonathan.l.peyton[OpenMP] Fix one sign-compare warning from GCC
|
 | openmp/runtime/src/kmp_runtime.cpp |
Commit
303ddb60a2d28fb7603266d8977f69ac77b194dd
by tstellarFix utils/update_cc_test_checks/check-globals.test on stand-alone builds
We want to use LLVM_EXTERNAL_LIT if defined for the %lit substitution.
Reviewed By: jdenny
Differential Revision: https://reviews.llvm.org/D105873
|
 | clang/test/utils/update_cc_test_checks/lit.local.cfg |
 | clang/test/CMakeLists.txt |
 | clang/test/lit.site.cfg.py.in |
Commit
2a399e60b6ea74aca47881b48414a5198a868cc3
by Louis Dionne[libc++] Add a CI job for macOS on arm64 hardware 🥳
Differential Revision: https://reviews.llvm.org/D105848
|
 | libcxxabi/test/thread_local_destruction_order.pass.cpp |
 | libcxx/utils/ci/buildkite-pipeline.yml |
Commit
2bc07083a258fdbbafc9c0381e936f441f93af70
by Vitaly Buka[sanitizer] Fix VSNPrintf %V on Windows
|
 | compiler-rt/lib/sanitizer_common/sanitizer_common.h |
 | compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp |
 | compiler-rt/lib/sanitizer_common/tests/sanitizer_printf_test.cpp |
Commit
f26deb4e6ba7e00c57b4be888c4d20c95a881154
by vsavchenko[analyzer][solver][NFC] Introduce ConstraintAssignor
The new component is a symmetric response to SymbolicRangeInferrer. While the latter is the unified component, which answers all the questions what does the solver knows about a particular symbolic expression, assignor associates new constraints (aka "assumes") with symbolic expressions and can imply additional knowledge that the solver can extract and use later on.
- Why do we need it and why is SymbolicRangeInferrer not enough?
As it is noted before, the inferrer only helps us to get the most precise range information based on the existing knowledge and on the mathematical foundations of different operations that symbolic expressions actually represent. It doesn't introduce new constraints.
The assignor, on the other hand, can impose constraints on other symbols using the same domain knowledge.
- But for some expressions, SymbolicRangeInferrer looks into constraints for similar expressions, why can't we do that for all the cases?
That's correct! But in order to do something like this, we should have a finite number of possible "similar expressions".
Let's say we are asked about `$a - $b` and we know something about `$b - $a`. The inferrer can invert this expression and check constraints for `$b - $a`. This is simple! But let's say we are asked about `$a` and we know that `$a * $b != 0`. In this situation, we can imply that `$a != 0`, but the inferrer shouldn't try every possible symbolic expression `X` to check if `$a * X` or `X * $a` is constrained to non-zero.
With the assignor mechanism, we can catch this implication right at the moment we associate `$a * $b` with non-zero range, and set similar constraints for `$a` and `$b` as well.
Differential Revision: https://reviews.llvm.org/D105692
|
 | clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp |
Commit
60bd8cbc0c84a41146b1ad6c832fa75f48cd2568
by vsavchenko[analyzer][solver][NFC] Refactor how we detect (dis)equalities
This patch simplifies the way we deal with (dis)equalities. Due to the symmetry between constraint handler and range inferrer, we can have very similar implementations of logic handling questions about (dis)equality and assumptions involving (dis)equality.
It also helps us to remove one more visitor, and removes uncertainty that we got all the right places to put `trackNE` and `trackEQ`.
Differential Revision: https://reviews.llvm.org/D105693
|
 | clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp |
 | clang/test/Analysis/equality_tracking.c |
Commit
ce25eb0b71bfcd104afd300c2eb2fb5982f827e8
by Vitaly Buka[NFC][sanitizer] Remove trailing whitespace
|
 | compiler-rt/lib/sanitizer_common/sanitizer_common.h |
Commit
6245252d4c8c7c9b1be5b9e6a876be9776c000e4
by listmail[test] Add a SCEV backedge computation test with an explicit zero stride
|
 | llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll |
Commit
01d3a3dcabaf862581b1d1aee604fcee6a18b240
by tra[CUDA] Only allow NVIDIA offload-arch during CUDA compilation.
Otherwise, if someone specifies a valid AMD arch, we may end up triggering an assertion on unexpected arch later on.
Differential Revision: https://reviews.llvm.org/D105295
|
 | clang/test/Driver/cuda-bad-arch.cu |
 | clang/lib/Driver/Driver.cpp |
Commit
43c7ca8e4963beb2e5a57639f20b8f43608296d7
by Jon Roelofs[AArch64][GlobalISel] Legalize store <2 x i16>
Differential revision: https://reviews.llvm.org/D105912
|
 | llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store.mir |
Commit
eba638dbbb77ca2a446fd76b4f52ad85640da4f9
by Jon Roelofs[AArch64][GlobalISel] Legalize load <2 x i16>
Differential revision: https://reviews.llvm.org/D105913
|
 | llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store.mir |
 | llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp |
Commit
e4585d3f4e1f076ff12db65259924492f5912b19
by wei.huangRevert "[PowerPC][NFC] Power ISA features for Semachecking"
This reverts commit 10e0cdfc6526578c8892d895c0448e77cb9ba876.
|
 | llvm/lib/Target/PowerPC/PPCSubtarget.h |
 | clang/include/clang/Basic/DiagnosticSemaKinds.td |
 | llvm/lib/Target/PowerPC/PPC.td |
 | llvm/lib/Target/PowerPC/PPCInstrInfo.td |
 | clang/lib/Sema/SemaChecking.cpp |
 | clang/lib/Basic/Targets/PPC.h |
 | llvm/lib/Target/PowerPC/PPCSubtarget.cpp |
 | clang/lib/Basic/Targets/PPC.cpp |
Commit
781929b4236bc34681fb0783cf7b6021109fe28b
by wei.huang[PowerPC][NFC] Power ISA features for Semachecking
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be used for semachecking builtin functions that are only valid for certain versions of ppc.
Reviewed By: nemanjai, #powerpc Authored By: Quinn Pham <Quinn.Pham@ibm.com>
Differential revision: https://reviews.llvm.org/D105501
|
 | llvm/lib/Target/PowerPC/PPC.td |
 | llvm/lib/Target/PowerPC/PPCSubtarget.h |
 | clang/lib/Basic/Targets/PPC.cpp |
 | llvm/lib/Target/PowerPC/PPCSubtarget.cpp |
 | llvm/lib/Target/PowerPC/PPCInstrInfo.td |
 | clang/lib/Basic/Targets/PPC.h |
 | clang/lib/Sema/SemaChecking.cpp |
 | clang/include/clang/Basic/DiagnosticSemaKinds.td |
 | clang/test/Driver/ppc-isa-features.cpp |
Commit
308d38128333af65455e8343a620b40a099e896a
by tlively[WebAssembly] Generate checks for simd-load-store-alignment.ll
This will make it easier to update these tests as we add support for generating more SIMD loads and stores with custom alignments.
Differential Revision: https://reviews.llvm.org/D105862
|
 | llvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll |
Commit
e56b2e57067652710418973e11bb9b118f37b177
by nikita.ppv[InstCombine] Precommit tests for D105088 (NFC)
Add tests for D105088, as well as an option to disable the (generally) unsound inttoptr of ptrtoint optimization.
Differential Revision: https://reviews.llvm.org/D105771
|
 | llvm/lib/IR/Instructions.cpp |
 | llvm/test/Transforms/InstCombine/ptr-int-ptr-icmp.ll |
Commit
3e5cff19fdae2515f87c08dc8b0e483751165153
by Jon Roelofs[Tests] Fix test broken by: 43c7ca8e4963 [AArch64][GlobalISel] Legalize store <2 x i16>
|
 | llvm/test/CodeGen/AArch64/arm64-rev.ll |
Commit
087310c71e5c1c70818ac62acd781860d59a6ce7
by listmail[SCEV] Strengthen inference of RHS > Start in howManyLessThans
Split off from D105216 to simplify review. Rewritten with a lambda to be easier to follow. Comments clarified.
Sorry for no test case, this is tricky to exercise with the current structure of the code. It's about to be hit more frequently in a follow up patch, and the change itself is simple.
|
 | llvm/lib/Analysis/ScalarEvolution.cpp |
Commit
25629bb45f0a4b8c8e99dbde4f4a7e3d980b9fd7
by traFix cuda-bad-arch.cu test.
Tests for correctness of HIP architecture need `- xhip`
|
 | clang/test/Driver/cuda-bad-arch.cu |
 | clang/test/Driver/cuda-flush-denormals-to-zero.cu |
 | clang/test/Driver/cuda-arch-translation.cu |
Commit
5ca9cf0e6b15647a4f6959c1fc1c23b9f6cb0cba
by listmail[tests] Precommit a test case from D105216
|
 | llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll |
Commit
3ea8860afb302f628703a57226e5466091b2c418
by thakis[gn build] (manually) port 303ddb60a2d2
|
 | llvm/utils/gn/secondary/clang/test/BUILD.gn |
Commit
5d1ba534043707a7b41542e9d1e514483f88503a
by efriedma[LoopReroll] Add an extra defensive check to avoid SCEV assertion.
Make sure getMinusSCEV() didn't return a pointer. The following check would never succeed if it was a pointer, anyway, but calling getMulExpr() on a pointer SCEV now asserts.
|
 | llvm/test/Transforms/LoopReroll/basic.ll |
 | llvm/lib/Transforms/Scalar/LoopRerollPass.cpp |
Commit
b28c465e4902f579799bc94512197c04a5ad4a29
by efriedma[NFC] Use CHECK-LABEL in trip-count-unknown-stride.ll
|
 | llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll |
Commit
6296e109728d58805004739530b8f265c6a130b9
by thomasraoux[mlir][Vector] Remove Vector TupleOp as it is unused
TupleOp is not used anymore after recent refactoring.
Differential Revision: https://reviews.llvm.org/D105924
|
 | mlir/include/mlir/Dialect/Vector/VectorOps.td |
 | mlir/lib/Dialect/Vector/VectorOps.cpp |
Commit
fb9c5c3dce27b352534641dbb6e3cb8c05da7bc9
by abidh[lld][AMDGPU] Handle R_AMDGPU_REL16 relocation.
This patch is a followup patch to https://reviews.llvm.org/D105760 which adds this relocation. This handles the relocation in lld.
The s_branch family of instruction does the following: PC = PC + signext(simm * 4) + 4
so we we do the opposite on the target address before writing it in the instruction stream.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D105761
|
 | lld/test/ELF/amdgpu-relocs2.s |
 | lld/ELF/Arch/AMDGPU.cpp |
Commit
7efe3887858fe77da5c6687e3ac9ed9b00f9ed4e
by arthur.j.odwyer[libc++] [test] Add a missing `()` in TestEachIntegralType.
|
 | libcxx/test/support/atomic_helpers.h |
Commit
ba8dcaef0d79ae0174cdcea6d6f62015266c1d40
by Vitaly BukaRevert "sanitizer_common: optimize memory drain"
Breaks https://lab.llvm.org/buildbot/#/builders/anitizer-windows
This reverts commit d89d3dfae17d7795dc1ef013db66272020de1959.
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h |
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
 | compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp |
Commit
d558bfaf8e1e8e7814053abc406cdaaed00cf784
by Vitaly Buka[NFC][sanitizer] clang-format part of D105778
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h |
 | compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp |
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
Commit
5105a77035d080a5f14668b136c8def52b182ce2
by Vedant Kumar[docs/llvm-cov] Document -compilation-dir
Document the `-compilation-dir` option added in D100232.
Differential Revision: https://reviews.llvm.org/D105826
|
 | llvm/docs/CommandGuide/llvm-cov.rst |
Commit
d12a7f142e2430f4983c668d910897db8cc2afc7
by hedingarcia[libc] Add on float properties for precision floating point numbers in FloatProperties.h
Defined constant that express the number of bits for exponent in single and double precision. Added bit masks values and other properties for quad precision floating point numbers that specifically targets architectures defined in PlatfromDefs.h. The exponentWidth values were added to be used in LongDoubleBitsX86.h where the implementation to set the exponent component uses this and the bitWidth value. The need occurred because of the 80-bit quad precision implementation.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D105153
|
 | libc/utils/FPUtil/FloatProperties.h |
Commit
9f1f666b30c03376d3816f7b2d18c93073517330
by Vitaly Buka[NFC][sanitizer] Move MemoryMapper out of SizeClassAllocator64
Part of D105778
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
Commit
1c69005c2e11414669ac8ba094a9b059920936db
by martin[libcxx] [docs] Acknowledge that the library is known to work in some configs outside of what's tested in CI
Differential Revision: https://reviews.llvm.org/D105888
|
 | libcxx/docs/index.rst |
Commit
4df591b5c960affd1612e330d0c9cd3076c18053
by listmail[SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result.
Differential Revision: https://reviews.llvm.org/D105921
|
 | llvm/lib/Analysis/ScalarEvolution.cpp |
 | llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll |
Commit
f990da59c5df840526baeb70bc5b5594fb5599ed
by Vitaly Buka[sanitizer] Few more NFC changes from D105778
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h |
Commit
a16071e409a55cfc83e59eb738fd6144207dd5d1
by caitlyncano[libc] Don't pass -fpie/-ffreestanding on Windows
The current compile options function hardcodes the -fpie and -ffreestanding flags, which don't exist on Windows. This patch sets the compilation flags conditionally based on the OS specifics.
Reviewed By: sivachandra, aeubanks
Differential Revision: https://reviews.llvm.org/D105643
|
 | libc/cmake/modules/LLVMLibCObjectRules.cmake |
Commit
a5a337e55ed2e265358ac0a2ce6db1af2dd69e07
by hedingarcia[libc] Capture floating point encoding and arrange it sequentially in memory
Redefined FPBits.h and LongDoubleBitsX86 so its implementation works for the Windows and Linux platform while maintaining a packed memory alignment of the precision floating point numbers. For its size in memory to be the same as the data type of the float point number. This change was necessary because the previous attribute((packed)) specification in the struct was not working for Windows like it was for Linux and consequently static_asserts in the FPBits.h file were failing.
Reviewed By: aeubanks, sivachandra
Differential Revision: https://reviews.llvm.org/D105561
|
 | libc/test/src/math/NextAfterTest.h |
 | libc/utils/FPUtil/Hypot.h |
 | libc/utils/FPUtil/generic/FMA.h |
 | libc/test/src/math/LdExpTest.h |
 | libc/utils/FPUtil/TestHelpers.cpp |
 | libc/test/src/math/RoundToIntegerTest.h |
 | libc/utils/FPUtil/Sqrt.h |
 | libc/utils/FPUtil/ManipulationFunctions.h |
 | libc/utils/FPUtil/FPBits.h |
 | libc/utils/FPUtil/NormalFloat.h |
 | libc/utils/FPUtil/SqrtLongDoubleX86.h |
 | libc/utils/FPUtil/NearestIntegerOperations.h |
 | libc/test/src/math/SqrtTest.h |
 | libc/utils/FPUtil/DivisionAndRemainderOperations.h |
 | libc/utils/FPUtil/BasicOperations.h |
 | libc/utils/FPUtil/LongDoubleBitsX86.h |
 | libc/utils/FPUtil/NextAfterLongDoubleX86.h |
Commit
24129fbc9aa006badc2e6e8432980cb94aba090c
by ayermolo[LLD] Adding support for RELA for CG Profile.
This is a follow up to https://reviews.llvm.org/D104080, and https://github.com/llvm/llvm-project/commit/ca3bdb57fa1ac98b711a735de048c12b5fdd8086#diff-e64a48fabe31db213a631fdc5f2acb51bdddf3f16a8fb2928784f4c579229585. The implementation of call graph profile was changed from a black box section to relocation approach. This was done to be compatible with post processing tools like strip/objcopy, and llvm equivalent. When they are invoked on object file before the final linking step with this new approach the symbol indices correctness is preserved.
The GNU binutils tools change the REL section to RELA section, unlike llvm tools. For example when strip -S is run on the ELF object files, as an intermediate step before linking. To preserve compatibility this patch extends implementation in LLD and ELFDumper to support both REL and RELA sections for call graph profile.
Reviewed By: MaskRay, jhenderson
Differential Revision: https://reviews.llvm.org/D105217
|
 | lld/ELF/InputFiles.h |
 | llvm/tools/llvm-readobj/ELFDumper.cpp |
 | lld/test/ELF/cgprofile-rela.test |
 | llvm/test/tools/llvm-readobj/ELF/call-graph-profile.test |
 | lld/ELF/InputFiles.cpp |
 | lld/ELF/Driver.cpp |
Commit
d4e2693a679927a62dd738dd3bba24863dcd290a
by dschuff[WebAssembly] Run varargs codegen test with non-emscripten triple
This is a followup from D105749 to cover both triples in the case where they differ.
|
 | llvm/test/CodeGen/WebAssembly/varargs.ll |
Commit
8a2720d81e159fc71550b10b4c34f1de912d5880
by jpienaarAdd more types to the LLVM dialect C API
This includes: - void type - array types - function types - literal (unnamed) struct types
Reviewed By: jpienaar, ftynse
Differential Revision: https://reviews.llvm.org/D105908
|
 | mlir/test/CAPI/llvm.c |
 | mlir/include/mlir-c/Dialect/LLVM.h |
 | mlir/lib/CAPI/Dialect/LLVM.cpp |
Commit
123e8dfcf86a74eb7ba08f33681df581d1be9dbd
by ajcbik[mlir][sparse] add support for std unary operations
Adds zero-preserving unary operators from std. Also adds xor. Performs minor refactoring to remove "zero" node, and pushed the irregular logic for negi (not support in std) into one place.
Reviewed By: gussmith23
Differential Revision: https://reviews.llvm.org/D105928
|
 | mlir/test/Dialect/SparseTensor/sparse_fp_ops.mlir |
 | mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp |
 | mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp |
 | mlir/unittests/Dialect/SparseTensor/MergerTest.cpp |
 | mlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h |
 | mlir/test/Dialect/SparseTensor/sparse_int_ops.mlir |
Commit
f2b5e438aa3620cd60d115cad8dcb39cc417c8a8
by ravishankarm[mlir][Tensor] Implement `reifyReturnTypeShapesPerResultDim` for `tensor.insert_slice`.
Differential Revision: https://reviews.llvm.org/D105852
|
 | mlir/include/mlir/Dialect/Tensor/IR/Tensor.h |
 | mlir/lib/Dialect/Tensor/IR/CMakeLists.txt |
 | mlir/test/Dialect/Tensor/resolve-shaped-type-result-dims.mlir |
 | utils/bazel/llvm-project-overlay/mlir/BUILD.bazel |
 | mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td |
 | mlir/lib/Dialect/Tensor/IR/TensorOps.cpp |
Commit
18c19414eb70578d4c487d6f4b0f438aead71d6a
by wei.huang[PowerPC] Add PowerPC compare and multiply related builtins and instrinsics for XL compatibility
This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds the builtins and instrisics for compare and multiply related operations.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D102875
|
 | clang/test/CodeGen/builtins-ppc-xlcompat-pwr9-error.c |
 | llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-multiply-64bit-only.ll |
 | llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-compare.ll |
 | llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-multiply.ll |
 | llvm/include/llvm/IR/IntrinsicsPowerPC.td |
 | clang/include/clang/Basic/BuiltinsPPC.def |
 | clang/test/CodeGen/builtins-ppc-xlcompat-pwr9-64bit.c |
 | clang/test/CodeGen/builtins-ppc-xlcompat-pwr9.c |
 | clang/test/CodeGen/builtins-ppc-xlcompat-multiply.c |
 | llvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-compare-64bit-only.ll |
 | clang/test/CodeGen/builtins-ppc-xlcompat-multiply-64bit-only.c |
 | clang/lib/Basic/Targets/PPC.cpp |
 | llvm/lib/Target/PowerPC/PPCInstr64Bit.td |
 | llvm/lib/Target/PowerPC/PPCInstrInfo.td |
 | clang/lib/Sema/SemaChecking.cpp |
Commit
9955c652eafdcb5f1d16ee3db857f03ee7e5cfbc
by gcmn[NFC][MLIR][std] Clean up ArithmeticCastOps
The documentation on these was out of sync with the implementation. Also the declaration of inputs was repeated when it is already part of the ArithmeticCastOp definition.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D105934
|
 | mlir/include/mlir/Dialect/StandardOps/IR/Ops.td |
Commit
5df99954392e3a4448e4ff43d4cf644bc06bfa92
by Vitaly Buka[NFC][sanitizer] Rename some MemoryMapper members
Part of D105778
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
Commit
afa3fedcda98db4d47694ed596270a5396074224
by Vitaly Buka[NFC][sanitizer] Exctract DrainHalfMax
Part of D105778
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h |
Commit
bb8c7a980fe487eb322d38641db9145a6b6cb1d4
by efriedma[ScalarEvolution] Make isKnownNonZero handle more cases.
Using an unsigned range instead of signed ranges is a bit more precise.
Differential Revision: https://reviews.llvm.org/D105941
|
 | llvm/test/Analysis/ScalarEvolution/trip-count9.ll |
 | llvm/lib/Analysis/ScalarEvolution.cpp |
Commit
eebe841a47cbbd55bdcc32da943c92d18f88a5b8
by Matthew.ArsenaultRegAlloc: Allow targets to split register allocation
AMDGPU normally spills SGPRs to VGPRs. Previously, since all register classes are handled at the same time, this was problematic. We don't know ahead of time how many registers will be needed to be reserved to handle the spilling. If no VGPRs were left for spilling, we would have to try to spill to memory. If the spilled SGPRs were required for exec mask manipulation, it is highly problematic because the lanes active at the point of spill are not necessarily the same as at the restore point.
Avoid this problem by fully allocating SGPRs in a separate regalloc run from VGPRs. This way we know the exact number of VGPRs needed, and can reserve them for a second run. This fixes the most serious issues, but it is still possible using inline asm to make all VGPRs unavailable. Start erroring in the case where we ever would require memory for an SGPR spill.
This is implemented by giving each regalloc pass a callback which reports if a register class should be handled or not. A few passes need some small changes to deal with leftover virtual registers.
In the AMDGPU implementation, a new pass is introduced to take the place of PrologEpilogInserter for SGPR spills emitted during the first run.
One disadvantage of this is currently StackSlotColoring is no longer used for SGPR spills. It would need to be run again, which will require more work.
Error if the standard -regalloc option is used. Introduce new separate -sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be controlled individually. PBQB is not currently supported, so this also prevents using the unhandled allocator.
|
 | llvm/test/CodeGen/AMDGPU/callee-frame-setup.ll |
 | llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll |
 | llvm/test/CodeGen/AMDGPU/pei-build-spill.mir |
 | llvm/test/CodeGen/AMDGPU/virtregrewrite-undef-identity-copy.mir |
 | llvm/lib/CodeGen/RegAllocFast.cpp |
 | llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp |
 | llvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll |
 | llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp |
 | llvm/include/llvm/CodeGen/RegAllocRegistry.h |
 | llvm/test/CodeGen/AMDGPU/attr-amdgpu-flat-work-group-size-vgpr-limit.ll |
 | llvm/test/CodeGen/AMDGPU/sgpr-spill-wrong-stack-id.mir |
 | llvm/test/CodeGen/AMDGPU/remat-vop.mir |
 | llvm/test/CodeGen/AMDGPU/alloc-aligned-tuples-gfx90a.mir |
 | llvm/test/CodeGen/AMDGPU/indirect-call.ll |
 | llvm/lib/CodeGen/RegAllocBase.cpp |
 | llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll |
 | llvm/lib/CodeGen/RegAllocBasic.cpp |
 | llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |
 | llvm/test/CodeGen/AMDGPU/agpr-csr.ll |
 | llvm/test/CodeGen/AMDGPU/sgpr-spill-no-vgprs.ll |
 | llvm/test/CodeGen/AMDGPU/sibling-call.ll |
 | llvm/test/CodeGen/AMDGPU/spill-empty-live-interval.mir |
 | llvm/lib/Target/AMDGPU/SIRegisterInfo.h |
 | llvm/test/CodeGen/AMDGPU/alloc-aligned-tuples-gfx908.mir |
 | llvm/include/llvm/CodeGen/Passes.h |
 | llvm/lib/CodeGen/RegAllocGreedy.cpp |
 | llvm/test/CodeGen/AMDGPU/spill_more_than_wavesize_csr_sgprs.ll |
 | llvm/include/llvm/CodeGen/RegAllocCommon.h |
 | llvm/test/CodeGen/AMDGPU/gfx-callable-preserved-registers.ll |
 | llvm/lib/CodeGen/RegAllocBase.h |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll |
 | llvm/lib/Target/AMDGPU/SIFrameLowering.cpp |
 | llvm/test/CodeGen/AMDGPU/llc-pipeline.ll |
 | llvm/test/CodeGen/AMDGPU/sgpr-regalloc-flags.ll |
 | llvm/test/CodeGen/AMDGPU/unstructured-cfg-def-use-issue.ll |
 | llvm/test/CodeGen/AMDGPU/stack-slot-color-sgpr-vgpr-spills.mir |
 | llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp |
 | llvm/lib/CodeGen/TargetPassConfig.cpp |
 | llvm/test/CodeGen/AMDGPU/vgpr-tuple-allocation.ll |
 | llvm/lib/CodeGen/LiveIntervals.cpp |
Commit
99aebb62fb4f2a39c7f03579facf3a1e176b245d
by Vitaly Buka[NFC][sanitizer] Don't store region_base_ in MemoryMapper
Part of D105778
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
 | compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp |
Commit
0024ec59a0f3deb206a21567ac2ebe0fc097ea9d
by aeubanks[NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch
To help with debugging non-trivial unswitching issues.
Don't care about the legacy pass, nobody is using it.
If a pass's string params are empty (e.g. "simple-loop-unswitch"), don't default to the empty constructor for the pass params. We should still let the parser take care of it in case the parser has its own defaults.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D105933
|
 | llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp |
 | llvm/test/Transforms/SimpleLoopUnswitch/options.ll |
 | llvm/test/Other/print-passes.ll |
 | llvm/lib/Passes/PassRegistry.def |
 | llvm/lib/Passes/PassBuilder.cpp |
 | llvm/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h |
Commit
832ba20710ee09b00161ea72cf80c9af800fda63
by Vitaly Bukasanitizer_common: optimize memory drain
Currently we allocate MemoryMapper per size class. MemoryMapper mmap's and munmap's internal buffer. This results in 50 mmap/munmap calls under the global allocator mutex. Reuse MemoryMapper and the buffer for all size classes. This radically reduces number of mmap/munmap calls. Smaller size classes tend to have more objects allocated, so it's highly likely that the buffer allocated for the first size class will be enough for all subsequent size classes.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D105778
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h |
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
Commit
3191ac27e396dbd141243b8ca6cf5660c10ddf5c
by Matthew.ArsenaultAMDGPU: Try to fix test failure with EXPENSIVE_CHECKS
The machine verifier is enabled by default for EXPENSIVE_CHECKS, so the pass runs of it would pollute the output here.
|
 | llvm/test/CodeGen/AMDGPU/sgpr-regalloc-flags.ll |
Commit
7140382b17df7c33145cc6e9a2df7e84a2259444
by Vitaly Buka[NFC][sanitizer] Move MemoryMapper template parameter
|
 | compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp |
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
Commit
8725b382b0a5ea375252d966bafbace62a21e93b
by Vitaly Buka[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
 | compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp |
Commit
5bd7cc4f42488129adb135539c64bb3933d5da4c
by Jessica Paquette[AArch64][GlobalISel] Mark v2s64 -> v2p0 G_INTTOPTR as legal
Allow
``` %x:_<2 x p0> = G_INTTOPTR %y:_<2 x s64> ```
This shows up when building clang for AArch64 with GlobalISel.
Also show that we can select it.
This should match SDAG's behaviour: https://godbolt.org/z/33oqYoaYv
Differential Revision: https://reviews.llvm.org/D105944
|
 | llvm/test/CodeGen/AArch64/GlobalISel/select-int-ptr-casts.mir |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalize-inttoptr.mir |
 | llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp |
Commit
ed430023e864c3b3ff7f47d5740e5380828c26f6
by Vitaly BukaRevert "[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer"
Does not compile.
This reverts commit 8725b382b0a5ea375252d966bafbace62a21e93b.
|
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
 | compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp |
Commit
5738819679fd3bb08c4848129b27c63690d937a5
by aeubanksRevert "[SCEV] Handle zero stride correctly in howManyLessThans"
This reverts commit 4df591b5c960affd1612e330d0c9cd3076c18053.
Causes crashes, see comments on D105921.
|
 | llvm/lib/Analysis/ScalarEvolution.cpp |
 | llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll |
Commit
6377388c32ffc1f5c054a813d0bc81ac118108af
by Jon Roelofs[AArch64] Fix AArch64::dsub's size
|
 | llvm/lib/Target/AArch64/AArch64RegisterInfo.td |
Commit
87c6bf92a9c7722b18643ea73f76623f2463c5bb
by Jon Roelofs[AArch64] rm unused subreg's
|
 | llvm/lib/Target/AArch64/AArch64RegisterInfo.td |
Commit
35ce66330a2686878ea0a1da93e0a94961933006
by Vitaly Buka[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer
|
 | compiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp |
 | compiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h |
Commit
071203845887a2ff0347747bd5864f8738d17eef
by hoy[CSSPGO][llvm-profgen] Allow multiple executable load segments.
The linker or post-link optimizer can create an ELF image with multiple executable segments each of which will be loaded separately at run time. This breaks the assumption of llvm-profgen that currently only supports one base load address. What it ends up with is that the subsequent mmap events will be treated as an overwrite of the first mmap event which will in turn screw up address mapping. While it is non-trivial to support multiple separate load addresses and given that on x64 those segments will always be loaded at consecutive addresses (though via separate mmap sys calls), I'm adding an error checking logic to bail out if that's violated and keep using a single load address which is the address of the first executable segment.
Also changing the disassembly output from printing section offset to printing the virtual address instead, which matches the behavior of objdump.
Differential Revision: https://reviews.llvm.org/D103178
|
 | llvm/test/tools/llvm-profgen/symbolize.test |
 | llvm/test/tools/llvm-profgen/Inputs/multi-load-segs.perfscript |
 | llvm/tools/llvm-profgen/ProfiledBinary.cpp |
 | llvm/tools/llvm-profgen/ProfiledBinary.h |
 | llvm/test/tools/llvm-profgen/multi-load-segs.test |
 | llvm/test/tools/llvm-profgen/Inputs/symbolize.perfbin |
 | llvm/test/tools/llvm-profgen/disassemble.test |
 | llvm/test/tools/llvm-profgen/Inputs/multi-load-segs.perfbin |
 | llvm/test/tools/llvm-profgen/disassemble.s |
 | llvm/tools/llvm-profgen/PerfReader.cpp |
 | llvm/test/tools/llvm-profgen/mmapEvent.test |
 | llvm/test/tools/llvm-profgen/Inputs/symbolize.ll |
 | llvm/tools/llvm-profgen/PerfReader.h |
 | llvm/test/tools/llvm-profgen/symbolize.ll |
Commit
74b99b5c2eacbdef15b99b3e0a8073598f985bb4
by hoy[CSSPGO] Do not import pseudo probe desc in thinLTO
Previously we reliedy on pseudo probe descriptors to look up precomputed GUID during probe emission for inlined probes. Since we are moving to always using unique linkage names, GUID for functions can be computed in place from dwarf names. This eliminates the need of importing pseudo probe descs in thinlto, since those descs should be emitted by the original modules.
This significantly reduces thinlto memory footprint in some extreme case where the number of imported modules for a single module is massive.
Test Plan:
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D105248
|
 | llvm/lib/CodeGen/AsmPrinter/PseudoProbePrinter.h |
 | llvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll |
 | llvm/lib/Linker/IRMover.cpp |
 | llvm/test/ThinLTO/X86/Inputs/pseudo-probe-desc-import.ll |
 | llvm/lib/CodeGen/AsmPrinter/PseudoProbePrinter.cpp |
 | llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp |
Commit
cda2394d9768f97cbacbbf8a5c6288c1015b981a
by hoy[NFC][CSSPGO] Rename the name of an enum value.
|
 | llvm/tools/llvm-profgen/PerfReader.cpp |
Commit
8a0f1163d02c77c6e764929b66c26ba196cfc549
by richardFix test trying to write a spurious output file into the source directory.
This causes test failures if the source directory is read-only.
|
 | clang/test/CodeGen/builtins-ppc-xlcompat-pwr9.c |
 | clang/test/CodeGen/builtins-ppc-xlcompat-pwr9-64bit.c |
Commit
205ed009a44c2b04a15aea039d8947e74856f158
by efriedma[SCEV] Handle zero stride correctly in howManyLessThans
This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result.
Recommitting with isLoopInvariant() check.
Differential Revision: https://reviews.llvm.org/D105921
|
 | llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll |
 | llvm/lib/Analysis/ScalarEvolution.cpp |
Commit
1100e4aafea233bc8bbc307c5758a7d287ad3bae
by tianshilei1992[AbstractAttributor] Fold function calls to `__kmpc_is_spmd_exec_mode` if possible
In the device runtime there are many function calls to `__kmpc_is_spmd_exec_mode` to query the execution mode of current kernels. In many cases, user programs only contain target region executing in one mode. As a consequence, those runtime function calls will only return one value. If we can get rid of these function calls during compliation, it can potentially improve performance.
In this patch, we use `AAKernelInfo` to analyze kernel execution. Basically, for each kernel (device) function `F`, we collect all kernel entries `K` that can reach `F`. A new AA, `AAFoldRuntimeCall`, is created for each call site. In each iteration, it will check all reaching kernel entries, and update the folded value accordingly.
In the future we will support more function.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105787
|
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | llvm/test/Transforms/OpenMP/custom_state_machines.ll |
 | llvm/test/Transforms/OpenMP/is_spmd_exec_mode_fold.ll |
Commit
fef5f4456abcb1ea052206db6c232468d70b07f2
by hoy[CSSPGO][llvm-profgen] Fix a missing initalization
Fixing a missing initalization that accidentaly caused by https://reviews.llvm.org/D103178 .
|
 | llvm/tools/llvm-profgen/ProfiledBinary.h |
Commit
597e9c61cee39071141f3c8f31f47561d2844196
by hoyRevert "[CSSPGO][llvm-profgen] Fix a missing initalization"
This reverts commit fef5f4456abcb1ea052206db6c232468d70b07f2.
|
 | llvm/tools/llvm-profgen/ProfiledBinary.h |
Commit
6b04ecaab355f0dfce8a980cb67a39662759734c
by hoy[CSSPGO][llvm-profgen] Fix a missing initalization
Fixing a missing initalization that accidentaly caused by https://reviews.llvm.org/D103178 .
|
 | llvm/tools/llvm-profgen/ProfiledBinary.h |
Commit
64785ac12ef8b94fe7281e2cbe2db68d64d55e4c
by Jinsong Ji[AIX] Update testcase to use aix triple
We have implemented the basic MCAsmParser now, we can use the triple directly now.
|
 | llvm/test/MC/PowerPC/modern-aix-as.s |
Commit
d5c0b0102a25c27f41137588422d368eb42d971e
by llvm-project[Polly] Fix typo. NFC.
Thanks to Mugerwa Martin for reporting.
|
 | polly/docs/Architecture.rst |
Commit
ba127a45701b5fa870a1df6b1fb09a351ad14051
by Vitaly Buka[sanitizer] Convert script to python 3
|
 | compiler-rt/test/sanitizer_common/android_commands/android_compile.py |
Commit
40ce58d0ca10a1195da82895749b67f30f000243
by david.greenRevert "[clang] Refactor AST printing tests to share more infrastructure"
This reverts commit 20176bc7dd3f431db4c3d59b51a9f53d52190c82 as some versions of GCC do not seem to handle the new code very well. They complain about:
/tmp/ccqUQZyw.s: Assembler messages: /tmp/ccqUQZyw.s:1151: Error: symbol `_ZNSt14_Function_base13_Base_managerIN5clangUlPKNS1_4StmtEE2_EE10_M_managerERSt9_Any_dataRKS7_St18_Manager_operation' is already defined /tmp/ccqUQZyw.s:11963: Error: symbol `_ZNSt17_Function_handlerIFbPKN5clang4StmtEENS0_UlS3_E2_EE9_M_invokeERKSt9_Any_dataOS3_' is already defined
This seems like it is some GCC issue, but multiple buildbots (and my local machine) are all failing because of it.
|
 | clang/unittests/AST/NamedDeclPrinterTest.cpp |
 | clang/unittests/AST/ASTPrint.h |
 | clang/unittests/AST/DeclPrinterTest.cpp |
 | clang/unittests/AST/StmtPrinterTest.cpp |
Commit
94210b12d1d6454c6de8ca4c83a82a1148b5cd1a
by Vitaly Buka[sanitizer] Upgrade android scripts to python 3
|
 | compiler-rt/test/sanitizer_common/android_commands/android_common.py |
 | compiler-rt/test/sanitizer_common/android_commands/android_run.py |
Commit
16f8207de377a055b7b75a3003d82059ca63992d
by Vitaly Buka[sanitizer] Fix type error in python 3
|
 | compiler-rt/test/sanitizer_common/android_commands/android_run.py |
Commit
08cf69c31f849310ec45945d18f0feef4ea8f2e6
by zakk.chen[RISCV] Support overloading for RVV miscellaneous functions.
Based on this update to the intrinsic doc https://github.com/riscv/rvv-intrinsic-doc/pull/103
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D105611
|
 | clang/utils/TableGen/RISCVVEmitter.cpp |
 | clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vget.c |
 | clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vlmul.c |
 | clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vreinterpret.c |
 | clang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vset.c |
 | clang/include/clang/Basic/riscv_vector.td |
Commit
8ae31b08d9da5f42dd149eb48ef3e3baae2d1b07
by joker.ephReformulate OrcJIT tutorial doc to make it more clear.
Fixed a minor writing error. The text was hard to understand.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D105899
|
 | llvm/docs/tutorial/BuildingAJIT2.rst |
Commit
dfd9808b6cea59ff075498ee7e6e57f2b5b3a798
by Vitaly Bukasanitizer_common: add simpler ThreadRegistry ctor
Currently ThreadRegistry is overcomplicated because of tsan, it needs tid quarantine and reuse counters. Other sanitizers don't need that. It also seems that no other sanitizer now needs max number of threads. Asan used to need 2^24 limit, but it does not seem to be needed now. Other sanitizers blindly copy-pasted that without reasons. Lsan also uses quarantine, but I don't see why that may be potentially needed.
Add a ThreadRegistry ctor that does not require any sizes and use it in all sanitizers except for tsan. In preparation for new tsan runtime, which won't need any of these parameters as well.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D105713
|
 | compiler-rt/lib/sanitizer_common/tests/sanitizer_thread_registry_test.cpp |
 | compiler-rt/lib/memprof/memprof_thread.h |
 | compiler-rt/lib/sanitizer_common/sanitizer_thread_registry.h |
 | compiler-rt/lib/asan/asan_thread.h |
 | compiler-rt/lib/memprof/memprof_thread.cpp |
 | compiler-rt/lib/sanitizer_common/sanitizer_thread_registry.cpp |
 | compiler-rt/lib/lsan/lsan_thread.cpp |
 | compiler-rt/lib/asan/asan_thread.cpp |
Commit
2c425c17e678c522d8f4961e9ad94ad718a7cba0
by martin[libcxx] [test] Clarify weak_ptr_ret on Windows, remove a LIBCXX-WINDOWS-FIXME
On Windows, structs with a destructor are always returned indirectly; add this to the list of known exceptions in the test where the class isn't returned in registers as expected.
Differential Revision: https://reviews.llvm.org/D105906
|
 | libcxx/test/libcxx/memory/trivial_abi/weak_ptr_ret.pass.cpp |
Commit
5635d2a56dab6dc64d3a3f185d68f676b81dc736
by kito.cheng[RISCV] Pass -u to linker correctly.
`-u` is a linker option used to pretend a symbol is undefined, this option are common used for forcing archive member extraction.
This option should pass to `ld`, and many other toolchain in Clang like `tools::gnutools` has pass that too.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D105091
|
 | clang/lib/Driver/ToolChains/RISCVToolchain.cpp |
 | clang/test/Driver/riscv-args.c |