Commit
e582c073d19b3c15d39cbe15a2f30b6190c6d0b1
by amy.kwan1[NFC][PowerPC] Add additional load/store test cases
This patch adds additional load/store test cases involving scalars, vectors, and PC-Rel in preparation for the refactored load and store implementation introduced in D93370.
Differential Revision: https://reviews.llvm.org/D97391
|
 | llvm/test/CodeGen/PowerPC/atomics-i8-ldst.ll |
 | llvm/test/CodeGen/PowerPC/vector-ldst.ll |
 | llvm/test/CodeGen/PowerPC/atomics-i32-ldst.ll |
 | llvm/test/CodeGen/PowerPC/f128_ldst.ll |
 | llvm/test/CodeGen/PowerPC/scalar-i16-ldst.ll |
 | llvm/test/CodeGen/PowerPC/int128_ldst.ll |
 | llvm/test/CodeGen/PowerPC/pcrel_ldst.ll |
 | llvm/test/CodeGen/PowerPC/scalar-i8-ldst.ll |
 | llvm/test/CodeGen/PowerPC/atomics-i16-ldst.ll |
 | llvm/test/CodeGen/PowerPC/scalar-float-ldst.ll |
 | llvm/test/CodeGen/PowerPC/atomics-i64-ldst.ll |
 | llvm/test/CodeGen/PowerPC/scalar-i64-ldst.ll |
 | llvm/test/CodeGen/PowerPC/scalar-double-ldst.ll |
 | llvm/test/CodeGen/PowerPC/scalar-i32-ldst.ll |
Commit
23cc8ebf59c661ebb988370a0edbcda37b61080a
by Jan Svoboda[clang][lex] Speculative fix for buffer overrun on raw string parse
This attempts to fix a (non-deterministic) buffer overrun when parsing raw string literals during modular build.
Similar fix to 4e5b5c36f47c9a406ea7f6b4f89fae477693973a.
Reviewed By: beccadax
Differential Revision: https://reviews.llvm.org/D94950
|
 | clang/lib/Lex/LiteralSupport.cpp (diff) |
Commit
74c270f33eb16d336b4ab834e18b27f8efcbabe8
by n.james93[ASTMatchers] Don't forward matchers in MapAnyOf
Forwarding these means that if an r-value reference is passed, the matcher will be moved. However it appears this happens for each mapped node matcher, resulting in use-after-move issues.
Reviewed By: steveire
Differential Revision: https://reviews.llvm.org/D98497
|
 | clang/include/clang/ASTMatchers/ASTMatchersInternal.h (diff) |
Commit
0333dde923c42219863f314d6c9fc0dcd352ef02
by n.james93[clang-tidy] Fix readability-identifer-naming duplicating prefix or suffix for replacements.
If a identifier has a correct prefix/suffix but a bad case, the fix won't strip them when computing the correct case, leading to duplication when the are added back.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D98521
|
 | clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp (diff) |
 | clang-tools-extra/test/clang-tidy/checkers/readability-identifier-naming.cpp (diff) |
Commit
da55af7f1d348c133774d8e8117d60462363fef5
by dmitry.polukhin[clang-tidy] Enable modernize-concat-nested-namespaces also on headers
For some reason the initial implementation of the check had an explicit check for the main file to avoid being applied in headers. This diff removes this check and add a test for the check on a header.
Similar approach was proposed in D61989 but review there got stuck.
Test Plan: added new test case
Differential Revision: https://reviews.llvm.org/D97563
|
 | clang-tools-extra/clang-tidy/modernize/ConcatNestedNamespacesCheck.cpp (diff) |
 | clang-tools-extra/test/clang-tidy/checkers/Inputs/modernize-concat-nested-namespaces/modernize-concat-nested-namespaces.h |
 | clang-tools-extra/test/clang-tidy/checkers/modernize-concat-nested-namespaces.cpp (diff) |
Commit
0b2aae42e5ea16a746d91a2945bf1e399fe485e3
by david.green[AArch64] Zero extended extract_vector_elt pattern
This adds a pattern for i64 zext_inreg(i32 extract_vector_elt X), producing a single UMOVvi16 instruction that is already expected to clear the top bits. The exact pattern that this matches is and(anyext(vector_extract X, lane), 0xff), similar to the sext patterns higher up in the same file.
Differential Revision: https://reviews.llvm.org/D98599
|
 | llvm/test/CodeGen/AArch64/build-vector-extract.ll (diff) |
 | llvm/lib/Target/AArch64/AArch64InstrInfo.td (diff) |
Commit
6f37d18d8cb109848b16ab55b631a9bdb4956a7a
by vyng[asan] Fixed test failing on windows due to different printf behaviour.
%p reported prints upper case hex chars on Windows. The fix is to switch to using %#lx
Differential Revision: https://reviews.llvm.org/D98570
|
 | compiler-rt/test/asan/TestCases/wild_pointer.cpp (diff) |
Commit
814339454d9e18f3298abbfba13104ce5e01e0b2
by llvm-dev[X86][SSE] canonicalizeShuffleWithBinOps - handle target shuffles.
Fold SHUFFLE(BINOP(SHUFFLE(X),SHUFFLE(Y))) -> BINOP(SHUFFLE'(X),SHUFFLE'(Y)) style patterns as well as the existing shuffles of constants.
|
 | llvm/test/CodeGen/X86/horizontal-sum.ll (diff) |
 | llvm/test/CodeGen/X86/srem-seteq-illegal-types.ll (diff) |
 | llvm/test/CodeGen/X86/haddsub-shuf.ll (diff) |
 | llvm/test/CodeGen/X86/phaddsub.ll (diff) |
 | llvm/test/CodeGen/X86/known-signbits-vector.ll (diff) |
 | llvm/test/CodeGen/X86/vec_usubo.ll (diff) |
 | llvm/test/CodeGen/X86/haddsub-undef.ll (diff) |
 | llvm/test/CodeGen/X86/vec_uaddo.ll (diff) |
 | llvm/test/CodeGen/X86/vector-shuffle-sse4a.ll (diff) |
 | llvm/lib/Target/X86/X86ISelLowering.cpp (diff) |
 | llvm/test/CodeGen/X86/haddsub-3.ll (diff) |
Commit
8e1c09ee5f8066918c6c91982395f6449f97f056
by tkeith[flang] Build intrinsic .mod files in include/flang
The build was putting .mod files for intrinsic modules in tools/flang/include/flang but the install puts them in include/flang, as does the out-of-tree build. This confused things for the driver. This change makes the build consistent with the install and simplifies the flang script accordingly.
Also, clean up the cmake commands for building the .mod files.
Differential Revision: https://reviews.llvm.org/D98522
|
 | flang/CMakeLists.txt (diff) |
 | flang/test/CMakeLists.txt (diff) |
 | flang/tools/f18/CMakeLists.txt (diff) |
 | flang/tools/f18/flang (diff) |
Commit
752f477d677b73039e9073d700c6def99c153445
by kostyak[scudo][standalone] Add shared library to makefile
Since we are looking to remove the old Scudo, we have to have a .so for parity purposes as some platforms use it.
I tested this on Fuchsia & Linux, not on Android though.
Differential Revision: https://reviews.llvm.org/D98456
|
 | compiler-rt/lib/scudo/standalone/CMakeLists.txt (diff) |
Commit
13e49dcee48f7bffec17df48b87e3237aebd5b1d
by jonathanchesterfield[amdgpu] Implement lower function LDS pass
[amdgpu] Implement lower function LDS pass
Local variables are allocated at kernel launch. This pass collects global variables that are used from non-kernel functions, moves them into a new struct type, and allocates an instance of that type in every kernel. Uses are then replaced with a constantexpr offset.
Prior to this pass, accesses from a function are compiled to trap. With this pass, most such accesses are removed before reaching codegen. The trap logic is left unchanged by this pass. It is still reachable for the cases this pass misses, notably the extern shared construct from hip and variables marked constant which survive the optimizer.
This is of interest to the openmp project because the deviceRTL runtime library uses cuda shared variables from functions that cannot be inlined. Trunk llvm therefore cannot compile some openmp kernels for amdgpu. In addition to the unit tests attached, this patch applied to ROCm llvm with fixed-abi enabled and the function pointer hashing scheme deleted passes the openmp suite.
This lowering will use more LDS than strictly necessary. It is intended to be a functionally correct fallback for cases that are difficult to target from future optimisation passes.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D94648
|
 | llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/lower-module-lds-constantexpr.ll |
 | llvm/test/CodeGen/AMDGPU/lds-global-non-entry-func.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/lower-module-lds-inactive.ll |
 | llvm/test/CodeGen/AMDGPU/lower-module-lds.ll |
 | llvm/test/CodeGen/AMDGPU/addrspacecast-initializer-unsupported.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/lower-module-lds-used-list.ll |
 | llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp |
 | llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (diff) |
 | llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.h (diff) |
 | llvm/test/CodeGen/AMDGPU/promote-alloca-to-lds-constantexpr-use.ll (diff) |
 | llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/lower-module-lds-indirect.ll |
 | llvm/lib/Target/AMDGPU/SIISelLowering.cpp (diff) |
 | llvm/lib/Target/AMDGPU/CMakeLists.txt (diff) |
 | llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/lds-global-non-entry-func.ll (diff) |
 | llvm/lib/Target/AMDGPU/AMDGPU.h (diff) |
|
 | llvm/utils/gn/secondary/llvm/lib/Target/AMDGPU/BUILD.gn (diff) |
Commit
995a128f07b60e82fdf7eb1e13670b4cda992379
by martin[libcxx] [docs] Update docs about how to build for Windows
Refresh the existing paragraphs on building in MSVC configurations, add a sample of one working configuration for MinGW, and add more details on what's necessary to run the tests these days.
Differential Revision: https://reviews.llvm.org/D97166
|
 | libcxx/www/index.html (diff) |
 | libcxx/docs/BuildingLibcxx.rst (diff) |
Commit
f60b35340fd70acb3c8003349931f085305886ad
by thomaspStop traping on sNaN in __builtin_isinf
__builtin_isinf currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case.
Reviewed By: mibintc
Differential Revision: https://reviews.llvm.org/D97125
|
 | clang/lib/CodeGen/CGBuiltin.cpp (diff) |
 | clang/test/CodeGen/aarch64-strictfp-builtins.c (diff) |
 | clang/test/CodeGen/strictfp_builtins.c (diff) |
 | clang/test/CodeGen/X86/strictfp_builtins.c (diff) |
 | clang/test/CodeGen/builtin_float_strictfp.c (diff) |
|
 | compiler-rt/lib/builtins/riscv/save.S |
 | compiler-rt/lib/builtins/CMakeLists.txt (diff) |
 | compiler-rt/lib/builtins/riscv/restore.S |
|
 | llvm/test/Transforms/InstSimplify/call.ll (diff) |
Commit
660728acd4f09294d18b9a0b51b8e94a68efd1a5
by spatel[InstSimplify] ctlz({signbit} >>u x) --> x
The motivating pattern was handled in 0a2d69480d , but we should have this for symmetry.
But this really highlights that we could generalize for any shifted constant if we match this in instcombine.
https://alive2.llvm.org/ce/z/MrmVNt
|
 | llvm/test/Transforms/InstSimplify/call.ll (diff) |
 | llvm/lib/Analysis/InstructionSimplify.cpp (diff) |
Commit
33b1f3f42cb9fc4fed9501ed49e4805f134e7a1b
by melanie.blower[clang][patch] Solve PR49479, File scope fp pragma should propagate to functions nested in struct, and initialization expressions
Previously, the CurFPFeatures state was set to command line settings before semantic analysis of the nested member functions and initialization expressions, that's not correct, it should use the pragma state which is in effect at the lexical position.
Reviewed By: Erich Keane, Aaron Ballman
Differential Revision: https://reviews.llvm.org/D98211
|
 | clang/test/CodeGen/fp-floatcontrol-stack.cpp (diff) |
 | clang/lib/Parse/ParseDeclCXX.cpp (diff) |
|
 | llvm/include/llvm/ADT/IntrusiveRefCntPtr.h (diff) |
 | llvm/include/llvm/Support/FormatVariadicDetails.h (diff) |
 | llvm/include/llvm/Support/Error.h (diff) |
Commit
4e67ae7b6b1c6c06f40191d9c968717101903761
by jianzhouzh[dfsan] Add origin ABI wrappers for thread/signal/fork
This is a part of https://reviews.llvm.org/D95835.
See https://github.com/llvm/llvm-project/commit/bb91e02efd00eda04296069a83228c8d9db105b7 about the similar issue of fork in MSan's origin tracking.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D98359
|
 | compiler-rt/lib/dfsan/dfsan_custom.cpp (diff) |
 | compiler-rt/lib/dfsan/dfsan_thread.cpp (diff) |
 | compiler-rt/test/dfsan/atomic.cpp (diff) |
 | compiler-rt/test/dfsan/sigaction_stress_test.c (diff) |
 | compiler-rt/test/dfsan/custom.cpp (diff) |
 | compiler-rt/test/dfsan/origin_with_signals.cpp |
 | compiler-rt/lib/dfsan/dfsan.cpp (diff) |
 | compiler-rt/test/dfsan/fork.cpp |
 | compiler-rt/lib/dfsan/dfsan_thread.h (diff) |
 | compiler-rt/test/dfsan/origin_with_sigactions.c |
 | compiler-rt/lib/dfsan/done_abilist.txt (diff) |
 | compiler-rt/test/dfsan/pthread.c (diff) |
Commit
0aceb61665dadbd80b24852cbe55cf0414bd4324
by zinenko[mlir] make memref.cast implement ViewLikeOpInterface
This was seemingly dropped in e2310704d890ad252aeb1ca28b4b84d29514b1d1, potentially due to a misrebase. The absence of this trait makes aliasing analysis incorrect, leading to, e.g., buffer deallocation pass inserting deallocations too early.
|
 | mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td (diff) |
Commit
772155793bd0def6e9c12c063a5fb330c416adfa
by llvm-dev[X86][SSE] isHorizontalBinOp - ensure we clear any unused source operands to improve HADD/SUB matching
Our shuffle matching for HADD/SUB patterns wasn't clearing repeated ops in 'fake unary' style shuffle masks (unpack(x,x) etc.), preventing matching of add(fakeunary(),fakeunary()) style patterns.
|
 | llvm/test/CodeGen/X86/haddsub-undef.ll (diff) |
 | llvm/lib/Target/X86/X86ISelLowering.cpp (diff) |
 | llvm/test/CodeGen/X86/haddsub-3.ll (diff) |
Commit
3dc5b533e093ee5df92b3c11ee2150869e83b8a6
by craig.topper[RISCV] Improve legalization of i32 UADDO/USUBO on RV64.
The default legalization uses zero extends that require pair of shifts on RISCV. Instead we can take advantage of the fact that unsigned compares work equally well on sign extended inputs. This allows us to use addw/subw and sext.w.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D98233
|
 | llvm/lib/Target/RISCV/RISCVISelLowering.cpp (diff) |
 | llvm/test/CodeGen/RISCV/xaluo.ll (diff) |
Commit
39970764af39415ad62136ff75b0f89577c18037
by ctetreau[CMake] Require python 3.6 if enabling LLVM test targets
The lit test suite uses python 3.6 features. Rather than a strange python syntax error upon running the lit tests, we will require the correct version in CMake.
Reviewed By: serge-sans-paille, yln
Differential Revision: https://reviews.llvm.org/D95635
|
 | llvm/CMakeLists.txt (diff) |
 | mlir/CMakeLists.txt (diff) |
 | lld/CMakeLists.txt (diff) |
 | llvm/cmake/modules/HandleLLVMOptions.cmake (diff) |
 | clang/CMakeLists.txt (diff) |
|
 | llvm/utils/gn/secondary/compiler-rt/lib/builtins/BUILD.gn (diff) |
Commit
f5f3a59837f41a1225833c55e3d0ac39db663626
by martin[libcxx] [test] Disable some allocation checks in class.path tests on windows
On windows, the path internal representation is wchar_t, and input/output often goes through utf8 inbetween, which causes extra allocations.
MS STL also fails a number of strict allocation checks, so this shouldn't be a standards compliance issue.
Differential Revision: https://reviews.llvm.org/D98398
|
 | libcxx/test/std/input.output/filesystems/class.path/path.member/path.concat.pass.cpp (diff) |
 | libcxx/test/std/input.output/filesystems/class.path/path.member/path.append.pass.cpp (diff) |
 | libcxx/test/std/input.output/filesystems/class.path/path.member/path.native.obs/string_alloc.pass.cpp (diff) |
 | libcxx/test/support/test_macros.h (diff) |
 | libcxx/test/std/input.output/filesystems/class.path/path.member/path.assign/source.pass.cpp (diff) |
Commit
d07e5c23b40078dcae13f76b091c9e18763ae44a
by martin[libcxx] [test] Fix the get_temp_file_name() function for mingw
Add the missing includes for getting the defines and functions used in the mingw version of get_temp_file_name().
This fixes 31 tests when built in a mingw configuration.
Also remove a redundant ifdef; _WIN32 is defined in mingw targets too.
Differential Revision: https://reviews.llvm.org/D97456
|
 | libcxx/test/support/platform_support.h (diff) |
Commit
156842937f5117176afae659d413f2891b81d4b9
by jonathanchesterfield[libomptarget][amdgcn] Drop use of inttypes.h, moving closer to freestanding
[libomptarget][amdgcn] Drop use of inttypes.h, moving closer to freestanding
The glibc headers are a periodic source of problems compiling the devicertl. This patch resolves the following error run into while building llvm on a slightly different linux system. ``` In file included from .../lib/clang/13.0.0/include/inttypes.h:21: In file included from /usr/include/inttypes.h:25: /usr/include/features.h:461:12: fatal error: 'sys/cdefs.h' file not found # include <sys/cdefs.h> ^~~~~~~~~~~~~ ``` As a second patch, removing assert.h from shuffle will let amdgcn build as -ffreestanding, at which point only the headers that clang itself provides are used and interactions with the host glibc are eliminated. Doing the same for nvptx is complicated by printf handling but also seems worthwhile.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D98565
|
 | openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h (diff) |
 | openmp/libomptarget/deviceRTLs/common/include/target/shuffle.h (diff) |
Commit
b7df372cdcd88b39ed3a05b5c8f09e879400f688
by llvm-project[Polly] Refactoring astScheduleDimIsParallel to take the C++ wrapper object. NFC
Polly currently needs to be slowly refactor to use the C++ wrapper objects to handle the reference counters automatically. I took the function of astScheduleDimIsParallel and refactored it so that it uses the C++ wrapper function as much as possible.
There are some problems with the IsParallel since it expects the C objects, so the C++ wrapper functions must be .release() and .get() first before they are able to be used with IsParallel.
When checking the ReductionDependencies Parallelism with the Build's Schedule, I opted to keep the union map as a C object rather than a C++ object. Eventually, changes will need to be made to IsParallel to refactor it to the C++ wrappers. When this is done, this function will also need to be slightly refactored to not use the C object.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98455
|
 | polly/lib/CodeGen/IslAst.cpp (diff) |
Commit
0035decae7ab9ab1c988fdcede46598540afd1a0
by fraser[CodeGen] Fix issues with scalable-vector INSERT/EXTRACT_SUBVECTORs
This patch addresses a few issues when dealing with scalable-vector INSERT_SUBVECTOR and EXTRACT_SUBVECTOR nodes.
When legalizing in DAGTypeLegalizer::SplitVecRes_INSERT_SUBVECTOR, we store the low and high halves to the stack separately. The offset for the high half was calculated incorrectly.
Additionally, we can optimize this process when we can detect that the subvector is contained entirely within the low/high split vector type. While this optimization is valid on scalable vectors, when performing the 'high' optimization, the subvector must also be a scalable vector. Note that the 'low' optimization is still conservative: it may be possible to insert v2i32 into the low half of a split nxv1i32/nxv1i32, but we can't guarantee it. It is always possible to insert v2i32 into nxv2i32 or v2i32 into nxv4i32+2 as we know vscale is at least 1.
Lastly, in SelectionDAG::isSplatValue, we early-exit on the extracted subvector value type being a scalable vector, forgetting that we can also extract a fixed-length vector from a scalable one.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D98495
|
 | llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (diff) |
 | llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll (diff) |
 | llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (diff) |
 | llvm/test/CodeGen/RISCV/rvv/insert-subvector.ll (diff) |
Commit
edf634ebc2673c2f6f878390c1f90bfd799f4a6b
by aqjune[AssumeBundles] Add nonnull/align to op bundle if noundef exists
This is a patch to add nonnull and align to assume's operand bundle only if noundef exists. Since nonnull and align in fn attr have poison semantics, they should be paired with noundef or noundef-implying attributes to be immediate UB.
Reviewed By: jdoerfert, Tyker
Differential Revision: https://reviews.llvm.org/D98228
|
 | llvm/test/Transforms/Util/assume-builder.ll (diff) |
 | llvm/lib/Transforms/Utils/AssumeBundleBuilder.cpp (diff) |
 | llvm/test/Transforms/Util/assume-builder-counter.ll (diff) |
 | llvm/unittests/Analysis/AssumeBundleQueriesTest.cpp (diff) |
Commit
b5e228fc00b18b1ce2ac823fe61b03b07600a9ad
by martin[libcxx] [test] Fix the temp_directory_path test for windows
Check a different set of env vars, don't check the exact value of the fallback path. (GetTempPath falls back to returning the Windows folder if nothing better is available in env vars.)
The test still fails one check on windows (due to relying on perms::none), which will be addressed separately.
Differential Revision: https://reviews.llvm.org/D98139
|
 | libcxx/test/std/input.output/filesystems/fs.op.funcs/fs.op.temp_dir_path/temp_directory_path.pass.cpp (diff) |
|
 | flang/docs/GettingInvolved.md (diff) |
Commit
fcfd3fda71905d7c48f75a531c2265ad3b9876ea
by luke.drummond[OpenCL] Respect calling convention for builtin
`__translate_sampler_initializer` has a calling convention of `spir_func`, but clang generated calls to it using the default CC.
Instruction Combining was lowering these mismatching calling conventions to `store i1* undef` which itself was subsequently lowered to a trap instruction by simplifyCFG resulting in runtime `SIGILL`
There are arguably two bugs here: but whether there's any wisdom in converting an obviously invalid call into a runtime crash over aborting with a sensible error message will require further discussion. So for now it's enough to set the right calling convention on the runtime helper.
Reviewed By: svenh, bader
Differential Revision: https://reviews.llvm.org/D98411
|
 | clang/test/CodeGenOpenCL/sampler.cl (diff) |
 | clang/lib/CodeGen/CodeGenModule.cpp (diff) |
Commit
9628cb1feef63d764c57fd0652016f9188000e2f
by sguelton[NFC] Use higher level constructs to check for whitespace/newlines in the lexer
It turns out that according to valgrind and perf, it's also slightly faster.
Differential Revision: https://reviews.llvm.org/D98637
|
 | clang/lib/Lex/Lexer.cpp (diff) |
|
 | clang/include/clang/Basic/LangOptions.def (diff) |
 | clang/test/Sema/128bitfloat.cpp (diff) |
 | clang/lib/Basic/Targets/PPC.cpp (diff) |
 | clang/lib/Basic/IdentifierTable.cpp (diff) |
Commit
b868a3edad9d453eae1c18ec8d9e1feee0cb8c9a
by zinenko[mlir] fix SPIR-V CPU and Vulkan runners after e2310704d890ad252aeb1ca28b4b84d29514b1d1
The commit in question changed the syntax but did not update the runner tests. This also required registering the MemRef dialect for custom parser to work correctly.
|
 | mlir/tools/mlir-spirv-cpu-runner/CMakeLists.txt (diff) |
 | mlir/tools/mlir-vulkan-runner/CMakeLists.txt (diff) |
 | mlir/test/mlir-spirv-cpu-runner/double.mlir (diff) |
 | mlir/test/mlir-vulkan-runner/time.mlir (diff) |
 | mlir/tools/mlir-vulkan-runner/mlir-vulkan-runner.cpp (diff) |
 | mlir/test/mlir-vulkan-runner/addf.mlir (diff) |
 | mlir/test/mlir-vulkan-runner/mulf.mlir (diff) |
 | mlir/test/mlir-vulkan-runner/addi8.mlir (diff) |
 | mlir/test/mlir-vulkan-runner/subf.mlir (diff) |
 | mlir/test/mlir-vulkan-runner/addi.mlir (diff) |
 | mlir/tools/mlir-spirv-cpu-runner/mlir-spirv-cpu-runner.cpp (diff) |
 | mlir/test/mlir-spirv-cpu-runner/simple_add.mlir (diff) |
Commit
ab86edbc88fa41e9cb9c6b43d99b69278c9c5040
by stelios.ioannou[AArch64] Implement __rndr, __rndrrs intrinsics
This patch implements the __rndr and __rndrrs intrinsics to provide access to the random number instructions introduced in Armv8.5-A. They are only defined for the AArch64 execution state and are available when __ARM_FEATURE_RNG is defined.
These intrinsics store the random number in their pointer argument and return a status code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter intrinsic reseeds the random number generator.
The instructions write the NZCV flags indicating the success of the operation that we can then read with a CSET.
[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics [2] https://bugs.llvm.org/show_bug.cgi?id=47838
Differential Revision: https://reviews.llvm.org/D98264
Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc
|
 | llvm/test/CodeGen/AArch64/rand.ll |
 | llvm/lib/Target/AArch64/AArch64InstrInfo.td (diff) |
 | clang/test/CodeGen/builtins-arm64.c (diff) |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.h (diff) |
 | clang/include/clang/Basic/BuiltinsAArch64.def (diff) |
 | clang/lib/Headers/arm_acle.h (diff) |
 | llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir (diff) |
 | clang/test/Preprocessor/aarch64-target-features.c (diff) |
 | clang/lib/Basic/Targets/AArch64.h (diff) |
 | llvm/include/llvm/IR/IntrinsicsAArch64.td (diff) |
 | clang/lib/CodeGen/CGBuiltin.cpp (diff) |
 | clang/test/CodeGen/arm_acle.c (diff) |
 | clang/lib/Basic/Targets/AArch64.cpp (diff) |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (diff) |
 | llvm/lib/Target/AArch64/AArch64InstrFormats.td (diff) |
Commit
3f170eb197906c2a793e62b54870296a4f5d722a
by llvm-project[Polly][Optimizer] Apply user-directed unrolling.
Make Polly look for unrolling metadata (https://llvm.org/docs/TransformMetadata.html#loop-unrolling) that is usually only interpreted by the LoopUnroll pass and apply it to the SCoP's schedule.
While not that useful by itself (there already is an unroll pass), it introduces mechanism to apply arbitrary loop transformation directives in arbitrary order to the schedule. Transformations are applied until no more directives are found. Since ISL's rescheduling would discard the manual transformations and it is assumed that when the user specifies the sequence of transformations, they do not want any other transformations to apply. Applying user-directed transformations can be controlled using the `-polly-pragma-based-opts` switch and is enabled by default.
This does not influence the SCoP detection heuristic. As a consequence, loop that do not fulfill SCoP requirements or the initial profitability heuristic will be ignored. `-polly-process-unprofitable` can be used to disable the latter.
Other than manually editing the IR, there is currently no way for the user to add loop transformations in an order other than the order in the default pipeline, or transformations other than the one supported by clang's LoopHint. See the `unroll_double.ll` test as example that clang currently is unable to emit. My own extension of `#pragma clang loop` allowing an arbitrary order and additional transformations is available here: https://github.com/meinersbur/llvm-project/tree/pragma-clang-loop. An effort to upstream this functionality as `#pragma clang transform` (because `#pragma clang loop` has an implicit transformation order defined by the loop pipeline) is D69088.
Additional transformations from my downstream pragma-clang-loop branch are tiling, interchange, reversal, unroll-and-jam, thread-parallelization and array packing. Unroll was chosen because it uses already-defined metadata and does not require correctness checks.
Reviewed By: sebastiankreutzer
Differential Revision: https://reviews.llvm.org/D97977
|
 | polly/lib/Analysis/ScopBuilder.cpp (diff) |
 | polly/include/polly/Support/ScopHelper.h (diff) |
 | polly/lib/Transform/ManualOptimizer.cpp |
 | polly/test/ScheduleOptimizer/ManualOptimization/unroll_partial_followup.ll |
 | polly/lib/CMakeLists.txt (diff) |
 | polly/lib/Support/ScopHelper.cpp (diff) |
 | polly/test/ScheduleOptimizer/ManualOptimization/unroll_partial.ll |
 | polly/include/polly/ScopInfo.h (diff) |
 | polly/lib/Transform/ScheduleTreeTransform.cpp (diff) |
 | polly/include/polly/ManualOptimizer.h |
 | polly/test/ScheduleOptimizer/ManualOptimization/disable_nonforced.ll |
 | polly/test/ScheduleOptimizer/ManualOptimization/unroll_full.ll |
 | polly/lib/CodeGen/IRBuilder.cpp (diff) |
 | polly/lib/CodeGen/IslNodeBuilder.cpp (diff) |
 | polly/test/ScheduleOptimizer/ManualOptimization/unroll_double.ll |
 | polly/include/polly/CodeGen/IRBuilder.h (diff) |
 | polly/lib/Transform/ScheduleOptimizer.cpp (diff) |
 | polly/include/polly/ScheduleTreeTransform.h (diff) |
Commit
018e96f71ff2d1617aff1ed1abd9c8ad61faf87d
by craig.topper[RISCV] Add isel-patterns to optimize (a < 1) into blez (a <= 0)
The following code-sequence showed up in a testcase (isolated from SPEC2017) for if-conversion and vectorization when searching for the maximum in an array: addi a2, zero, 1 blt a1, a2, .LBB0_5 which can be expressed as `bge zero,a1,.LBB0_5`/`blez a1,/LBB0_5`.
More generally, we want to express (a < 1) as (a <= 0).
This adds the required isel-pattern and updates the testcases.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D98449
|
 | llvm/lib/Target/RISCV/RISCVInstrInfo.td (diff) |
 | llvm/test/CodeGen/RISCV/hoist-global-addr-base.ll (diff) |
 | llvm/test/CodeGen/RISCV/branch.ll (diff) |
Commit
f675b3df4848bae5a8203c2508b721b41086471f
by jonathanchesterfield[libomptarget] Drop assert.h, use freestanding for amdgcn devicertl
[libomptarget] Drop assert.h, use freestanding for amdgcn devicertl
Promotes the runtime assert to a link time error for the unimplemented fallback functions. Enables amdgcn to build with only clang provided headers, which makes it less likely to break other builds when enabled.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D98649
|
 | openmp/libomptarget/deviceRTLs/common/include/target/shuffle.h (diff) |
 | openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt (diff) |
Commit
41759c3d92c5445857a89087a877c47bef907f13
by craig.topper[RISCV] Add RISCVISD::BR_CC similar to RISCVISD::SELECT_CC.
This allows me to introduce similar combines for branches as we have recently added for SELECT_CC. Some of them are less useful for standalone setccs and only help branch instructions. By having a BR_CC node its easier to only affect branches.
I'm using CondCodeSDNode to make isel patterns easier to write so we can refer to the codes by name. SELECT_CC uses a constant instead.
I've translated the condition code just like SELECT_CC so we need less patterns for the swapped conditions. This includes special cases for X < 1 and X > -1 that get translated to blez and bgez by using a 0 constant.
computeKnownBitsForTargetNode support for SELECT_CC is added to allow MaskedValueIsZero to work for cases where the true and false values of the SELECT_CC are setccs and the result of the SELECT_CC is used by a BR_CC. This was needed to avoid regressions in some of the overflow tests.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D98159
|
 | llvm/lib/Target/RISCV/RISCVISelLowering.cpp (diff) |
 | llvm/lib/Target/RISCV/RISCVISelLowering.h (diff) |
 | llvm/test/CodeGen/RISCV/xaluo.ll (diff) |
 | llvm/lib/Target/RISCV/RISCVInstrInfo.td (diff) |
|
 | compiler-rt/lib/dfsan/scripts/check_custom_wrappers.sh (diff) |
Commit
29d46760599b32504f4dfda4de69aca8b9f1f65e
by jezng[lld-macho] Place LC_FUNCTION_STARTS data at the right position
This pleases the codesign
(Otherwise it complains about "function starts data out of place")
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D98648
|
 | lld/MachO/Writer.cpp (diff) |
 | lld/test/MachO/linkedit-contiguity.s (diff) |
Commit
5d44c92bf82b7b1055a5a9826695b21eaa43530c
by iChange void getNoop(MCInst &NopInst) to MCInst getNop()
Prefer (self-documenting) return values to output parameters (which are liable to be used). While here, rename Noop to Nop which is more widely used and improves consistency with hasEmitNops/setEmitNops/emitNop/etc.
|
 | llvm/lib/Target/ARM/ARMInstrInfo.h (diff) |
 | llvm/lib/Target/ARM/Thumb2InstrInfo.h (diff) |
 | llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (diff) |
 | llvm/lib/CodeGen/BasicBlockSections.cpp (diff) |
 | llvm/lib/Target/ARM/Thumb1InstrInfo.cpp (diff) |
 | llvm/lib/Target/ARM/Thumb2InstrInfo.cpp (diff) |
 | llvm/lib/Target/PowerPC/PPCInstrInfo.cpp (diff) |
 | llvm/lib/Target/PowerPC/PPCInstrInfo.h (diff) |
 | llvm/lib/Target/ARM/Thumb1InstrInfo.h (diff) |
 | llvm/include/llvm/CodeGen/TargetInstrInfo.h (diff) |
 | llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (diff) |
 | llvm/lib/CodeGen/TargetInstrInfo.cpp (diff) |
 | llvm/lib/Target/AArch64/AArch64InstrInfo.h (diff) |
 | llvm/lib/Target/ARM/ARMInstrInfo.cpp (diff) |
 | llvm/lib/Target/X86/X86InstrInfo.cpp (diff) |
 | llvm/lib/Target/X86/X86InstrInfo.h (diff) |
|
 | compiler-rt/lib/dfsan/scripts/check_custom_wrappers.sh (diff) |
Commit
a5d30421a62cee0217afeac194d111eba9adb15e
by aktoon[CSSPGO] Load context profile for external functions in PreLink and populate ThinLTO import list
For ThinLTO's prelink compilation, we need to put external inline candidates into an import list attached to function's entry count metadata. This enables ThinLink to treat such cross module callee as hot in summary index, and later helps postlink to import them for profile guided cross module inlining.
For AutoFDO, the import list is retrieved by traversing the nested inlinee functions. For CSSPGO, since profile is flatterned, a few things need to happen for it to work:
- When loading input profile in extended binary format, we need to load all child context profile whose parent is in current module, so context trie for current module includes potential cross module inlinee. - In order to make the above happen, we need to know whether input profile is CSSPGO profile before start reading function profile, hence a flag for profile summary section is added. - When searching for cross module inline candidate, we need to walk through the context trie instead of nested inlinee profile (callsite sample of AutoFDO profile). - Now that we have more accurate counts with CSSPGO, we swtiched to use entry count instead of total count to decided if an external callee is potentially beneficial to inline. This make it consistent with how we determine whether call tagert is potential inline candidate.
Differential Revision: https://reviews.llvm.org/D98590
|
 | llvm/include/llvm/ProfileData/SampleProf.h (diff) |
 | llvm/lib/Transforms/IPO/SampleProfile.cpp (diff) |
 | llvm/test/Transforms/SampleProfile/Inputs/csspgo-import-list.prof.extbin |
 | llvm/test/Transforms/SampleProfile/csspgo-import-list.ll |
 | llvm/tools/llvm-profgen/ProfileGenerator.cpp (diff) |
 | llvm/lib/ProfileData/SampleProfReader.cpp (diff) |
 | llvm/include/llvm/Transforms/IPO/SampleContextTracker.h (diff) |
 | llvm/lib/ProfileData/SampleProfWriter.cpp (diff) |
 | llvm/test/Transforms/SampleProfile/Inputs/csspgo-import-list.prof |
 | llvm/tools/llvm-profgen/ProfileGenerator.h (diff) |
|
 | polly/lib/CodeGen/ManagedMemoryRewrite.cpp (diff) |
 | polly/lib/CodeGen/PPCGCodeGeneration.cpp (diff) |
Commit
bcb3f0f867b27179f9cab49d2ef41fe7769112c0
by jonathanchesterfield[libomptarget] Fix devicertl build
[libomptarget] Fix devicertl build
The target specific functions in target_interface are extern C, but the implementations for nvptx were mostly C++ mangling. That worked out as a quirk of DEVICE macro expanding to nothing, except for shuffle.h which only forward declared the functions with C++ linkage.
Also implements GetWarpSize, as used by shuffle, and includes target_interface in nvptx target_impl.cu to help catch future divergence between interface and implementation.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D98651
|
 | openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu (diff) |
 | openmp/libomptarget/deviceRTLs/common/include/target/shuffle.h (diff) |
 | openmp/libomptarget/deviceRTLs/target_interface.h (diff) |
 | openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip (diff) |
Commit
af2796c76d2ff4b73165ed47959afd35a769beee
by markus.boeck02[test] Add ability to get error messages from CMake for errc substitution
Visual Studios implementation of the C++ Standard Library does not use strerror to produce a message for std::error_code unlike other standard libraries such as libstdc++ or libc++ that might be used.
This patch adds a cmake script that through running a C++ program gets the error messages for the POSIX error codes and passes them onto lit through an optional config parameter.
If the config parameter is not set, or getting the messages failed, due to say a cross compiling configuration without an emulator, it will fall back to using pythons strerror functions.
Differential Revision: https://reviews.llvm.org/D98278
|
 | lld/test/lit.site.cfg.py.in (diff) |
 | llvm/test/lit.site.cfg.py.in (diff) |
 | lld/CMakeLists.txt (diff) |
 | clang/CMakeLists.txt (diff) |
 | llvm/cmake/modules/GetErrcMessages.cmake |
 | llvm/utils/lit/lit/llvm/config.py (diff) |
 | llvm/CMakeLists.txt (diff) |
 | clang/test/lit.site.cfg.py.in (diff) |
Commit
3bffb1cd0ef63858bcc88a6ef39d66c27a872df8
by Stanislav.Mekhanoshin[AMDGPU] Use single cache policy operand
Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amount of code. These operands are mostly 0 anyway.
Additional advantage that parser will accept these flags in any order unlike now.
Differential Revision: https://reviews.llvm.org/D96469
|
 | llvm/test/CodeGen/AMDGPU/i1_copy_phi_with_phi_incoming_value.mir (diff) |
 | llvm/lib/Target/AMDGPU/SIInstrInfo.td (diff) |
 | llvm/test/CodeGen/AMDGPU/hazard-in-bundle.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/hazard-kill.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/limit-soft-clause-reg-pressure.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fmul.mir (diff) |
 | llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (diff) |
 | llvm/test/CodeGen/AMDGPU/indirect-addressing-term.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/postra-bundle-memops.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/hazard-buffer-store-v-interp.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/wqm.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-global.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/regbank-reassign.mir (diff) |
 | llvm/test/MC/AMDGPU/gfx90a_err.s (diff) |
 | llvm/test/CodeGen/AMDGPU/fold-imm-copy.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.format.f16.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/coalescer-subreg-join.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/opt-sgpr-to-vgpr-copy.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/bundle-latency.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/smem-no-clause-coalesced.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-global.s96.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/regcoalescing-remove-partial-redundancy-assert.mir (diff) |
 | llvm/test/CodeGen/MIR/AMDGPU/mircanon-memoperands.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/dce-disjoint-intervals.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/regcoal-subrange-join-seg.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-copy.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/call-waw-waitcnt.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/shrink-vop3-carry-out.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/merge-image-load-gfx10.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/sgpr-spill-wrong-stack-id.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/dead-lane.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-flat.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/pei-build-spill.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.format.f32.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/coalescer-extend-pruned-subrange.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/flat-load-clustering.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-saddr.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/pei-scavenge-vgpr-spill.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-global.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fptoui.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/lower-control-flow-other-terminators.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/smrd-fold-offset.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.add.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/invert-br-undef-vcc.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/skip-branch-taildup-ret.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/cluster-flat-loads-postra.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/couldnt-join-subrange-3.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.tbuffer.load.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/spill-agpr.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt-no-redundant.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/vccz-corrupt-bug-workaround.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sitofp.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/sched-assert-onlydbg-value-empty-region.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/stack-slot-color-sgpr-vgpr-spills.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/sdwa-vop2-64bit.mir (diff) |
 | llvm/test/MC/AMDGPU/flat-gfx9.s (diff) |
 | llvm/lib/Target/AMDGPU/BUFInstructions.td (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.s.buffer.load.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt-agpr.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/insert-skips-flat-vmem-ds.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/amdgcn-load-offset-from-reg.ll (diff) |
 | llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/coalescer-with-subregs-bad-identical.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/collapse-endcf2.mir (diff) |
 | llvm/test/MC/AMDGPU/atomic-fadd-insts.s (diff) |
 | llvm/test/CodeGen/AMDGPU/nsa-reassign.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/sched-crash-dbg-value.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/sdwa-scalar-ops.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-implicit-def.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.fadd.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.format.ll (diff) |
 | llvm/lib/Target/AMDGPU/SMInstructions.td (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.load.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-flat.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/schedule-barrier.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/memory-legalizer-multiple-mem-operands-nontemporal-2.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/coalescer-subregjoin-fullcopy.mir (diff) |
 | llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/readlane_exec0.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/transform-block-with-return-to-epilog.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/smem-war-hazard.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/limit-coalesce.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-global.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/fold-immediate-output-mods.mir (diff) |
 | llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/regcoal-subrange-join.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/pei-reg-scavenger-position.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/merge-load-store.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/sdwa-preserve.mir (diff) |
 | llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/fold-fi-mubuf.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/mai-hazards.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/break-vmem-soft-clauses.mir (diff) |
 | llvm/test/MC/AMDGPU/mubuf-gfx10.s (diff) |
 | llvm/test/CodeGen/MIR/AMDGPU/mir-canon-multi.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/coalescer-subranges-another-copymi-not-live.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/subreg-split-live-in-error.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/phi-elimination-end-cf.mir (diff) |
 | llvm/test/CodeGen/MIR/AMDGPU/load-store-opt-dlc.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/insert-waitcnts-exp.mir (diff) |
 | llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (diff) |
 | llvm/test/CodeGen/AMDGPU/sdwa-ops.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/fastregalloc-self-loop-heuristic.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/hazard-inlineasm.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/schedule-regpressure.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/merge-tbuffer.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/memory-legalizer-region.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/virtregrewrite-undef-identity-copy.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f32.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/soft-clause-dbg-value.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.cmpswap.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/scalar-store-cache-flush.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/fold-multiple.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/syncscopes.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-flat.mir (diff) |
 | llvm/test/CodeGen/MIR/AMDGPU/custom-pseudo-source-values.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fmaxnum-ieee.mir (diff) |
 | llvm/test/MC/AMDGPU/cpol-err.s |
 | llvm/test/CodeGen/AMDGPU/reserved-reg-in-clause.mir (diff) |
 | llvm/test/CodeGen/MIR/AMDGPU/load-store-opt-scc.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/fold-imm-f16-f32.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/merge-image-sample.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/shrink-carry.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.format.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/break-smem-soft-clauses.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/sched-handleMoveUp-subreg-def-across-subreg-def.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/si-lower-control-flow.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/hazard-hidden-bundle.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/mcp-overlap-after-propagation.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/unallocatable-bundle-regression.mir (diff) |
 | llvm/test/CodeGen/MIR/AMDGPU/stack-id-assert.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/memory-legalizer-multiple-mem-operands-atomics.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/spill-reg-tuple-super-reg-use.mir (diff) |
 | llvm/test/CodeGen/MIR/AMDGPU/target-index-operands.mir (diff) |
 | llvm/lib/Target/AMDGPU/SIDefines.h (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.f16.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/post-ra-sched-kill-bundle-use-inst.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt-loop-irreducible.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/spill-special-sgpr.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-atomic-flat.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/inserted-wait-states.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/dead_copy.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/fold-sgpr-copy.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/memory-legalizer-local.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-flat.mir (diff) |
 | llvm/lib/Target/AMDGPU/AMDGPUGISel.td (diff) |
 | llvm/test/CodeGen/AMDGPU/flat-scratch-fold-fi.mir (diff) |
 | llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f16.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/sdwa-gfx9.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.tbuffer.load.f16.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm-gfx10.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/merge-image-load.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/verify-gfx90a-aligned-vgprs.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt-preexisting.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/memory_clause.mir (diff) |
 | llvm/lib/Target/AMDGPU/SIInstrFormats.td (diff) |
 | llvm/test/CodeGen/AMDGPU/dbg-value-ends-sched-region.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.i8.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/schedule-barrier-fpmode.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.add.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/sdwa-peephole-instr-gfx10.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/constant-fold-imm-immreg.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/memory-legalizer-invalid-addrspace.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/vgpr-spill.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/mai-hazards-gfx90a.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/expand-si-indirect.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global.s96.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/spill-agpr-partially-undef.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/sdwa-peephole-instr.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/hazard-pass-ordering.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/scheduler-handle-move-bundle.mir (diff) |
 | llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (diff) |
 | llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (diff) |
 | llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.format.f16.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt-back-edge-loop.mir (diff) |
 | llvm/test/CodeGen/MIR/AMDGPU/parse-order-reserved-regs.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/alloc-aligned-tuples-gfx90a.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/endpgm-dce.mir (diff) |
 | llvm/lib/Target/AMDGPU/MIMGInstructions.td (diff) |
 | llvm/test/CodeGen/AMDGPU/merge-image-sample-gfx10.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-flat.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/subvector-test.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.format.f16.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/hazard-recognizer-meta-insts.mir (diff) |
 | llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/SRSRC-GIT-clobber-check.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/coalescer-subranges-another-prune-error.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt-vmem-waw.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt.mir (diff) |
 | llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (diff) |
 | llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.mir (diff) |
 | llvm/test/MC/AMDGPU/flat-gfx10.s (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt-loop-single-basic-block.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/power-sched-no-instr-sunit.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/splitkit-copy-bundle.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/extract_subvector_vec4_vec3.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/cluster-flat-loads.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fract.f64.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/vmem-to-salu-hazard.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/nsa-vmem-hazard.mir (diff) |
 | llvm/lib/Target/AMDGPU/SIISelLowering.cpp (diff) |
 | llvm/test/CodeGen/MIR/AMDGPU/syncscopes.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/memory-legalizer-atomic-insert-end.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/regcoalesce-dbg.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/pei-build-spill-partial-agpr.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt-meta-instructions.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt-overflow.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.load.f16.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/debug-value-scheduler-crash.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/alloc-aligned-tuples-gfx908.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/lds-branch-vmem-hazard.mir (diff) |
 | llvm/test/MC/AMDGPU/gfx90a_asm_features.s (diff) |
 | llvm/test/CodeGen/AMDGPU/clamp-omod-special-case.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/optimize-if-exec-masking.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/hard-clauses.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/vmem-vcc-hazard.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.cmpswap.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/memory-legalizer-multiple-mem-operands-nontemporal-1.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/waitcnt-vscnt.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/branch-relaxation-debug-info.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fmaxnum.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/fp-atomic-to-s_denormmode.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-global.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/rename-independent-subregs-mac-operands.mir (diff) |
 | llvm/lib/Target/AMDGPU/FLATInstructions.td (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.fadd.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/splitkit-getsubrangeformask.ll (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fminnum-ieee.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/sched-assert-dead-def-subreg-use-other-subreg.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/undefined-physreg-sgpr-spill.mir (diff) |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-fminnum.mir (diff) |
 | llvm/lib/Target/AMDGPU/SIFrameLowering.cpp (diff) |
|
 | llvm/test/Transforms/SLPVectorizer/X86/reduction.ll (diff) |
|
 | llvm/utils/lit/lit/llvm/config.py (diff) |
Commit
9bcf0eff99a01094c685ff375a42e3f5a9166094
by kbobyrev[clangd] Optionally add reflection for clangd-index-server
This was originally landed without the optional part and reverted later:
https://github.com/llvm/llvm-project/commit/8080ea4c4b8c456c72c617587cc32f174b3105c1
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D98404
|
 | clang-tools-extra/clangd/Features.inc.in (diff) |
 | clang-tools-extra/clangd/index/remote/server/CMakeLists.txt (diff) |
 | clang-tools-extra/clangd/index/remote/server/Server.cpp (diff) |
 | llvm/cmake/modules/FindGRPC.cmake (diff) |
 | clang-tools-extra/clangd/CMakeLists.txt (diff) |
Commit
7da76aaaf41e963a1ec3b108f2bfefd88f42858b
by jonathanchesterfield[libomptarget] Build amdgpu plugin by default
[libomptarget] Build amdgpu plugin by default
This will build the amdgpu plugin if cmake is able to find the hsa runtime library, which will be the case if rocm is installed or if the hsa library has been installed somewhere cmake looks.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D98654
|
 | openmp/libomptarget/plugins/CMakeLists.txt (diff) |
|
 | openmp/libomptarget/deviceRTLs/common/src/omp_data.cu (diff) |
 | openmp/libomptarget/deviceRTLs/common/src/reduction.cu (diff) |
 | openmp/libomptarget/deviceRTLs/target_interface.h (diff) |
 | openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h (diff) |
 | openmp/libomptarget/deviceRTLs/common/src/omptarget.cu (diff) |
 | openmp/libomptarget/deviceRTLs/common/support.h (diff) |
 | openmp/libomptarget/deviceRTLs/common/omptarget.h (diff) |
 | openmp/libomptarget/deviceRTLs/common/device_environment.h (diff) |
 | openmp/libomptarget/deviceRTLs/amdgcn/src/amdgcn_locks.hip (diff) |
 | openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h (diff) |
 | openmp/libomptarget/deviceRTLs/common/src/support.cu (diff) |
 | openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu (diff) |
 | openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip (diff) |
Commit
86f2a3d17878e430109be209fdb013fb3f4716ee
by stefanp[PowerPC] Add __PCREL__ when PC Relative is enabled.
This patch adds the `__PCREL__` define when PC Relative addressing is enabled.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D98546
|
 | clang/lib/Basic/Targets/PPC.cpp (diff) |
 | clang/test/Preprocessor/init-ppc64.c (diff) |
|
 | openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt (diff) |