Commit
ea8448e3618a1581b5eca39d39bedaa55fede75d
by sam.parker[LoopUnroll] Adjust CostKind query
When TTI was updated to use an explicit cost, TCK_CodeSize was used although the default implicit cost would have been the hand-wavey cost of size and latency. So, revert back to this behaviour. This is not expected to have (much) impact on targets since most (all?) of them return the same value for SizeAndLatency and CodeSize.
When optimising for size, the logic has been changed to query CodeSize costs instead of SizeAndLatency.
This patch also adds a testing option in the unroller so that OptSize thresholds can be specified.
Differential Revision: https://reviews.llvm.org/D85723
|
 | llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp |
 | llvm/test/Transforms/LoopUnroll/ARM/unroll-optsize.ll |
 | llvm/test/Transforms/LoopUnroll/ARM/instr-size-costs.ll |
 | llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp |
Commit
bca1b8ed994336690db4775e67953aad533b0e31
by kai[SystemZ/ZOS] Implement computeHostNumPhysicalCores
On z/OS, the information is stored in the Common System Data Area (CSD). It is the number of CPs allocated to the current LPAR.
Reviewers: aganea, hubert.reinterpertcast, MaskRay
Reviewed By: hubert.reinterpertcast
Differential Revision: https://reviews.llvm.org/D85531
|
 | llvm/lib/Support/Host.cpp |
 | llvm/unittests/Support/Host.cpp |
Commit
b97e402ca5ba3d1a4795ed61f8cb36783b00ed44
by spatel[VectorCombine] add test for Hexagon that would crash; NFC
This test verifies the code change from: rGb0b95dab1ce2 (although that would not be true if PR47128 is fixed)
|
 | llvm/test/Transforms/VectorCombine/Hexagon/load.ll |
 | llvm/test/Transforms/VectorCombine/Hexagon/lit.local.cfg |
Commit
912c09e845cb1907bc44664495fc69925a1bd2a9
by spatel[InstCombine] eliminate a pointer cast around insertelement
I'm not sure if this solves PR46839 completely, but reducing the casting should help: https://bugs.llvm.org/show_bug.cgi?id=46839
Differential Revision: https://reviews.llvm.org/D85647
|
 | llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp |
 | llvm/test/Transforms/InstCombine/cast_ptr.ll |
Commit
e859868eb3808eae7ca0f27682931c38aa875090
by david.green[ARM] Add additional predicated VFMA tests. NFC
|
 | llvm/test/CodeGen/Thumb2/mve-fmas.ll |
Commit
89a7f64afc7968ef3337472d02f7e08681f1766e
by spatel[VectorCombine] add test for x86 target with SSE disabled; NFC
|
 | llvm/test/Transforms/VectorCombine/X86/no-sse.ll |
Commit
cc892fd9f4cb7ad8c6b37bc260fd12c2edf3745d
by spatel[VectorCombine] early exit if target has no vector registers
Based on post-commit discussion in: D81766
Other vectorization passes (SLP and Loop) use this TTI API similarly.
|
 | llvm/lib/Transforms/Vectorize/VectorCombine.cpp |
 | llvm/test/Transforms/VectorCombine/X86/no-sse.ll |
Commit
aa4bc1cb7978b87bdbdb75910da0abbd27889800
by erich.keaneLimit Max Vector alignment on COFF targets to 8192.
COFF targets have a max object alignment of 8192, so trying to create one with a larger size results in an unreachable in WinCOFFObjectWriter.
For the reproducer I have uses thread local storage, however other alignments are likely affected as well.
This patch sets the MaxVectorAlign for COFF to 8192. Additionally, though there is no longer a way to reproduce that I could find, it correctly sets the MaxTLSAlign for COFF to that value as well, so that if anyone comes up with a situation where this is true, it will cause an error.
Differential Revision: https://reviews.llvm.org/D85543
|
 | clang/include/clang/Basic/TargetInfo.h |
 | clang/lib/Basic/Targets/X86.h |
 | clang/test/CodeGen/alignment.c |
Commit
ec9563c54ed25e9f9cbe60985399212d50bd801d
by a.bataev[OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks.
Summary: In untied tasks, need to allocate the space for local variales, declared in task region, when the memory for task data is allocated. THe function can be interrupted and we can exit from the function in untied task switch. Need to keep the state of the local variables in this case. Also, the compiler should not call cleanup when exiting in untied task switch until the real exit out of the declaration scope is met during execution.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D84457
|
 | clang/lib/CodeGen/CGStmtOpenMP.cpp |
 | clang/test/OpenMP/task_codegen.cpp |
 | clang/lib/CodeGen/CGOpenMPRuntime.h |
 | clang/lib/CodeGen/CGOpenMPRuntime.cpp |
Commit
386d5af04b65aca7c81eed1468e53462a6b54550
by Xing[MachOYAML] Simplify the section data emitting function. NFC.
This patch helps simplify some codes in writeSectionData() function.
Reviewed By: jhenderson, grimar
Differential Revision: https://reviews.llvm.org/D85821
|
 | llvm/lib/ObjectYAML/MachOEmitter.cpp |
Commit
e891b6a75d919b5bcb95577d1e5eb0ebad0ea427
by Xing[DWARFYAML] Make the address size of compilation units optional.
This patch makes the 'AddrSize' field optional. If the address size is missing, yaml2obj will infer it from the object file.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D85805
|
 | llvm/lib/ObjectYAML/DWARFEmitter.cpp |
 | llvm/test/tools/yaml2obj/ELF/DWARF/debug-info.yaml |
 | llvm/include/llvm/ObjectYAML/DWARFYAML.h |
 | llvm/lib/ObjectYAML/DWARFYAML.cpp |
Commit
3651658bdd11a085b727783f27495a198c4f3bc5
by a.bataevRevert "[OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks."
This reverts commit ec9563c54ed25e9f9cbe60985399212d50bd801d to investigate compiler crash revelaed by the buildbots.
|
 | clang/lib/CodeGen/CGOpenMPRuntime.h |
 | clang/lib/CodeGen/CGOpenMPRuntime.cpp |
 | clang/lib/CodeGen/CGStmtOpenMP.cpp |
 | clang/test/OpenMP/task_codegen.cpp |
Commit
701228c4117636e6dd46564afcb8e5fbd98c13fb
by Matthew.ArsenaultAMDGPU: Handle intrinsics in performMemSDNodeCombine
This avoids a possible regression in a future patch
|
 | llvm/lib/Target/AMDGPU/SIISelLowering.cpp |
 | llvm/test/CodeGen/AMDGPU/shl_add_ptr_csub.ll |
 | llvm/test/CodeGen/AMDGPU/shl_add_ptr_global.ll |
Commit
e14474a39a14b3c86c6c5d5ed9bf11467a0bbe9b
by Matthew.ArsenaultAMDGPU/GlobalISel: Select llvm.amdgcn.global.atomic.fadd
Remove the intermediate transform in the DAG path. I believe this is the last non-deprecated intrinsic that needs handling.
|
 | llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.global.atomic.fadd.ll |
 | llvm/lib/Target/AMDGPU/BUFInstructions.td |
 | llvm/lib/Target/AMDGPU/FLATInstructions.td |
 | llvm/lib/Target/AMDGPU/SIISelLowering.cpp |
 | llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp |
 | llvm/lib/Target/AMDGPU/SIInstrInfo.td |