Commit
a8f720925599f8e44366438f1ccb4b4e9d9375ae
by Matthew.ArsenaultAMDGPU: Change internal tracking of wave size
Store the log2 wave size instead of forcing division and log2 operations when querying either.
|
 | llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h |
 | llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp |
 | llvm/lib/Target/AMDGPU/AMDGPUFeatures.td |
Commit
776708b00bddb01f91b8d44f6853770966d335a5
by Vedant Kumar[LiveDebugValues] Remove early-exit when testing regmasks, NFC
In transferRegisterDef, if the instruction has a regmask attached, we'll check if any currently used register is clobbered by the regmask.
The early exit in this scan isn't necessary, costs a set lookup, and is almost never taken [1]. Delete it.
[1] http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp.html#L1136
|
 | llvm/lib/CodeGen/LiveDebugValues.cpp |
Commit
19ff00dab875d6184618c756df01b57acb908e82
by Amara Emerson[AArch64] Fix CollectLOH creating an AdrpAdd LOH when there's a live used reg between the two instructions.
If there's a pattern like: $xA = ADRP foo @PAGE [some killing use of reg Xb] $Xb = ADDXri $Xa, 0, @PAGEOFF
CollectLOH would create an AdrpAdd LOH that resulted in the linker optimizing this sequence into: $xB = ADR foo [some killing use of reg $Xb] ... and therefore clobbers the live $Xb register that was used by the instruction in between.
This was discovered by a GlobalISel patch D78465 which broke up global variable accesses into two pseudos, which in some cases could be moved apart.
Differential Revision: https://reviews.llvm.org/D80834
|
 | llvm/test/CodeGen/AArch64/loh-use-between-adrp-add.mir |
 | llvm/lib/Target/AArch64/AArch64CollectLOH.cpp |
Commit
f573d489b6fccca85e0f2b3765aa17a364a4b0a8
by Amara Emerson[AArch64][GlobalISel] Split G_GLOBAL_VALUE into ADRP + G_ADD_LOW and optimize.
The concept of G_GLOBAL_VALUE is nice and simple, but always using it as the representation for global var addressing until selection time creates some problems in optimizing accesses in certain code/relocation models.
The problem comes from trying to optimize adrp -> add -> load/store sequences in the most common "small" code model. These accesses can be optimized into an adrp -> load with the add offset being folded into the load's immediate field. If we try to keep all global var references as a single generic instruction then by the time we get to the complex operand trying to match these, we end up generating an adrp at the point of use. The real issue here is that we don't have any form of CSE during selection, so the code size will bloat from many redundant adrp's.
This patch custom legalizes small code mode non-GOT G_GLOBALs into target ADRP and a new "target specific generic opcode" G_ADD_LOW. We also teach the localizer to localize these instructions via the custom hook that was added recently. Finally, the complex pattern for indexed loads/stores is extended to try to fold these G_ADD_LOW instructions into the load immediate.
On -O0 CTMark, we see a 0.8% geomean code size improvement. We should also see some minor performance improvements too.
Differential Revision: https://reviews.llvm.org/D78465
|
 | llvm/lib/Target/AArch64/AArch64LegalizerInfo.cpp |
 | llvm/test/CodeGen/AArch64/GlobalISel/combine-ext-debugloc.mir |
 | llvm/lib/Target/AArch64/AArch64InstructionSelector.cpp |
 | llvm/test/CodeGen/AArch64/dllimport.ll |
 | llvm/test/CodeGen/AArch64/GlobalISel/localizer.mir |
 | llvm/lib/Target/AArch64/AArch64LegalizerInfo.h |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalize-blockaddress.mir |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalize-global.mir |
 | llvm/lib/Target/AArch64/AArch64InstrGISel.td |
 | llvm/lib/Target/AArch64/AArch64InstrInfo.td |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir |
 | llvm/test/CodeGen/AArch64/arm64-custom-call-saved-reg.ll |
 | llvm/lib/Target/AArch64/AArch64ISelLowering.cpp |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalize-constant.mir |
 | llvm/test/CodeGen/AArch64/GlobalISel/call-translator-variadic-musttail.ll |
 | llvm/test/CodeGen/AArch64/arm64-ldxr-stxr.ll |
Commit
b429a0fef047867255e9cb65379677b2af7bb61b
by Vedant Kumar[docs] Sketch outline for HowToUpdateDebugInfo.rst
Summary: Sketch the outline for a new document that explains how to update debug info in various kinds of code transformations.
Some of the guidelines that belong in HowToUpdateDebugInfo.rst were in SourceLevelDebugging.rst already under the debugify section. It seems like the distinction between the two docs ought to be that the former is more prescriptive, while the latter is more descriptive.
To that end I've consolidated the "how to update debug info" guidelines which were in SourceLevelDebugging.rst into the new doc, along with the information about using "debugify" to test transformations. Since we've added a mir-debugify pass, I've described that as well.
Reviewers: aprantl, jmorse, chrisjackson, dsanders
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80052
|
 | llvm/docs/UserGuides.rst |
 | llvm/docs/SourceLevelDebugging.rst |
 | llvm/docs/HowToUpdateDebugInfo.rst |
Commit
a66e1d2aa943959e158821be8956109cb5ef3b3b
by Vedant Kumar[os_log][test] Remove -O1 from a test, NFC
|
 | clang/test/CodeGenObjCXX/os_log.mm |
Commit
a0b674fd7f06b86241cf19387313b508248a3868
by Adrian PrantlFix UB in EmulateInstructionARM64.cpp
This fixes an unhandled signed integer overflow in AddWithCarry() by using the llvm::checkedAdd() function. Thats to Vedant Kumar for the suggestion!
<rdar://problem/60926115>
Differential Revision: https://reviews.llvm.org/D80955
|
 | lldb/unittests/Instruction/CMakeLists.txt |
 | lldb/unittests/Instruction/TestAArch64Emulator.cpp |
 | lldb/unittests/CMakeLists.txt |
 | lldb/source/Plugins/Instruction/ARM64/EmulateInstructionARM64.h |
 | lldb/source/Plugins/Instruction/ARM64/EmulateInstructionARM64.cpp |
Commit
11d1aa0bcc1197f1b3010171b02c6e9662f34b75
by rnk[COFF] Free some memory used for chunks
First, do not reserve numSections in the Chunks array. In cases where there are many non-prevailing sections, this will overallocate memory which will not be used.
Second, free the memory for sparseChunks after initializeSymbols. After that, it is never used.
This saves 50MB of 627MB for my use case without affecting performance.
|
 | lld/COFF/InputFiles.cpp |
 | lld/COFF/InputFiles.h |
Commit
8a8d703be0986dd6785cba0b610c9c4708b83e89
by rjmccallFix how cc1 command line options are mapped into FP options.
Canonicalize on storing FP options in LangOptions instead of redundantly in CodeGenOptions. Incorporate -ffast-math directly into the values of those LangOptions rather than considering it separately when building FPOptions. Build IR attributes from those options rather than a mix of sources.
We should really simplify the driver/cc1 interaction here and have the driver pass down options that cc1 directly honors. That can happen in a follow-up, though.
Patch by Michele Scandale! https://reviews.llvm.org/D80315
|
 | clang/test/CodeGenOpenCL/relaxed-fpmath.cl |
 | clang/lib/CodeGen/BackendUtil.cpp |
 | clang/test/CodeGenCUDA/library-builtin.cu |
 | clang/lib/Frontend/CompilerInvocation.cpp |
 | clang/lib/CodeGen/CGCall.cpp |
 | clang/include/clang/Basic/LangOptions.h |
 | clang/lib/CodeGen/CGExprScalar.cpp |
 | clang/test/CodeGen/builtins-nvptx-ptx60.cu |
 | clang/test/CodeGen/libcalls.c |
 | clang/lib/CodeGen/CodeGenFunction.h |
 | clang/test/CodeGen/fp-options-to-fast-math-flags.c |
 | clang/test/CodeGenCUDA/builtins-amdgcn.cu |
 | clang/include/clang/Basic/CodeGenOptions.def |
 | clang/lib/CodeGen/CodeGenFunction.cpp |
 | clang/test/CodeGen/complex-math.c |
Commit
2e6c3e3e7b5eb46452b1819c69919fab820b4233
by kccadd debug code to chase down a rare crash in asan/lsan https://github.com/google/sanitizers/issues/1193
Summary: add debug code to chase down a rare crash in asan/lsan https://github.com/google/sanitizers/issues/1193
Reviewers: vitalybuka
Subscribers: #sanitizers, llvm-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D80967
|
 | compiler-rt/lib/asan/asan_allocator.cpp |
 | compiler-rt/lib/lsan/lsan_common.cpp |
Commit
801d823bdec182d3913aa0a27073229029359ad2
by kcc[asan] fix a comment typo
|
 | compiler-rt/lib/asan/asan_allocator.cpp |
Commit
3bb0d95fdc2fffd193d39d14f2ef421d4b468617
by yrouban[BrachProbablityInfo] Rename loop variables. NFC
|
 | llvm/lib/Analysis/BranchProbabilityInfo.cpp |