Commit
82655c151450e0103a3aa60725639da607f9220c
by jianzhouzh[MSan] Tweak CopyOrigin
There could be some mis-alignments when copying origins not aligned.
I believe inaligned memcpy is rare so the cases do not matter too much in practice.
1) About the change at line 50
Let dst be (void*)5, then d=5, beg=4 so we need to write 3 (4+4-5) bytes from 5 to 7.
2) About the change around line 77.
Let dst be (void*)5, because of lines 50-55, the bytes from 5-7 were already writen. So the aligned copy is from 8.
Reviewed-by: eugenis Differential Revision: https://reviews.llvm.org/D94552
|
 | compiler-rt/lib/msan/msan_poisoning.cpp |
Commit
25b3921f2fcd8fb3241c2f79e488f25a6374b99f
by thakis[gn build] (manually) port 79f99ba65d96
|
 | llvm/utils/gn/secondary/libcxx/include/BUILD.gn |
Commit
c0f3ea8a08ca9a9ec473f6e9072ccf30dad5def8
by zhanghb97[mlir][Python] Add checking process before create an AffineMap from a permutation.
An invalid permutation will trigger a C++ assertion when attempting to create an AffineMap from the permutation. This patch adds an `isPermutation` function to check the given permutation before creating the AffineMap.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94492
|
 | mlir/lib/Bindings/Python/IRModules.cpp |
 | mlir/test/Bindings/Python/ir_affine_map.py |
Commit
055644cc459eb204613ac788b73c51d5dab2fcbb
by yuanke.luo[X86][AMX] Prohibit pointer cast on load.
The load/store instruction will be transformed to amx intrinsics in the pass of AMX type lowering. Prohibiting the pointer cast make that pass happy.
Differential Revision: https://reviews.llvm.org/D94372
|
 | llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp |
 | llvm/test/Transforms/InstCombine/X86/x86-amx-load-store.ll |
Commit
5c7dcd7aead7b33ba065b98ab3573278feb42228
by Yuanfang Chen[Coroutine] Update promise object's final layout index
promise is a header field but it is not guaranteed that it would be the third field of the frame due to `performOptimizedStructLayout`.
Reviewed By: lxfind
Differential Revision: https://reviews.llvm.org/D94137
|
 | llvm/lib/Transforms/Coroutines/CoroFrame.cpp |
 | llvm/test/Transforms/Coroutines/coro-spill-promise.ll |
Commit
6529d7c5a45b1b9588e512013b02f891d71bc134
by rnk[PDB] Defer relocating .debug$S until commit time and parallelize it
This is a pretty classic optimization. Instead of processing symbol records and copying them to temporary storage, do a first pass to measure how large the module symbol stream will be, and then copy the data into place in the PDB file. This requires defering relocation until much later, which accounts for most of the complexity in this patch.
This patch avoids copying the contents of all live .debug$S sections into heap memory, which is worth about 20% of private memory usage when making PDBs. However, this is not an unmitigated performance win, because it can be faster to read dense, temporary, heap data than it is to iterate symbol records in object file backed memory a second time.
Results on release chrome.dll: peak mem: 5164.89MB -> 4072.19MB (-1,092.7MB, -21.2%) wall-j1: 0m30.844s -> 0m32.094s (slightly slower) wall-j3: 0m20.968s -> 0m20.312s (slightly faster) wall-j8: 0m19.062s -> 0m17.672s (meaningfully faster)
I gathered similar numbers for a debug, component build of content.dll in Chrome, and the performance impact of this change was in the noise. The memory usage reduction was visible and similar.
Because of the new parallelism in the PDB commit phase, more cores makes the new approach faster. I'm assuming that most C++ developer machines these days are at least quad core, so I think this is a win.
Differential Revision: https://reviews.llvm.org/D94267
|
 | lld/COFF/Chunks.cpp |
 | llvm/lib/DebugInfo/PDB/Native/DbiModuleDescriptorBuilder.cpp |
 | lld/COFF/PDB.cpp |
 | llvm/include/llvm/DebugInfo/PDB/Native/DbiModuleDescriptorBuilder.h |
 | llvm/lib/DebugInfo/PDB/Native/DbiStreamBuilder.cpp |
 | lld/COFF/Chunks.h |