Changes from Git (git http://labmaster3.local/git/llvm-project.git)


  1. [BranchAlign] Compiler support for suppressing branch align (details)
  2. [X86] Add isel patterns for bitcasting between v32i1/v64i1 and (details)
  3. [X86] Custom type legalize v4i64->v4f32 uint_to_fp on sse4.1 targets in (details)
  4. [mlir][Linalg] Lower linalg.reshape to LLVM for the static case (details)
  5. [mlir] NFC: Move the state for managing aliases out of ModuleState and (details)
  6. [clang-tidy] Remove broken test on Windows for (details)
  7. [MLIR] Fix ML IR build on Windows with Visual Studio (details)
  8. [X86] Keep cl::opts at top of file [NFC] (details)
  9. Merge memtag instructions with adjacent stack slots. (details)
  10. Add a new AST matcher 'optionally'. (details)
Commit 29ccb12e2c12b6a50a1451ffdbf70fef29efda0e by listmail
[BranchAlign] Compiler support for suppressing branch align
As discussed heavily in the original review (D70157), there's a need for
the compiler to be able to selective suppress padding (either nop or
prefix) to respect assumptions about the meaning of labels and
instructions in generated code.
Rather than wait for syntax to be finalized - which appears to be a very
slow process - this patch focuses on the compiler use case and *only*
worries about the integrated assembler. To my knowledge, this covers all
cases mentioned to date for clang/JIT support.
For testing purposes, I wired it up so that if the integrated assembler
was using autopadding for branch alignment (e.g. enabled at command
line) then the textual assembly output would contain a comment for each
location where padding was enabled or disabled. This seemed like the
least painful choice overall.
Note that the result of this patch effective disables the jcc errata
mitigation for many constructs (statepoints, implicit null checks, xray,
etc...) which is non ideal. It is at least *correct* and should allow us
to enable the mitigation for the compiler. Once that's done, and a few
other items are worked through, we probably want to come back to this an
explore a bundling based approach instead so that we can pad
instructions while keeping labels in the right place.
Differential Revision:
The file was modifiedllvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
The file was modifiedllvm/include/llvm/MC/MCStreamer.h
The file was addedllvm/test/CodeGen/X86/align-branch-boundary-noautopadding.ll
The file was modifiedllvm/lib/Target/X86/X86MCInstLower.cpp
The file was modifiedllvm/lib/MC/MCAsmStreamer.cpp
The file was addedllvm/test/CodeGen/X86/align-branch-boundary-suppressions.ll
The file was modifiedllvm/include/llvm/MC/MCAsmBackend.h
The file was modifiedllvm/lib/MC/MCObjectStreamer.cpp
Commit d60b3b4817cb9346b682bb75371c41642c273b13 by craig.topper
[X86] Add isel patterns for bitcasting between v32i1/v64i1 and
We have to do an intermediate jump to a GPR to make the cast.
Fixes PR43750.
The file was modifiedllvm/test/CodeGen/X86/avx512bw-mask-op.ll
The file was modifiedllvm/lib/Target/X86/
Commit 3811417f39a7d0a370fac2923060f5ef8dacd8d7 by craig.topper
[X86] Custom type legalize v4i64->v4f32 uint_to_fp on sse4.1 targets in
64-bit mode
For v4i64->v4f32 uint_to_fp on pre-avx targets where v4i64 isn't legal
we create to v2i64->v2f32 uint_to_fp that need to be shuffled together.
Our codegen for v2i64->v2f32 involves detecting if the number is larger
than (2^31 - 1), if so we do a special divison by 2 so we can do a
signed conversion which we need to scalarize, then do a multiply by 2 at
the end if we divided earlier.
When v4i64 isn't legal we need to split the checking for a larger number
and dividing by 2 into two v2i64 vectors. The scalar part can extract
the 4 i64 values from those 4 splits. But we can reassemble the 4 scalar
f32 results directly into a single v432 vector. Then we just need to
combine the fixup indications from the 2 halves and we can do the final
multiply by 2 fixup on all 4 values if needed at once using a single
v4f32 blend and v4f32 fadd.
Differential Revision:
The file was modifiedllvm/test/CodeGen/X86/vec_int_to_fp.ll
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
Commit 766ce87e9bed89bc3b5c2c904f1eb2d10be0d3be by ntv
[mlir][Linalg] Lower linalg.reshape to LLVM for the static case
Summary: This diff adds lowering of the linalg.reshape op to LLVM.
A new descriptor is created with fields initialized as follows: 1.
allocatedPTr, alignedPtr and offset are copied from the source
descriptor 2. sizes are copied from the static destination shape 3.
strides are copied from the static strides collected with
Only the static case in which the target view conforms to strided memref
semantics is supported. Other cases are left for future work and will be
added on a per-need basis.
Reviewers: ftynse, mravishankar
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen,
antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits
Tags: #llvm
Differential Revision:
The file was modifiedmlir/lib/Conversion/LinalgToLLVM/LinalgToLLVM.cpp
The file was modifiedmlir/test/Dialect/Linalg/llvm.mlir
Commit 659f7d463b3d677823fdcfddc37eea481078c514 by riverriddle
[mlir] NFC: Move the state for managing aliases out of ModuleState and
into a new class AliasState.
Summary: This reduces the complexity of ModuleState and simplifies the
code. A future revision will mold ModuleState into something that can be
used by users for caching of printer state, as well as for implementing
printAsOperand style methods.
Reviewed By: antiagainst
Differential Revision:
The file was modifiedmlir/lib/IR/AsmPrinter.cpp
Commit 0a01ec972d2e24c721f46e55210d42391ae52b70 by abpostelnicu
[clang-tidy] Remove broken test on Windows for
`readability-misleading-indentation`. Because Windows build uses by
default `fdelayed-template-parsing` we cannot have a test where we don't
instantiate the template. Please see D72333.
The file was modifiedclang-tools-extra/test/clang-tidy/checkers/readability-misleading-indentation.cpp
Commit 48b14e58abc57cfea7bcdc0d7165686f135a2ebd by stilis
[MLIR] Fix ML IR build on Windows with Visual Studio
Summary: Right now the path for each lib in whole_archive_link when MSVC
is used as the compiler is not a full path - and it's not even the
correct path when VS is used to build. This patch sets the lib path to a
full path using CMAKE_CFG_INTDIR which means the path will be correct
regardless of whether ninja, make or VS is used and it will always be a
full path.
Reviewers: denis13, jpienaar
Reviewed By: jpienaar
Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen,
antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox,
llvm-commits, asmith
Tags: #llvm
Differential Revision:
The file was modifiedmlir/CMakeLists.txt
Commit ba181d0063e43fb56938555112ab859f48aee287 by listmail
[X86] Keep cl::opts at top of file [NFC]
The file was modifiedllvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
Commit b675a7628ce6a21b1e4a71c079a67badfb8b073d by eugenis
Merge memtag instructions with adjacent stack slots.
Summary: Detect a run of memory tagging instructions for adjacent stack
frame slots, and replace them with a shorter instruction sequence
* replace STG + STG with ST2G
* replace STGloop + STGloop with STGloop
This code needs to run when stack slot offsets are already known, but
before FrameIndex operands in STG instructions are eliminated; that's
the reason for the new hook in PrologueEpilogue.
This change modifies STGloop and STZGloop pseudos to take the size as an
immediate integer operand, and base address as a FI operand when
possible. This is needed to simplify recognizing an STGloop instruction
as operating on a stack slot post-regalloc.
This improves memtag code size by ~0.25%, and it looks like an
additional ~0.1% is possible by rearranging the stack frame such that
consecutive STG instructions reference adjacent slots (patch pending).
Reviewers: pcc, ostannard
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision:
The file was modifiedllvm/lib/CodeGen/PrologEpilogInserter.cpp
The file was modifiedllvm/test/CodeGen/AArch64/settag.ll
The file was modifiedllvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64InstrInfo.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64FrameLowering.h
The file was addedllvm/test/CodeGen/AArch64/settag-merge.mir
The file was modifiedllvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64FrameLowering.cpp
The file was addedllvm/test/CodeGen/AArch64/settag-merge.ll
The file was modifiedllvm/include/llvm/CodeGen/TargetFrameLowering.h
The file was modifiedllvm/lib/Target/AArch64/
The file was modifiedllvm/test/CodeGen/AArch64/stack-tagging-unchecked-ld-st.ll
Commit 2823e91d55891e33a7a8b9a4016db4ec9e2765ae by aaron
Add a new AST matcher 'optionally'.
This matcher matches any node and at the same time executes all its
inner matchers to produce any possbile result bindings.
This is useful when a user wants certain supplementary information
that's not always present along with the main match result.
The file was modifiedclang/docs/LibASTMatchersReference.html
The file was modifiedclang/lib/ASTMatchers/ASTMatchersInternal.cpp
The file was modifiedclang/include/clang/ASTMatchers/ASTMatchers.h
The file was modifiedclang/include/clang/ASTMatchers/ASTMatchersInternal.h
The file was modifiedclang/lib/ASTMatchers/Dynamic/Registry.cpp
The file was modifiedclang/unittests/ASTMatchers/ASTMatchersNarrowingTest.cpp