1. [InstCombine] Fix incorrect SimplifyWithOpReplaced transform (PR47322) (details)
  2. [ARM] Recognize "double extend" reduction patterns (details)
  3. [InstCombine][X86] getNegativeIsTrueBoolVec - use ConstantExpr evaluators. NFCI. (details)
  4. [Intrinsics] define semantics for experimental fmax/fmin vector reductions (details)
Commit 36e2e2e12efb6b02ad07f502d61b9a95937edb08 by nikita.ppv
[InstCombine] Fix incorrect SimplifyWithOpReplaced transform (PR47322)

This is a followup to D86834, which partially fixed this issue in
InstSimplify. However, InstCombine repeats the same transform while
dropping poison flags -- which does not cover cases where poison is
introduced in some other way.

The fix here is a bit more comprehensive, because things are quite
entangled, and it's hard to only partially address it without
regressing optimization. There are really two changes here:

* Export the SimplifyWithOpReplaced API from InstSimplify, with an
   added AllowRefinement flag. For replacements inside the TrueVal
   we don't actually care whether refinement occurs or not, the
   replacement is always legal. This part of the transform is now
   done in InstSimplify only. (It should be noted that the current
   AllowRefinement check is not sufficient -- that's an issue we
   need to address separately.)
* Change the InstCombine fold to work by temporarily dropping
   poison generating flags, running the fold and then restoring the
   flags if it didn't work out. This will ensure that the InstCombine
   fold is correct as long as the InstSimplify fold is correct.

Differential Revision:
The file was modifiedllvm/lib/Analysis/InstructionSimplify.cpp (diff)
The file was modifiedllvm/test/Transforms/InstCombine/select.ll (diff)
The file was modifiedllvm/include/llvm/Analysis/InstructionSimplify.h (diff)
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (diff)
Commit c437446d90be17c3fe8a216a90ee442222f2fe9d by
[ARM] Recognize "double extend" reduction patterns

We can sometimes get code that does:
  xe = zext i16 x to i32
  ye = zext i16 y to i32
  m = mul i32 xe, ye
  me = zext i32 m to i64
  r = vecreduce.add(me)
This "double extend" can trip up the reduction identification, but
should give identical results.

This extends the pattern matching to handle them.

Differential Revision:
The file was modifiedllvm/lib/Target/ARM/ARMISelLowering.cpp (diff)
The file was modifiedllvm/test/CodeGen/Thumb2/mve-vecreduce-mlapred.ll (diff)
The file was modifiedllvm/test/CodeGen/Thumb2/mve-vecreduce-mla.ll (diff)
Commit 50ee0b99ec2902f5cf7a62a5e9b4a4f882b17031 by llvm-dev
[InstCombine][X86] getNegativeIsTrueBoolVec - use ConstantExpr evaluators. NFCI.

Don't do this manually, we can just use the ConstantExpr evaluators to do it more tidily for us.
The file was modifiedllvm/lib/Target/X86/X86InstCombineIntrinsic.cpp (diff)
Commit 3a8ea8609b82b7e5401698b7c63df6680e1257a8 by spatel
[Intrinsics] define semantics for experimental fmax/fmin vector reductions

As discussed on llvm-dev:

This is hopefully the final remaining showstopper before we can remove
the 'experimental' from the reduction intrinsics.

No behavior was specified for the FP min/max reductions, so we have a
mess of different interpretations.

There are a few potential options for the semantics of these max/min ops.
I think this is the simplest based on current behavior/implementation:
make the reductions inherit from the existing llvm.maxnum/minnum intrinsics.
These correspond to libm fmax/fmin, and those are similar to the (now
deprecated?) IEEE-754 maxNum/minNum functions (NaNs are treated as missing
data). So the default expansion creates calls to libm functions.

Another option would be to inherit from llvm.maximum/minimum (NaNs propagate),
but most targets just crash in codegen when given those nodes because no
default expansion was ever implemented AFAICT.

We could also just assume 'nnan' semantics by default (we are already
assuming 'nsz' semantics in the maxnum/minnum intrinsics), but some targets
(AArch64, PowerPC) support the more defined behavior, so it doesn't make much
sense to not allow a tighter spec. Fast-math-flags (nnan) can be used to
loosen the semantics.

(Note that D67507 was proposed to update the LangRef to acknowledge the more
recent IEEE-754 2019 standard, but that patch seems to have stalled. If we do
update based on the new standard, the reduction instructions can seamlessly
inherit from whatever updates are made to the max/min intrinsics.)

x86 sees a regression here on 'nnan' tests because we have underlying,
longstanding bugs in FMF creation/propagation. Those need to be fixed apart
from this change (for example: The expansion
sequence before this patch may not have been correct.

Differential Revision:
The file was modifiedllvm/test/CodeGen/X86/vector-reduce-fmin.ll (diff)
The file was modifiedllvm/test/CodeGen/Thumb2/mve-vecreduce-loops.ll (diff)
The file was modifiedllvm/test/CodeGen/Thumb2/mve-vecreduce-fminmax.ll (diff)
The file was modifiedllvm/test/CodeGen/X86/vector-reduce-fmax-nnan.ll (diff)
The file was modifiedllvm/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll (diff)
The file was modifiedllvm/include/llvm/CodeGen/BasicTTIImpl.h (diff)
The file was modifiedllvm/test/CodeGen/X86/vector-reduce-fmin-nnan.ll (diff)
The file was modifiedllvm/test/CodeGen/AArch64/vecreduce-fmax-legalization-nan.ll (diff)
The file was modifiedllvm/test/CodeGen/Generic/expand-experimental-reductions.ll (diff)
The file was modifiedllvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (diff)
The file was modifiedllvm/lib/Target/ARM/ARMTargetTransformInfo.h (diff)
The file was modifiedllvm/lib/Target/AArch64/AArch64TargetTransformInfo.h (diff)
The file was modifiedllvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (diff)
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.cpp (diff)
The file was modifiedllvm/lib/CodeGen/ExpandReductions.cpp (diff)
The file was modifiedllvm/test/CodeGen/X86/vector-reduce-fmax.ll (diff)
The file was modifiedllvm/docs/LangRef.rst (diff)