Changes

Summary

  1. [flang] Correct the subscripts used for arguments to character intrinsics (details)
  2. RISCVFixupKinds.h: Don’t duplicate function or class name at the beginning of the comment && fix some comments (details)
  3. [ValueTracking] add FP intrinsics to test for propagatesPoison; NFC (details)
  4. [mlir][sparse] support new kind of scalar in sparse linalg generic op (details)
  5. [CSSPGO] Report zero-count probe in profile instead of dangling probes. (details)
  6. [llvm-objcopy][MachO] Copy LC_LINKER_OPTIMIZATION_HINT (details)
  7. [lld-macho][nfc] Put back shouldOmitFromOutput() asserts (details)
  8. [lld-macho] Handle multiple LC_LINKER_OPTIONs (details)
  9. [lld-macho] Put DATA_IN_CODE immediately after FUNCTION_STARTS (details)
  10. [flang] Don't crash on some bogus expressions (details)
  11. [NFC][ScalarEvolution] Refactor createNodeForSelectOrPHI (details)
  12. Fix verifier crashing on some invalid IR (details)
  13. Use early exist and simplify a condition in Block SuccessorRange (NFC) (details)
  14. [MCA] Anchoring the vtable of CustomBehaviour (details)
  15. [flang] Fix crashes on calls to non-procedures (details)
  16. Add hook for dialect specializing processing blocks post inlining calls (details)
  17. [MLIR] Fix affine parallelize pass. (details)
  18. [MLIR] Make store to load fwd condition less conservative (details)
  19. [ASTMatchers] Fix bug in `hasUnaryOperand` (details)
  20. Add sparse matrix multiplication integration test (details)
  21. [libTooling] Change `access` stencil to recognize use of `operator*`. (details)
  22. [OpenMP] Add Two-level Distributed Barrier (details)
Commit 8ba9ee46e465a56a54f8361703d3af7f4bc98d63 by pklausler
[flang] Correct the subscripts used for arguments to character intrinsics

When chasing down another unrelated bug, I noticed that the
implementations of various character intrinsic functions assume
that the lower bounds of (some of) their arguments were 1.
This isn't necessarily the case, so I've cleaned them up, tweaked
the unit tests to exercise the fix, and regularized the allocation
pattern used for results to use SetBounds() before Allocate() rather
than the old original Descriptor::Allocate() wrapper around
CFI_allocate().

Since there were few other remaining uses of the old original
Descriptor::Allocate() wrapper, I also converted them to the
new one and deleted the old one.

Differential Revision: https://reviews.llvm.org/D104325
The file was modifiedflang/runtime/transformational.cpp
The file was modifiedflang/runtime/descriptor.cpp
The file was modifiedflang/unittests/RuntimeGTest/CharacterTest.cpp
The file was modifiedflang/runtime/descriptor.h
The file was modifiedflang/runtime/character.cpp
The file was modifiedflang/unittests/Evaluate/reshape.cpp
Commit 1a76bff6264abb3f19fa78d74ec78196e2da7d34 by i
RISCVFixupKinds.h: Don’t duplicate function or class name at the beginning of the comment && fix some comments
The file was modifiedllvm/lib/Target/RISCV/MCTargetDesc/RISCVFixupKinds.h
Commit a993bb08b8348c947a733377655af3de610cf28e by spatel
[ValueTracking] add FP intrinsics to test for propagatesPoison; NFC

I'm not sure what behavior we want if the FP environment is
not default (also not sure if there's a way to enumerate
the full list of intrinsics programmatically), but currently
these are all defaulting to 'false' (doesn't propagate).
The file was modifiedllvm/unittests/Analysis/ValueTrackingTest.cpp
Commit 619bfe8bd23f76b22f0a53fedafbfc8c97a15f12 by ajcbik
[mlir][sparse] support new kind of scalar in sparse linalg generic op

We have several ways of introducing a scalar invariant value into
linalg generic ops (should we limit this somewhat?). This revision
makes sure we handle all of them correctly in the sparse compiler.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D104335
The file was addedmlir/test/Dialect/SparseTensor/sparse_scalars.mlir
The file was modifiedmlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
Commit cef9b96b01b75fedea5e91ece776228f4088ba78 by hoy
[CSSPGO] Report zero-count probe in profile instead of dangling probes.

Previously dangling samples were represented by INT64_MAX in sample profile while probes never executed were not reported. This was based on an observation that dangling probes were only at a smaller portion than zero-count probes. However, with compiler optimizations, dangling probes end up becoming at large portion of all probes in general and reporting them does not make sense from profile size point of view. This change flips sample reporting by reporting zero-count probes instead. This enabled dangling probe to be represented by none (missing entry in profile). This has a couple benefits:

1. Reducing sample profile size in optimize mode, even when the number of non-executed probes outperform the number of dangling probes, since INT64_MAX takes more space over 0 to encode.

2. Binary size savings. No need to encode dangling probe anymore, since missing probes are treated as dangling in the profile reader.

3. Reducing compiler work to track dangling probes. However, for probes that are real dead and removed, we still need the compiler to identify them so that they can be reported as zero-count, instead of mistreated as dangling probes.

4. Improving counts quality by respecting the counts already collected on the non-dangling copy of a probe. A probe, when duplicated, gets two copies at runtime. If one of them is dangling while the other is not, merging the two probes at profile generation time will cause the real samples collected on the non-dangling one to be discarded. Not reporting the dangling counterpart will keep the real samples.

5. Better readability.

6. Be consistent with non-CS dwarf line number based profile. Zero counts are trusted by the compiler counts inferencer while missing counts will be inferred by the compiler.

Note that the current patch does include any work for #3. There will be follow-up changes.

For #1, I've seen for a large Facebook service, the text profile is reduced by 7%. For extbinary profile, the size of  LBRProfileSection is reduced by 35%.

For #4, I have seen general counts quality for SPEC2017 is improved by 10%.

Reviewed By: wenlei, wlei, wmi

Differential Revision: https://reviews.llvm.org/D104129
The file was modifiedllvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
The file was modifiedllvm/include/llvm/ProfileData/SampleProf.h
The file was modifiedllvm/test/tools/llvm-profgen/merge-cold-profile.test
The file was modifiedllvm/lib/ProfileData/ProfileSummaryBuilder.cpp
The file was modifiedllvm/lib/ProfileData/SampleProf.cpp
The file was modifiedllvm/test/tools/llvm-profgen/fname-canonicalization.test
The file was modifiedllvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-inline.prof
The file was modifiedllvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
The file was modifiedllvm/test/tools/llvm-profgen/inline-cs-dangling-pseudoprobe.test
The file was modifiedllvm/tools/llvm-profgen/ProfileGenerator.cpp
The file was modifiedllvm/test/tools/llvm-profgen/truncated-pseudoprobe.test
Commit d619cf5ac5bffa4020f6f391afb23a7a9a5ae568 by i
[llvm-objcopy][MachO] Copy LC_LINKER_OPTIMIZATION_HINT

This fixes `error: unsupported load command (cmd=0x2e)`
The file was modifiedllvm/tools/llvm-objcopy/MachO/MachOLayoutBuilder.cpp
The file was modifiedllvm/test/tools/llvm-objcopy/MachO/basic-executable-copy.test
Commit b8bbb9723af3ab5bfa11366480ecf42b45680101 by jezng
[lld-macho][nfc] Put back shouldOmitFromOutput() asserts

I removed them in rG5de7467e982 but @thakis pointed out that
they were useful to keep, so here they are again. I've also converted
the `!isCoalescedWeak()` asserts into `!shouldOmitFromOutput()` asserts,
since the latter check subsumes the former.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104169
The file was modifiedlld/MachO/UnwindInfoSection.cpp
The file was modifiedlld/MachO/InputSection.h
The file was modifiedlld/MachO/MapFile.cpp
The file was modifiedlld/MachO/InputSection.cpp
Commit eeac6b2beceeb82382354d730b615159af3bc70e by jezng
[lld-macho] Handle multiple LC_LINKER_OPTIONs

We previously only parsed the first one.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104352
The file was modifiedlld/MachO/InputFiles.cpp
The file was modifiedlld/test/MachO/lc-linker-option.ll
Commit 560636e5497a85d036a39ad5bf599d26828f66b3 by jezng
[lld-macho] Put DATA_IN_CODE immediately after FUNCTION_STARTS

codesign checks for this.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104354
The file was modifiedlld/test/MachO/linkedit-contiguity.s
The file was modifiedlld/MachO/OutputSegment.cpp
Commit 3061334e0d887b260b023ca5613359a84b7da8e7 by pklausler
[flang] Don't crash on some bogus expressions

Recover more gracefully from user errors in expressions.

Differential Revision: https://reviews.llvm.org/D104326
The file was modifiedflang/test/Semantics/select-rank.f90
The file was modifiedflang/lib/Semantics/resolve-names.cpp
The file was modifiedflang/lib/Semantics/expression.cpp
Commit 27963ccf07683df951235948622bd2bef33c0c6d by efriedma
[NFC][ScalarEvolution] Refactor createNodeForSelectOrPHI

In preparation for D103660.
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
Commit a6559b42cee2a1a34bd75cf78a70c64429fb5cf8 by joker.eph
Fix verifier crashing on some invalid IR

In a region with multiple blocks the verifier will try to look for
dominance and may get successor list for blocks, even though a block
may be empty or does not end with a terminator.

Differential Revision: https://reviews.llvm.org/D104411
The file was modifiedmlir/lib/IR/Block.cpp
The file was modifiedmlir/test/IR/invalid.mlir
Commit 066b3207234d098b6bf25d7c55e47c5a7b8dcfc7 by joker.eph
Use early exist and simplify a condition in Block SuccessorRange (NFC)
The file was modifiedmlir/lib/IR/Block.cpp
Commit c29555342ce18cd4769228db650dbcd817a6e474 by minyihh
[MCA] Anchoring the vtable of CustomBehaviour

Put the dtor of mca::CustomBehaviour into the cpp file to avoid
undefined vtable when linking libLLVMMCACustomBehaviourAMDGPU as shared
library.

Differential Revision: https://reviews.llvm.org/D104401
The file was modifiedllvm/include/llvm/MCA/CustomBehaviour.h
The file was modifiedllvm/tools/llvm-mca/lib/AMDGPU/CMakeLists.txt
The file was modifiedllvm/lib/MCA/CustomBehaviour.cpp
Commit e5813a683a81001d3853cb3d2b1397a11e98c1dd by pklausler
[flang] Fix crashes on calls to non-procedures

When a procedure reference is attempted to an entity that just
isn't a procedure, say so.

Differential Revision: https://reviews.llvm.org/D104329
The file was modifiedflang/test/Semantics/resolve09.f90
The file was addedflang/test/Semantics/call19.f90
The file was modifiedflang/lib/Semantics/expression.cpp
The file was modifiedflang/lib/Semantics/resolve-names.cpp
Commit 0e760a0870e61b0a150bdea24532ad054774ade4 by jpienaar
Add hook for dialect specializing processing blocks post inlining calls

This allows for dialects to do different post-processing depending on operations with the inliner (my use case requires different attribute propagation rules depending on call op). This hook runs before the regular processInlinedBlocks method.

Differential Revision: https://reviews.llvm.org/D104399
The file was modifiedmlir/test/Transforms/inlining.mlir
The file was modifiedmlir/include/mlir/Transforms/InliningUtils.h
The file was modifiedmlir/lib/Transforms/Utils/InliningUtils.cpp
The file was modifiedmlir/test/lib/Dialect/Test/TestDialect.cpp
Commit 51d43bbc4662202d7f694c43b968fb289a56a355 by uday
[MLIR] Fix affine parallelize pass.

To control the number of outer parallel loops, we need to process the
outer loops first and hence pre-order walk fixes the issue.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D104361
The file was modifiedmlir/test/Dialect/Affine/parallelize.mlir
The file was modifiedmlir/lib/Dialect/Affine/Transforms/AffineParallelize.cpp
Commit 54384d172397402a3ad606ef990af488f344eb19 by uday
[MLIR] Make store to load fwd condition less conservative

Make store to load fwd condition for -memref-dataflow-opt less
conservative. Post dominance info is not really needed. Add additional
check for common cases.

Differential Revision: https://reviews.llvm.org/D104174
The file was modifiedmlir/lib/Dialect/Affine/Transforms/AffineScalarReplacement.cpp
The file was modifiedmlir/test/Dialect/Affine/scalrep.mlir
Commit 439c9206945aba15d74d5bcaef3bf3f4d1e32b5e by yitzhakm
[ASTMatchers] Fix bug in `hasUnaryOperand`

Currently, `hasUnaryOperand` fails for the overloaded `operator*`. This patch fixes the bug and
adds tests for this case.

Differential Revision: https://reviews.llvm.org/D104389
The file was modifiedclang/include/clang/ASTMatchers/ASTMatchersInternal.h
The file was modifiedclang/unittests/ASTMatchers/ASTMatchersTraversalTest.cpp
Commit f9a6d47c3642cb07b5e98e8b08330ccc95b85dd8 by ajcbik
Add sparse matrix multiplication integration test

Adds an integration test for the SPMM (sparse matrix multiplication) kernel, which multiplies a sparse matrix by a dense matrix, resulting in a dense matrix. This is just a simple modification on the existing matrix-vector multiplication kernel.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D104334
The file was addedmlir/test/Integration/Dialect/SparseTensor/CPU/sparse_spmm.mlir
Commit c7ed4fe56e0a6c664c5fb5dedaedb426abe7224d by yitzhakm
[libTooling] Change `access` stencil to recognize use of `operator*`.

Currently, `access` doesn't recognize a dereferenced smart pointer. So,
`access(e, "field")` where `e = *x`, yields:
* `x->field`, for normal-pointer x,
* `(*x).field`, for smart-pointer x.

This patch normalizes handling of smart pointer to match normal pointer, when
the smart pointer type supports `->`.

Differential Revision: https://reviews.llvm.org/D104390
The file was modifiedclang/unittests/Tooling/StencilTest.cpp
The file was modifiedclang/lib/Tooling/Transformer/Stencil.cpp
Commit 25073a4ecfc9b2e3cb76776185e63bfdb094cd98 by terry.l.wilmarth
[OpenMP] Add Two-level Distributed Barrier

Two-level distributed barrier is a new experimental barrier designed
for Intel hardware that has better performance in some cases than the
default hyper barrier.

This barrier is designed to handle fine granularity parallelism where
barriers are used frequently with little compute and memory access
between barriers.  There is no need to use it for codes with few
barriers and large granularity compute, or memory intensive
applications, as little difference will be seen between this barrier
and the default hyper barrier. This barrier is designed to work
optimally with a fixed number of threads, and has a significant setup
time, so should NOT be used in situations where the number of threads
in a team is varied frequently.

The two-level distributed barrier is off by default -- hyper barrier
is used by default. To use this barrier, you must set all barrier
patterns to use this type, because it will not work with other barrier
patterns.  Thus, to turn it on, the following settings are required:

KMP_FORKJOIN_BARRIER_PATTERN=dist,dist
KMP_PLAIN_BARRIER_PATTERN=dist,dist
KMP_REDUCTION_BARRIER_PATTERN=dist,dist

Branching factors (set with KMP_FORKJOIN_BARRIER, KMP_PLAIN_BARRIER,
and KMP_REDUCTION_BARRIER) are ignored by the two-level distributed
barrier.

Differential Revision: https://reviews.llvm.org/D103121
The file was modifiedopenmp/runtime/src/kmp_wait_release.cpp
The file was modifiedopenmp/runtime/test/barrier/omp_barrier.c
The file was modifiedopenmp/runtime/src/kmp.h
The file was modifiedopenmp/runtime/src/kmp_str.cpp
The file was modifiedopenmp/runtime/src/kmp_tasking.cpp
The file was modifiedopenmp/runtime/src/kmp_os.h
The file was modifiedopenmp/runtime/src/z_Linux_util.cpp
The file was modifiedopenmp/runtime/src/z_Windows_NT_util.cpp
The file was modifiedopenmp/runtime/src/kmp_wait_release.h
The file was modifiedopenmp/runtime/src/kmp_atomic.cpp
The file was modifiedopenmp/runtime/src/i18n/en_US.txt
The file was modifiedopenmp/runtime/src/kmp_str.h
The file was addedopenmp/runtime/src/kmp_barrier.h
The file was modifiedopenmp/runtime/src/kmp_settings.cpp
The file was modifiedopenmp/runtime/src/kmp_global.cpp
The file was modifiedopenmp/runtime/src/kmp_stats.h
The file was modifiedopenmp/runtime/src/kmp_barrier.cpp
The file was modifiedopenmp/runtime/src/kmp_runtime.cpp