Started 15 hr ago
Took 14 min

Success Build clang-r364215-t57584-b57584.tar.gz (Jun 24, 2019 2:05:59 PM)

Issues

No known issues detected

Build Log

Revision: 362564
Changes
  1. Add debuginfo-tests to the list of repositories needed by lldb-cmake-matrix. (detail)
    by Adrian Prantl
Revision: 362564
Changes
  1. AMDGPU/GlobalISel: Select G_TRUNC (detail)
    by arsenm
  2. AMDGPU/GlobalISel: RegBankSelect for amdgcn.class (detail)
    by arsenm
  3. [PowerPC][UpdateTestChecks] powerpc- triple support

    There are quite some old testcases with powerpc- triple,
    we should add this triple support so that we can update them with script.

    Differential Revision: https://reviews.llvm.org/D63723 (detail)
    by jsji
  4. AMDGPU/GlobalISel: Split VALU s64 G_ZEXT/G_SEXT in RegBankSelect

    Scalar extends to s64 can use S_BFE_{I64|U64}, but vector extends need
    to extend to the 32-bit half, and then to 64.

    I'm not sure what the line should be between what RegBankSelect
    handles, and what instruction select does, but for now I'm erring on
    the side of RegBankSelect for future post-RBS combines. (detail)
    by arsenm
  5. [llvm-objdump] Match GNU objdump on symbol types shown in disassembly
    output.

    STT_OBJECT and STT_COMMON are dumped as data, not disassembled.

    https://bugs.llvm.org/show_bug.cgi?id=41947

    Differential Revision: https://reviews.llvm.org/D62964 (detail)
    by yuanfang
  6. [AMDGPU] Allow any value in unused src0 field in v_nop

    Summary:
    The LLVM disassembler assumes that the unused src0 operand of v_nop is
    zero. Other tools can put another value in that field, which is still
    valid. This commit fixes the LLVM disassembler to recognize such an
    encoding as v_nop, in the same way as we already do for s_getpc.

    Differential Revision: https://reviews.llvm.org/D63724

    Change-Id: Iaf0363eae26ff92fc4ebc716216476adbff37a6f (detail)
    by tpr
  7. [X86] Don't a vzext_movl in LowerBuildVectorv16i8/LowerBuildVectorv8i16 if there are no zeroes in the vector we're building.

    In LowerBuildVectorv16i8 we took care to use an any_extend if the first pair is in the lower 16-bits of the vector and no elements are 0. So bits [31:16] will be undefined. But we still emitted a vzext_movl to ensure that bits [127:32] are 0. If we don't need any zeroes we should be consistent and make all of 127:16 undefined.

    In LowerBuildVectorv8i16 we can just delete the vzext_movl code because we only use the scalar_to_vector when there are no zeroes. So the vzext_movl is always unnecessary.

    Found while investigating whether (vzext_movl (scalar_to_vector (loadi32)) patterns are necessary. At least one of the cases where they were necessary was where the loadi32 matched 32-bit aligned 16-bit extload. Seemed weird that we required vzext_movl for that case.

    Differential Revision: https://reviews.llvm.org/D63700 (detail)
    by ctopper
  8. [X86] Cleanups and safety checks around the isFNEG

    This patch does a few things to start cleaning up the isFNEG function.

    -Remove the Op0/Op1 peekThroughBitcast calls that seem unnecessary. getTargetConstantBitsFromNode has its own peekThroughBitcast inside. And we have a separate peekThroughBitcast on the return value.
    -Add a check of the scalar size after the first peekThroughBitcast to ensure we haven't changed the element size and just did something like f32->i32 or f64->i64.
    -Remove an unnecessary check that Op1's type is floating point after the peekThroughBitcast. We're just going to look for a bit pattern from a constant. We don't care about its type.
    -Add VT checks on several places that consume the return value of isFNEG. Due to the peekThroughBitcasts inside, the type of the return value isn't guaranteed. So its not safe to use it to build other nodes without ensuring the type matches the type being used to build the node. We might be able to replace these checks with bitcasts instead, but I don't have a test case so a bail out check seemed better for now.

    Differential Revision: https://reviews.llvm.org/D63683 (detail)
    by ctopper
  9. [AArch64] Regenerate vcvt tests. NFCI.

    Prep work for an upcoming patch (detail)
    by rksimon
  10. [AArch64] Regenerate 2velem tests. NFCI.

    Prep work for an upcoming patch (detail)
    by rksimon
  11. [AArch64] Regenerate merge-store tests. NFCI.

    Prep work for an upcoming patch (detail)
    by rksimon
  12. [X86] Regenerate fast fadd reduction tests. NFCI

    Fix whitespace. (detail)
    by rksimon
  13. AMDGPU/GlobalISel: Fix selecting G_IMPLICIT_DEF for s1

    Try to fail for scc, since I don't think that should ever be produced. (detail)
    by arsenm
  14. [bindings/go] Add debug information accessors

    Add debug information accessors, as provided in the following patches:

    https://reviews.llvm.org/D46627 (DILocation)
    https://reviews.llvm.org/D52693 metadata kind
    https://reviews.llvm.org/D60481 get/set debug location on a Value
    https://reviews.llvm.org/D60489 (DIScope)

    The API as proposed in this patch is similar to the current Value API,
    with a single root type and methods that are only valid for certain
    subclasses. I have considered just implementing generic Line() calls
    (that are valid on all DINodes that have a line) but the implementation
    of that got a bit awkward without support from the C API. I've also
    considered creating generic getters like a Metadata.DebugLoc() that
    returns a DebugLoc, but there is a mismatch between the Go DI nodes in
    the LLVM API and the actual DINode class hierarchy, so that's also hard
    to get right (without being confusing or breaking the API).

    Differential Revision: https://reviews.llvm.org/D63056 (detail)
    by aykevl
  15. Hexagon: Rename another copy of Register class

    For some reason clang is happy with the conflict, but MSVC is not. (detail)
    by arsenm
  16. ARC: Fix -Wimplicit-fallthrough (detail)
    by arsenm
  17. GlobalISel: Remove unsigned variant of SrcOp

    Force using Register.

    One downside is the generated register enums require explicit
    conversion. (detail)
    by arsenm
  18. CodeGen: Introduce a class for registers

    Avoids using a plain unsigned for registers throughoug codegen.
    Doesn't attempt to change every register use, just something a little
    more than the set needed to build after changing the return type of
    MachineOperand::getReg(). (detail)
    by arsenm
  19. [AMDGPU] Remove unused variable AllSGPRSpilledToVGPRs. NFC

    Summary:
    Removing the unused variable AllSGPRSpilledToVGPRs in
    SIFrameLowering::processFunctionBeforeFrameFinalized
    to avoid
      error: variable 'AllSGPRSpilledToVGPRs' set but not used
      [-Werror=unused-but-set-variable]

    Reviewers: arsenm, nhaehnle

    Reviewed By: nhaehnle

    Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63721 (detail)
    by bjope
  20. Hexagon: Rename Register class

    This avoids a naming conflict in a future patch. (detail)
    by arsenm
  21. [InstCombine] reduce funnel-shift i16 X, X, 8 to bswap X

    Prefer the more exact intrinsic to remove a use of the input value
    and possibly make further transforms easier (we will still need
    to match patterns with funnel-shift of wider types as pieces of
    bswap, especially if we want to canonicalize to funnel-shift with
    constant shift amount). Discussed in D46760. (detail)
    by spatel
  22. AMDGPU/GlobalISel: Fix RegBankSelect for s1 sext/zext/anyext

    This needs different handling if the source is known to be a valid
    condition or not. Handle turning it into shifts or a select during
    regbankselect. (detail)
    by arsenm
  23. AMDGPU: Fold frame index into MUBUF

    This matters for byval uses outside of the entry block, which appear
    as copies.

    Previously, the only folding done was during selection, which could
    not see the underlying frame index. For any uses outside the entry
    block, the frame index was materialized in the entry block relative to
    the global scratch wave offset.

    This may produce worse code in cases where the offset ends up not
    fitting in the MUBUF offset field. A better heuristic would be helpfu
    for extreme frames. (detail)
    by arsenm
  24. [InstCombine] add tests for funnel-shift to bswap; NFC (detail)
    by spatel
  25. AMDGPU: Cleanup checking when spills need emergency slots

    Address fixme, which should no longer be a problem since r363757. (detail)
    by arsenm
  26. [InstCombine] SliceUpIllegalIntegerPHI - bail on out of range shifts

    trunc(lshr) handling - if the shift is out of range (undefined) then bail like we do for non-constant shifts.

    Fixes OSS Fuzz #15217 (detail)
    by rksimon
  27. [DAGCombine] visitMUL - allow shift by zero in MulByConstant.

    This can occur under certain circumstances when undefs are created later on in the constant multipliers (e.g. in this case due to SimplifyDemandedVectorElts). Its better to let the shift by zero to occur and perform any cleanup afterward.

    Fixes OSS Fuzz #15429 (detail)
    by rksimon
  28. [ConstantFolding] Use hasVectorInstrinsicScalarOpd. NFC

    Summary:
    Use the hasVectorInstrinsicScalarOpd helper function
    in ConstantFoldVectorCall.

    Reviewers: rengolin, RKSimon, dblaikie

    Reviewed By: rengolin, RKSimon

    Subscribers: tschuett, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63705 (detail)
    by bjope
  29. [Scalarizer] Add scalarizer support for smul.fix.sat

    Summary:
    Handle smul.fix.sat in the scalarizer. This is done by
    adding smul.fix.sat to the set of "isTriviallyVectorizable"
    intrinsics.

    The addition of smul.fix.sat in isTriviallyVectorizable and
    hasVectorInstrinsicScalarOpd can also be seen as a preparation
    to be able to use hasVectorInstrinsicScalarOpd in ConstantFolding.

    Reviewers: rengolin, RKSimon, dblaikie

    Reviewed By: rengolin

    Subscribers: hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63704 (detail)
    by bjope
  30. [docs][llvm-nm] Add missing options to documentation

    There were several options missing from the documentation. This patch
    adds them as well as improving some wording and separating the Mach-O
    only options into a separate section.

    Fixes https://bugs.llvm.org/show_bug.cgi?id=42234.

    Reviewed by: MaskRay

    Differential Revision: https://reviews.llvm.org/D63655 (detail)
    by jhenderson
  31. [sancov] Avoid unnecessary unique_ptr (detail)
    by maskray
  32. [ARM] Add MVE interleaving load/store family.

    This adds the family of loads and stores with names like VLD20.8 and
    VST42.32, which load and store parts of multiple q-registers in such a
    way that executing both VLD20 and VLD21, or all four of VLD40..VLD43,
    will distribute 2 or 4 vectors' worth of memory data across the lanes
    of the same number of registers but in a transposed order.

    In addition to the Tablegen descriptions of the instructions
    themselves, this patch also adds encode and decode support for the
    QQPR and QQQQPR register classes (representing the range of loaded or
    stored vector registers), and tweaks to the parsing system for lists
    of vector registers to make it return the right format in this case
    (since, unlike NEON, MVE regards q-registers as primitive, and not
    just an alias for two d-registers). (detail)
    by statham
  33. [docs][llvm-nm] Improve symbol code documentation

    The existing symbol code documentation was very incomplete. This patch
    adds the missing codes, and defines them based on the current code
    behaviour.

    Fixes https://bugs.llvm.org/show_bug.cgi?id=42231.

    Reviewed by: rupprecht, mtrent, MaskRay

    Differential Revision: https://reviews.llvm.org/D63327 (detail)
    by jhenderson
  34. [Support] Fix error handling in DataExtractor::get[US]LEB128

    Summary:
    These functions are documented as not modifying the offset argument if
    the extraction fails (just like other DataExtractor functions). However,
    while reviewing D63591 we discovered that this is not the case -- if the
    function reaches the end of the data buffer, it will just return the
    value parsed until that point and set offset to point to the end of the
    buffer.

    This fixes the functions to act as advertised, and adds a regression
    test.

    Reviewers: dblaikie, probinson, bkramer

    Subscribers: kristina, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63645 (detail)
    by labath
  35. Follow up of rL363913. NFC.

    Minor reshuffle in AArch64 targetparser unittest, solving a potential problem
    with querying iterators too early. (detail)
    by sjoerdmeijer
  36. [llvm-readobj/llvm-readelf] - Eliminate the elf-groups.x86_64 precompiled binary from the inputs.

    We do not need the elf-groups.x86_64. In one of the tests, it was
    used for no solid reason, and for the second test case we can use
    YAML input with SHT_GROUP sections.

    The patch performs a cleanup of one of the test cases, removes another
    one completely (since during the review was found out it actually
    duplicates one of the existent tests) and removes the precompiled binary.

    Differential revision: https://reviews.llvm.org/D63647 (detail)
    by grimar
  37. [X86] Turn v16i16->v16i8 truncate+store into a any_extend+truncstore if we avx512f, but not avx512bw.

    Ideally we'd be able to represent this truncate as a any_extend to
    v16i32 and a truncate, but SelectionDAG doens't know how to not
    fold those together.

    We have isel patterns to use a vpmovzxwd+vpdmovdb for the truncate,
    but we aren't able to simultaneously fold the load and the store
    from the isel pattern. By pulling the truncate into the store we
    can successfully hide it from the DAG combiner. Then we can isel
    pattern match the truncstore and load+any_extend separately. (detail)
    by ctopper
  38. [GN] Generation failure caused by trailing space in file name

    When I executed gn.py gen out/gn I got the following error:

    ERROR at //compiler-rt/lib/builtins/BUILD.gn:162:7: Only source, header, and object files belong in the sources of a static_library. //compiler-rt/lib/builtins/emutls.c  is not one of the valid types.
          "emutls.c ",
          ^----------
    See //compiler-rt/lib/BUILD.gn:3:5: which caused the file to be included.
        "//compiler-rt/lib/builtins",
        ^---------------------------
    It turns out to be that the latest gn doesn't accept ill-format file name. And the emutls.c above has a trailing space.
    Remove the trailing space should work.

    Patch By: myhsu
    Differential Revision: https://reviews.llvm.org/D63449 (detail)
    by phosek
  39. Fix typo in comment; NFC (detail)
    by sanjoy
  40. [X86] Fix isel pattern that was looking for a bitcasted load. Remove what appears to be a copy/paste mistake.

    DAG combine should ensure bitcasts of loads don't exist.

    Also remove 3 patterns that are identical to the block above them. (detail)
    by ctopper
  41. [Tests] Autogen and improve test readability (detail)
    by reames
  42. [IndVars] Remove dead instructions after folding trivial loop exit

    In rL364135, I taught IndVars to fold exiting branches in loops with a zero backedge taken count (i.e. loops that only run one iteration).  This extends that to eliminate the dead comparison left around. (detail)
    by reames
  43. SlotIndexes: delete unused functions (detail)
    by maskray
  44. [InstCombine] squash is-power-of-2 that uses ctpop

    This is another intermediate IR step towards solving PR42314:
    https://bugs.llvm.org/show_bug.cgi?id=42314

    We can test if a value is power-of-2-or-0 using ctpop(X) < 2,
    so combining that with a non-zero check of the input is the
    same as testing if exactly 1 bit is set:

    (X != 0) && (ctpop(X) u< 2) --> ctpop(X) == 1

    Differential Revision: https://reviews.llvm.org/D63660 (detail)
    by spatel
  45. SlotIndexes: simplify IdxMBBPair operators (detail)
    by maskray
  46. [SelectionDAG] Remove the code that attempts to calculate the alignment for the second half of a split masked load/store.

    The code divides the alignment by 2 if the original alignment is
    equal to the original VT size. But this wouldn't be correct
    if the alignment was larger than the VT size.

    The memory operand object already takes care of calling MinAlign
    on the base alignment and the memory pointer offset. So we don't
    need any special code at all. (detail)
    by ctopper
  47. [X86][SelectionDAG] Cleanup and simplify masked_load/masked_store in tablegen. Use more precise PatFrags for scalar masked load/store.

    Rename masked_load/masked_store to masked_ld/masked_st to discourage
    their direct use. We need to check truncating/extending and
    compressing/expanding before using them. This revealed that
    our scalar masked load/store patterns were misusing these.

    With those out of the way, renamed masked_load_unaligned and
    masked_store_unaligned to remove the "_unaligned". We didn't
    check the alignment anyway so the name was somewhat misleading.

    Make the aligned versions inherit from masked_load/store instead
    from a separate identical version. Merge the 3 different alignments
    PatFrags into a single version that uses the VT from the SDNode to
    determine the size that the alignment needs to match. (detail)
    by ctopper
  48. [Support] Fix build under Emscripten

    Summary:
    Emscripten's libc doesn't define MNT_LOCAL, thus causing a build
    failure in the fallback path. However, to the best of my knowledge,
    it also doesn't support remote file system mounts, so we may simply
    return `true` here (as we do for e.g. Fuchsia). With this fix, the
    core LLVM libraries build correctly under emscripten (though some
    of the tools and utils do not).

    Reviewers: kripken
    Differential Revision: https://reviews.llvm.org/D63688 (detail)
    by kfischer
  49. Revert [CommandLine] Remove OptionCategory and SubCommand caches from the Option class.

    This reverts r364134 (git commit a5b83bc9e3b8e8945b55068c762bd6c73621a4b0)

    Caused errors in the asan bot, so the GeneralCategory global needs to
    be changed to ManagedStatic.

    Differential Revision: https://reviews.llvm.org/D62105 (detail)
    by dhinton
  50. [X86][SSE] Fold extract_subvector(vselect(x,y,z),0) -> vselect(extract_subvector(x,0),extract_subvector(y,0),extract_subvector(z,0)) (detail)
    by rksimon
  51. Exploit a zero LoopExit count to eliminate loop exits

    This turned out to be surprisingly effective. I was originally doing this just for completeness sake, but it seems like there are a lot of cases where SCEV's exit count reasoning is stronger than it's isKnownPredicate reasoning.

    Once this is in, I'm thinking about trying to build on the same infrastructure to eliminate provably untaken checks. There may be something generally interesting here.

    Differential Revision: https://reviews.llvm.org/D63618 (detail)
    by reames
  52. [CommandLine] Remove OptionCategory and SubCommand caches from the Option class.

    Summary:
    This change processes `OptionCategory`s and `SubCommand`s as they
    are seen instead of caching them in the Option class and processing
    them later.  Doing so simplifies the work needed to be done by the Global
    parser and significantly reduces the size of the Option class to a mere 64
    bytes.

    Removing  the `OptionCategory` cache saved 24 bytes, and removing
    the `SubCommand` cache saved an additional 48 bytes, for a total of a
    72 byte reduction.

    Reviewers: beanz, zturner, MaskRay, serge-sans-paille

    Reviewed By: serge-sans-paille

    Subscribers: serge-sans-paille, tstellar, zturner, hiraditya, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D62105 (detail)
    by dhinton
  53. [NFC] Fix indentation in PPCAsmPrinter.cpp

    After r248261, the indentation switches, inside a namespace definition,
    between indenting and not indenting one level in for that namespace; the
    abomination occurs in the middle of a class definition. Fix that. (detail)
    by hubert.reinterpretcast
  54. [PowerPC][NFC] Move comment to the relevant function

    A comment that applies to a virtual destructor was placed on a class
    constructor. Move the comment to where it belongs. (detail)
    by hubert.reinterpretcast
  55. PDB docs: Delete trailing whitespace, wrap to 80 cols (detail)
    by nico
  56. [NewGVN] Fix copy/paste mistake in cast (detail)
    by nikic
  57. [NewGVN] Remove dead SwitchEdges variable; NFC (detail)
    by nikic
  58. [LFTR] Add tests for PR41998; NFC

    The limit for the pointer case is incorrect. (detail)
    by nikic
  59. AArch64: Add support for reading pc using llvm.read_register.

    This is useful for allowing code to efficiently take an address
    that can be later mapped onto debug info. Currently the hwasan
    pass achieves this by taking the address of the current function:
    http://llvm-cs.pcc.me.uk/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp#921

    but this costs two instructions (plus a GOT entry in PIC code) per function
    with stack variables. This will allow the cost to be reduced to a single
    instruction.

    Differential Revision: https://reviews.llvm.org/D63471 (detail)
    by pcc
  60. [CMake] Delete redundant DEPENDS/LINK_LIBS from LineEditor/XRay

    The link dependencies are already specified in LLVMBuild.txt (detail)
    by maskray
  61. Make GlobalISel depend on SelectionDAG after D63169

    GlobalISel/IRTranslator.cpp now references SelectionDAG/FunctionLoweringInfo.cpp.
    This fixes a link error in -DBUILD_SHARED_LIBS=on builds:

        ld.lld: error: undefined symbol: llvm::FunctionLoweringInfo::clear()
        >>> referenced by IRTranslator.cpp:2198 (../lib/CodeGen/GlobalISel/IRTranslator.cpp:2198)
        >>>               lib/CodeGen/GlobalISel/CMakeFiles/LLVMGlobalISel.dir/IRTranslator.cpp.o:(llvm::IRTranslator::finalizeFunction()) (detail)
    by maskray
  62. Fix UNSUPPORTED attribute from windows to system-windows. (detail)
    by dyung
  63. [llvm-objdump] Allow --disassemble-functions to take demangled names

    The --disassemble-functions switch takes demangled names when
    --demangle is specified, otherwise the switch takes mangled names.

    https://bugs.llvm.org/show_bug.cgi?id=41908

    Reviewers: jhenderson, grimar, MaskRay, rupprecht

    Differential Revision: https://reviews.llvm.org/D63524 (detail)
    by yuanfang
  64. [llvm-objdump] Move --start-address >= --stop-address check out of the
    -d code.

    Summary:
    Move it into `main` function so the checking is effective for all actions
    user may do with llvm-objdump; notably, -r and -s in addition to existing -d.

    Match GNU behavior.

    Reviewers: jhenderson, grimar, MaskRay, rupprecht

    Subscribers: llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D63631 (detail)
    by yuanfang
  65. AArch64: Prefer FP-relative debug locations in HWASANified functions.

    To help produce better diagnostics for stack use-after-return, we'd like
    to be able to determine the addresses of each HWASANified function's local
    variables given a small amount of information recorded on entry to the
    function. Currently we require all HWASANified functions to use frame pointers
    and record (PC, FP) on function entry. This works better than recording SP
    because FP cannot change during the function, unlike SP which can change
    e.g. due to dynamic alloca.

    However, most variables currently end up using SP-relative locations in their
    debug info. This prevents us from recomputing the address of most variables
    because the distance between SP and FP isn't recorded in the debug info. To
    address this, make the AArch64 backend prefer FP-relative debug locations
    when producing debug info for HWASANified functions.

    Differential Revision: https://reviews.llvm.org/D63300 (detail)
    by pcc
  66. gn build: Merge r364046. (detail)
    by pcc
  67. [COFF, ARM64] Fix encoding of debugtrap for Windows

    On Windows ARM64, intrinsic __debugbreak is compiled into brk #0xF000 which is
    mapped to llvm.debugtrap in Clang. Instruction brk #F000 is the defined break
    point instruction on ARM64 which is recognized by Windows debugger and
    exception handling code, so llvm.debugtrap should map to it instead of
    redirecting to llvm.trap (brk #1) as the default implementation.

    Differential Revision: https://reviews.llvm.org/D63635 (detail)
    by tomtan
  68. Revert [SLP] Look-ahead operand reordering heuristic.

    This reverts r364084 (git commit 5698921be2d567f6abf925479ac9f5a376d6d74f)

    It caused crashes while compiling a file in Chrome. Reduction
    forthcoming. (detail)
    by rnk
  69. [llvm-lipo] Implement -thin

    Creates thin output file of specified arch_type from the fat input file.

    Patch by Anusha Basana <anushabasana@fb.com>

    Differential Revision: https://reviews.llvm.org/D63341 (detail)
    by smeenai
  70. [ASan] Use dynamic shadow on 32-bit iOS and simulators

    The VM layout on iOS is not stable between releases. On 64-bit iOS and
    its derivatives we use a dynamic shadow offset that enables ASan to
    search for a valid location for the shadow heap on process launch rather
    than hardcode it.

    This commit extends that approach for 32-bit iOS plus derivatives and
    their simulators.

    rdar://50645192
    rdar://51200372
    rdar://51767702

    Reviewed By: delcypher

    Differential Revision: https://reviews.llvm.org/D63586 (detail)
    by yln
  71. [X86] Add test cases for incorrect shrinking of volatile vector loads from 128-bits to 32 or 64 bits. NFC

    This is caused by isel patterns that look for vzmovl+load and
    treat it the same as vzload. (detail)
    by ctopper
  72. AMDGPU: Fix not using s33 for scratch wave offset in kernels

    Fixes missing piece from r363990. (detail)
    by arsenm
  73. [X86] Add DAG combine to turn (vzmovl (insert_subvector undef, X, 0)) into (insert_subvector allzeros, (vzmovl X), 0)

    128/256 bit scalar_to_vectors are canonicalized to (insert_subvector undef, (scalar_to_vector), 0). We have isel patterns that try to match this pattern being used by a vzmovl to use a 128-bit instruction and a subreg_to_reg.

    This patch detects the insert_subvector undef portion of this and pulls it through the vzmovl, creating a narrower vzmovl and an insert_subvector allzeroes. We can then match the insertsubvector into a subreg_to_reg operation by itself. Then we can fall back on existing (vzmovl (scalar_to_vector)) patterns.

    Note, while the scalar_to_vector case is the motivating case I didn't restrict to just that case. I'm also wondering about shrinking any 256/512 vzmovl to an extract_subvector+vzmovl+insert_subvector(allzeros) but I fear that would have bad implications to shuffle combining.

    I also think there is more canonicalization we can do with vzmovl with loads or scalar_to_vector with loads to create vzload.

    Differential Revision: https://reviews.llvm.org/D63512 (detail)
    by ctopper
  74. [X86] Don't mark v64i8/v32i16 ISD::SELECT as custom unless they are legal types.

    We don't have any Custom handling during type legalization. Only
    operation legalization.

    Fixes PR42355 (detail)
    by ctopper
  75. [X86] Add avx512bw command lines to avx512-select.ll

    Prep for fixing PR42355 and ensuring we have coverage of
    ISD::SELECT for v64i8/v32i16 on KNL and SKX configs. (detail)
    by ctopper
  76. [X86] Add a debug print of the node in the default case for unhandled opcodes in ReplaceNodeResults.

    This should be unreachable, but bugs can make it reachable. This
    adds a debug print so we can see the bad node in the output when
    the llvm_unreachable triggers. (detail)
    by ctopper
  77. [X86][AVX] Combine INSERT_SUBVECTOR(SRC0, EXTRACT_SUBVECTOR(SRC1)) as shuffle

    Subvector shuffling often ends up as insert/extract subvector. (detail)
    by rksimon
  78. [AArch64][GlobalISel] Implement selection support for the new G_JUMP_TABLE and G_BRJT ops.

    With this we can now fully code generate jump tables, which is important for code size.

    Differential Revision: https://reviews.llvm.org/D63223 (detail)
    by aemerson
  79. [GlobalISel][IRTranslator] Change switch table translation to generate jump tables and range checks.

    This change makes use of the newly refactored SwitchLoweringUtils code from
    SelectionDAG to in order to generate jump tables and range checks where appropriate.

    Much of this code is ported from SDAG with some modifications. We generate
    G_JUMP_TABLE and G_BRJT instructions when JT opportunities are found. This means
    that targets which previously relied on the naive one MBB per case stmt
    translation will now start falling back until they add support for the new opcodes.

    For range checks, we don't generate any previously unused operations. This
    just recognizes contiguous ranges of case values and generates a single block per
    range. Single case value blocks are just a special case of ranges so we get that
    support almost for free.

    There are still some optimizations missing that I haven't ported over, and
    bit-tests are also unimplemented. This patch series is already complex enough.

    Actual arm64 support for selection of jump tables is coming in a later patch.

    Differential Revision: https://reviews.llvm.org/D63169 (detail)
    by aemerson
  80. [SLP] Look-ahead operand reordering heuristic.

    This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).

    Committed on behalf of @vporpo (Vasileios Porpodas)

    Differential Revision: https://reviews.llvm.org/D60897 (detail)
    by rksimon
  81. [NFC] Update shl-sub tests (detail)
    by xbolva00
  82. [InstCombine] add tests for ctpop folds; NFC (detail)
    by spatel
  83. [X86] Use vmovq for v4i64/v4f64/v8i64/v8f64 vzmovl.

    We already use vmovq for v2i64/v2f64 vzmovl. But we were using a
    blendpd+xorpd for v4i64/v4f64/v8i64/v8f64 under opt speed. Or
    movsd+xorpd under optsize.

    I think the blend with 0 or movss/d is only needed for
    vXi32 where we don't have an instruction that can move 32
    bits from one xmm to another while zeroing upper bits.

    movq is no worse than blendpd on any known CPUs. (detail)
    by ctopper
Revision: 362564
Changes
  1. [clang][NewPM] Add RUNS for tests that produce slightly different IR under new PM

    For CodeGenOpenCL/convergent.cl, the new PM produced a slightly different for
    loop, but this still checks for no loop unrolling as intended. This is
    committed separately from D63174. (detail)
    by leonardchan
  2. [clang][NewPM] Remove exception handling before loading pgo sample profile data

    This patch ensures that SimplifyCFGPass comes before SampleProfileLoaderPass
    on PGO runs in the new PM and fixes clang/test/CodeGen/pgo-sample.c.

    Differential Revision: https://reviews.llvm.org/D63626 (detail)
    by leonardchan
  3. [analyzer] print() JSONify: ProgramPoint revision

    Summary: Now we also print out the filename with its path.

    Reviewers: NoQ

    Reviewed By: NoQ

    Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin,
                 mikhail.ramalho, Szelethus, donat.nagy, dkrupp, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63438 (detail)
    by charusso
  4. [analyzer] Fix JSON dumps for ExplodedNodes

    Summary:
    - Now we could see the `has_report` property in `trim-egraph` mode.
    - This patch also removes the trailing comma after each node.

    Reviewers: NoQ

    Reviewed By: NoQ

    Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin,
                 mikhail.ramalho, Szelethus, donat.nagy, dkrupp, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63436 (detail)
    by charusso
  5. [OPENMP]Relax the test checks to pacify 32bit buildbots, NFC. (detail)
    by abataev
  6. [CUDA][HIP] Don't set comdat attribute for CUDA device stub functions.\nDifferential Revision: https://reviews.llvm.org/D63277 (detail)
    by kpyzhov
  7. [OpenCL] Restore ATOMIC_VAR_INIT

    We accidentally lost the ATOMIC_VAR_INIT and ATOMIC_FLAG_INIT macros
    in r363794.

    Also put the `memory_order` typedef back inside a `>= CL2.0` guard. (detail)
    by svenvh
  8. [OpenCL] Remove more duplicates from opencl-c.h

    Identified the duplicate declarations using

      sort lib/Headers/opencl-c.h | uniq -c | grep '      2' (detail)
    by svenvh
  9. PR42362: Fix auto deduction of template parameter packs from
    type-dependent argument packs.

    We need to strip off the PackExpansionExpr to get the real (dependent)
    type rather than an opaque DependentTy. (detail)
    by rsmith
  10. Fix test for 32-bit targets. (detail)
    by rsmith
  11. Revert "builtins: relax __iso_volatile_{load,store}32"

    This reverts commit SVN r364137.  This seems to be cause problems with
    casting in C. (detail)
    by Saleem Abdulrasool
  12. MSVC visualizers for type aliases

    For example, the following TypeAliasTemplateDecl now displays in the autos window as
    template<class T> using type_identity_t = type_identity<T>::type; (detail)
    by mps
  13. Fix TBAA representation for zero-sized fields and unnamed bit-fields.

    Unnamed bit-fields should not be represented in the TBAA metadata
    because they do not represent storage fields (they only affect layout).

    Zero-sized fields should not be represented in the TBAA metadata
    because by definition they have no associated storage (so we will never
    emit a load or store through them), and they might not appear in
    declaration order within the struct layout.

    Fixes a verifier failure when emitting a TBAA-enabled load through a
    class type containing a zero-sized field. (detail)
    by rsmith
  14. Remove reliance on toCharUnitsFromBits rounding down. (detail)
    by rsmith
  15. Natural MSVC visualization of constructors

    E.g., Allow MSVC to visualize a CXXConstructorDecl like
    Constructor { Y(type_identity_t<T>)} (detail)
    by mps
  16. builtins: relax __iso_volatile_{load,store}32

    This is reduced from MSVC's MSVCPRT 14.21.27702 atomic header.  Because
    Windows is a LLP64 environment, `long`, `long int`, and `int` are all
    synonymous.  Change the signature for `__iso_volatile_load32` and
    `__iso_volatile_store32` to accept a `long int` instead.  This allows
    an implicit cast of `int` to `long int` while also permitting `long`
    to be accepted. (detail)
    by Saleem Abdulrasool
  17. [X86] Don't use _MM_FROUND_CUR_DIRECTION in the intrinsics tests.

    _MM_FROUND_CUR_DIRECTION is the behavior of the intrinsics that
    don't take a rounding mode argument. So a better test
    is using _MM_FROUND_NO_EXC with the SAE only intrinsics and
    an explicit rounding mode with the intrinsics that support
    embedded rounding mode. (detail)
    by ctopper
  18. AMDGPU: Fix target builtins for gfx10

    This wasn't setting some of the features from older generations. (detail)
    by arsenm
  19. [ODRHash] Skip some typedef types.

    In some cases, a typedef only strips aways a keyword for a type, keeping the
    same name as the root record type.  This causes some confusion when the type
    is defined in one modules but only forward declared in another.  Skipping the
    typedef and going straight to the record will avoid this issue.

    typedef struct S {} S;
    S* s;  // S is TypedefType here

    struct S;
    S* s;  // S is RecordType here (detail)
    by rtrieu
  20. Remove binary finally accidentially committed in r364109 (detail)
    by erichkeane
  21. Ensure Target Features always_inline error happens in C++ cases.

    A handful of C++ cases as reported in PR42352 didn't actually give an
    error when always_inlining with a different target feature list. This
    resulted in broken IR. (detail)
    by erichkeane
  22. Fix has_attribute.cpp test on Windows after r364102 (detail)
    by rnk
  23. clang-format a block; NFC

    The indentation of the return here was off, and confusing as a result.
    Cleaned up a bit extra while I was in the area. (detail)
    by George Burgess IV
  24. PR42301: Abort cleanly if we encounter a huge source file rather than
    crashing.

    Ideally we wouldn't care about the size of a file so long as it fits in
    memory, but in practice we have lots of hardocded assumptions that
    unsigned can be used to index files, string literals, and so on. (detail)
    by rsmith
  25. Fix __has_cpp_attribute expansion to produce trailing L and (where
    necessary) leading whitespace.

    Simplify unit test and extend to cover no_unique_address attribute. (detail)
    by rsmith
  26. Devirtualize destructor of final class.

    Summary:
    Take advantage of the final keyword to devirtualize destructor calls.

    Fix https://bugs.llvm.org/show_bug.cgi?id=21368

    Reviewers: rsmith

    Reviewed By: rsmith

    Subscribers: davidxl, Prazek, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63161 (detail)
    by yamauchi
  27. Revert [test][Driver] Fix Clang :: Driver/cl-response-file.c

    This reverts r363985 (git commit d5f16d6cfccc4b0b13b6c01d16c673886d53e695)

    This test can't use printf on Windows because the path contains
    backslashes which must not be interpreted as escapes by printf. (detail)
    by rnk
  28. [clang-scan-deps] print the dependencies to stdout
    and remove the need to use -MD options in the CDB

    Differential Revision: https://reviews.llvm.org/D63579 (detail)
    by arphaman
  29. Fix ARM buildbot. (detail)
    by rsmith
  30. [OPENMP]Fix PR42068: Vla type is not captured.

    If the variably modified type is declared outside of the captured region
    and then used in the cast expression along with array subscript
    expression, the type is not captured and it leads to the compiler crash. (detail)
    by abataev
  31. Ensure that top-level QualType objects also have a "kind" field when dumping the AST to JSON. (detail)
    by aaronballman
Revision: 362564
Changes
  1. [clangd] Improve SelectionTree string representation (detail)
    by sammccall
  2. [NFC] Marking test added in r363975 as unsupported on Windows.

    This test references a path that does not exist on Windows causing
    it to emit different output from what was expected leading to a
    failure when run on Windows. (detail)
    by dyung
  3. [clang-tidy] misc-unused-parameters: don't comment out parameter name for C code

    Summary: The fixit `int square(int /*num*/)` yields `error: parameter name omitted` for C code. Enable it only for C++ code.

    Reviewers: klimek, ilya-biryukov, lebedev.ri, aaron.ballman

    Subscribers: xazax.hun, cfe-commits

    Tags: #clang

    Differential Revision: https://reviews.llvm.org/D63088 (detail)
    by mgehre
  4. Quote path to Python executable in case it has spaces

    These days Python 3 is typically installed into C:/Program Files, so
    cope with that.

    Similar to r364077 in compiler-rt. (detail)
    by rnk
Revision: 362564
Changes
  1. [ASan] Use dynamic shadow on 32-bit iOS and simulators

    The VM layout on iOS is not stable between releases. On 64-bit iOS and
    its derivatives we use a dynamic shadow offset that enables ASan to
    search for a valid location for the shadow heap on process launch rather
    than hardcode it.

    This commit extends that approach for 32-bit iOS plus derivatives and
    their simulators.

    rdar://50645192
    rdar://51200372
    rdar://51767702

    Reviewed By: delcypher

    Differential Revision: https://reviews.llvm.org/D63586 (detail)
    by yln
  2. [asan] Quote the path to the Python exe in case it has spaces

    These days, Python 3 installs itself into Program Files, so it often has
    spaces. At first, I resisted this, and I reinstalled it globally into
    C:/Python37, similar to the location used for Python 2.7. But then I
    updated VS 2019, and it uninstalled my copy of Python and installed a
    new one inside "C:/Program Files (x86)/Microsoft Visual Studio/". At
    this point, I gave up and switched to using its built-in version of
    Python. However, now these tests fail, and have to be made aware of the
    possibility of spaces in paths. :( (detail)
    by rnk
Revision: 362564
Changes
  1. [libcxx] [test] Read files as bytestrings to fix py3 encoding issues

    Use binary mode to read test files in libcxx LibcxxTestFormat class.
    This ensures that tests are read correctly independently of encoding,
    and therefore fixes UnicodeDecodeError when file is opened in Python 3
    that defaults to pure ASCII encoding.

    Technically this could be also fixed via conditionally appending
    encoding argument when opening the file in Python 3.  However, since
    the code in question only searches for fixed ASCII substrings reading
    it in binary mode is simpler and more universal.

    Differential Revision: https://reviews.llvm.org/D63346 (detail)
    by mgorny
  2. Use C++11 implementation of unique_ptr in C++03. (detail)
    by ericwf
  3. Apply new meta-programming traits throughout the library.

    The new meta-programming primitives are lower cost than the old versions. This patch removes those old versions and switches libc++ to use the new ones. (detail)
    by ericwf
  4. Disable test by default (detail)
    by ericwf
  5. Add super fast _IsSame trait for internal use.

    Clang provides __is_same that doesn't produce any instantiations
    and just returns a bool. It's a lot faster than using std::is_same

    I'll follow up with a patch to actually start using it. (detail)
    by ericwf
  6. Add noexcept throughout <atomic>

    The CMake CheckLibcxxAtomic module was always failing to compile
    the example, even when libatomic wasn't needed. This was caused
    because the check doesn't link a C++ runtime library to provide
    std::terminate, which is required for exception support.

    The check is still really broken, but <atomic> is better! (detail)
    by ericwf
  7. Fix placement of -Wno-ignored-attributes (detail)
    by ericwf
  8. Disable -Wignored-attributes for now (detail)
    by ericwf
  9. Add new style meta-programming primatives.

    Using class templates instead of alias templates causes a lot of
    instantiations. As part of the move away from C++03, we want to
    improve the efficiency of our meta-programming.

    This patch lays the groundwork by introducing new _If, _EnableIf,
    _And, _Or, and _IsValidExpansion (detect member). Future patches
    will replace the existing implementations after verifying there
    compile time differences. (detail)
    by ericwf
  10. Implement P0340R3: Make 'underlying_type' SFINAE-friendly. Reviewed as https://reviews.llvm.org/D63574 (detail)
    by marshall

Started by upstream project relay-lnt-ctmark build number 8606
originally caused by:

This run spent:

  • 7 sec waiting;
  • 14 min build duration;
  • 14 min total from scheduled to completion.