Started 2 yr 4 mo ago
Took 49 min on green-dragon-09

Success Build #291 (Aug 7, 2017 12:59:20 PM)

  • : 310285
  • : 310282
  • : 310276
  • : 303903
  • : 310125
  • : 310157
  • : 310096
  1. Removing an unused variable that was missed with the refactoring in r310272; NFC. (detail)
    by aaronballman
  2. [AMDGPU] Add pseudo "old" source to all DPP instructions

    All instructions with the DPP modifier may not write to certain lanes of
    the output if bound_ctrl=1 is set or any bits in bank_mask or row_mask
    aren't set, so the destination register may be both defined and modified.
    The right way to handle this is to add a constraint that the destination
    register is the same as one of the inputs. We could tie the destination
    to the first source, but that would be too restrictive for some use-cases
    where we want the destination to be some other value before the
    instruction executes. Instead, add a fake "old" source and tie it to the
    destination. Effectively, the "old" source defines what value unwritten
    lanes will get. We'll expose this functionality to users with a new
    intrinsic later.

    Also, we want to use DPP instructions for computing derivatives, which
    means we need to set WQM for them. We also need to enable the entire
    wavefront when using DPP intrinsics to implement nonuniform subgroup
    reductions, since otherwise we'll get incorrect results in some cases.
    To accomodate this, add a new operand to all DPP instructions which will
    be interpreted by the SI WQM pass. This will be exposed with a new
    intrinsic later. We'll also add support for Whole Wavefront Mode later.

    I also fixed to overwrite the source and fixed up
    the test. However, I could also keep the old behavior (where lanes that
    aren't written are undefined) if people want it.

    Reviewers: tstellar, arsenm

    Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

    Differential Revision: (detail)
    by cwabbott

Started by timer

This run spent:

  • 48 min waiting;
  • 49 min build duration;
  • 1 hr 37 min total from scheduled to completion.
Test Result (no failures)