FailedChanges

Summary

  1. [HIP] Fix visibility for 'extern' device variables. (details)
  2. [X86/Atomics] (Semantically) revert G246098, switch back to the old (details)
Commit 0a220de9e9ca3e6786df6c03fd37668815805c62 by michael.hliao
[HIP] Fix visibility for 'extern' device variables.
Summary:
- Fix a bug which misses the change for a variable to be set with
target-specific attributes.
Reviewers: yaxunl
Subscribers: jvesely, nhaehnle, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63020
The file was modifiedclang/lib/CodeGen/CodeGenModule.cpp
The file was modifiedclang/test/CodeGenCUDA/amdgpu-visibility.cu
Commit 027aa27d95c165cb4afa2c0b43b22b729d989755 by listmail
[X86/Atomics] (Semantically) revert G246098, switch back to the old
atomic example
When writing an email for a follow up proposal, I realized one of the
diffs in the committed change was incorrect.  Digging into it revealed
that the fix is complicated enough to require some thought, so reverting
in the meantime.
The problem is visible in this diff (from the revert):
; X64-SSE-LABEL: store_fp128:
; X64-SSE:       # %bb.0:
-; X64-SSE-NEXT:    movaps %xmm0, (%rdi)
+; X64-SSE-NEXT:    subq $24, %rsp
+; X64-SSE-NEXT:    .cfi_def_cfa_offset 32
+; X64-SSE-NEXT:    movaps %xmm0, (%rsp)
+; X64-SSE-NEXT:    movq (%rsp), %rsi
+; X64-SSE-NEXT:    movq {{[0-9]+}}(%rsp), %rdx
+; X64-SSE-NEXT:    callq __sync_lock_test_and_set_16
+; X64-SSE-NEXT:    addq $24, %rsp
+; X64-SSE-NEXT:    .cfi_def_cfa_offset 8
; X64-SSE-NEXT:    retq
  store atomic fp128 %v, fp128* %fptr unordered, align 16
  ret void
The problem here is three fold: 1) x86-64 doesn't guarantee atomicity of
anything larger than 8 bytes.  Some platforms observably break this
guarantee, others don't, but the codegen isn't considering this, so it's
wrong on at least some platforms. 2) When I started to track down the
problem, I discovered that DAGCombiner had stripped the atomicity off
the store entirely.  This comes down to idiomatic usage of DAG.getStore
passing all MMO components separately as opposed to just passing the
MMO. 3) On x86 (not -64), there are cases where 8 byte atomiciy is
supported, but only for floating point operations.  This would seem to
imply that operation typing matters for correctness, and DAGCombine
happily folds away bitcasts.  I'm not 100% sure there's a problem here,
but I'm not entirely sure there isn't either.
I plan on returning to each issue in turn;  sorry for the churn here.
The file was modifiedllvm/test/CodeGen/X86/atomic-non-integer-fp128.ll
The file was modifiedllvm/test/CodeGen/X86/combineIncDecVector-crash.ll
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
The file was modifiedllvm/test/CodeGen/X86/atomic-non-integer.ll