Quantcast
Channel: AMD Developer Forums: Message List - Global synchronization inside the kernel
Viewing all articles
Browse latest Browse all 17

Re: Global synchronization inside the kernel

$
0
0

Hi drallan,

 

Thx for the great example code! And congrats to your compiler!

 

But how can it fail even on a simple thing as this: (the result is a deadlock at ds_barrier :S)

 

AMD disasm tells me that I do the ds_gws encodings correctly. I restrict the whole kernel to the first local lane. The workgroupsize is 64, there are 2 workgroups only and yet it goes into an infinite loop :S

 

Is there something in the CAL Note Section to enable it?

I've found something called     IMM_GWS_BASE  // immediate UINT with GWS resource base offset. It's in a _E_SC_USER_DATA_CLASS structure. Is that the key? (Right now I don't fiddle with it because I allways ask the current OpenCL to make me a fresh skeleton kernel)

 

----------------------------------------------------------------------------------------

var dev:=cl.devices[1], kernel:=dev.NewKernel(asm_isa(

isa79xx

  numVgprs 256  numSgprs 104

  numThreadPerGroup 64             //workgroupsize=64

  oclBuffers 0,0                  

 

  s_mov_b64 exec,1                 //restrict to first local id

  s_cmpk_eq_i32 s2,0               //gid=0?

  s_cbranch_scc0 @skip

    v_mov_b32 v10,1                //I load 1 because there are 2 waves in total

    ds_gws_init v10 offset0:1 gds

    s_waitcnt lgkmcnt(0)

  @skip:

  __for__(i:=0 to 999, s_sleep 7)  //very long dummy code

 

  ds_gws_barrier v0 offset0:1 gds  //v0 is only a dummy 0

s_endpgm

));

 

writeln(kernel.ISACode);

 

with kernel.run(64*2 {2 waves}) do begin

  waitfor; writeln('elapsed: '&format('%.3f',elapsedtime_sec*1000)&' ms'); free; end;

kernel.free;

 

---------------------------------------------------------------------------------------

 

ShaderType = IL_SHADER_COMPUTE

TargetChip = t;

------------- SC_SRCSHADER Dump ------------------

SC_SHADERSTATE: u32NumIntVSConst = 0

SC_SHADERSTATE: u32NumIntPSConst = 0

SC_SHADERSTATE: u32NumIntGSConst = 0

SC_SHADERSTATE: u32NumBoolVSConst = 0

SC_SHADERSTATE: u32NumBoolPSConst = 0

SC_SHADERSTATE: u32NumBoolGSConst = 0

SC_SHADERSTATE: u32NumFloatVSConst = 0

SC_SHADERSTATE: u32NumFloatPSConst = 0

SC_SHADERSTATE: u32NumFloatGSConst = 0

fConstantsAvailable = 0

iConstantsAvailable = 0

bConstantsAvailable = 0

u32SCOptions[0] = 0x01A00000 SCOption_IGNORE_SAMPLE_L_BUG SCOption_FLOAT_DO_NOT_DIST SCOption_FLOAT_DO_NOT_REASSOC

u32SCOptions[1] = 0x00000000

u32SCOptions[2] = 0x20800001 SCOption_R800_UAV_NONARRAY_FIXUP SCOption_R1000_BYTE_SHORT_WRITE_WORKAROUND_BUG317611 SCOption_R1000_READLANE_SMRD_WORKAROUND_BUG343479

u32SCOptions[3] = 0x00000010 SCOption_R1000_BARRIER_WORKAROUND_BUG405404

; -------- Disassembly --------------------

shader main

asic(SI_ASIC)

type(CS)

  s_mov_b64     exec, 1             // 00000000: BEFE0481

  s_cmpk_eq_i32  s2, 0x0000         // 00000004: B1820000

  s_cbranch_scc0  label_0007        // 00000008: BF840004

    v_mov_b32     v10, 1              // 0000000C: 7E140281

    ds_gws_init   v10 offset:1 gds    // 00000010: D8660001 0000000A

    s_waitcnt     lgkmcnt(0)          // 00000018: BF8C007F

label_0007:

   

  [tonns of] s_sleep       0x0007   // 00000FA0: BF8E0007

   

  ds_gws_barrier  v0 offset:1 gds   // 00000FBC: D8760001 00000000

s_endpgm                          // 00000FC4: BF810000

end

; ----------------- CS Data ------------------------

codeLenInByte        = 4040; Bytes

userElementCount     = 0;

extUserElementCount  = 0;

NumVgprs             = 256;

NumSgprs             = 104;

FloatMode            = 192;

IeeeMode             = 0;

ScratchSize          = 0;

  texResourceUsage[0]     = 0x00000000;

  texResourceUsage[1]     = 0x00000000

    ... all zeroes

fetch4ResourceUsage[7]  = 0x00000000

texSamplerUsage         = 0x00000000;

constBufUsage           = 0x00000000;

COMPUTE_PGM_RSRC2       = 0x00000084

COMPUTE_PGM_RSRC2:USER_SGPR      = 2

COMPUTE_PGM_RSRC2:TGID_X_EN      = 1


Viewing all articles
Browse latest Browse all 17

Trending Articles