Re: Global synchronization inside the kernel
Yep, but that would give you only 10% occupancy which could be slow. But if it's just a test that doesn't care about performance then it doesn't matter.
View ArticleRe: Global synchronization inside the kernel
Please note that these small numbers of waves are for the smallest GCN chip, which has only 10 CUes, not 32.With 40 threads it is possible to utilize all the 640 streams but without any latency hiding...
View ArticleRe: Global synchronization inside the kernel
With 60 threads (thanks for ds_gws_barrier) it was possible to put 6 waves into every CU, and this tolerates better the 'fat' instruction stream I'm planning to give them. Thanks for the data. Agree,...
View ArticleRe: Global synchronization inside the kernel
vmiura wrote: gcnc. What's that and where can we get it? Hi vmiura, The best answer is it's my attempt at building a GCN hardware specific C compiler/assembler that can run in AMD's opencl...
View ArticleRe: Global synchronization inside the kernel
This new Jive platform finally works under IE10, but I for one cannot edit my messages, because it keeps importing my very first post of the topic, and I fear editing it, because I think it will edit...
View ArticleRe: Global synchronization inside the kernel
Very inspirational post! How good is to have arithmetic expressions and local functions with inline asm. Makes me wanna throw away macros and start to make something out of my pascal parser. Now at...
View ArticleRe: Re: Global synchronization inside the kernel
Oups I had a mistake: forgot to use GLC while checking the synchronization with uav.So the 8 wavefronts / CU is possible with GWS, and beyond this it is a crash. w/CU 4 5 6 7 8MAD 29...
View ArticleRe: Re: Global synchronization inside the kernel
Oups I had a mistake: forgot to use GLC while checking the synchronization with uav.So the 8 wavefronts / CU is possible with GWS, and beyond this it is a crash. w/CU 4 5 6 7 8MAD 29...
View ArticleRe: Global synchronization inside the kernel
Wow thanks for MAC, now I'm at 960 GFlops/s with 230KHz synch I do convolution most of the time, so that's the proper instruction.(Gotta memorize that mad = mad+mac+madak+madmk. Even in my Mandelbrot...
View ArticleRe: Global synchronization inside the kernel
Here's how a 10cu HD7770 'instrument' sounds in realtime https://soundcloud.com/realhet/gcn-piano-moonlight-mvt3-by (performed by vs120 on prog.hu) And I don't even use the synch yet, all strings are...
View Article