Author

Topic: Why is there no PyCUDA port of POCLBM? (Read 2599 times)

newbie
Activity: 28
Merit: 0
June 07, 2011, 05:16:30 AM
#7
There's something wrong if we have a card that can perform side by side with a 6950 (GTX 570)

570 more similar to a 6970 than a 6950 just pointing out Smiley Apart from that as the wiki said, AMD's are better cause of their strong integer crunching prowess and higher number of shaders.
legendary
Activity: 2618
Merit: 1007
June 07, 2011, 05:00:52 AM
#6
There's something wrong if we have a card that can perform side by side with a 6950 (GTX 570) in 3D applications of parallel processing/SP calculation but only 1/4 as fast in decoding bitcoin blocks.
Bitcoin is an integer algorithm, neither DP nor SP power is needed there.

You can try to do the port yourself though as pyoclbm is Open Source, good luck!
newbie
Activity: 54
Merit: 0
June 07, 2011, 03:43:26 AM
#5
I don't see any reason why the CUDA architecture, using the full parallel processing capabilities of each CUDA core, should be any slower than ATI cards, but hopefully someone here with a better understanding can figure things out and explain them.

Please read the Wiki:
https://en.bitcoin.it/wiki/Why_a_GPU_mines_faster_than_a_CPU#Why_are_AMD_GPUs_faster_than_Nvidia_GPUs?
full member
Activity: 219
Merit: 120
June 07, 2011, 02:30:10 AM
#4
I think the reason nobody has developed a pure CUDA kernel for Phoenix (or a CUDA port of poclbm) is that Nvidia cards are very poor miners compared to similarly priced ATI cards. As a result of this the vast majority of Nvidia cards used for mining were likely intended for gaming first.

We were planning on making a CUDA kernel for Phoenix, but we didn't get very far before shifting focus to other areas. (BFI_INT implementation)
legendary
Activity: 1484
Merit: 1005
June 07, 2011, 01:35:22 AM
#3
Yeah but it's pretty clear that an nVidia card do as many SP vector calculations as an ATI card when it comes to 3D applications...  The performance we're seeing right now amounts to maybe 1 active thread/CUDA core whereas we should be seeing a much faster speedup.  Is it just because the DP performance of the ATI cards is so much faster?  That's the one area where ATI really outstrips nVidia, but there must be some kind of fix for this.

There's something wrong if we have a card that can perform side by side with a 6950 (GTX 570) in 3D applications of parallel processing/SP calculation but only 1/4 as fast in decoding bitcoin blocks.
sr. member
Activity: 418
Merit: 250
June 07, 2011, 01:30:11 AM
#2
You can run phoenix poclbm/phatk on NVidia cards.

On my GTX570, rpcminer-cuda.exe gets 113 MH/s while phoenix poclbm gets 1-2 MH/s more, but makes the machine unusable (desktop unresponsive)

On my 8800GT, rpcminer-cuda.exe gets 24 MH/s while phoenix [unknown kernel] gets 31 MH/s
legendary
Activity: 1484
Merit: 1005
June 07, 2011, 01:15:55 AM
#1
Is my question.  Anyone using POCLBM with CUDA-enabled nVidia cards has probably noticed that their cards are only using a fraction of their possible output in terms of heat/electricity.  It's reasonable to think then that all of our transistors are not being used effectively or at all.

Particularly I am pretty sure the OCL implementation in POCLBM very poorly utilizes the GPU in terms of blocks/threads: http://llpanorama.wordpress.com/2008/06/11/threads-and-blocks-and-grids-oh-my/

The PyCUDA documentation is here: http://documen.tician.de/pycuda/index.html#contents

I don't see any reason why the CUDA architecture, using the full parallel processing capabilities of each CUDA core, should be any slower than ATI cards, but hopefully someone here with a better understanding can figure things out and explain them.
Jump to: