Pages:
Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 38. (Read 3426930 times)

sr. member
Activity: 462
Merit: 250
...
All I can say is that I'm out of ideas how to push it any further Smiley

It's very nice as is, stable as hell and I'm finding blocks pretty regularly. It's been months since I made any money with my GPUs  Smiley  So thanks again!
full member
Activity: 137
Merit: 100

Not much unless somebody (don't look at me) comes up with a significantly faster way to calculate sha256 hashes.

...blahblahblah...
Is it specific to spreadcoin ?
not sure I understand about which sha256 you are talking about... but yes your table is seriously fucked up  Grin

If it is the sha256 of the merkleroot, it is performed only on accepted hash so it couldn't be parallelized.

If it is something specific to spreadcoin, you may want to look at sha256 implementation in xcn (m7) it performs some sha256 calculation on very long hash... however yours should be faster because you know the length of the hash...


Spreadcoin specific, described in http://spreadcoin.net/files/SpreadCoin-WhitePaper.pdf on page 2 starting at "hashWholeBlock is a SHA-256 hash of the block data arranged as follows."

I think the sha256 implementation from your M7 was my first attempt at it, after spending a couple of days trying to figure out where the problem was I found out the sha256 code was giving incorrect results with the sha256 test vectors so I yanked one from ... Catia's version of the M7 miner. I do believe it's very strongly based on the same sha256 code (might even be your code originally, god knows) but I did get correct results from it so I went with it Smiley

Just to set the scale right, you describe the roughly 300 bytes that you need to process with sha256 to get the final hash as "very long hash" and with spreadcoin it is four hundred THOUSAND bytes, meaning 6250 calls to the transformation function "sha2_round_body" ... and one more for the padding. Which should explain why it's slow Smiley
it is very long compared to the usual hash...
The implementation is rather specific to the algo, to get it work you would have to update the part which runs on the last round. (however the routine used to calculate one round should work...).  

All in all I can't say the sha256 code is exactly slow though. The run times in my "table" (lol) are for 1048576 nonces, which means 1048576 / 64 = 16384 sha256 hashes of 400000 bytes each. And 16384*400000 bytes in 246 milliseconds is 24.8 gigabytes/second processed, unless I have a brainfart in my math. Sounds like a very nice number compared to http://www.extremetech.com/gaming/176785-nvidias-new-maxwell-powered-gtx-750-ti-is-hyper-efficient-quiet-a-serious-threat-to-amd/3 which quotes 15.16 GB/s for 750 Ti. It's just that the amount of data to process is huge compared to the usual "ok, now we hash this 512-bit value from the previous hash" scenario. All I can say is that I'm out of ideas how to push it any further Smiley

edit: may-be one way to increase a bit (not a lot though) would be to change the way the hashes are packed (the xcn way gives some slight speed improvement over the standard way) especially as you are working on very long hashes

I assume you're referring to the coalesced reads of the input. Won't work here, the block data is the same for every thread with the exception of the first two blocks that contain the changing part (nonce + minersignature) so it's just stored as a single 200000 byte block (that gets hashed over twice for a total of 400000 bytes processed) and the changing parts are are replaced by the sha256 kernel for the two first blocks.
legendary
Activity: 1400
Merit: 1050

Not much unless somebody (don't look at me) comes up with a significantly faster way to calculate sha256 hashes.

...blahblahblah...
Is it specific to spreadcoin ?
not sure I understand about which sha256 you are talking about... but yes your table is seriously fucked up  Grin

If it is the sha256 of the merkleroot, it is performed only on accepted hash so it couldn't be parallelized.

If it is something specific to spreadcoin, you may want to look at sha256 implementation in xcn (m7) it performs some sha256 calculation on very long hash... however yours should be faster because you know the length of the hash...


Spreadcoin specific, described in http://spreadcoin.net/files/SpreadCoin-WhitePaper.pdf on page 2 starting at "hashWholeBlock is a SHA-256 hash of the block data arranged as follows."

I think the sha256 implementation from your M7 was my first attempt at it, after spending a couple of days trying to figure out where the problem was I found out the sha256 code was giving incorrect results with the sha256 test vectors so I yanked one from ... Catia's version of the M7 miner. I do believe it's very strongly based on the same sha256 code (might even be your code originally, god knows) but I did get correct results from it so I went with it Smiley

Just to set the scale right, you describe the roughly 300 bytes that you need to process with sha256 to get the final hash as "very long hash" and with spreadcoin it is four hundred THOUSAND bytes, meaning 6250 calls to the transformation function "sha2_round_body" ... and one more for the padding. Which should explain why it's slow Smiley
it is very long compared to the usual hash...
The implementation is rather specific to the algo, to get it work you would have to update the part which runs on the last round. (however the routine used to calculate one round should work...).  

edit: may-be one way to increase a bit (not a lot though) would be to change the way the hashes are packed (the xcn way gives some slight speed improvement over the standard way) especially as you are working on very long hashes
full member
Activity: 137
Merit: 100

Not much unless somebody (don't look at me) comes up with a significantly faster way to calculate sha256 hashes.

...blahblahblah...
Is it specific to spreadcoin ?
not sure I understand about which sha256 you are talking about... but yes your table is seriously fucked up  Grin

If it is the sha256 of the merkleroot, it is performed only on accepted hash so it couldn't be parallelized.

If it is something specific to spreadcoin, you may want to look at sha256 implementation in xcn (m7) it performs some sha256 calculation on very long hash... however yours should be faster because you know the length of the hash...


Spreadcoin specific, described in http://spreadcoin.net/files/SpreadCoin-WhitePaper.pdf on page 2 starting at "hashWholeBlock is a SHA-256 hash of the block data arranged as follows."

I think the sha256 implementation from your M7 was my first attempt at it, after spending a couple of days trying to figure out where the problem was I found out the sha256 code was giving incorrect results with the sha256 test vectors so I yanked one from ... Catia's version of the M7 miner. I do believe it's very strongly based on the same sha256 code (might even be your code originally, god knows) but I did get correct results from it so I went with it Smiley

Just to set the scale right, you describe the roughly 300 bytes that you need to process with sha256 to get the final hash as "very long hash" and with spreadcoin it is four hundred THOUSAND bytes, meaning 6250 calls to the transformation function "sha2_round_body" ... and one more for the padding. Which should explain why it's slow Smiley
legendary
Activity: 1400
Merit: 1050
...
Don't go crazy now, donating everything you mine with the diff going up Tongue

Yeah, difficulty is really rising  Smiley Hope they can get on decent exchange soon.

Btw, anyway you can get some more hash out of your miner? I saw that they now get 2,1M out of 290x. Well, we still win both by watt/price, but wanted ask  Smiley

Not much unless somebody (don't look at me) comes up with a significantly faster way to calculate sha256 hashes.

There's probably some kH/s to gain by taking a look at the X11 core parts that sp_ & crew have been improving, but as long as the huge honking whole block sha256 calculation is there at its current state, improving the core X11 part doesn't do much in the grand scheme of things.

There are basically three things that slow it down compared to regular X11. First is calculating the miner signature and that part already works effectively in the gigahashes/second range so making it faster wouldn't have much of an effect. Second is calculating the whole block hash, in essence sha256 over 400000 bytes of data for every 64 nonces being tested and that is where it gets slow, effectively around 4-5 MH/s. And the third and last slowdown is having to run the Blake compression function over two blocks where a single block is enough for regular X11, so Blake runs at roughly half of its usual 100+ MH/s speed in regular X11.

Removing the the whole block hash calculation gets you 2.8 MH/s, compared to about 2.9 MH/s that is the X11 part only. It's pretty easy to see that less than absolutely stellar improvements in the other parts won't make much difference in the total hash rate unless the whole block hash gets faster..

Edit: Tables are fun, gotta have tables. Timings for the separate parts that make up the SpreadX11 hash:

Part of hashtime (ms)
sha256 for signature0.35
signature1.09
sha256 whole block246.11
blake20.57
bmw11.57
groestl68.84
skein9.84
jh19.72
keccak6.74
luffa17.83
cubehash26.00
shavite43.72
simd66.74
echo78.91

Edit 2: Goddddamn that's one seriously fucked up way to render a simple table...


Is it specific to spreadcoin ?
not sure I understand about which sha256 you are talking about... but yes your table is seriously fucked up  Grin

If it is the sha256 of the merkleroot, it is performed only on accepted hash so it couldn't be parallelized.

If it is something specific to spreadcoin, you may want to look at sha256 implementation in xcn (m7) it performs some sha256 calculation on very long hash... however yours should be faster because you know the length of the hash...
full member
Activity: 137
Merit: 100
...
Don't go crazy now, donating everything you mine with the diff going up Tongue

Yeah, difficulty is really rising  Smiley Hope they can get on decent exchange soon.

Btw, anyway you can get some more hash out of your miner? I saw that they now get 2,1M out of 290x. Well, we still win both by watt/price, but wanted ask  Smiley

Not much unless somebody (don't look at me) comes up with a significantly faster way to calculate sha256 hashes.

There's probably some kH/s to gain by taking a look at the X11 core parts that sp_ & crew have been improving, but as long as the huge honking whole block sha256 calculation is there at its current state, improving the core X11 part doesn't do much in the grand scheme of things.

There are basically three things that slow it down compared to regular X11. First is calculating the miner signature and that part already works effectively in the gigahashes/second range so making it faster wouldn't have much of an effect. Second is calculating the whole block hash, in essence sha256 over 400000 bytes of data for every 64 nonces being tested and that is where it gets slow, effectively around 4-5 MH/s. And the third and last slowdown is having to run the Blake compression function over two blocks where a single block is enough for regular X11, so Blake runs at roughly half of its usual 100+ MH/s speed in regular X11.

Removing the the whole block hash calculation gets you 2.8 MH/s, compared to about 2.9 MH/s that is the X11 part only. It's pretty easy to see that less than absolutely stellar improvements in the other parts won't make much difference in the total hash rate unless the whole block hash gets faster..

Edit: Tables are fun, gotta have tables. Timings for the separate parts that make up the SpreadX11 hash:

Part of hashtime (ms)
sha256 for signature0.35
signature1.09
sha256 whole block246.11
blake20.57
bmw11.57
groestl68.84
skein9.84
jh19.72
keccak6.74
luffa17.83
cubehash26.00
shavite43.72
simd66.74
echo78.91

Edit 2: Goddddamn that's one seriously fucked up way to render a simple table...

sr. member
Activity: 462
Merit: 250
Ööööhhh... Recently....? Monero?  Wink Perhaps neoscryp.


Disregarding market cap, SpreadCoin is a bit interesting.

But really no, I'm afraid
market cap is clearly the main problem in Christian question  Grin
There are quite a few nvidia coins but market cap is something which should be forgotten at the moment.

Well I'm just eager to get more hash on SpreadCoin  Smiley I'm a selfish bastard, but I like that coin and hope they get on a proper exchange  Grin And I think it's one of the few profitable coins atm.
legendary
Activity: 1400
Merit: 1050
Ööööhhh... Recently....? Monero?  Wink Perhaps neoscryp.


Disregarding market cap, SpreadCoin is a bit interesting.

But really no, I'm afraid
market cap is clearly the main problem in Christian question  Grin
There are quite a few nvidia coins but market cap is something which should be forgotten at the moment.
sr. member
Activity: 462
Merit: 250
Ööööhhh... Recently....? Monero?  Wink Perhaps neoscryp.


Disregarding market cap, SpreadCoin is a bit interesting.

But really no, I'm afraid
legendary
Activity: 1400
Merit: 1000
Can anyone name some CUDA mineable altcoins that have had a successful launch recently and actually achieved a decent market capitalization?


It's been a while. I had to "RTFM" for the algos.

It seems like there have been no coins with the algo's, beside scrypt coins.

Almost all the new altcoins seem to be ccminer.

Now if we can get cudaminer on neoscrypt to compete with AMD cards maybe we can reboot the cudaminer. I'll dedicate 3 750ti's to mine for a week straight for you on that.
sr. member
Activity: 329
Merit: 250
Can anyone name some CUDA mineable altcoins that have had a successful launch recently and actually achieved a decent market capitalization?

monero, but you already know that Smiley

uh, successful? Not so sure. It seems like it's going to fizzle out.  Also, that launch isn't "recent" by any means. Wink

Christian

it'll come back, today a web wallet was released and the price raised significantly and when(?) the gui wallet will be released marketing will start too...
i know it's an incredible slow process, but are there any true anonymous currency alternatives? i didn't see any recent coin that have any reason to exist apart from coinshield which should (hopefully) absorb all these useless and pointless coins...
hero member
Activity: 756
Merit: 502
Can anyone name some CUDA mineable altcoins that have had a successful launch recently and actually achieved a decent market capitalization?

monero, but you already know that Smiley

uh, successful? Not so sure. It seems like it's going to fizzle out.  Also, that launch isn't "recent" by any means. Wink

Christian
sr. member
Activity: 329
Merit: 250
Can anyone name some CUDA mineable altcoins that have had a successful launch recently and actually achieved a decent market capitalization?

monero, but you already know that Smiley
legendary
Activity: 1400
Merit: 1050
Can anyone name some CUDA mineable altcoins that have had a successful launch recently and actually achieved a decent market capitalization?

I think you are asking too much  Grin
hero member
Activity: 756
Merit: 502
Can anyone name some CUDA mineable altcoins that have had a successful launch recently and actually achieved a decent market capitalization?
sr. member
Activity: 462
Merit: 250
...
Don't go crazy now, donating everything you mine with the diff going up Tongue

Yeah, difficulty is really rising  Smiley Hope they can get on decent exchange soon.

Btw, anyway you can get some more hash out of your miner? I saw that they now get 2,1M out of 290x. Well, we still win both by watt/price, but wanted ask  Smiley
full member
Activity: 137
Merit: 100
Collective "thanks m8s" post for the SPR donations, cheers Smiley

Don't go crazy now, donating everything you mine with the diff going up Tongue
sr. member
Activity: 423
Merit: 250
What engine/clock are you running on the 270's to get those figures?

With my 270x Toxic's and djm34's miner, I was getting the same power figures but 2.6 Mh/s running 960/1175 core/mem @ 1050mv.

He runs is own personal, private, kernels and talks about them as if they're readily available to everyone. No one else is getting those numbers.

That still isn't more efficient then maxwell either.
sr. member
Activity: 462
Merit: 250
www.dashpay.io
My gainward is pulling 40W and producing 2700 KHASH (modded miner). (Standard clocks)

67,5KHASH per watt.

http://www.gainward.com/main/vgapro.php?id=926&lang=en


40W for the whole system?

you can't beat maxwell deal with it  Grin

Close enough. My system uses 55W. 165W - 55W = 110W for 2x270X.

Meaning, each 270X is doing 3300 - 3400kh/s X11 for 55W each.

What engine/clock are you running on the 270's to get those figures?

With my 270x Toxic's and djm34's miner, I was getting the same power figures but 2.6 Mh/s running 960/1175 core/mem @ 1050mv.
sr. member
Activity: 462
Merit: 250
3-400mhs nethashrate on speadcoin, tsiv gives us an nvidia advantage and only 3 guys donating to him - come on guys we could do better!

SPR: SfSEcVQGhbXvPQ2hkTj3vxSd9PEZA12efa
BTC: 1QD25HSCF8EAxUTYj2XsXZNGBi7RvQ21p8

edit. send some more bucks out too

Thanks, sent some more  Smiley
Pages:
Jump to: