Author

Topic: [ANN] SpreadCoin | Decentralize Everything (decentralized blockexplorer coming) - page 223. (Read 790391 times)

newbie
Activity: 44
Merit: 0
cant connect to spr.suprnova.cc
full member
Activity: 202
Merit: 100
Bittrex:

This market is in danger of de-listing due to low trade volume and lack of user interest. It may be removed on December 4th unless the average daily trade volume for the last 7 days exceeds 0.2 BTC
legendary
Activity: 2870
Merit: 1091
--- ChainWorks Industries ---
ambercoin road dooor to door promotion
is very good inovation
 Wink  Wink  Wink

hey - door to door isnt a bad thing ...

avon - amway - enyclopaedia brittanica - just to name a few ...

why not ... i mean i wont - but why not Tongue ...

#crysx
newbie
Activity: 49
Merit: 0
ambercoin road dooor to door promotion
is very good inovation
 Wink  Wink  Wink
hero member
Activity: 646
Merit: 501
Ni dieu ni maître
Has there been any progress made on funding the project?
legendary
Activity: 1456
Merit: 1000
Probably more, but I don't want to look that deeply into it. Tongue

You've done enough. Thanks for your assessment.
I will take over from here.

 Wink

He's an artist at work  Wink
legendary
Activity: 1484
Merit: 1007
spreadcoin.info
Probably more, but I don't want to look that deeply into it. Tongue

You've done enough. Thanks for your assessment.
I will take over from here.

 Wink
member
Activity: 81
Merit: 1002
It was only the wind.
Looks like I oopsed, they're not hashing the same thing over 3k times, they're hashing an INSANE amount of data. What the fuck is all of this...?
member
Activity: 81
Merit: 1002
It was only the wind.
If you know C, you should already be able to see how ridiculous most of the code is. Echo is really bad.

I'm fighting with priorities on a daily basis, but I'll take a look at it soon.

You talking about both the AMD and NVidia version here?
They were created by different people as far as I know.


Only looked at the OpenCL - not the CUDA one.
legendary
Activity: 2870
Merit: 1091
--- ChainWorks Industries ---
Can someone shed some light on this:

Code:
uint64_t signature8[5];
    signature8[0] = psign[0];
    signature8[1] = psign[8];
    signature8[2] = psign[16];
    signature8[3] = psign[24];
    signature8[4] = psign[32];

    uint64_t signature[4];
    signature[0] = (DEC64LEng(psign +  0) >> 8) | (signature8[1] << 56);
    signature[1] = (DEC64LEng(psign +  8) >> 8) | (signature8[2] << 56);
    signature[2] = (DEC64LEng(psign + 16) >> 8) | (signature8[3] << 56);
    signature[3] = (DEC64LEng(psign + 24) >> 8) | (signature8[4] << 56);

    signature8[1] = signature[0] >> 56;
    signature8[2] = signature[1] >> 56;
    signature8[3] = signature[2] >> 56;
    signature8[4] = signature[3] >> 56;

    signbe[0] = SWAP8((signature[0] << 8) | signature8[0]);
    signbe[1] = SWAP8((signature[1] << 8) | signature8[1]);
    signbe[2] = SWAP8((signature[2] << 8) | signature8[2]);
    signbe[3] = SWAP8((signature[3] << 8) | signature8[3]);
    signbe[4] = (signature8[4] << 56) | 0x80000000000000;

Just... what is it even doing?

Got it. Fun fact, that can be replaced by this:

Code:
for(int i = 0; i < 4; ++i) signbe[i] = SWAP8(((ulong *)psign)[i]);

signbe[4] = (((ulong)psign[32]) << 56) | 0x80000000000000;

Whoever wrote that needs to stay away from the alcohol...

hahaha ...

you will find the main drive for the creation of most crypto IS alcohol ... not innovation Wink ...

just kidding ...

thats awesome stuff wolf ...

#crysx
legendary
Activity: 1484
Merit: 1007
spreadcoin.info
SHA256d, at least the code itself, probably isn't going to get too much faster. HOWEVER, it can be improved, I think, by improving the structure of the kernel. It's partially unrolled, possibly wasting space. There's a tradeoff in rolling it up - I'll have to branch, or use conditional moves - but I'm pretty sure it'll be WELL worth it to decrease register usage and shrink code size.

Thanks again for providing this insight.

PS: No hamsters in my collection that I recall, but I *do* have a cute mouse: https://ottrbutt.com/tmp/3121bcd0f67852c01ae4a582bd4ab24e.jpg
It doesn't work for mr. spread if she has no "cheek pouches"... thanks but no thanks.  Tongue
legendary
Activity: 1484
Merit: 1007
spreadcoin.info
Not quite - you need to take into consideration that the massive number of SHA256 hashes take MOST of the time. So the REMAINING code can probably be doubled in speed, and I'm not sure what I can do with the signature yet.

Right now, signature2 is not looking good - https://ottrbutt.com/tmp/spreadx11-sig2-analysis.png -- it's bigger than the code cache by a lot (code cache on GCN is 32KiB), and uses enough registers to limit it to one wave in flight. Since the kernel also uses some memory, it probably would benefit from more waves in flight.

Interesting,
so SHA256d is THE problem, although SPH is the most obvious thing that can be improved.

I need to look into this CodeXL thing to analyze kernels.

PS: mr. spread asks if you have any NSFW pics of naked hamster girls.  Cheesy
member
Activity: 81
Merit: 1002
It was only the wind.

No problem. The OpenCL is still slow, the current algorithm can be implemented a lot better, even if not changed.

Well, I don't think the current algorithm needs to be changed at the core,
except to get rid of inefficiencies or to make the "pool-prevention"-mechanism stronger (which is certainly necessary).

If you have any ideas, let us know!



More splitting of the kernels, replacing most of the X11 code. I think the AMD compiler will derp out of some kernels are seperated, but more can be without error.

One of these days, I'll need to dive into GPU/ OpenCL programming too... it giggity-geeks the hell out of me already, lol!  Grin

If you know C, you should already be able to see how ridiculous most of the code is. Echo is really bad.
legendary
Activity: 1484
Merit: 1007
spreadcoin.info
It's not really the SHA256d that's bad, it's the REST of X11.

You mean the way the whole SPH library has been ported to OpenCL, right?

If you wanna call it that. It hasn't been ported so much as copypasted, and on top of this, SPH is a BAD library to use for any kind of speed-critical application. Its main purpose is to be portable across a wide range of CPUs, not perform well.

I understand now, thanks.
This means that the efficiency can probably be increased by a tenfold, I would guess... wow!
legendary
Activity: 1484
Merit: 1007
spreadcoin.info
It's not really the SHA256d that's bad, it's the REST of X11.

You mean the way the whole SPH library has been ported to OpenCL, right?
member
Activity: 81
Merit: 1002
It was only the wind.

No problem. The OpenCL is still slow, the current algorithm can be implemented a lot better, even if not changed.

Well, I don't think the current algorithm needs to be changed at the core,
except to get rid of inefficiencies or to make the "pool-prevention"-mechanism stronger (which is certainly necessary).

If you have any ideas, let us know!



More splitting of the kernels, replacing most of the X11 code. I think the AMD compiler will derp out of some kernels are seperated, but more can be without error.
legendary
Activity: 1484
Merit: 1007
spreadcoin.info
I think its a good idea, on paper. Tx's are given priority, the rest is there to make mining equal. With near full blocks, mining should be easy.

Perhaps we should cut the block sizes down to 1 tx  Grin

1tx blocks, moving up to 1mb blocks by next year. Maybe we should just organise hard forks every 6 months.

 Cheesy

One thing is for sure.
If we keep the algo as it is, and increase the block size to say 2 Megabytes (10x ), this will also make the padding / hashWholeBlock calculation 10x more heavy on your GPU.
 Shocked

I wonder if this algo can be reduced in complexity while still maintaining the same results.
(But we can also introduce new complexity if it helps make everything MORE anti pool and pro solo-mining)

After all, a solo-miner is always also a full node. Bingo!
legendary
Activity: 1484
Merit: 1007
spreadcoin.info
What I don't get is why the AMD miner is so much worse than nvidia; and it has problems too.

I can't really judge it (yet), but it's probably badly implemented.

It's largely SPH code... really, really bad. About the same as the original darkcoin-mod.

EDIT: Idea! What's the padding made out of? Maybe I can shortcut the memory usage!

What I want to find out how many times we actually need to calculate those double-SHA256 for the whole 200KBytes.
Maybe we can skip / hold a few iterations.
member
Activity: 81
Merit: 1002
It was only the wind.

If it was an error, it wouldn't get the right hash. It's correct.

I meant to say "bug", not error, ofcourse.


Not a bug - if I change/remove it, wrong hashes.

well, if - as you say - it causes a horrible inefficiency, then it's a bug in my book.

Hey thanks, that you take your time to look into this BTW.

No problem. The OpenCL is still slow, the current algorithm can be implemented a lot better, even if not changed.
legendary
Activity: 1456
Merit: 1000
That looks like he's standing in front of an ASICs with vents. Has Mr.Spread already created an ASICs for SpreadX11?

Yep!
It's a hamster dung powered wooden ASIC alright.
He created it himself, just with the stuff that was lying around...  Huh

Dude...He's ripping you off, stealing your electricity.

Quote
// Fill rest of the buffer to ensure that there is no incentive to mine small blocks without transactions.

So, that's clear now....the padding is to make mining equal. Just needs a better implementation, but still with the padding, I guess.
Jump to: