4 hashes parallel on SSE2 CPUs for 0.3.6 - page 2.

aceat64

full member

Activity: 307

Merit: 102

I just tried again with the latest version from your github. I'm still seeing a drop in performance compared to the vanilla source.

My system went from ~7100 to ~4200.

This particular system has dual Intel Xeon Quad-Core CPUs (E5335) @ 2.00GHz.

tcatm

sr. member

Activity: 337

Merit: 285

Quote from: impossible7 on August 07, 2010, 05:51:07 PM

I can confirm that the patch now works just fine. I just generated my first 50 BTC with it. And since this patch doubles the speed I think it's only fair if I donated half of that to tcatm.

That's nice to hear. Thanks for the donation and thanks to everyone else who donated!

impossible7

newbie

Activity: 18

Merit: 0

I can confirm that the patch now works just fine. I just generated my first 50 BTC with it. And since this patch doubles the speed I think it's only fair if I donated half of that to tcatm.

satoshi

founder

Activity: 364

Merit: 7553

Quote from: impossible7 on August 06, 2010, 06:37:20 AM

CRITICAL_BLOCK is a macro that contains a for loop. The assertion failure indicates that break has been called inside the body of the loop. The only break statement in this block is in line 2762. In the original source file, there is no break statement in this critical block. I think you must remove lines 2759-2762. The is nothing like that in the original main.cpp.

Sorry about that. CRITICAL_BLOCK isn't perfect. You have to be careful not to break or continue out of it. There's an assert that catches and warns about break. I can be criticized for using it, but the syntax would be so much more bloated and error prone without it.

Is there a chance the SSE2 code is slow on Intel because of some quirk that could be worked around? For instance, if something works but is slow if it's not aligned, or thrashing the cache, or one type of instruction that's really slow? I'm not sure how available it is, but I think Intel used to have a profiler for profiling on a per instruction level. I guess if tcatm doesn't have a system with the slow processor to test with, there's not much hope. But it would be really nice if this was working on most CPUs.

tcatm

sr. member

Activity: 337

Merit: 285

Thanks! That got probably mixed up when I patched the file with an older diff. I fixed the git: http://github.com/tcatm/bitcoin-cruncher

impossible7

newbie

Activity: 18

Merit: 0

CRITICAL_BLOCK is a macro that contains a for loop. The assertion failure indicates that break has been called inside the body of the loop. The only break statement in this block is in line 2762. In the original source file, there is no break statement in this critical block. I think you must remove lines 2759-2762. The is nothing like that in the original main.cpp.

tcatm

sr. member

Activity: 337

Merit: 285

Oh, that's a part in the code my patch doesn't touch. You could try to remove line 2741 (CRITICAL_BLOCK(cs_main)).

impossible7

newbie

Activity: 18

Merit: 0

Here's the output on stderr:

Code:

bitcoind: main.cpp:2741: void BitcoinMiner(): Assertion `("break caught by CRITICAL_BLOCK!", !fcriticalblockonce)' failed.

knightmb

sr. member

Activity: 308

Merit: 258

Quote from: tcatm on August 05, 2010, 10:38:53 PM

Is the khash/s you get worth it anymore at 352 difficulty? I'm only getting a block once a week now. If you want to keep the block chain working, you should use the original client. If you want to gain lots of bitcoins you should use a GPU.

When the difficulty changed, the first machine in my group to generate a block was the slowest one running the stock client (800 MHz E-machine) and it hadn't generated anything in over a week itself. So I guess every little bit helps, that's so many are interested in getting this to work on their machine.

tcatm

sr. member

Activity: 337

Merit: 285

Yes, test is single threaded. Is there any output on stderr? From the coredumps I can tell that there must be some output.

The problem seems to be hard to debug, though. Is the khash/s you get worth it anymore at 352 difficulty? I'm only getting a block once a week now. If you want to keep the block chain working, you should use the original client. If you want to gain lots of bitcoins you should use a GPU.

impossible7

newbie

Activity: 18

Merit: 0

The test program does not fail

Code:

$ ./test ../blocks.txt 
SHA256 test started
70293
found solutions = 70293
total hashes = 139463136
total time = 63250 ms
average speed: 2204 khash/s

Does the test program run on a single thread?

Finally, I have the same problem with both -march=amdfam10 and -march=native. The cpu is a Opteron 2374.

tcatm

sr. member

Activity: 337

Merit: 285

That's not a real crash this time. It's an assert that fails in the miner. Most likely assert(hash == pblock->GetHash());. Can you run the test programm (explained in https://bitcointalksearch.org/topic/m.7096 If it fails, can you change -march=amdfam10 back to -march=native in makefile.unix, rm cryptopp/obj/*.o and recompile everything? What cpu are you running it on?

impossible7

newbie

Activity: 18

Merit: 0

Ok I tried again, this time with no extra patches. I just cloned your git tree and compiled it. It still crashes. Here's the stack backtrace:

Code:

#0  0x00007ffff710b1b5 in raise () from /lib/libc.so.6
#1  0x00007ffff710c5e0 in abort () from /lib/libc.so.6
#2  0x00007ffff71042d1 in __assert_fail () from /lib/libc.so.6
#3  0x00000000004628de in BitcoinMiner () at main.cpp:2741
#4  0x0000000000462d70 in ThreadBitcoinMiner (parg=0x391e) at main.cpp:2518
#5  0x00007ffff6ec3894 in start_thread () from /lib/libpthread.so.0
#6  0x00007ffff71aa07d in clone () from /lib/libc.so.6

I have also uploaded ~~here~~ (link removed) the sources with the object files as well as the a core dump and the binary.

impossible7

newbie

Activity: 18

Merit: 0

I had 5 machines running today and when I checked back 10 hours later, 4 of them had crashed, in the same way as with the previous times (i.e. right after they had generated a new block but before they broadcasted it).

I created a tarball containing the coredump, backtrace, binary and the sources I used to compile it including the compiled object files. You can get it from ~~here~~ (link removed). Hope that helps.

EDIT: Ok, I just noticed that both your patch and the patch for the getkhps rpc (from http://www.alloscomp.com/bitcoin/) modify the function BitcoinMiner in main.cpp (which is where the segfault occurs) so this must be the reason for the seqfaults. I will try to test it without the getkhps patch.

tcatm

sr. member

Activity: 337

Merit: 285

Oh thats easy: Get two nodes on a seperate network and connect them using -connect=other_nodes_ip and a seperate/empty datadir.

vess

full member

Activity: 141

Merit: 100

Just a comment that this would be easier to test if difficulty were set to '1' in the client.

tcatm

sr. member

Activity: 337

Merit: 285

GDB can also generate coredumps with the command generate-core-file. It might be useful to reconstruct the cause for the segfault. Please note, that the coredump might include your wallet so it's probably a good idea to run bitcoind on a seperate datadir.

impossible7

newbie

Activity: 18

Merit: 0

I modified bitcoind so that it doen't fork to the background and now I can debug it with gdb. Next time it crashes gdb will give me a backtrace.

tcatm

sr. member

Activity: 337

Merit: 285

Did your kernel write a coredump and if so can you mail me the binary + coredump to [email protected]?

impossible7

newbie

Activity: 18

Merit: 0

I did segfault according to dmesg:

Code:

bitcoind[2469]: segfault at 0 ip 00007fe92c5b3f32 sp 00007fff15e5f6b0 error 4 in libc-2.11.2.so[7fe92c57e000+150000]

I don't have a stack trace.

Topic: 4 hashes parallel on SSE2 CPUs for 0.3.6 - page 2. (Read 22072 times)