Butterfly Labs - Bitforce Single and Mini Rig Box - page 80.

Energizer

sr. member

Activity: 273

Merit: 250

Quote from: Inspector 2211 on February 27, 2012, 04:04:11 PM

Quote from: zefir on February 27, 2012, 02:03:43 PM

Quote from: Epoch on February 27, 2012, 01:26:48 PM

Quote from: jddebug on February 26, 2012, 09:32:58 PM

I got an email back from Sonny the other day. He's saying late this week for my singles.

I run Mac. Wondering if cgminer will compile for me and support the singles?

@jddebug, when did you place that order? I'm guessing November or December? Would give us an idea of how far along BFL is with their backlog. Undecided

Not jddebug, but hopefully helping clarification: I placed my order end of January and received wire transfer information the same day. Payment from Europe took about one week and Sonny confirmed receipt immediately, giving me a projected delivery date mid of March.

I also requested updated API specification and received it the same day. That's why I never got impatient with them. OTOH, that were the times when 90% of forum members bashed BFL as scammers and the amount of orders might have been low.

I placed my order in early January and paid via Paypal, and as of today I have received: Nothing. Nada. Zilch.
In other words, this is now the 8th week [of waiting], seven weeks have passed.

Same here! it is almost 9 weeks now! and I've received NOTHING! I accept such delay as long as there would be an overall improvement to performance/cooling! But what is making me so angry is that Sonny started to ignore my emails 2 weeks ago. He used to answer my emails immediately in the first month but now it takes him at least 1 week to respond!

Inspector 2211

sr. member

Activity: 448

Merit: 250

Quote from: zefir on February 27, 2012, 02:03:43 PM

Quote from: Epoch on February 27, 2012, 01:26:48 PM

Quote from: jddebug on February 26, 2012, 09:32:58 PM

I got an email back from Sonny the other day. He's saying late this week for my singles.

I run Mac. Wondering if cgminer will compile for me and support the singles?

@jddebug, when did you place that order? I'm guessing November or December? Would give us an idea of how far along BFL is with their backlog. Undecided

Not jddebug, but hopefully helping clarification: I placed my order end of January and received wire transfer information the same day. Payment from Europe took about one week and Sonny confirmed receipt immediately, giving me a projected delivery date mid of March.

I also requested updated API specification and received it the same day. That's why I never got impatient with them. OTOH, that were the times when 90% of forum members bashed BFL as scammers and the amount of orders might have been low.

I placed my order in early January and paid via Paypal, and as of today I have received: Nothing. Nada. Zilch.
In other words, this is now the 8th week [of waiting], seven weeks have passed.

zefir

donator

Activity: 919

Merit: 1000

Quote from: Epoch on February 27, 2012, 01:26:48 PM

Quote from: jddebug on February 26, 2012, 09:32:58 PM

I got an email back from Sonny the other day. He's saying late this week for my singles.

I run Mac. Wondering if cgminer will compile for me and support the singles?

@jddebug, when did you place that order? I'm guessing November or December? Would give us an idea of how far along BFL is with their backlog. Undecided

Not jddebug, but hopefully helping clarification: I placed my order end of January and received wire transfer information the same day. Payment from Europe took about one week and Sonny confirmed receipt immediately, giving me a projected delivery date mid of March.

I also requested updated API specification and received it the same day. That's why I never got impatient with them. OTOH, that were the times when 90% of forum members bashed BFL as scammers and the amount of orders might have been low.

zefir

donator

Activity: 919

Merit: 1000

Quote from: pieppiep on February 27, 2012, 12:39:55 PM

Quote from: zefir on February 27, 2012, 12:12:29 PM

One thing to consider is the overhead for queuing work and checking results: assume the serial communication goes over 115.2kbps 8N1, sending a job request takes around 4ms, if you added starting nonce and length about 5ms. Checking for results adds another ms.

At 800 MH/s the BitForce needs 5secs for the whole nonce range. Splitting up the work into e.g. 64 chunks to get latency down to 80ms will cost you about 350ms for communication. Thats 7% of total idle time for your BFL... Might still be worth considering as a mean to prevent chips from running hot Wink

Unless the input/output is buffered before sent to the sha256 engine.
In that case it doesn't matter how often you send new work.

As folks said above: we need our BitForces to turn this speculation into a solid discussion. If BFL assumed that issuing a new work request while the engine is still busy invalidates current work, they might stop it as soon as they receive the 'ZDX' job request command.

Or they already knew about P2Pool and check for results supports incremental reporting and everything is fine. We'll see soon

cablepair

hero member

Activity: 896

Merit: 1000

Buy this account on March-2019. New Owner here!!

Edit:

Just so you know BFL just emailed me, they apologized for the delay and said they are concentrating on getting orders out and will process mine this week , now I can be happy and stop bitchin ! Wink

Thanks sonny @ bfl !

Epoch

legendary

Activity: 922

Merit: 1003

Quote from: jddebug on February 26, 2012, 09:32:58 PM

I got an email back from Sonny the other day. He's saying late this week for my singles.

I run Mac. Wondering if cgminer will compile for me and support the singles?

@jddebug, when did you place that order? I'm guessing November or December? Would give us an idea of how far along BFL is with their backlog. Undecided

jamesg

vip

Activity: 1358

Merit: 1000

AKA: gigavps

Quote from: cablepair on February 27, 2012, 01:01:54 PM

Quote from: jamesg on February 27, 2012, 12:23:11 PM

I have been watching this thread and want to thank everyone who has participated lately. It's great to see discussion about if/how to improve the BFL code and firmware without all of the BFL bashing going on.

Gigavps: any tips on how I can actually order their products bro? I placed an order four days ago , sent multiple emails and calls , I can't get anyone from bfl to even talk to me, I know they must be busy but damn ... Is it that hard for them to setup an order and take my money?

I don't know what to tell you. I have yet to meet the guys from BFL in person and placed my orders months before now.

I am sure they are swamped with all of the orders flying.

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: zefir on February 27, 2012, 12:12:29 PM

One thing to consider is the overhead for queuing work and checking results: assume the serial communication goes over 115.2kbps 8N1, sending a job request takes around 4ms, if you added starting nonce and length about 5ms. Checking for results adds another ms.

At 800 MH/s the BitForce needs 5secs for the whole nonce range. Splitting up the work into e.g. 64 chunks to get latency down to 80ms will cost you about 350ms for communication. Thats 7% of total idle time for your BFL... Might still be worth considering as a mean to prevent chips from running hot Wink

Good point but even with that limitation there has to be an optimal value.

Instead of 64 chunks say it processed 2^29 values per "chunk" making 8 chunks per nonce range and (assumming your math is correct) ~48ms idle time every 5 seconds or 1%. If you prevent 1% less shares being lost then you are coming out ahead. right?

There are other ways to improve miner-board communication (buffering next workload, having a "halt & return command" to force a smaller execution window, etc). Without knowing the capabilities of the board it is hard to know what could be done.

Who knows maybe my interpretation of the docs is wrong. Looking at luke code though it looks like he interpreted it the same way.

I don't know if giga or Inaba has p2pool setup but it would be interesting to compare normal non-merged mining pool (avg 600 sec LP interval), merged mining pool (~250 sec LP interval), and p2pool (~10 sec LP interval).

cablepair

hero member

Activity: 896

Merit: 1000

Buy this account on March-2019. New Owner here!!

Quote from: jamesg on February 27, 2012, 12:23:11 PM

I have been watching this thread and want to thank everyone who has participated lately. It's great to see discussion about if/how to improve the BFL code and firmware without all of the BFL bashing going on.

Gigavps: any tips on how I can actually order their products bro? I placed an order four days ago , sent multiple emails and calls , I can't get anyone from bfl to even talk to me, I know they must be busy but damn ... Is it that hard for them to setup an order and take my money?

pieppiep

hero member

Activity: 1596

Merit: 502

Quote from: zefir on February 27, 2012, 12:12:29 PM

One thing to consider is the overhead for queuing work and checking results: assume the serial communication goes over 115.2kbps 8N1, sending a job request takes around 4ms, if you added starting nonce and length about 5ms. Checking for results adds another ms.

At 800 MH/s the BitForce needs 5secs for the whole nonce range. Splitting up the work into e.g. 64 chunks to get latency down to 80ms will cost you about 350ms for communication. Thats 7% of total idle time for your BFL... Might still be worth considering as a mean to prevent chips from running hot Wink

Unless the input/output is buffered before sent to the sha256 engine.
In that case it doesn't matter how often you send new work.

jamesg

vip

Activity: 1358

Merit: 1000

AKA: gigavps

I have been watching this thread and want to thank everyone who has participated lately. It's great to see discussion about if/how to improve the BFL code and firmware without all of the BFL bashing going on.

zefir

donator

Activity: 919

Merit: 1000

One thing to consider is the overhead for queuing work and checking results: assume the serial communication goes over 115.2kbps 8N1, sending a job request takes around 4ms, if you added starting nonce and length about 5ms. Checking for results adds another ms.

At 800 MH/s the BitForce needs 5secs for the whole nonce range. Splitting up the work into e.g. 64 chunks to get latency down to 80ms will cost you about 350ms for communication. Thats 7% of total idle time for your BFL... Might still be worth considering as a mean to prevent chips from running hot Wink

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: rjk on February 27, 2012, 09:32:12 AM

Also, it appears according to the spec (at least how I read it) that nonces are returned as found, not at the end of a cycle. However this would need testing. The polling interval is 10ms, so if it continues working after finding nonces, there would be no issue with p2pool, as long as those nonces were gathered DURING the cycle, and not at the end. Again, needs testing.

Given only four statuses:
BUSY
IDLE
NONCE-FOUND
NO-NONCE

If status changes to "NONCE-FOUND" when it finds a nonce how do you know when it is finished w/ the entire nonce range?

I would have imagined if the device returned nonces are found it would have status like:
IDLE
BUSY-NO-NONCE
BUSY-NONCE-FOUND
FINISHED-NO-NONCE
FINISHED-NONCE-FOUND

Looking at the code it would appear it waits for NONCE-FOUND or NO-NONCE and then loads new data which would indicate that the status "BUSY" is used until batch completes.
https://github.com/ckolivas/cgminer/blob/master/bitforce.c

Code:

	while (1) {
		BFwrite(fdDev, "ZFX", 3);
		BFgets(pdevbuf, sizeof(pdevbuf), fdDev);
		if (unlikely(!pdevbuf[0])) {
			applog(LOG_ERR, "Error reading from BitForce (ZFX)");
			return 0;
		}
		if (pdevbuf[0] != 'B')
		    break;
		usleep(10000);
		i += 10;
	}
	applog(LOG_DEBUG, "BitForce waited %dms until %s\n", i, pdevbuf);
	work->blk.nonce = 0xffffffff;
	if (pdevbuf[2] == '-')
		return 0xffffffff;
	else if (strncasecmp(pdevbuf, "NONCE-FOUND", 11)) {
		applog(LOG_ERR, "BitForce result reports: %s", pdevbuf);
		return 0;
	}

	pnoncebuf = &pdevbuf[12];

	while (1) {
		hex2bin((void*)&nonce, pnoncebuf, 4);
#ifndef __BIG_ENDIAN__
		nonce = swab32(nonce);
#endif

		submit_nonce(thr, work, nonce);
		if (pnoncebuf[8] != ',')
			break;
		pnoncebuf += 9;
	}

	return 0xffffffff;

I agree though if the status changes to "NONCE-FOUND" as soon as a nonce is found then there is no issues w/ p2pool (or other shorter LP intervals).

rjk

sr. member

Activity: 448

Merit: 250

1ngldh

Quote from: DeathAndTaxes on February 27, 2012, 09:22:01 AM

Since each chip is ~400 MH/s it is ~10 seconds to complete a "batch" and thus you would lose a lot of potential work with p2pool. I wonder if BFL can field update the firmware to provide either fixed smaller nonce range or user variable nonce range (2 values added to start job command (starting-nonce = uint32 value, nonce-range where # of hashes = 2^(nonce-range).

Remember according to Inaba's tests, it seems to be single threaded - it got 500% efficiency and took about 50 seconds to complete a block header from what I recall. So apparently the nonce is already being split between the 2 FPGAs.

Also, it appears according to the spec (at least how I read it) that nonces are returned as found, not at the end of a cycle. However this would need testing. The polling interval is 10ms, so if it continues working after finding nonces, there would be no issue with p2pool, as long as those nonces were gathered DURING the cycle, and not at the end. Again, needs testing.

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: rjk on February 27, 2012, 09:19:07 AM

Quote from: DeathAndTaxes on February 27, 2012, 09:17:16 AM

Quote from: rjk on February 27, 2012, 09:13:20 AM

Quote from: DeathAndTaxes on February 27, 2012, 09:10:40 AM

Looking at the (very limited) data BFL provided in cgminer thread it looks like the BFL singles can't be interrupted once they begin a cycle.

They posted in the cgminer thread to give specifics of the interface spec. One of the commands does allow the work in progress to be discarded immediately and begin again from new work. Sorry I don't have a link just at the moment.

So the only other missing piece of the puzzle is does BFL return shares as found or does it wait until end of nonce range.

If shares are returned as found then there is no "loss" you simply "reset" the BFL at each LP and will have perfect efficiency. If shares are only returned at the end of the nonce range (i.e. it performs 2^32 nonces and returns all shares found) then there will be lower efficiency the shorter the LP interval is.

I got to find that post.

https://bitcointalksearch.org/topic/m.692304

So the wording is vague but it looks like it only returns nonce at the end of the block interval

Code:

3) Checking for results
-------------------------------
After the sent job, driver must keep asking the device for status (10ms is preferred polling interval).
Status command is 'ZFX'. Once sent, the unit may respond with one of the predefined responses:
   
    * BUSY
    (Device is busy processing a block. You can still issue a job, but the previous process will be 
     discarded and new process will start)

    * IDLE
    (Device is not processing anything)

    * NONCE-FOUND:<8 Hexadecimal characters defining the found Nonce>, <8 Hexa decimal of the second found nonce>...
    Note: The last nonce will not be terminated by a comma. The byte-ordering is Little-Endian
    Example: NONCE-FOUND:1234ABCD,2468EFAB,1111BBBB   (3 valid nonces are discovered in this process)

    * NO-NONCE
    Processing has been finished, no valid nonce was detected.

It doesn't look like you can provide a nonce start value or nonce range so I am once again assuming (spec is very vaguely worded) is works only on full 2^32 nonce range.

This would indicate some valid work is being "left behind".

rjk

sr. member

Activity: 448

Merit: 250

1ngldh

Quote from: DeathAndTaxes on February 27, 2012, 09:17:16 AM

Quote from: rjk on February 27, 2012, 09:13:20 AM

Quote from: DeathAndTaxes on February 27, 2012, 09:10:40 AM

Looking at the (very limited) data BFL provided in cgminer thread it looks like the BFL singles can't be interrupted once they begin a cycle.

They posted in the cgminer thread to give specifics of the interface spec. One of the commands does allow the work in progress to be discarded immediately and begin again from new work. Sorry I don't have a link just at the moment.

So the only other missing piece of the puzzle is does BFL return shares as found or does it wait until end of nonce range.

If shares are returned as found then there is no "loss" you simply "reset" the BFL at each LP and will have perfect efficiency. If shares are only returned at the end of the nonce range (i.e. it performs 2^32 nonces and returns all shares found) then there will be lower efficiency the shorter the LP interval is.

I got to find that post.

https://bitcointalksearch.org/topic/m.692304

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: rjk on February 27, 2012, 09:13:20 AM

Quote from: DeathAndTaxes on February 27, 2012, 09:10:40 AM

Looking at the (very limited) data BFL provided in cgminer thread it looks like the BFL singles can't be interrupted once they begin a cycle.

They posted in the cgminer thread to give specifics of the interface spec. One of the commands does allow the work in progress to be discarded immediately and begin again from new work. Sorry I don't have a link just at the moment.

So the only other missing piece of the puzzle is does BFL return shares as found or does it wait until end of nonce range.

If shares are returned as found then there is no "loss" you simply "reset" the BFL at each LP and will have perfect efficiency. If shares are only returned at the end of the nonce range (i.e. it performs 2^32 nonces and returns all shares found) then there will be lower efficiency the shorter the LP interval is.

I got to find that post.

rjk

sr. member

Activity: 448

Merit: 250

1ngldh

Quote from: DeathAndTaxes on February 27, 2012, 09:10:40 AM

Looking at the (very limited) data BFL provided in cgminer thread it looks like the BFL singles can't be interrupted once they begin a cycle.

They posted in the cgminer thread to give specifics of the interface spec. One of the commands does allow the work in progress to be discarded immediately and begin again from new work. ~~Sorry I don't have a link just at the moment.~~

EDIT: I found it:

Quote from: BFL on January 14, 2012, 07:05:09 PM

* BUSY
(Device is busy processing a block. You can still issue a job, but the previous process will be
discarded and new process will start)

DeathAndTaxes

donator

Activity: 1218

Merit: 1080

Gerald Davis

Quote from: makomk on February 27, 2012, 05:23:39 AM

Looks like cgminer will do actually. There are basically two choices. You can carry on working on every work unit to completion even though you get a longpoll, which is what cgminer does, or you can submit a new work unit to the single and it'll discard any results it found so far for the old work unit and start working on the new one.

I don't think you can do that. You certainly can't do that with a GPU. Once it starts working it is asynchronous. It simply works until completed. This is why setting cgminer to an insane intensity like 17 is foolish as you have no loaded the GPU down with 4 billion hashes. It will takes 10 to 20 or more seconds to complete and if a longpoll occurs cgminer can only wait as the GPU hashes away on worthless data.

Looking at the (very limited) data BFL provided in cgminer thread it looks like the BFL singles can't be interrupted once they begin a cycle.

Quote

Either way you lose out, it's just a question of whether you lose by throwing away shares or lose by working on work units that are stale.

Well the former is much better than the later. Once LP occurs any work completed but not submitted is worthless. That doesn't change but what matter is how much MORE worthless work will be completed. If you can interrupt the single you can prevent 0 hashes more worthless work if you can't then on average you will lose an entire "batch workload" on each LP. I can't tell from data BFL provides if BFL singles work on a full 2^32 in one batch or use a smaller fixed batch size. The optimal solution would be a firmware which allows an "intensity like value" so the miner can give the hardware a starting nonce and # of nonces to perform.

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

Quote from: makomk on February 27, 2012, 05:23:39 AM

...
Looks like cgminer will do actually. There are basically two choices. You can carry on working on every work unit to completion even though you get a longpoll, which is what cgminer does, or you can submit a new work unit to the single and it'll discard any results it found so far for the old work unit and start working on the new one. Either way you lose out, it's just a question of whether you lose by throwing away shares or lose by working on work units that are stale.

Hmmm, that suggests something rather unexpected.
I guess I'll have to verify you're correct next time I see Luke-jr in IRC ... coz that is a waste as you suggest
I was pretty sure that the BFL can handle nonce ranges but I'm not sure now Tongue

If the rest below seems off-topic - it actually isn't coz the code for both BFL and Icarus is very similar.
The Icarus code is a butchered copy of the BFL code but redesigned to handle the way Icarus works
(and that will be enhanced more once I get my 2 Icarus in the next 4 or 5 days)

With the current firmware in the Icarus, it has the problem of only ever returning 1 nonce.
Now that would seem bad ... but looking at what you said above, it actually isn't all that bad after all ...

For an LP, when you overwrite the current work, you know you only have a very small chance of throwing anything away
(no share nonces exist before the current point in the full nonce range otherwise it would have returned one already)
It could be working on a share nonce at the time you overwrote it - but that's quite unlikely - and the chance of that would be 1 in (2^32 divided by however many nonces could be checked in the amount of time it would take to overwrite the current work)

The other case (that would seem bad but isn't really either) is when you get a nonce reply from Icarus, it has stopped work and you start another work and thus have only a very small (but different) chance of throwing anything away if both FPGA's happen the find an answer at the same time
The result of this is of course you on average halve your efficiency - but that doesn't really mean anything worthy of concern

Nonce ranges would reduce the impact on BFL (if they exist)
That just adds the extra overhead of starting each nonce range and a smaller nonce range is the maximum wasted processing time (if the pool doesn't accept stale shares)

... oh and lastly, in cgminer you can also manually enable stale share submission with the --submit-stale option if the pool doesn't pass that to info to cgminer

Topic: Butterfly Labs - Bitforce Single and Mini Rig Box - page 80. (Read 186974 times)