Author

Topic: Looking for FPGA cgminer testers. (Read 3796 times)

legendary
Activity: 1795
Merit: 1208
This is not OK.
June 29, 2012, 01:41:45 AM
#25
made a windows binary with ocl and all fpga's.

works for me, hopefully it works elsewhere too Wink

https://github.com/downloads/pshep/cgminer/cgminer-2.4.3-win32.zip
member
Activity: 112
Merit: 10
June 27, 2012, 10:18:08 PM
#24
I have 4 icarus on win7 32.

No idea how to compile it, so if you can run me up a win32 executable I can test it with icarus hardware.

kind regards
member
Activity: 243
Merit: 10
June 26, 2012, 12:01:49 AM
#23
I've got a X6500 and an Asus E450 ITX board on order. After they come in and I have time to find the right Linux OS to use for my FPGA controller, I'll try your version.
legendary
Activity: 1795
Merit: 1208
This is not OK.
June 25, 2012, 08:20:38 PM
#22
Alright Kano, you win. Trivial to add, so I've done it...

if either user or pool set to not accept stales, work reset is checked, else it carries on with the job and submits the stales.
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
June 25, 2012, 07:36:58 PM
#21
Hmm - you seem to have completely missed the point of my last post.

Yes aborting work should be an option.

However, it must only do that when told to - i.e. if the user specified --no-submit-stale (and the getwork doesn't say to submit stales)

If you abort work on a pool that does accept stales, then you are reducing the number of accepted shares with BFL devices.

You're giving an example based on some pool that doesn't accept stales - which of course would mean you should use --no-submit-stale and then your new BFL code should abort work.
legendary
Activity: 1795
Merit: 1208
This is not OK.
June 25, 2012, 07:28:08 PM
#20
It still does submit stales though. My logs still show plenty being submitted, just not the 2-3% I was getting. Now it's 0.4%.
I see you point, stick to what the command has set... but if you do that, one way or the other you lose performance.

Disable stales:
No stales are submitted potentially losing valid shares
Re-start work quickly and get on with new block

Enable stales:
Potentially gain some shares through submitting stales
Potentially lose up to 5s of work on each device

It's marginal either way, but why not take the best from both worlds and maximise as far as we can the performance, at the sacrifice not strictly doing what the user has asked. How many users do you think give a crap about whether stales are submitted or not over getting the best performance? My U has gone up about 0.5 from 69.2 to 69.7. Fred's has gone up 0.8.

If BFL worked the way every other device worked and returned a result as soon as it found it, of course we'd do it your way. We've got to handle what we're dealt the best we can, eh?
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
June 25, 2012, 06:58:04 PM
#19
...
Well, that's the thing, either way the work is lost, no? It's a matter of minimizing work lost. We can submit shares which may/may not be accepted, or we can restart work which we know will be accepted. As you say, this is only really a problem on P2Pool, which is problem anyway, so what's lost?
Well, no it's not an actual problem as such.
How the code must work is quite straight forward - as I said:
If --no-submit-stale is set and the getwork didn't say to submit stale, then yes abort.
i.e. the choice is the user's with using "--no-submit-stale" or the pool's by saying to submit stale in the getwork
legendary
Activity: 1795
Merit: 1208
This is not OK.
June 25, 2012, 02:12:20 PM
#18
Actually it'd be interest to get real performance data from someone with BFLs to mine on P2Pool with the existing 2.4.3, and my version. I'm quite curious Smiley
legendary
Activity: 1795
Merit: 1208
This is not OK.
June 25, 2012, 12:48:19 PM
#17
Not sure if you are doing this so:
As I  have mentioned to luke-jr so I'll mention to you also Smiley
The BFL abort should only be done if --no-submit-stale is enabled and the getwork said to not submit stale.
(i.e. you need to somehow check those two before aborting the work)

Reasons:
1) If you abort work on a pool that allows stale shares, then when you abort you may be throwing away shares (since BFL doesn't tell you what shares you have worked out already when you abort the work) - so on such a pool (or a getwork that says to submit stale) you'd never abort the work
2) On p2pool only ~1 in every ~60 LPs represent a real BTC LP - for all other LP's, if the stale work is a full difficulty block, it is a valid payable BTC block - and p2pool will send it to the bitcoind ... and throwing away blocks is bad Smiley

Of course no one in their right mind would mine on p2pool with a BFL since either you throw away blocks or you throw away shares - you must do one or the other with a BFL on p2pool.

Well, that's the thing, either way the work is lost, no? It's a matter of minimizing work lost. We can submit shares which may/may not be accepted, or we can restart work which we know will be accepted. As you say, this is only really a problem on P2Pool, which is problem anyway, so what's lost?
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
June 25, 2012, 12:02:34 PM
#16
Not sure if you are doing this so:
As I  have mentioned to luke-jr so I'll mention to you also Smiley
The BFL abort should only be done if --no-submit-stale is enabled and the getwork said to not submit stale.
(i.e. you need to somehow check those two before aborting the work)

Reasons:
1) If you abort work on a pool that allows stale shares, then when you abort you may be throwing away shares (since BFL doesn't tell you what shares you have worked out already when you abort the work) - so on such a pool (or a getwork that says to submit stale) you'd never abort the work
2) On p2pool only ~1 in every ~60 LPs represent a real BTC LP - for all other LP's, if the stale work is a full difficulty block, it is a valid payable BTC block - and p2pool will send it to the bitcoind ... and throwing away blocks is bad Smiley

Of course no one in their right mind would mine on p2pool with a BFL since either you throw away blocks or you throw away shares - you must do one or the other with a BFL on p2pool.
legendary
Activity: 1795
Merit: 1208
This is not OK.
June 25, 2012, 11:20:27 AM
#15
Thanks for that Kano, I'm just about ready to submit a pull request now, Sorting out one more thing...

Looks good there fred Smiley
sr. member
Activity: 349
Merit: 250
June 25, 2012, 10:26:16 AM
#14
Some numbers from testing BFL rev2 x16 running 800Mh/s Firmware

I disrupted the results by disconnecting the power on one unit mistakenly and did not notice, so results might be a teeny bit better for the pshep changes to cgminer.

Running under ubuntu 12.04 64-bit

CGMinerRejectAcceptedUtilMH/sGetworkRemoteLocalDiscardFoundHW ErrNetworkUptime hh:mm:ssRej %Eff
std8249700825174.831251492038870085391020044866:48:431.16%761%
pshep588367158175.68125753988810235339036850021334:49:580.16%920%



legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
June 23, 2012, 10:47:21 AM
#13
Oh - you have a thread.
You didn't mention that Smiley

Yeah been running for a while - but it's 1:45am
I'll leave it running overnight anyway.

If you show up shortly in #cgminer I'll give you the link to see my rig (or in the morning when I wake up)
But yeah it's mining and showing the same av MH/s mine does.
legendary
Activity: 1795
Merit: 1208
This is not OK.
June 19, 2012, 10:44:35 PM
#12
Great Smiley

Just need someone who has an FPGA other then BFL to test it...
full member
Activity: 206
Merit: 100
Mostly Harmless...
June 19, 2012, 08:44:14 PM
#11
Just wanted to update, it's been 24 hours and it's working like a champ, a solid 8.5gh with no issues.
sr. member
Activity: 252
Merit: 250
Inactive
June 19, 2012, 11:39:42 AM
#10
I guess there's no hope for scan-serial to work for BFL's in Windows, eh?

Well I don't know... an ugly way maybe just to try an open each port in turn. But then where do you stop... 8? 16? 100? That might take a while...
Can they be done in parallel? Ufasoft is able to do it, somehow.

Of course.  Separate pool of "test" threads (say 10-20) could signal the main "scan" thread when they are done so that new port can be assigned to be scanned.  The upper limit can be read from the OS.  The "scan" thread would assign "untested" ports to worker "test" threads.

The BFL I/O code should be non-blocking, overlapped IO so that scanning can be stopped if needed.

When/if this is accepted, I'll look into it. I also want to have it scan port during operation so you can yank out and replace/add devices while it's running.

Not sure if this helps, but reserved for use COM identifiers are located here.  Records in a key of sorts what has ever been assigned - offline devices, too.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\COM Name Arbiter

Problem with this method is unplug/replug of devices without reboot results in incrementing COM id's.

legendary
Activity: 1795
Merit: 1208
This is not OK.
June 19, 2012, 10:46:48 AM
#9
I guess there's no hope for scan-serial to work for BFL's in Windows, eh?

Well I don't know... an ugly way maybe just to try an open each port in turn. But then where do you stop... 8? 16? 100? That might take a while...
Can they be done in parallel? Ufasoft is able to do it, somehow.

Of course.  Separate pool of "test" threads (say 10-20) could signal the main "scan" thread when they are done so that new port can be assigned to be scanned.  The upper limit can be read from the OS.  The "scan" thread would assign "untested" ports to worker "test" threads.

The BFL I/O code should be non-blocking, overlapped IO so that scanning can be stopped if needed.

When/if this is accepted, I'll look into it. I also want to have it scan port during operation so you can yank out and replace/add devices while it's running.
full member
Activity: 206
Merit: 100
Mostly Harmless...
June 19, 2012, 01:01:38 AM
#8
I tried it out on OSX, seems to work pretty well.

With the stock cgminer, I've been noticing a decline in hashrate over the course of about 4 hours (from 8.5gh down to 7.Cool so I've been restarting it when I notice it dropping.  I've been running your version for the last five hours, and it's still up at 8.5gh.  Thanks a ton, I'll keep you updated.

I should add that after about 4 hours, one or more of the BFLs will drop below a U of 10 (normally around 8, sometimes down to 6.  this behavior started after I moved my rig back, so it might be how I laid everything out, I was thinking it was probably noise across all the USB cables).  Right now, they are all at or above 11.8, with one at 11.3, much much better.
rjk
sr. member
Activity: 448
Merit: 250
1ngldh
June 18, 2012, 07:35:45 PM
#7
I guess there's no hope for scan-serial to work for BFL's in Windows, eh?

Well I don't know... an ugly way maybe just to try an open each port in turn. But then where do you stop... 8? 16? 100? That might take a while...
Can they be done in parallel? Ufasoft is able to do it, somehow.
legendary
Activity: 1795
Merit: 1208
This is not OK.
June 18, 2012, 07:32:35 PM
#6
I guess there's no hope for scan-serial to work for BFL's in Windows, eh?

Well I don't know... an ugly way maybe just to try an open each port in turn. But then where do you stop... 8? 16? 100? That might take a while...
sr. member
Activity: 252
Merit: 250
Inactive
June 18, 2012, 07:08:55 PM
#5



I guess there's no hope for scan-serial to work for BFL's in Windows, eh?
sr. member
Activity: 252
Merit: 250
Inactive
June 18, 2012, 07:07:06 PM
#4



Really appreciated, P_Shep
legendary
Activity: 1795
Merit: 1208
This is not OK.
June 18, 2012, 07:02:50 PM
#3
I tried actually, just to see if my compiler would!

It wouldn't.

Actually I think the compiler would, but the libraries are all wrong for your kind of processor, so it got nowhere.
legendary
Activity: 1064
Merit: 1000
June 18, 2012, 06:55:00 PM
#2
if you compile for the tplink mips toy I can test =)
legendary
Activity: 1795
Merit: 1208
This is not OK.
June 18, 2012, 02:16:07 PM
#1
I've been working on many MANY improvements for BFL support in cgminer.
I think I'm about ready to submit a pull request into cgminer, but I still need to test how it operates with other FPGAs.
Essentially it should have no effect on Icarus, ztex and ModMiner, but since don't have any of those, I can't test them so I'm hoping some kind volunteers could compile my fork here, and run it for a while:
https://github.com/pshep/cgminer

The most significant change for FPGAs is the inclusion of 'SICK' and 'DEAD' processing, which was previously reserved only for OpenCL devices. For Icarus, Ztex and ModMiner, this should tell you if they are sick or dead (for BFL it'll attempt to re-init the device).

For BFL devices, my changes do the following:
- Timeout to restart work if it's taking too long
   A nonce range should take just over 5s. Any longer and device is throttling.
   Fixes issue where BFL appears to stall in cgminer
- Count throttling as a zero hash error
- Temp taken in watchdog thread
  Now a disabled device will still return a current temperature, rather than the last value before disabling.
- Work restart on new work
  This was cause very high stale rate for me...
  Previously on a work restart, cgminer would allow the BFL to continue with the stale block. Now this is checked, and while and nonces found in that time will be wasted, the work will be discarded and new work will be started immediately and not after the 5s the BFL takes to return results.
- Timing adjustments
   The 'wait for results' was hard-set to 4500ms before polling at 10ms intervals. With variation of systems and new firmwares of differing hash rates, a hard set timer could be either inefficient (starts polling way before necessary) or wasteful (starts polling way after result is ready). The auto-adjustment will find the correct wait time to minimize polling and and delay retrieving results.
- Device re-initialization
   When a device is disabled (for whatever reason - user or by cgminer) then re-enabled, the device is re-initialized, rather then assuming communications are still working.
- Sick / Dead monitoring
  As with OpenCL devices, BFL devices will be checked for sickness (60s no response) or dead (10 mins no response) and try to re-initialize them.
- Improved logging
   Most logs now include the device in question, i.e.: "BFL0: took longer than 15s"
- Device start offset
  Delays the start of each device by a random time between 0 and 100ms so that they don't all make calls at exactly the same time.

If you can help, I'll be much appreciative.

Thanks Smiley


Jump to: