Pages:
Author

Topic: (OLD) BFGMiner: modular FPGA/GPU, GBT, Stratum, RPC, Avalon/Lnx/OpnWrt/PPA/W64 - page 31. (Read 260035 times)

legendary
Activity: 2576
Merit: 1186
Looks like CGMiner broke other things in 2.5.0, killing performance on BFL Singles and declaring ModMiners sick. If you use either of these FPGAs or encounter other problems (please report them!), feel free to use 2.4.4 until I get them resolved:
legendary
Activity: 2576
Merit: 1186
except for Con's "workaround" which discards valid shares from BFL devices in order to avoid stales.
Lol. It still discards them Smiley

Re-check your code there Luke Wink
Care to elaborate? I did another look over the code and seem to have properly handled every case that discards work.

OK, since you still haven't found it yet (I'm looking at git btw):

restart_cond
I don't see any case where that actually aborts the job, but you're right about it being a bug (which I suppose would make it start polling too early). Thanks.
legendary
Activity: 1795
Merit: 1208
This is not OK.
except for Con's "workaround" which discards valid shares from BFL devices in order to avoid stales.
Lol. It still discards them Smiley

Re-check your code there Luke Wink
Care to elaborate? I did another look over the code and seem to have properly handled every case that discards work.

OK, since you still haven't found it yet (I'm looking at git btw):

restart_cond
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
NEW VERSION 2.5.0, JULY 7 2012
...
Better Puppet Master integration is also still in the works.
...
Other than adding --api-puppet-master (which of course "I" would never add to cgminer), what is missing from --api-groups?
legendary
Activity: 2576
Merit: 1186
except for Con's "workaround" which discards valid shares from BFL devices in order to avoid stales.
Lol. It still discards them Smiley

Re-check your code there Luke Wink
Care to elaborate? I did another look over the code and seem to have properly handled every case that discards work.
legendary
Activity: 2576
Merit: 1186
except for Con's "workaround" which discards valid shares from BFL devices in order to avoid stales.
It's a feature, not a bug, and it's been in the firmware of the Single right from the beginning. There's no reason to not use it, and you can't guarantee that a nonce was valid, especially after a longpoll.
It's a bug. There is nothing to gain (except false advertising of lower stales), and plenty to lose (valid shares discarded).
Fix your pool then, because a longpoll is only supposed to be sent when the current work has been invalidated. If you accept previous work as valid after a longpoll, you risk orphan blocks. Roll Eyes
No, that is not correct. Longpolls can be sent for a variety of reasons, not all of which result in orphans.

Edit: And pools that don't longpoll outside of new blocks are either producing orphan-shares more often, or are harmful to the Bitcoin network.
rjk
sr. member
Activity: 448
Merit: 250
1ngldh
except for Con's "workaround" which discards valid shares from BFL devices in order to avoid stales.
It's a feature, not a bug, and it's been in the firmware of the Single right from the beginning. There's no reason to not use it, and you can't guarantee that a nonce was valid, especially after a longpoll.
It's a bug. There is nothing to gain (except false advertising of lower stales), and plenty to lose (valid shares discarded).
Fix your pool then, because a longpoll is only supposed to be sent when the current work has been invalidated. If you accept previous work as valid after a longpoll, you risk orphan blocks. Roll Eyes
legendary
Activity: 2576
Merit: 1186
except for Con's "workaround" which discards valid shares from BFL devices in order to avoid stales.
It's a feature, not a bug, and it's been in the firmware of the Single right from the beginning. There's no reason to not use it, and you can't guarantee that a nonce was valid, especially after a longpoll.
It's a bug. There is nothing to gain (except false advertising of lower stales), and plenty to lose (valid shares discarded).
legendary
Activity: 1795
Merit: 1208
This is not OK.
except for Con's "workaround" which discards valid shares from BFL devices in order to avoid stales.
It's a feature, not a bug, and it's been in the firmware of the Single right from the beginning. There's no reason to not use it, and you can't guarantee that a nonce was valid, especially after a longpoll.

Lol. It still discards them Smiley

Re-check your code there Luke Wink
rjk
sr. member
Activity: 448
Merit: 250
1ngldh
except for Con's "workaround" which discards valid shares from BFL devices in order to avoid stales.
It's a feature, not a bug, and it's been in the firmware of the Single right from the beginning. There's no reason to not use it, and you can't guarantee that a nonce was valid, especially after a longpoll.
legendary
Activity: 2576
Merit: 1186
NEW VERSION 2.5.0, JULY 7 2012

Working on a big update to improve Mini Rig (and FPGA performance in general), but it's not quite done yet, so it'll need to wait for 2.6.0. Better Puppet Master integration is also still in the works.

In the meantime, I've incorporated most of the changes from CGMiner 2.5.0, except for Con's "workaround" which discards valid shares from BFL devices in order to avoid stales. On average, the shares discarded are about the same as the stales avoided, so the effective difference here is that CGMiner lies about them while BFGMiner reports them honestly. I'll publish some Utility comparison statistics between the two in a few days regardless of which performs better.

Human readable changelog:
  • Partial support for use of BFL Mini Rigs on p2pool through use of the --bfl-range option, provided you have a new enough minirig that supports the nonce-range feature. By default this feature is disabled because it costs ~1% in hashrate, but given the massive loss of hashes you would otherwise have mining on p2pool, this is worth using. Other miners should leave it disabled.
  • Huge update to other bitforce device. I've merged all of p_shep's changes into this code (thanks!). These can reenable devices, time out gracefully and mark them sick/overheated and so on.
  • Fixed the dynamic GPU intensity behaviour which was getting stuck at -10 on faster GPUs.
  • Updated API code with lots of changes under the hood courtesy of Kano, and updated miner.php.

Full changelog
  • Fix BitFORCE driver to not silenty discard valid shares (bug introduced by CGMiner merges)
  • Fix --benchmark not working since the dynamic addition of pools and pool stats.
  • Make disabling BFL nonce range support a warning since it has to be explicitly enabled on the command line now.
  • miner.php allow renaming table headers
  • Make bitforce nonce range support a command line option --bfl-range since enabling it decrease hashrate by 1%.
  • Add sanity checking to make sure we don't make sleep_ms less than 0 in bitforce.
  • The fastest minirig devices need a significantly smaller starting sleep time.
  • Use a much shorter initial sleep time to account for faster devices and nonce range working, and increase it if nonce range fails to work.
  • Use nmsleep instead of usleep in bitforce.
  • Provide a ms based sleep function that uses nanosleep to avoid the inaccuracy of usleep on SMP systems.
  • delay_time_ms is always set so need not be initialised in bitforce.
  • Increase bitforce timeout to 10 seconds.
  • Add more hysteresis and poll ~5 times to allow for timer delays in bitforce devices.
  • miner.php allow alternating line colours (off by default)
  • Display the actual duration of wait when it is greater than the cutoff.
  • Set nonce to maximum once we determine nonce range support is broken.
  • Initial wait time is always known so no need to zero it beforehand in bitforce.
  • No point counting wait time until the work is actually sent to bitforce devices.
  • Use string comparison functions instead of explicit comparisons.
  • Account for wait_ms time when nonce_range is in use on BFL.
  • Split nonces up into 1/5 chunks when nonce range is supported.
  • limit clear buffer iterations.
  • Ad fd check to clear buffer.
  • miner.php remove incorrect 'DATE' error message
  • miner.php allow summary header in custom pages
  • Disable nonce range support in BFL when broken support is detected.
  • Restart_wait is only called with a ms value so incorporate that into the function.
  • Only try to adjust dev width when curses is built in.
  • miner.php define custom sum fields as a simple array
  • Fix off-by-one error in nonce increment in bfl.
  • Use BE when setting nonce in bitforce nonce range work.
  • Enable nonce range in the normal init sequence for bfl.
  • Queue extra work at 2/3 differently depending on whether we're using nonce range or not.
  • Initially enable support for nonce range support on bfl, splitting nonces up into 3/4 size and only disable it if it fails on work submit.
  • Attempt to detect nonce range support in BFL by sending work requring its support.
  • Limit retrying on busy for up to BITFORCE_TIMEOUT_MS
  • Attempt to initialise while bitforce device returns BUSY.
  • Extend length of string that can be passed to BFL devices.
  • Fix signedness warning.
  • Adjust device width column to be consistent.
  • Use cgpu-> not gpus[] in watchdog thread.
  • Add api stats (sleep time)
  • Timing tweaks Added long and short timeouts, short for detecting throttling, long to give up totally. Reset sleep time when device re-initialised Still check results after timeout Back up a larger time if result on first poll.
  • Add API Notify counter 'Comms Error'
  • Style police on api.c
  • Do all logging outside of the bitforce mutex locking to avoid deadlocks.
  • Remove applog call from bfwrite to prevent grabbing nested mutexes.
  • Bitforce style changes.
  • Minor style changes.
  • Remove needless roundl define.
  • Made JSON error message verbose.
  • Fine-tune timing adjustment. Also remove old work_restart timing.
  • Check for gpu return times of >= 0, not just 0, to fix intensity dropping to -10.
  • Restart is zeroed in the mining thread so no need to do it inside the bitforce code.
  • More improvements to comms. BFL return nothing when throttling, so should not be considered an error. Instead repeat with a longer delay.
  • Polling every 10ms there's not much point checking the pthread_cond_timedwait as it just adds overhead. Simply check the value of work_restart in the bfl main polling loop.
  • Use a pthread conditional that is broadcast whenever work restarts are required. Create a generic wait function waiting a specified time on that conditional that returns if the condition is met or a specified time passed to it has elapsed. Use this to do smarter polling in bitforce to abort work, queue more work, and check for results to minimise time spent working needlessly.
  • Add busy time to wait time.
  • api.c put version up to 1.14
  • Add tiny delay after writing to BFL Change BFL errors to something more human readable Send work busy re-tries after 10ms delay
  • Fix race condition in thread creation that could under some conditions crash BFGMiner at startup
member
Activity: 112
Merit: 10
NEW VERSION - 2.4.4, JULY 1 2012
I fixed a crash in the Icarus driver; thanks to wildemagic for reporting it!

Happy to help.  Also happy to report that its mining stable now.  Will report on performance after a decent time spread.

[update] : Performance seems the same as CGMiner 2.4.3 and 2.4.4.

kind regards
legendary
Activity: 2576
Merit: 1186
NEW VERSION - 2.4.4, JULY 1 2012

Mostly the same changes as CGMiner this time. Additionally, I fixed a crash in the Icarus driver; thanks to wildemagic for reporting it!

Still waiting on BFL to finish the revised Mini Rig protocol update code.

Improved compatibility with P4man's Puppet Master is planned for a future release too.

Human readable changelog:
  • Massive overhaul of the nrolltime mechanism now should cause a huge rise in efficiency on pools that support it. This allows much lower getwork bandwidth for much higher hashrates.
  • Support for the expire= feature. This works in concert with nrolltime when pools support it to allow more local generation of work.
  • Support for the x-mining-hashrate feature. I'm sure some pool somewhere cares about this, even though I'm not convinced, but it was trivial to add.
  • Better damping of GPU temperature changes should cause much less overshoot when temps rise or fall outside the optimal range in autofan mode.
  • Reinstated the application restart should adl fail - disabling this did not fix the crashes for those who had cgminer crash after 1 week of uptime in windows fail land when their ATI driver would fail, and disabled the advantage of it fixing the problem for those who simply lost their fanspeed.
  • API groups features - this is squarely aimed at grouping privileges for remote access for services like P4man's hopping puppetmaster service.
  • Support for unlimited devices
  • Support for unlimited pools
  • Massive fix for the "dynamic" feature for GPUs. Somehow in the many device abstractions it had gotten broken and wasn't really doing what it was intended. It should be much more dynamic now.
  • FPGA fixes.
  • Fixes for builds on other platforms.
  • Lots of other things under the hood.

Full changelog
  • Fix builds on non gnu platforms.
  • api.c ensure old mode is always available when not using --api-groups + quit() on param errors
  • Implement rudimentary X-Mining-Hashrate support.
  • Detect large swings in temperature when below the target temperature range and change fan by amounts dependant on the value of tdiff.
  • Adjust the fanspeed by the magnitude of the temperature difference when in the optimal range.
  • Revert "Restarting cgminer from within after ADL has been corrupted only leads to a crash. Display a warning only and disable fanspeed monitoring."
  • api.c fix json already closed
  • implement and document API option --api-groups
  • Put upper bounds to under 2 hours that work can be rolled into the future for bitcoind will deem it invalid beyond that.
  • define API option --api-groups
  • api.c allow unwell devices to be enabled so they can be cured
  • miner.php - fix/enable autorefresh for custom pages
  • miner.php allow custom summary pages - new 'Mobile' summary
  • Work around pools that advertise very low expire= time inappropriately as this leads to many false positives for stale shares detected.
  • Only show ztex board count if any exist.
  • There is no need for work to be a union in struct workio_cmd
  • fpgautils.c include a debug message for all unknown open errors
  • Don't keep rolling work right up to the expire= cut off. Use 2/3 of the time between the scantime and the expiry as cutoff for reusing work.
  • Log a specific error when serial opens fail due to lack of user permissions
  • Increase GPU timing resolution to microsecond and add sanity check to ensure times are positive.
  • Opencl code may start executing before the clfinish order is given to it so get the start timing used for dynamic intensity from before the kernel is queued.
  • fpgautils.c - set BAUD rate according to termio spec
  • fpgautils.c - linux ordering back to the correct way
  • miner.php remove unneeded '.'s
  • miner.php add auto refresh options
  • miner.php add 'restart' next to 'quit'
  • miner.php make fontname/size configurable with myminer.php
  • Make the pools array a dynamically allocated array to allow unlimited pools to be added.
  • Make the devices array a dynamically allocated array of pointers to allow unlimited devices.
  • Dynamic intensity for GPUs should be calculated on a per device basis. Clean up the code to only calculate it if required as well.
  • Bugfix: Provide alternative to JSON_ENCODE_ANY for Jansson 1.x
  • Use a queueing bool set under control_lock to prevent multiple calls to queue_request racing.
  • Use the work clone flag to determine if we should subtract it from the total queued variable and provide a subtract queued function to prevent looping over locked code.
  • Don't decrement staged extras count from longpoll work.
  • Count longpoll's contribution to the queue.
  • Increase queued count before pushing message.
  • Test we have enough work queued for pools with and without rolltime capability.
  • As work is sorted by age, we can discard the oldest work at regular intervals to keep only 1 of the newest work items per mining thread.
  • Roll work again after duplicating it to prevent duplicates on return to the clone function.
  • Abstract out work cloning and clone $mining_threads copies whenever a rollable work item is found and return a clone instead.
  • api.c display Pool Av in json
  • Take into account average getwork delay as a marker of pool communications when considering work stale.
  • Work out a rolling average getwork delay stored in pool_stats.
  • Getwork delay in stats should include retries for each getwork call.
  • Walk through the thread list instead of searching for them when disabling threads for dynamic mode.
  • Extend nrolltime to support the expiry= parameter. Do this by turning the rolltime bool into an integer set to the expiry time. If the pool supports rolltime but not expiry= then set the expiry time to the standard scantime.
  • When disabling fanspeed monitoring on adl failure, remove any twin GPU association. This could have been leading to hangs on machines with dual GPU cards when ADL failed.
  • modminer: Don't delay 2nd+ FPGAs during work restart
  • Disable OpenCL code when not available.
  • Fix openwrt crashing on regeneratehash() by making check_solve a noop.
  • FPGA - allow device detect override without an open failure
  • Fix sign warning.
  • Bugfix: icarus: properly store/restore info and work end times across longpoll restarts
  • Enable modminer for release builds
member
Activity: 112
Merit: 10
Please try with -D -T longer...

Tried it for much longer than it usually takes to crash.

As I previously reported, standard settings, crash within a few minutes
-D instant crash doesnt even start mining
-D -T mines fine no crash within 15 mins

kind regards
legendary
Activity: 2576
Merit: 1186
A quick test with -D -T option enabled worked fine for a while, no crash.

-D option crashes instantly

Normal operation crashes after a few mins.

Seems to be a display issue, it doesnt like the ncurses formating perhaps.

kind regards
Nothing ncurses-related has changed between CGMiner and BFGMiner. Please try with -D -T longer...
member
Activity: 112
Merit: 10
A quick test with -D -T option enabled worked fine for a while, no crash.

-D option crashes instantly

Normal operation crashes after a few mins.

Seems to be a display issue, it doesnt like the ncurses formating perhaps.

kind regards
legendary
Activity: 2576
Merit: 1186
Here is what Kano had to say about the whole icarus code issue which is one of the issues that caused the fork in the first place:
https://bitcointalksearch.org/topic/m.997424
Reposting Kano's lies is not necessary.

Yea I read through kano's comments before, I though I would just give things a try, guess this is another confirmation that BFG+icarus is not working in win32.
Can you try running in a normal command prompt with -D -T added, and let me know what it prints before crashing?
member
Activity: 112
Merit: 10
Yea I read through kano's comments before, I though I would just give things a try, guess this is another confirmation that BFG+icarus is not working in win32.

kind regards
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Thanks for your reply Luke.

Ive used CGMiner pretty successfully so far, but I have some small imbedded systems I want to test the icarus on and opencl or sdk support isnt possible.

I like the idea of better icarus support, so ill give it a try.

[update] : Switched to BFGMiner 2.4.3 from CGMiner 2.4.3 with same switches, BFG stops working after 1-5mins with windows saying the program has stopped responding.
This is on my win7x32 system with e350.  Ill try on my intel p3 when I get it setup, but so far a bit disappointing that it didnt work out of the box.  
I Have used the last 3 versions of CGMiner no problems.

[update2] : Used administrator access privs to run the exe, seems to have fixed the issue.

[update3] : Still crashing.  Switching back to CGMiner until I can try this on my intel+no ocl platform.

kind regards
Here is what Kano had to say about the whole icarus code issue which is one of the issues that caused the fork in the first place:
https://bitcointalksearch.org/topic/m.997424
member
Activity: 112
Merit: 10
Thanks for your reply Luke.

Ive used CGMiner pretty successfully so far, but I have some small imbedded systems I want to test the icarus on and opencl or sdk support isnt possible.

I like the idea of better icarus support, so ill give it a try.

[update] : Switched to BFGMiner 2.4.3 from CGMiner 2.4.3 with same switches, BFG stops working after 1-5mins with windows saying the program has stopped responding.
This is on my win7x32 system with e350.  Ill try on my intel p3 when I get it setup, but so far a bit disappointing that it didnt work out of the box.  
I Have used the last 3 versions of CGMiner no problems.

[update2] : Used administrator access privs to run the exe, seems to have fixed the issue.

[update3] : Still crashing.  Switching back to CGMiner until I can try this on my intel+no ocl platform.

kind regards
Pages:
Jump to: