Author

Topic: Optimal Firmware/Hardware design for mining with cgminer (Read 6352 times)

hero member
Activity: 676
Merit: 501
How about the biggest underclocked miner...

Max gh per watt

Max watt approx 1500
legendary
Activity: 1274
Merit: 1000
It's too bad U3 development didn't follow any of this, they had potential to be fun usb miners, but they turned out to be metallic piles of crap.
legendary
Activity: 1288
Merit: 1004
Great job Kano!
I do not know how I missed this thread for so long.

I think you should bump it at least once per month to keep it in developers minds.

A couple things I would suggest would be this.
If they are putting out new hardware to stop using such outdated versions of cgminer or bfgminer as the release version.  ie: the ZeusMiners running 3.1.1 at start instead of a newer version.  I know they cannot always use the latest as the updates come out regularly but at least keep it as up to date as possible.

The 2nd thing is to get units to ckolivas, Luke-Jr, Kano and Nate Woolls as soon as possible so that support for their products can be set for all the major mining software packages as soon as possible on the main chain.  I know this is a thread for just cgminer but this problem effects all the mining software packages.

Again great job Kano I look forward to devices working better because you have this thread going.
Thanks for all your great work.
newbie
Activity: 56
Merit: 0
The current ASIC mining industry really is not environmentally friendly, and I hope someone can come out to change all that. Kano may have the ability to solve it all
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
Moved here to hardware with some updates.
Will probably add an update regarding drivers soon also.
erk
hero member
Activity: 826
Merit: 500
Thanks Kano,
this thread probably should be in the custom hardware forum where the board designers would see it.

hero member
Activity: 532
Merit: 500
It looks that we have all of these implemented already in Minion ASIC.

legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
- reserved -
legendary
Activity: 4634
Merit: 1851
Linux since 1997 RedHat 4
2021-08 Minor update to note since I wrote this years ago.
Anywhere that I mention LP below, you should refer to that as a 'block change' where stratum also sends through a flag saying to cancel all work and start a fresh with the new work - since of course any previous work is now 'stale' and should be discarded.
In stratum this is: params[8] = Clean Jobs - true or false
This however doesn't mean that any nonces generated should be discarded, since as you'll notice in cgminer, this option (always off by default) was removed, since it's a very bad idea to throw away possible blocks ... Smiley

---------------

I've been asked this (quite) a few times and no doubt I don't give the same reply each time if I forget something - I should just have a link to point people to, that I update over time Smiley

... Some hardware manufacturers completely ignore the first 4 points - I wonder if they don't expect their hardware to be around for very long? ...

This is version 0.1 ... expect it to be updated as required ... or as I remember things I've forgotten to add Tongue
0.2 already added in blue
0.3 update in green
0.3 update 2014-Jun-15 brown
 strike out of the multiple end points idea since no one has implemented it like that so no way to compare it's usefulness,
 reordered the first 2 - and extra highlight in red one rather important heading Smiley

In relation to comments I've made about this:
Indeed people will think up better ideas over time ... and those ideas I will add to this first post if needed.
I certainly have no delusions of grandeur, things change over time, what is best today may not be best tomorrow.

Here's a list of things that Hardware companies or DIYs should do in their USB/Firmware design.
N.B. most of this applies to all mining devices, not just USB devices - queues, resets, IDs, temperature management, asynchronous I/O ...


Make sure it has a well defined iProduct and/or iManufacturer code so cgminer can tell straight away if it is a device of interest rather than not knowing until it has done cgminer code I/O to it.
USB chips are used by many different devices, cgminer can simply ignore a device based on iProduct and/or iManufacturer - as it already does for some devices.
Indeed this has shown up where devices have exactly the same USB information as other devices - so it is necessary to attempt to mine before even knowing if it is the right device.
The iProduct/iManufacturer is even easy to change on some of these devices, so only a small amount of foresight (reading this and acting on it) would be necessary


Have a firmware version string available!
This clears up any issues about code handling changes to the firmware ... as long as you change the version string when the firmware changes.
Even better to have something like the BFL GetInfo that returns the version string and other relevant device configuration information, that cgminer would get when it initialises the device.
If you are making multiple devices that cgminer has to treat differently, then the 'GetInfo' idea is a must.

Each device you produce should have a different iSerial - i.e. something that will identify each device differently when a person has more than one of them.

If you are considering improving the firmware in the future (who isn't?) - then being able to upload the firmware via USB is pretty much mandatory and thus making it possible for people who bought the devices early, to update their firmware and not be left behind due to being the early adopters.

Most important for mining, they need to have an input and output queue
i.e. be able to send more than one work item, and the device queues up those items, moving onto the next one each time it completes one.
This removes all USB latency for any work other than when there is an LP
The size of the input queue should allow it to run for at least 100ms, preferably longer.
So basically what the cgminer code would do:
(as I already do in the BFL SC) is have 2 threads, one feeding work into the device at the rate required to keep the queue from getting empty, and another to get the output queued results.

With an input work queue, It is also ideal to be able to send multiple work items at once.
If the device has multiple processors that each require their own work item, then when all processors are idle, or when an LP occurs, it is the priority of cgminer to get work to all processors as quickly as possible, thus being able to send at least as many work items as processors, in a single USB transfer, ensures minimum time wasted for all processors
This of course also requires being able to send multiple work items to multiple processors in the same USB transfer.
Also, of course, if the device is able to process many work items in a short time frame, then it is ideal to be able to send many work items per USB transfer.


The process of sending and receiving work should include an identifier, so that the work replies only need to return that identifier, not the whole work item, to identify the work item the results are from
e.g. maybe 4 characters a-zA-Z0-9 so >14million before it recycles - bigger if your device hashes at more than 10TH/s - at 10TH/s 4 characters = ~105 minutes of work

Results should be put in the output queue when they are found, not waiting for the nonce range to complete.
Thus, you also need to send a 'finished' work item result since ~1/3 of work has no nonces
However, in the further future when a nonce range takes less than 1ms, it won't really be very relevant if the results aren't queued in the output queue until the nonce range is completed.

The device needs 3 types of reset:
1) to reset the work on an LP: clear the input queue, abort current work and accept new work to start on - preferably all in one command
2) to stop work due to e.g. overheating: clear the input queue and stop work
This of course also needs reliable temperature sensors
3) to clear the input queue, queue new work, but finish current work - like the LP but where the current work is allowed to complete ... e.g. a difficulty change being received by cgminer, providing new work at the new difficulty

The combination of the above 2 - nonce answers immediately and being able to abort work - resolves any issues with short LP times e.g. p2pool

Regarding overheating ... the hardware should only shutdown mining at some critical self destruct level - like GPUs do.
cgminer should control this below that critical level at the user's chosen temperature constraints.
Again (as above) this of course also needs reliable temperature sensors - and the hardware developer to provide details of how to interpret the results of the sensors


It's optimal to have asynchronous processing - send something to be done but not expect the reply to be the next thing returned.
This can easily be achieved with two things: 1) each command that is expecting a reply should send an identifier with it and 2) the replies would be placed in the output queue with the identifier that was sent.
Thus only one thread is dealing with reading replies, and waiting for status replies doesn't get in the way of getting results

One thing I have thought might be good, but I'm not sure of the MCU implementation issues, would be to have a USB chip with at least 2 pairs of end points, the 2nd one being purely for results, so the results thread simply waits on replies rather than polling for them

Related to this, of course you should be able to send status requests while it is mining, but that really only becomes an issue when you don't have a queue, so it shouldn't be relevant (i.e. it should always be implements to allow sending status requests while mining)

As the devices get faster, also allow difficulty to be defined in the work sent, so it will only return nonces at the requested difficulty.
HOWEVER, you'd want to be sure there's no loss of normal or error information - e.g. if a nonce is passed to the MCU difficulty checker, and there is a problem with it, it shouldn't hide that fact, but rather report that fact to the miner - a hardware error
This is important so that cgminer can modify the performance of the device - if the device has such options - to maximise mining.
If hardware errors are high and affecting the performance, it may be that the overall performance settings need to be lowered a bit by cgminer to increase the mining performance

Of course related to what I mentioned before, you need to send information saying the nonce ranges are complete so that information isn't lost about how much work was done.
Maybe as devices get faster (in the future) this complete message could be one reply with a list of work items that have completed



More to come as discussions provide it (or as I find things I've missed)
Jump to: