Pages:
Author

Topic: [CLOSED] Bitmine CoinCraft A1 28nm chip distribution / DIY support - page 5. (Read 81318 times)

newbie
Activity: 26
Merit: 0
I added code to trim my supply and the good news is that it looks like the board is stable at 35GH/s at 0.975V. To get any faster than that I need to get over the max 1.050V my supply is able to put out (to do that I need to disassemble the cooling and change a sense resistor - not terrible but it is a hassle).

A single chip (the board has four) almost works at 1.050V/40GH/s. If I run all four then the supply under load drops to about 1.030V which doesn't work too well.

Need to validate this on more than one board of course Smiley
hero member
Activity: 490
Merit: 500
This is also what I see. You get to crazy amounts of power usage and heat over 33... I don't think it is safe to run it over that with normal cooling. Even if you use coper coolers I don't think it is enough gain to be worth extra cost of power and price... When you get to about 950 MHz range necessary voltage to run the chips start going up like crazy... From 0,9 at 950 to over 1 at 1050...

EDIT: Looking at your signature. I think thus Hex16B are about as good as 8 chips A1 board... If we look at just a performance... Maybe even batter

Quote
6x Hex16B (Bitfury) 45GH boards 270GH/414W total (1.53W/GH)

Did anyone till now get close to promised numbers?

Yeah the Hex8A1 is in the 1.5-1.6W/GH range when running at 8x33GH=264GH and runs a hell of a lot noisier fans than 6x Hex16B do (1800 v 4900rpm). Roll Eyes
I opt to run them at 880MHz(4x220)*/910mV = 27GH chip = 1.25W/GH which is just about coolable with 3x Quiet F9 92mm 1800rpm fans.
I'd rather distribute the heat over more units and not have a headache! Bitfury2 now shows another 25% GH improvement too, but it looks like it's going to be short lived as king, by the new 40nm Avalon 3 which promises 0.775W/GH, but still waiting to see real world results on that one.
* The Bitmine clock is 4x the Technobit clock.
hero member
Activity: 826
Merit: 1000
This is also what I see. You get to crazy amounts of power usage and heat over 33... I don't think it is safe to run it over that with normal cooling. Even if you use coper coolers I don't think it is enough gain to be worth extra cost of power and price... When you get to about 950 MHz range necessary voltage to run the chips start going up like crazy... From 0,9 at 950 to over 1 at 1050...

EDIT: Looking at your signature. I think thus Hex16B are about as good as 8 chips A1 board... If we look at just a performance... Maybe even batter

Quote
6x Hex16B (Bitfury) 45GH boards 270GH/414W total (1.53W/GH)

Did anyone till now get close to promised numbers? So I know is it my board (and HEX8 too) or just the chip...
hero member
Activity: 490
Merit: 500
Did Bitmine change chips specs?

Wasn't turbo 40GH? Now it is 33GH... But still same price... But I do agree I don't see way to get them to 40GH with normal cooling...

Well that's sad to see… I've still been hopeful that I could get 40 to work someday… It actually does kinda work, I just notice that when I over clock the chip I start dropping nonces (when running a known nonce test case)…

EDIT2: And from what I see you can't get 33GH out at 0.85V more like 0.985 to 1 V. Also 1W in Turbo mode is fantasy... More like 1,3 to 1,5W... Hard to say how much is the loss on chip power supply...

I haven't played with increasing the core voltage beyond 0.85V - still on the list. 1V really is necessary that's going to be a decent chunk of power indeed!

Has anyone been able to get 33GH/s or higher to run?

The current pricing really does need to be adjusted. Several competitors are all coming online very soon at well under $2/GH. I'd really prefer to stick with coin craft as I have a working design and am very happy with the support I've seen (zefir plus free samples). But the current pricing level is going to make it very hard to be profitable for long...

My experience with the Technobit Pre-production A1 chips is that they don't fire up until 850mV (Turbo) and then you'll be lucky to see 25GH (Normal).
They're in a top-30%/70%-bottom package so you'll need a heatsink on both sides and the power/temps rise quickly so you'll need a lot of cooling to get anywhere near 33GH. You need to raise the voltage as you raise the clock or you'll get hardware errors but 33GH is doable at 1040MHz(4x260)*/1000mV. Marto even managed to push it to 35GH at 1100MHz(4x275)*/1050mV
* The Bitmine clock is 4x the Technobit clock.
newbie
Activity: 26
Merit: 0
Did Bitmine change chips specs?

Wasn't turbo 40GH? Now it is 33GH... But still same price... But I do agree I don't see way to get them to 40GH with normal cooling...

Well that's sad to see… I've still been hopeful that I could get 40 to work someday… It actually does kinda work, I just notice that when I over clock the chip I start dropping nonces (when running a known nonce test case)…

EDIT2: And from what I see you can't get 33GH out at 0.85V more like 0.985 to 1 V. Also 1W in Turbo mode is fantasy... More like 1,3 to 1,5W... Hard to say how much is the loss on chip power supply...

I haven't played with increasing the core voltage beyond 0.85V - still on the list. 1V really is necessary that's going to be a decent chunk of power indeed!

Has anyone been able to get 33GH/s or higher to run?

The current pricing really does need to be adjusted. Several competitors are all coming online very soon at well under $2/GH. I'd really prefer to stick with coin craft as I have a working design and am very happy with the support I've seen (zefir plus free samples). But the current pricing level is going to make it very hard to be profitable for long...
hero member
Activity: 504
Merit: 500
One always null? (Alignment gap to detect breaks in registers, for error-checking, or an unregistered bit for future-use?)

Does it use error-checking-bits? (Like a check-sum or always high/low, or tick flip-flop bit.)

Might be used to differentiate between send/receive data, on the same stream. send=0 receive=1 dictating pass-along, or use, by the chip or the chip-reader. (Thus, addressable only by the hardware for internal use only.)
sr. member
Activity: 335
Merit: 250
Quote
Hope this does not demotivate you, but getting this issue stable was the hardest part with the boards at Bitmine.

Good Luck

Oh, Things are getting more interesting. Thanks.

I have configured the chip at 250MHZ system clock and 16MHz reference clock.
pre_div=2, post_div=4, fb_div=125: 0x88 0x7d 0x21 0x84 0x00 0x00

So the hashing speed is supposed to be 250MHZ* 32 cores = 8000 MH/s

Am i correct?
So maybe i need to reconfigure at a lower system clock? To make it work stable.


I think the naming for post and pre divider were reversed in an earlier version of documentation / software, but basically yes, this should set the sys_clock to 250MHz.

If unsure, you always can double-check by scoping the inter-chip SPI clock, which is sys_clk/64, so at 250MHz you should get ~3.9MHz.

Note: take care to configure your host SPI clock below the inter-chip SPI clock, i.e. in this case configure your RPi SPI interface below 3.9MHz.


I have checked the SPI. It shows 16MHZ Cheesy. So i'm running the chip at 1024MHZ clock.
There is one thing o don't understand in the datasheet.
Here is the register:

https://www.dropbox.com/s/45w9ta19af7y1fl/A1_registry.JPG

It is 48 bit but the numbering goes from 0 to 46. It is correct?
I will program the dividers
legendary
Activity: 1610
Merit: 1000
Did Bitmine change chips specs?

Wasn't turbo 40GH? Now it is 33GH... But still same price... But I do agree I don't see way to get them to 40GH with normal cooling...

EDIT: Yes they did... And forgot to change it on one page...

http://bitmine.ch/?page_id=863
Quote
Technical specifications* of the CoinCraft A1 ASIC:

Developed on 28nm HPP process from Global Foundries
Custom IC package with power bars for low voltage, high current feeding
Configurable in daisy chain mode for distributed work with up to 253 ASICs.
Standard SPI interface
Hashing power of 25 GH/s in nominal and up to 40 GH/s in Turbo mode
Power usage of 0.35 W/GH in low power, 0.6 W/GH in nominal and 1 W/GH in Turbo mode
Supply voltage of 0.65V in low power, 0.765 V in nominal and 0.85 V in Turbo mode
Mass production available starting from the second week of December 2013

http://bitmine.ch/?product=coincraft-ai-asic
Quote
Product Description

Developed on 28nm HPP process from Global Foundries
Custom IC package with power bars for low voltage, high current feeding
Configurable in daisy chain mode for distributed work with up to 253 ASICs.
Standard SPI interface
Hashing power* of 20 GH/s in low power, 25 GH/s in nominal and up to 33 GH/s in Turbo mode
Power usage* of 0.35 W/GH in low power, 0.6 W/GH in nominal and 1 W/GH in Turbo mode
Supply voltage* of 0.65V in low power, 0.765 V in nominal and 0.85 V in Turbo mode
Mass production available starting from the second week of December 2013

So they change it down fro about 20%...

EDIT2: And from what I see you can't get 33GH out at 0.85V more like 0.985 to 1 V. Also 1W in Turbo mode is fantasy... More like 1,3 to 1,5W... Hard to say how much is the loss on chip power supply...
Add to that  10%  dead cores on average. Take into account chip price and difficulty. In reality that means around 25% hashing power + 25 %  increased power draw without price reduction? sorry to say it but coincraft will remain in history soon  if price is not reduced significantly
I love my craft boards for sure. I would like to have more but at that price it seems direct loss to me
Seems that great zefir efforts will be wasted once again Cry
hero member
Activity: 826
Merit: 1000
Did Bitmine change chips specs?

Wasn't turbo 40GH? Now it is 33GH... But still same price... But I do agree I don't see way to get them to 40GH with normal cooling...

EDIT: Yes they did... And forgot to change it on one page...

http://bitmine.ch/?page_id=863
Quote
Technical specifications* of the CoinCraft A1 ASIC:

Developed on 28nm HPP process from Global Foundries
Custom IC package with power bars for low voltage, high current feeding
Configurable in daisy chain mode for distributed work with up to 253 ASICs.
Standard SPI interface
Hashing power of 25 GH/s in nominal and up to 40 GH/s in Turbo mode
Power usage of 0.35 W/GH in low power, 0.6 W/GH in nominal and 1 W/GH in Turbo mode
Supply voltage of 0.65V in low power, 0.765 V in nominal and 0.85 V in Turbo mode
Mass production available starting from the second week of December 2013

http://bitmine.ch/?product=coincraft-ai-asic
Quote
Product Description

Developed on 28nm HPP process from Global Foundries
Custom IC package with power bars for low voltage, high current feeding
Configurable in daisy chain mode for distributed work with up to 253 ASICs.
Standard SPI interface
Hashing power* of 20 GH/s in low power, 25 GH/s in nominal and up to 33 GH/s in Turbo mode
Power usage* of 0.35 W/GH in low power, 0.6 W/GH in nominal and 1 W/GH in Turbo mode
Supply voltage* of 0.65V in low power, 0.765 V in nominal and 0.85 V in Turbo mode
Mass production available starting from the second week of December 2013

So they change it down fro about 20%...

EDIT2: And from what I see you can't get 33GH out at 0.85V more like 0.985 to 1 V. Also 1W in Turbo mode is fantasy... More like 1,3 to 1,5W... Hard to say how much is the loss on chip power supply...
hero member
Activity: 504
Merit: 500
Are you guys setting-up the chips so they fire-up all at once, or in time-delayed succession? (To allow power to "restore", before reaching the amperage limits of the voltage-regulators and reference-voltages.) Turning-on all the lights in your house at once, and the AC and the fridge, with the coffee-pot, and the hair-dryer, would trip the main breaker. Even though the total running amperage is not high enough to trip the breaker, the in-rush of start-up amperage draw would trip the breaker. It is like a 150-lb man standing on a scale, then jumping on it. When he jumps, he weighs up to 10x his weight, in momentum, from acceleration. Electricity does a similar thing, as the brown-out from the dip, causes the VR's to kick-in high-mode, delivering 10x more power that it "detected" it needed, but once delivered, that starting resistance is no longer there, and the in-rush is too much amperage. (Or it does a full black-out, as the reference voltage falls out of spec, since all the available power was just absorbed by that temporary short-circuit of all the devices turning on at once.)

I realize the chips auto-run once started, pulling massive power as they run free... However, could that also be addressed by starting them in low-power mode, in addition to firing-up in sequence, then ramping them up to normal mode in sequence also?

One nice feature would be an AUTO-ADJUST feature. Tuning each chip to remain only powerful enough to operate within X-errors. (Which would really be a nice safety feature in the event of chip-decay from overheating. In the event of poor cooling or dead fans or unavoidable restrictions like dust build-up.) Every returned error would count as 1/2 of a potential speed reduction, which would decay after a few minutes. If two errors hit, within that time, the unit drops down a level. Thus, less errors, or none. Which it would attempt to auto-adjust back up, after a few more minutes... Just in-case it only needed a mini-break, or the errors were just a chance-occurrence. (1/3 for the more daring, and 1/4 for those trying to ride the upper-limit... 1/5, 1/6, 1/7, 1/8 for those who may have better management of heat, but purposely want to get that extra few hashes.)

P.S. Due to the speed of the chips... You might also want to hard-code a minimum hash-rate "Share limit". Not even attempting to broadcast diff-16 or lower... For instance... Because it would just flood the output-stream with too many collisions that would appear to be errors, only because they are not arriving completely through the data-lines. Most pools now have auto-adjusting hash-rate diff-levels, and also compensate for returned diff-levels, based on the difficulty actually returned. I am not sure how much diff-16 as a minimum would save you, but diff-128 to diff-256 would be what most suggest for a THs miner, and most auto-adjust to diff-512 anyways. (Diff-16 from a 1THs miner would surely saturate most internet connections with a stream of data that would end-up being buffered anyways.)
donator
Activity: 919
Merit: 1000
Quote
Hope this does not demotivate you, but getting this issue stable was the hardest part with the boards at Bitmine.

Good Luck

Oh, Things are getting more interesting. Thanks.

I have configured the chip at 250MHZ system clock and 16MHz reference clock.
pre_div=2, post_div=4, fb_div=125: 0x88 0x7d 0x21 0x84 0x00 0x00

So the hashing speed is supposed to be 250MHZ* 32 cores = 8000 MH/s

Am i correct?
So maybe i need to reconfigure at a lower system clock? To make it work stable.


I think the naming for post and pre divider were reversed in an earlier version of documentation / software, but basically yes, this should set the sys_clock to 250MHz.

If unsure, you always can double-check by scoping the inter-chip SPI clock, which is sys_clk/64, so at 250MHz you should get ~3.9MHz.

Note: take care to configure your host SPI clock below the inter-chip SPI clock, i.e. in this case configure your RPi SPI interface below 3.9MHz.
sr. member
Activity: 335
Merit: 250
Quote
Hope this does not demotivate you, but getting this issue stable was the hardest part with the boards at Bitmine.

Good Luck

Oh, Things are getting more interesting. Thanks.

I have configured the chip at 250MHZ system clock and 16MHz reference clock.
pre_div=2, post_div=4, fb_div=125: 0x88 0x7d 0x21 0x84 0x00 0x00

So the hashing speed is supposed to be 250MHZ* 32 cores = 8000 MH/s

Am i correct?
So maybe i need to reconfigure at a lower system clock? To make it work stable.
donator
Activity: 919
Merit: 1000
I'm using raspi driver with one chip.
The chips starts hashing ok but after a while (30s) it stopes.
I have examined the log and i have found that the problem is in the scanwork function.

We see that the registry values cannot be read.

Well, that's the standard problem with unstable power blocks: once you feed the chip with work, it starts hashing and with that consuming serious power. When that happens and your DCDC is not capable to provide the required power, the chip resets itself.

You need to scope your supply voltage and the reset line; then you should be able to notice significant voltage drops below critical values (800mV supply, 1.6V reset).

Once the chip resets itself, it gets inaccessible and you need to re-issue a HW-reset and send initialisation command sequence again.


Hope this does not demotivate you, but getting this issue stable was the hardest part with the boards at Bitmine.


Good Luck
hero member
Activity: 924
Merit: 1000
Our 2 chip test A1 Wasps will be housed in the Hive pictured below. Testing of the A1 Wasp board will start on Tuesday.

Hive P0rn. BP-1. Rev0.


Bichnellski, do you have cgminer fork for wasps or any board - e.g. 2-chip DIY-board developed by Zefir ?
And what is the progress on this ?

We are not basing our wasp firmware / software off cgminer we used bfgminer / bifury as our starting point.

Review our thread for updates and progress on the Wasp & Hive Project.
sr. member
Activity: 335
Merit: 250
I'm using raspi driver with one chip.
The chips starts hashing ok but after a while (30s) it stopes.
I have examined the log and i have found that the problem is in the scanwork function.

Here is the log:
Code:
[2014-03-16 11:36:49] Processing command 0x0800

 [2014-03-16 11:36:49] TX: 2 bytes:08 00
 [2014-03-16 11:36:49] RX: 2 bytes:00 00
 [2014-03-16 11:36:49] RX: 2 bytes:00 00
 [2014-03-16 11:36:49] RX: 2 bytes:08 00
 [2014-03-16 11:36:49] Output queue empty
 [2014-03-16 11:36:49] Processing command 0x0a01

 [2014-03-16 11:36:49] TX: 2 bytes:0A 01
 [2014-03-16 11:36:49] RX: 2 bytes:00 00
 [2014-03-16 11:36:49] RX: 2 bytes:00 00
 [2014-03-16 11:36:49] RX: 2 bytes:0A 01
 [2014-03-16 11:36:49] RX: 2 bytes:00 00
 [2014-03-16 11:36:49] Failure: missing ACK for cmd 0x1a
 [2014-03-16 11:36:49] Failed to read reg from chip 0
 [2014-03-16 11:36:49] Processing command 0x0a01

 [2014-03-16 11:36:49] TX: 2 bytes:0A 01
 [2014-03-16 11:36:49] RX: 2 bytes:00 00
 [2014-03-16 11:36:49] RX: 2 bytes:00 00
 [2014-03-16 11:36:49] RX: 2 bytes:0A 01
 [2014-03-16 11:36:49] RX: 2 bytes:00 00
 [2014-03-16 11:36:49] Failure: missing ACK for cmd 0x1a
 [2014-03-16 11:36:49] Disabling chip 1

We see that the registry values cannot be read.
member
Activity: 101
Merit: 10
no avatar for now
Our 2 chip test A1 Wasps will be housed in the Hive pictured below. Testing of the A1 Wasp board will start on Tuesday.

Hive P0rn. BP-1. Rev0.






Bichnellski, do you have cgminer fork for wasps or any board - e.g. 2-chip DIY-board developed by Zefir ?
And what is the progress on this ?
hero member
Activity: 924
Merit: 1000
Our 2 chip test A1 Wasps will be housed in the Hive pictured below. Testing of the A1 Wasp board will start on Tuesday.

Hive P0rn. BP-1. Rev0.




newbie
Activity: 30
Merit: 0
Got my PCB made this week. Anyone else working on a reference build? Would love to compare notes  Smiley

http://i.imgur.com/J34PiOI.jpg

I am selling 2 extra boards that I had made at cost ($25). I am in the US, quick shipping.
donator
Activity: 919
Merit: 1000
I had same problem in 1-chip Test.

I just changed additional recv size. :-)
(at [register read] & [result read] )

Can 2-chip cgminer fork with RX buffer patch be uploaded to Bitmine Github ?

That needs to be forked from someone actually having that kind of hardware (I don't). The driver that is in cgminer upstream repository is specifically for the CoinCraft Desk modules, every derivative will need to have its dedicated driver.
member
Activity: 101
Merit: 10
no avatar for now
Clarification: SPI Processing shorter Chip Chains

I have been approached by users having difficulties to operate their designs with the official or reference cgminer driver.

Here is the SPI trace provided:
Code:
TX:  8 bytes: 09 00 88 A6 21 84 00 00      //spi_send_command
RX:  8 bytes: 00 00 00 00 09 00 88 A6      //spi_send_command
RX:  2 bytes: 21 84               //spi_poll_result
RX:  2 bytes: 00 00               //spi_poll_result
RX:  2 bytes: 00 00               //spi_poll_result

To understand why polling for the ACK (i.e. reading 0x90 00 from the chain) fails in this case, consider how the current implementation accesses the chain:
1) write a command to the first A1, which with every next dummy write is shifted to the next A1
2) to get an ACK, shift the data through the chain until you receive the expected return values

The reference implementation in a first step writes the command and in the second polls for the result. This works with longer chains, but in this case (with 1 or 2 chip chains) fails because the response is obviously returned already while the command is still written.

To circumvent the issue, one needs to search for the ACK already in the RX buffer of the command write step. In the upper case the correct processing would therefore be:
1) write 8 bytes for command 09
2) in the RX buffer search for the ACK (here at position 5)
3) determine how many more words need to be polled for the full ACK (here 2 more words) and do the poll


Unfortunately I have no single chip chain available to test and implement this, but it obviously is no rocket science and should be easily done.

I had same problem in 1-chip Test.

I just changed additional recv size. :-)
(at [register read] & [result read] )

Can 2-chip cgminer fork with RX buffer patch be uploaded to Bitmine Github ?
Pages:
Jump to: