Pages:
Author

Topic: Cairnsmore1 - Quad XC6SLX150 Board - page 19. (Read 286370 times)

legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
September 13, 2012, 07:47:20 PM
...
2012-09-13 20:18:25.275   [200]   0017-p3 hv-175:    Got H-not-zero share 7416e19f
...
If you were wondering about that message,
That usually means a HW error share.

I'm not sure about mpbm and what it does with HW error shares, but any device that gets a HW error it's simply detected by H being non-zero

H is the first 32 bits in the block hash (in the order you normally see it) and as you normally see it - it is always all zeros.
As soon as it is non-zero for any given share (nonce) found by the device, then you know the device got it wrong and it's a hardware error - usually due to overheating or other such issues.
newbie
Activity: 37
Merit: 0
September 13, 2012, 06:55:38 PM
Are the dipswitch settings correct on the bitstream upgrade section? 'sw6 dip2 off' is contrary to this: https://bitcointalksearch.org/topic/m.1073047 which shows 'sw6 dip1 off'

It looks correct as per the settings here for the 1.5 controller:

http://www.enterpoint.co.uk/cairnsmore/cairnsmore1_support_materials.html

'3' (a.k.a SW1-3) is programming enabled, '6'  (a.k.a SW6-2) is SPI programming enabled

I'm referring to the section for upgrading the bitstream though, not updating the controller, is the SPI switch still relevant at this point in the procedure?
And is SW6-1 baud rate? because I'm getting better results with that set to 'off' or 115200 for mining.

The guide is excellent btw Smiley



Yes, you're right. I got those transposed somehow.

I've corrected them to leave SW6-1 off (115200 baud) all the time (there's no reason to change it when flashing)
For controller programming, setting SW1-3 & SW6-2 off is only necessary when reflashing the controller post 1.3 so that's a good catch-all
I've corrected the FPGA programming settings as per Lethos's diagram


Well spotted, cheers for that Smiley - I'm not in a position to do a full dummy run of this as I've got a single board happily hashing away and don't want to interrupt it....

sr. member
Activity: 476
Merit: 250
Keep it Simple. Every Bit Matters.
September 13, 2012, 06:04:25 PM
... I've just find it odd, what could of caused it to just start happening.
...
'Maybe' a small increase in ambient temperature?

Are you using the makomk bitstream and are any of your CM1 FPGAs generating over 0.5% Invalids?

For me, USB related problems went away when I stopped connecting a USB cable to a board which is generating 2.25% invalids. (By daisy chaining the boards and making the one with about 0.2% invalids the Master.) With this approach I've been operating four days continuously whereas before the reconfiguration I had two failures within 24 hrs.


Not using the makomk bitstream, I've used hashvoodoo one. Not sure it even makes a difference, before I put things in place to prevent power being a problem, it never mine. For reference I get approximately 0.01% invalids on his bitstream, it's pretty rare.



Must of missed that (or don't remember), I've just find it odd, what could of caused it to just start happening.
At least I know why now, I don't even know for sure if a new powered hub will help.

Lethos,

I don't think a powered hub can help you here since it is an OS/driver problem as far as I know.

spiccioli

I'm back to using ubuntu to prove it wasn't a software problem, using the same method I used in Debian.
I've always been cheap when it came to hubs, never seen their be much of a difference to them. Quiet different to how I am with PSU's ( always high end). So this time I'm getting a really good one, doesn't bother me even if I'm wrong I've not got a good 7 port usb hub, so even if it ends up not fixing it permanently it will find a use.
full member
Activity: 562
Merit: 100
September 13, 2012, 05:48:05 PM
Are the dipswitch settings correct on the bitstream upgrade section? 'sw6 dip2 off' is contrary to this: https://bitcointalksearch.org/topic/m.1073047 which shows 'sw6 dip1 off'

It looks correct as per the settings here for the 1.5 controller:

http://www.enterpoint.co.uk/cairnsmore/cairnsmore1_support_materials.html

'3' (a.k.a SW1-3) is programming enabled, '6'  (a.k.a SW6-2) is SPI programming enabled

I'm referring to the section for upgrading the bitstream though, not updating the controller, is the SPI switch still relevant at this point in the procedure?
And is SW6-1 baud rate? because I'm getting better results with that set to 'off' or 115200 for mining.

The guide is excellent btw Smiley

newbie
Activity: 37
Merit: 0
September 13, 2012, 04:46:05 PM
I've posted CM1 quickstart guide draft 2 at the same URL:

http://btc.steveme.mailforce.net/CM1%20quickstart%20guide.html

If there's anything you'd like me to add or any errors to fix please let me know.


Once it's finalized I'll post it on the bitcoin.it wiki (hopefuly)

Are the dipswitch settings correct on the bitstream upgrade section? 'sw6 dip2 off' is contrary to this: https://bitcointalksearch.org/topic/m.1073047 which shows 'sw6 dip1 off'

It looks correct as per the settings here for the 1.5 controller:

http://www.enterpoint.co.uk/cairnsmore/cairnsmore1_support_materials.html

'3' (a.k.a SW1-3) is programming enabled, '6'  (a.k.a SW6-2) is SPI programming enabled
full member
Activity: 562
Merit: 100
September 13, 2012, 04:33:12 PM
I've posted CM1 quickstart guide draft 2 at the same URL:

http://btc.steveme.mailforce.net/CM1%20quickstart%20guide.html

If there's anything you'd like me to add or any errors to fix please let me know.


Once it's finalized I'll post it on the bitcoin.it wiki (hopefuly)

Are the dipswitch settings correct on the bitstream upgrade section? 'sw6 dip2 off' is contrary to this: https://bitcointalksearch.org/topic/m.1073047 which shows 'sw6 dip1 off'
newbie
Activity: 37
Merit: 0
September 13, 2012, 04:11:04 PM
I've posted CM1 quickstart guide draft 2 at the same URL:

http://btc.steveme.mailforce.net/CM1%20quickstart%20guide.html

If there's anything you'd like me to add or any errors to fix please let me know.


Once it's finalized I'll post it on the bitcoin.it wiki (hopefuly)
hm
member
Activity: 107
Merit: 10
September 13, 2012, 12:48:41 PM
Stats are OK:


#0017 is being flashed right now (hashvoodoo_release_08_16_2012).

I've got a bunch of old serial ports with flat cables and 10pin-connectors laying around.
I would like to use them to build some daisy-chaining cables for my rig.
Possible problem is, they only have 9 lines instead of 10.
Can I still use them? Is there a pinout description available for J12, J13?

ps. Another thing... I just had all my four boards connected to power, the USB cables were connected to the boards but not to the computer. When I touched at least two of the USB connectors repeatedly, I got electric shocks every time. When touching one USB connector alone, I didn't get shocked.
Also, I have an unpowered USB hub with LED for each port. When I had all the boards connected, I disconnected the hub from the PC and the LEDs where still on, though not as bright as when connected to the PC.
Is this normal?

This is the hub, half of the boards couldn't start mining using this cheap chinese product, I recommend not to use this:


pps. board #0017 is running.. not good but better than before. there are mpbm log messages like this:
Code:
2012-09-13 20:18:25.216	[250]	0017-p1 hv-175: 	50btc accepted share 88297200 (difficulty 1.01576)
2012-09-13 20:18:25.275 [200] 0017-p3 hv-175: Got H-not-zero share 7416e19f
2012-09-13 20:18:28.091 [200] 0017-p2 hv-175: Got H-not-zero share fca2654d
2012-09-13 20:18:28.091 [200] 0017-p2 hv-175: Detected overload condition!
2012-09-13 20:18:29.963 [200] 0017-p3 hv-175: Got H-not-zero share a58ccc30
2012-09-13 20:18:29.964 [200] 0017-p3 hv-175: Detected overload condition!
sr. member
Activity: 476
Merit: 250
September 13, 2012, 12:16:08 PM
... I've just find it odd, what could of caused it to just start happening.
...
'Maybe' a small increase in ambient temperature?

Are you using the makomk bitstream and are any of your CM1 FPGAs generating over 0.5% Invalids?

For me, USB related problems went away when I stopped connecting a USB cable to a board which is generating 2.25% invalids. (By daisy chaining the boards and making the one with about 0.2% invalids the Master.) With this approach I've been operating four days continuously whereas before the reconfiguration I had two failures within 24 hrs.
legendary
Activity: 1378
Merit: 1003
nec sine labore
September 13, 2012, 12:05:06 PM

Must of missed that (or don't remember), I've just find it odd, what could of caused it to just start happening.
At least I know why now, I don't even know for sure if a new powered hub will help.

Lethos,

I don't think a powered hub can help you here since it is an OS/driver problem as far as I know.

spiccioli
hero member
Activity: 481
Merit: 502
September 13, 2012, 09:43:29 AM
Just incase anyone was interested, BFGMiner isn't working with my Cairnsmore1's but CGMiner is working fine Smiley
sr. member
Activity: 476
Merit: 250
Keep it Simple. Every Bit Matters.
September 13, 2012, 09:07:52 AM

However the reason for the entire system slowly crashing, while still kind of working, was the usb stick was becoming undetectable, either by becoming unmounted or ejected, as soon as any of the CM1's was connected via usb. I use a wireless mouse (usb bluetooth) and usb thumb drive since the beginning, it has always been that way, but this most recent problem only effect the thumb drive. I even tested if it temporarily effected the mouse and it did not.

Hi Lethos,

this is a problem which surfaced in the past, see old thread messages, where attaching a usb device sometimes makes a different one to disappear. If I'm not wrong this problem was on windows back then.

Anyway, good to know it happens on linux as well.

spiccioli


Must of missed that (or don't remember), I've just find it odd, what could of caused it to just start happening.
At least I know why now, I don't even know for sure if a new powered hub will help.
legendary
Activity: 1378
Merit: 1003
nec sine labore
September 13, 2012, 08:57:33 AM

However the reason for the entire system slowly crashing, while still kind of working, was the usb stick was becoming undetectable, either by becoming unmounted or ejected, as soon as any of the CM1's was connected via usb. I use a wireless mouse (usb bluetooth) and usb thumb drive since the beginning, it has always been that way, but this most recent problem only effect the thumb drive. I even tested if it temporarily effected the mouse and it did not.

Hi Lethos,

this is a problem which surfaced in the past, see old thread messages, where attaching a usb device sometimes makes a different one to disappear. If I'm not wrong this problem was on windows back then.

Anyway, good to know it happens on linux as well.

spiccioli
sr. member
Activity: 476
Merit: 250
Keep it Simple. Every Bit Matters.
September 13, 2012, 08:56:21 AM
Ah USB problems.
Caused by casper (and /cow)
Read here if you are interested in the details of the problem and a suggested fixed set of scripts:
https://lists.ubuntu.com/archives/ubuntu-devel-discuss/2012-May/013639.html

I believe that to be a different but similar problem. Not ruling it out of course.
The side affect is that the USB persistent filesystem regularly gets corrupted.
Without fixing the dismount on shutdown, and allowing fsck on boot, you can expect all sorts of large and small weird problems due to random file corruption...

True, it probably would, but the system doesn't (yet) suffer any bad issues from these resets, it operates fine now and has done so until the usb triggered it to drop the thumb drive. For now I have managed to avoid that occurring again.
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
September 13, 2012, 08:44:18 AM
Ah USB problems.
Caused by casper (and /cow)
Read here if you are interested in the details of the problem and a suggested fixed set of scripts:
https://lists.ubuntu.com/archives/ubuntu-devel-discuss/2012-May/013639.html

I believe that to be a different but similar problem. Not ruling it out of course.
The side affect is that the USB persistent filesystem regularly gets corrupted.
Without fixing the dismount on shutdown, and allowing fsck on boot, you can expect all sorts of large and small weird problems due to random file corruption...
sr. member
Activity: 476
Merit: 250
Keep it Simple. Every Bit Matters.
September 13, 2012, 08:36:22 AM
Ah USB problems.
Caused by casper (and /cow)
Read here if you are interested in the details of the problem and a suggested fixed set of scripts:
https://lists.ubuntu.com/archives/ubuntu-devel-discuss/2012-May/013639.html

I believe that to be a different but similar problem. Not ruling it out of course.
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
September 13, 2012, 08:28:07 AM
Ah USB problems.
Caused by casper (and /cow)
Read here if you are interested in the details of the problem and a suggested fixed set of scripts:
https://lists.ubuntu.com/archives/ubuntu-devel-discuss/2012-May/013639.html
sr. member
Activity: 476
Merit: 250
Keep it Simple. Every Bit Matters.
September 13, 2012, 08:15:32 AM
I'm looking forward to try your next release. I hope you'll succeed in raising the clock while maintaining mining stability.

Yes, I just finished the smartxplorer run, and unfortunately it encountered some problems Sad

I'm tweaking some more settings now and re-running it. The new release works beautifully... But at it's current clock speed it's only about "equal" to the 175 release, so I wanted to push the clock higher before "releasing" it. I know this thing can get up to 180Mhz+ while meeting timing for a Spartan6 LX150 -2 speed grade chip. And that would mean with overclock, well over 200Mhz But I've got to find the "magic" combination of settings that will coax the bloody Xilinx toolchain to produce a working bitstream...

I'll post another update once I have it.

I'll be happy to test your next one Glasswalker. Smiley
sr. member
Activity: 476
Merit: 250
Keep it Simple. Every Bit Matters.
September 13, 2012, 08:15:02 AM
I think I may have found a temporary fix to my problem ...
Congratulations.

Just in case someone else someday runs into the same or equivalent problem, and while it is still fresh in your mind, could you give a very short description of what you uncovered, please?

--

If you want to wait a few days for validation, that makes sense. However, I do suggest jotting notes down now. It is amazing how fast little details can fade from memory.

Probably best to wait a while to really confirm really what was causing it. Since I don't know for sure if my temporary fix or it's more permanent solution will work. So I don't know for sure the follow to be true.

However the reason for the entire system slowly crashing, while still kind of working, was the usb stick was becoming undetectable, either by becoming unmounted or ejected, as soon as any of the CM1's was connected via usb. I use a wireless mouse (usb bluetooth) and usb thumb drive since the beginning, it has always been that way, but this most recent problem only effect the thumb drive. I even tested if it temporarily effected the mouse and it did not.
Only programs which was entirely in memory continued to work, anything currently being access form the stick, disappeared. This lead to a slow but eventual system crash as more and more programs realised some quicker than others than it could not access it's files.

Debian has more system level and diagnostic programs installed by default that work entirely in memory, that were easy to find out how and what occurred and to get it working again. Btw I don't recommend ever yanking out a usb in mid-use, it's more obvious now how many orphaned files it has, while it didn't cause any permanent damage, I'd bet it would eventually.

While I've made only minor hardware changes since I got the system (Replacing a usb hub), nothing has changed recently, they did not coincide with the timing of this problem, it was a software change that allowed me to uncover this problem.
Getting a new powered usb hub. Gone through 2 since I got it, it also worked fine without them too for a while, connecting directly to the usb ports on the board, actually was most stable with it, surprisingly.
Hence while I'm uncertain if it will actually fix it, but I figured if the thumb drive could drop like that, only thing that could make it do that is a lack of power to maintain it, according to what little log info I have, it's not a software call or command that issued it.
sr. member
Activity: 407
Merit: 250
September 13, 2012, 08:06:05 AM
I'm looking forward to try your next release. I hope you'll succeed in raising the clock while maintaining mining stability.

Yes, I just finished the smartxplorer run, and unfortunately it encountered some problems Sad

I'm tweaking some more settings now and re-running it. The new release works beautifully... But at it's current clock speed it's only about "equal" to the 175 release, so I wanted to push the clock higher before "releasing" it. I know this thing can get up to 180Mhz+ while meeting timing for a Spartan6 LX150 -2 speed grade chip. And that would mean with overclock, well over 200Mhz But I've got to find the "magic" combination of settings that will coax the bloody Xilinx toolchain to produce a working bitstream...

I'll post another update once I have it.
Pages:
Jump to: