Cairnsmore1 - Quad XC6SLX150 Board - page 71.

Isokivi

hero member

Activity: 910

Merit: 1000

Items flashing here available at btctrinkets.com

Quote from: kano on July 06, 2012, 03:52:53 AM

shares are 2^32 hashes

13154 shares in 14 hrs is 1121MH/s

However, 14 hours is not long enough to ensure the number of shares is exactly the same as the hash rate.
Maybe give it a few days at least if you want to use the number to be close to correct.
(also you need to measure the time accurately as I'm sure it wasn't exactly 14hrs, and even 1 minute is 0.12%)

My intention was to have a look at rough numbers at this point, thank you for the math. Im not that interested at getting an accurate performance figure, while the boards are at best crippled to less than half they should eventually be able to provide.

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

shares are 2^32 hashes

13154 shares in 14 hrs is 1121MH/s

However, 14 hours is not long enough to ensure the number of shares is exactly the same as the hash rate.
Maybe give it a few days at least if you want to use the number to be close to correct.
(also you need to measure the time accurately as I'm sure it wasn't exactly 14hrs, and even 1 minute is 0.12%)

Isokivi

hero member

Activity: 910

Merit: 1000

Items flashing here available at btctrinkets.com

Quote from: Isokivi on July 05, 2012, 01:58:49 PM

Quote from: yohan on July 05, 2012, 01:38:25 PM

Are you running the correct version of CGminer?

When you flashed the units dip you have the dip switches in the correct position for flashing and then afterwards set them to running position for the twin bitstream?

I tought I was, somewhere along the line I had for some reason started the wrong one... Im running the correct one now, I'll report in when I have enough data, I think around 1000shares per core should be definite ?

And yes I have been playing with the dipswitches in accordance to your pictures.

Ok after 14 hours (a bit less tbh, but it's like 10min) of running 3 boards in the correct cgminer, at a regular pool, with the twin_test bitstream:

The pool side shows my current hashing speed to be: 1146Mhs (This is supposedly a 10 or so minute average), I have accumulated 13151 valid shares and 3 invalids.
Cgminer shows: all cores running and hashing at roughly equal speeds (the slow core was caused by the incorrect miner version) 13300 shares 3 rejected shares.

Could someone smarter than me do the math on 13154 shares in 14 hours: whats the average hashing speed ?

yohan

sr. member

Activity: 462

Merit: 251

Quote from: zefir on July 06, 2012, 12:29:14 AM

Quote from: ebereon on July 05, 2012, 06:11:38 PM

I can not believe that the differences are so great with 50 boards. Enterpoint stated that the problem is the controller firmware but why these differences?

Most probably it is a controller issue, or it could even be on SW side and solvable with cgminer updates.

Since boards were tested in shipping-test mode at Enterpoint before delivery, I should re-program my batch and repeat the test. But since this will eat up one additional weekend, I prefer to wait for a better bitstream (hoping for the next week).

Quote from: Keninishna on July 05, 2012, 07:46:57 PM

Crap I hope my 23 boards work when they arrive. On the other hand I'm sure enterpoint will take back the non working boards and send you new ones.

I guess only those 3 boards that fail to get detected might need to be RMAd, while all the other should be fixable with new FW. No doubt defunct units will get replaced, like already happened in this thread.

Can you send as much information into the bitcoin support email so that everyone relevant gets to see the problem. Outside what's already been mentioned on the forum, and thanks to all that have already forwarded into our support, we have not had any reports like this come in. So as yet we don't have a big statistical base to work on. We also haven't had a faulty unit arrive back to us yet to analyse. One I think is on it's way. So bear with us for a few days whilst we get enough information to give us a clue where to look. Probably the first thing to do is to try and create a model setup as similar to Zefir's as we can. We do think there is more than one aspect here and elements of software and firmware are probably the main place to look. However we can't rule out a hardware failure so that also needs to be part of the analysis. From what we have seen here on the line USB failures are very low and all down real identifable things like solder shorts.

zefir

donator

Activity: 919

Merit: 1000

Quote from: ebereon on July 05, 2012, 06:11:38 PM

I can not believe that the differences are so great with 50 boards. Enterpoint stated that the problem is the controller firmware but why these differences?

Most probably it is a controller issue, or it could even be on SW side and solvable with cgminer updates.

Since boards were tested in shipping-test mode at Enterpoint before delivery, I should re-program my batch and repeat the test. But since this will eat up one additional weekend, I prefer to wait for a better bitstream (hoping for the next week).

Quote from: Keninishna on July 05, 2012, 07:46:57 PM

Crap I hope my 23 boards work when they arrive. On the other hand I'm sure enterpoint will take back the non working boards and send you new ones.

I guess only those 3 boards that fail to get detected might need to be RMAd, while all the other should be fixable with new FW. No doubt defunct units will get replaced, like already happened in this thread.

Keninishna

hero member

Activity: 556

Merit: 500

Quote from: zefir on July 05, 2012, 05:27:48 PM

Update: initial statistical evaluation

Having an almost statistically significant amount of units at hand and fought almost two weeks to get the most out of them, I have collected some numbers that might be interesting for you.

....
Thanks and good night.

Crap I hope my 23 boards work when they arrive. On the other hand I'm sure enterpoint will take back the non working boards and send you new ones.

ebereon

sr. member

Activity: 397

Merit: 500

Well done, thanks for sharing that information zefir.

I can not believe that the differences are so great with 50 boards. Enterpoint stated that the problem is the controller firmware but why these differences?

zefir

donator

Activity: 919

Merit: 1000

Update: initial statistical evaluation

Having an almost statistically significant amount of units at hand and fought almost two weeks to get the most out of them, I have collected some numbers that might be interesting for you.

Setup details:

50 boards, serial numbers 55-100 and 126-129
all boards SPI programmed to twin_test.bit
all DIP switches set as documented for twin test mode, plus SW6-1 off for 115kBaud
powered 7-port USB2.0 hubs
1 x Enermax Revo 1.5kW + 2 x CoolerMaster 1.2kW gold PSUs
programming with xc3sprog natively under Linux
mining with native cgminer-2.4.4 for Icarus under Linux

What made testing hard for me was the non deterministic failure patterns, i.e. one unit might fail today and work tomorrow or vice versa. After successively narrowing the problematic units out, I classified my batch like follows:

1. OK
Around 23 boards work almost stable, meaning after 24h there are only 2-5 FPGAs that stopped operation, while the other ~40 settle down to a utility of ~2.5/min. This corresponds to ~360MHps / unit, pool is reporting around 8GH for the whole array. Having to power-cycle the whole setup every 24h is far from being maintenance-free, but I get already 20% of the potential total hashing power when the Quads operated as proposed.

2. UNSTABLE
Unstable units are those that

start to mine at full speed but stop after some hours
mine with a significantly lower hashing rate: U < 1/min
start to mine with only one FPGA
or an arbitrary combination of the named issues

Those devices are the most problematic ones, since you need several rounds to identify an unstable unit. From my batch 15 belong to this category.

3. FAILING
I classify failing units those that are detected but do not mine, including

those that fail the Icarus detection (i.e. fail to find the golden nonce)
those that are detected as Icarus but do not generate any valid shares

My batch has 9 of those units, which are easy to identify, since the failure is reproducible, i.e. no matter if you reprogram them before you start or not, they always fail to generate valid shares.

4. BROKEN
Those are the units that either

are not detected as ttyUSB serial ports
get disconnected immediately after detection (Linux reports bus errors)
fail to initialize communication (Linux reports errors setting com parameters or flow control)
just freeze the system until you pull the plug

I have 3 of those boards that I assume need to be sent back to Enterpoint for repair, since the problem seems to be located at the FTDI side.

Bottom line: your chances to setup your Cairnsmore1 as Icarus and mine continuously are less than 45%. Your odds to have some unstable device are 30% and that device will trick and cheat you until you start to disbelieve in your skills. 18% will find their board not generating valid hashes in Icarus mode (maybe only working in shipping-test mode, not tested). And finally 6% might be the unlucky ones that buy new USB hubs and/or switch between Linux and Windows to just see that the board is not detected and/or makes the system hang.

This is a rough estimate based on my batch, YMMV. The only advice I can give: do not kick the board too hard and don't get frustrated about your skills - chances are 50%+ that yours is not the OK one.

Yohan, your team did a great job, the HW is really top-notch (Swiss made fans, do I need say more?

). But now, priority 1 is to unfold the potential of this piece of HW as soon as possible. While I am glad to see the PDB, stacking kits and up/down links and realize how they will significantly improve the mess I currently have in my setup - more important than that is to see the Quads operating continuously somewhere around 800MHps. I therefore hope that your own bitstream is progressing well and you also consider to support EldenTyrell developing a CM1 TriCore bitstream.

Thanks and good night.

misternoodle

member

Activity: 108

Merit: 10

Quote from: yohan on July 05, 2012, 02:53:50 PM

Quote from: misternoodle on July 05, 2012, 02:35:01 PM

I've got my board working for the most, the weird issue I'm having is that once each core has hashed about 1000-1100 shares, they will both display "OFF" in CGMiner 2.4.1 standard build. Windows doesn't see the Cairnsmore device anymore and I usually have to power cycle the unit and then restart the computer for it to redetect. Any idea what might be wrong?

Windows 7 x64

There are a few reports like that and we will try and investigate them. We have a couple of theories to look at but we will have to pull personnel off the new stuff to do that so it's a compromise of when we do that looking. We think there is a good chance a lot of this type of problem will simple dissappear with the next major releases of bitstreams and associated items.

I think there was a post earlier in the thread that mentioned a CGminer issue a bit like this but I am struggling to find that again.

Thanks Yohan, I'll just continue to power cycle it when it happens in the meantime.

Isokivi

hero member

Activity: 910

Merit: 1000

Items flashing here available at btctrinkets.com

Quote from: misternoodle on July 05, 2012, 02:35:01 PM

I've got my board working for the most, the weird issue I'm having is that once each core has hashed about 1000-1100 shares, they will both display "OFF" in CGMiner 2.4.1 standard build. Windows doesn't see the Cairnsmore device anymore and I usually have to power cycle the unit and then restart the computer for it to redetect. Any idea what might be wrong?

Windows 7 x64

I've had this happen once, with 1/3 boards. Got no leads as in to what it's related.

yohan

sr. member

Activity: 462

Merit: 251

Quote from: punin on July 05, 2012, 03:22:07 PM

Quote from: yohan on July 05, 2012, 07:58:12 AM

If I have made the correct association you should have had an email on the 23rd of June. I will forward it again to you.

Thank you! Amazingly enough, I couldn't find that mail in either my inbox or junk box. I might have deleted it by accident.

Email isn't a perfect system.

punin

hero member

Activity: 560

Merit: 500

Quote from: yohan on July 05, 2012, 07:58:12 AM

If I have made the correct association you should have had an email on the 23rd of June. I will forward it again to you.

Thank you! Amazingly enough, I couldn't find that mail in either my inbox or junk box. I might have deleted it by accident.

yohan

sr. member

Activity: 462

Merit: 251

Quote from: misternoodle on July 05, 2012, 02:35:01 PM

I've got my board working for the most, the weird issue I'm having is that once each core has hashed about 1000-1100 shares, they will both display "OFF" in CGMiner 2.4.1 standard build. Windows doesn't see the Cairnsmore device anymore and I usually have to power cycle the unit and then restart the computer for it to redetect. Any idea what might be wrong?

Windows 7 x64

There are a few reports like that and we will try and investigate them. We have a couple of theories to look at but we will have to pull personnel off the new stuff to do that so it's a compromise of when we do that looking. We think there is a good chance a lot of this type of problem will simple dissappear with the next major releases of bitstreams and associated items.

I think there was a post earlier in the thread that mentioned a CGminer issue a bit like this but I am struggling to find that again.

misternoodle

member

Activity: 108

Merit: 10

I've got my board working for the most, the weird issue I'm having is that once each core has hashed about 1000-1100 shares, they will both display "OFF" in CGMiner 2.4.1 standard build. Windows doesn't see the Cairnsmore device anymore and I usually have to power cycle the unit and then restart the computer for it to redetect. Any idea what might be wrong?

Windows 7 x64

Isokivi

hero member

Activity: 910

Merit: 1000

Items flashing here available at btctrinkets.com

Quote from: yohan on July 05, 2012, 01:38:25 PM

Are you running the correct version of CGminer?

When you flashed the units dip you have the dip switches in the correct position for flashing and then afterwards set them to running position for the twin bitstream?

I tought I was, somewhere along the line I had for some reason started the wrong one... Im running the correct one now, I'll report in when I have enough data, I think around 1000shares per core should be definite ?

And yes I have been playing with the dipswitches in accordance to your pictures.

yohan

sr. member

Activity: 462

Merit: 251

Quote from: Isokivi on July 05, 2012, 01:31:32 PM

62-0108, 62-0158 and 62-0159. Windows 7. But I believe I have hardware issues ruled out, because plugging in the 2 boards that function only yields the same problem.

Are you running the correct version of CGminer?

When you flashed the units dip you have the dip switches in the correct position for flashing and then afterwards set them to running position for the twin bitstream?

Isokivi

hero member

Activity: 910

Merit: 1000

Items flashing here available at btctrinkets.com

62-0108, 62-0158 and 62-0159. Windows 7. But I believe I have hardware issues ruled out, because plugging in the 2 boards that function only yields the same problem.

yohan

sr. member

Activity: 462

Merit: 251

[/quote]

Reflashing does not appear to of have worked, neither has moving the board to it's own usb3-port. Im starting to wonder if I have a defective one. it's pushing one tenth of the shares of the other cores atm. When tested individually at a regular pool after the re-flash it kept jammering about the pool not providing work fast enough (my gpu miners report nothing of such at the same time at the same pool). On p2pool it appeared to work for some reason, so Im guessing the rapid longpolls have something to do with this. The board reported in at 365MH/s on the 10 minute mark at p2pool.

I am now going to double check this by pointing all 3 boards to p2pool in the same worker... and if that reveals nothing then Im propably going to see how they behave in induvidual cgminer instances, does anyone have experience of running multiple cgminer copy's on the same system ?

[/quote]

What ia your board serial number and what environment that you are working in?

Isokivi

hero member

Activity: 910

Merit: 1000

Items flashing here available at btctrinkets.com

This is the weirdest thing: When starting a cgminer copy with only two boards Im having the same issue, again it's the last core thats slacking off and the core that has been troublesome so far is not in there.
[edit]
I'd like to hear from user(s) running multiple boards on windows 7, do you have the same issue in cgminer, the last core is signifigantly slower that the others ?

Isokivi

hero member

Activity: 910

Merit: 1000

Items flashing here available at btctrinkets.com

Quote from: Hpman on July 05, 2012, 12:07:27 PM

Under Ubuntu im using different instances of cgminer, one for GPU and one for the cairnmore without problems.

Hpman

Thank you, proceeding to that.. the issue had nothing to do with the pool, same problems persist on P2pool. So I guess it could have something to do with getwork protocols.

Topic: Cairnsmore1 - Quad XC6SLX150 Board - page 71. (Read 286370 times)