Pages:
Author

Topic: Cairnsmore1 - Quad XC6SLX150 Board - page 42. (Read 286370 times)

sr. member
Activity: 476
Merit: 250
Keep it Simple. Every Bit Matters.
August 02, 2012, 11:41:04 AM
what are DCMs?

DCM - Digital Clock Manager... it is function for manipulating clock signals. It's suppose to keep things stable, it's a big oh dear if it's not.
hero member
Activity: 686
Merit: 564
August 02, 2012, 11:39:53 AM
makomk,

what are DCMs?

spiccioli.

They're the things that generate the clocks for the rest of the FPGA. No working clock means that the FPGA is almost literally frozen.
legendary
Activity: 1379
Merit: 1003
nec sine labore
August 02, 2012, 11:33:12 AM
If I'm correct it must be the FPGA pair 0/1, on 3 of my boards FPGA1 freeze after some hours with 200, 190, 180, 170, 160. So i flashed the 150Mh bitstream, after 1 day and 12 hours it has freeze again on 2 boards, the 3th is still running at 4.89 U/m. I will test now the 140Mh on the FPGA1 on the 2 problematic boards. Just try it.
It's probably the DCMs losing their lock, had a chat with Glasswalker and he confirmed it's a problem on the Cairnsmore1 boards. I may have a workaround out in a couple of hours if the Xilinx gods smile upon us all. A proper fix will be trickier.

makomk,

what are DCMs?

spiccioli.
hero member
Activity: 686
Merit: 564
August 02, 2012, 11:16:13 AM
If I'm correct it must be the FPGA pair 0/1, on 3 of my boards FPGA1 freeze after some hours with 200, 190, 180, 170, 160. So i flashed the 150Mh bitstream, after 1 day and 12 hours it has freeze again on 2 boards, the 3th is still running at 4.89 U/m. I will test now the 140Mh on the FPGA1 on the 2 problematic boards. Just try it.
It's probably the DCMs losing their lock, had a chat with Glasswalker and he confirmed it's a problem on the Cairnsmore1 boards. I may have a workaround out in a couple of hours if the Xilinx gods smile upon us all. A proper fix will be trickier.
sr. member
Activity: 397
Merit: 500
August 02, 2012, 11:07:12 AM
ICA15 seems really slow, don't know why.
If I'm correct it must be the FPGA pair 0/1, on 3 of my boards FPGA1 freeze after some hours with 200, 190, 180, 170, 160. So i flashed the 150Mh bitstream, after 1 day and 12 hours it has freeze again on 2 boards, the 3th is still running at 4.89 U/m. I will test now the 140Mh on the FPGA1 on the 2 problematic boards. Just try it.

All the others are around 4.2 - 5.1 so not so fast, they're in a room that today is at 30 degrees C though.

There are several messages of share below target, so I've compiled cgminer with zefir's patch and I'll run it to be able to see what boards have higher invalids and flash them with slower bitstream.

spiccioli.

I had similar problems with ABCpool, U was between 4.5 and 5.1. On Ozcoin i have 5.1 - 5.4 (5.6 with 200Mh) except the 3 problemaic boards which are at ~4.89 (200/150Mh).
I have also optimiced every FPGA to it's best bitstream. With MPBM i have the information about Invalids in % and when a invalid share is logged i can see which FPGA of the both per pair produced it.

Like:
Code:
2012-08-02 17:33:40.694	[200]	CM30 SN#62-415: 	Got K-not-zero share 78d8dad1
                                ^COM30                                             ^0-7 = FPGA0; 8-9 and a-f = FPGA1

The second-to-last bit tells which FPGA is it. (Info from TheSeven from IRC Smiley ... thanks TheSeven!)

I hope it helps a bit.

eb
legendary
Activity: 1379
Merit: 1003
nec sine labore
August 02, 2012, 10:41:56 AM
zefir,

I see that your patch changes

Code:
submit_nonce(thr, work, nonce);

with

Code:
submit_work_sync(thr, work);

what's the difference?

spiccioli

ps. linux 32bit cgminer 2.6.1 with zefir patch available at

http://p2pool.soon.it/cgminer/cgminer-2.6.1-zefir

legendary
Activity: 1379
Merit: 1003
nec sine labore
August 02, 2012, 10:19:45 AM
Ok,

twenty hours since I've flashed the 190MH/s bitstream from makomk into my boards, this is how it is going

Code:
 cgminer version 2.6.1 - Started: [2012-08-01 22:39:42]
--------------------------------------------------------------------------------
 (5s):7729.5 (avg):7294.7 Mh/s | Q:179757  A:100170  R:680  HW:0  E:56%  U:90.6/m
 TQ: 20  ST: 20  SS: 4  DW: 1556  NB: 120  LW: 0  GF: 1730  RF: 7
 Connected to http://pool.abcpool.co with LP as user ....
 Block: 00000352b4eaf99deb56ff1c28d9eff2...  Started: [16:59:32]
--------------------------------------------------------------------------------
 [P]ool management [S]ettings [D]isplay options [Q]uit
 ICA  0:                | 379.6/366.8Mh/s | A:5473 R:37 HW:0 U:4.95/m
 ICA  1:                | 379.8/367.1Mh/s | A:5475 R:46 HW:0 U:4.95/m
 ICA  2:                | 379.7/366.6Mh/s | A:5473 R:34 HW:0 U:4.95/m
 ICA  3:                | 379.9/366.9Mh/s | A:5274 R:41 HW:0 U:4.77/m
 ICA  4:                | 379.6/367.4Mh/s | A:5524 R:32 HW:0 U:4.99/m
 ICA  5:                | 379.7/366.8Mh/s | A:5701 R:32 HW:0 U:5.15/m
 ICA  6:                | 379.7/367.2Mh/s | A:5469 R:41 HW:0 U:4.94/m
 ICA  7:                | 379.6/366.6Mh/s | A:5596 R:39 HW:0 U:5.06/m
 ICA  8:                | 379.8/366.6Mh/s | A:5096 R:35 HW:0 U:4.61/m
 ICA  9:                | 379.7/366.9Mh/s | A:5602 R:37 HW:0 U:5.06/m
 ICA 10:                | 379.8/366.2Mh/s | A:2842 R:15 HW:0 U:2.57/m
 ICA 11:                | 379.9/366.0Mh/s | A:2452 R:16 HW:0 U:2.22/m
 ICA 12:                | 369.8/346.2Mh/s | A:4721 R:27 HW:0 U:4.27/m
 ICA 13:                | 348.2/346.0Mh/s | A:4728 R:41 HW:0 U:4.27/m
 ICA 14:                | 379.7/366.9Mh/s | A:5589 R:40 HW:0 U:5.05/m
 ICA 15:                | 379.7/367.4Mh/s | A:3090 R:22 HW:0 U:2.79/m
 ICA 16:                | 379.7/366.9Mh/s | A:5324 R:36 HW:0 U:4.81/m
 ICA 17:                | 379.7/366.9Mh/s | A:5604 R:31 HW:0 U:5.07/m
 ICA 18:                | 379.8/366.3Mh/s | A:5546 R:42 HW:0 U:5.01/m
 ICA 19:                | 379.9/366.9Mh/s | A:5592 R:36 HW:0 U:5.06/m
--------------------------------------------------------------------------------

 [2012-08-02 17:05:45] Accepted e21cf766.be4d2d09 ICA 7 pool 0
 [2012-08-02 17:05:45] ICA7                | (5s):379.6 (avg):366.7 Mh/s | A:5596 R:39 HW:0 U:5.1/m
 [2012-08-02 17:05:46] Accepted dbda8c4a.97ed61c3 ICA 3 pool 0
 [2012-08-02 17:05:46] ICA3                | (5s):379.9 (avg):366.9 Mh/s | A:5274 R:41 HW:0 U:4.8/m
 [2012-08-02 17:05:46] Accepted 9ab35a4f.ebe198ce ICA 9 pool 0
 [2012-08-02 17:05:46] ICA9                | (5s):379.7 (avg):366.9 Mh/s | A:5602 R:37 HW:0 U:5.1/m
 [2012-08-02 17:05:47] Accepted 42b4d56c.66bf4796 ICA 16 pool 0
 [2012-08-02 17:05:47] ICA16                | (5s):379.7 (avg):366.9 Mh/s | A:5324 R:36 HW:0 U:4.8/m
 [2012-08-02 17:05:47] Accepted 9055797a.8d8bc0b4 ICA 14 pool 0
 [2012-08-02 17:05:47] ICA14                | (5s):379.7 (avg):366.9 Mh/s | A:5589 R:40 HW:0 U:5.1/m
 [2012-08-02 17:05:47] Accepted fa65eee7.0a6c0c1b ICA 15 pool 0
 [2012-08-02 17:05:47] ICA15                | (5s):379.7 (avg):367.4 Mh/s | A:3090 R:22 HW:0 U:2.8/m
 [2012-08-02 17:05:48] Accepted 6a127a6c.c8cbb664 ICA 10 pool 0
 [2012-08-02 17:05:48] ICA10                | (5s):379.8 (avg):366.2 Mh/s | A:2842 R:15 HW:0 U:2.6/m
 [2012-08-02 17:05:48] Accepted cbc97141.930954ea ICA 0 pool 0
 [2012-08-02 17:05:48] ICA0                | (5s):379.6 (avg):366.8 Mh/s | A:5473 R:37 HW:0 U:4.9/m
 [2012-08-02 17:05:49] Accepted dcec8a8e.5c86ef90 ICA 1 pool 0
 [2012-08-02 17:05:49] ICA1                | (5s):379.8 (avg):367.2 Mh/s | A:5474 R:46 HW:0 U:4.9/m
 [2012-08-02 17:05:50] Accepted 0235b45d.21f40441 ICA 1 pool 0
 [2012-08-02 17:05:50] ICA1                | (5s):379.8 (avg):367.2 Mh/s | A:5475 R:46 HW:0 U:5.0/m
 [2012-08-02 17:05:50] (5s):7729.5 (avg):7294.7 Mh/s | Q:179757  A:100170  R:680  HW:0  E:56%  U:90.6/m
 [2012-08-02 17:05:50] Accepted 067ff243.88842ab6 ICA 5 pool 0
 [2012-08-02 17:05:50] ICA5                | (5s):379.7 (avg):366.8 Mh/s | A:5701 R:32 HW:0 U:5.2/m

ICA10/11 is board serial n. 8, which is still using twin_test.bit

ICA15 seems really slow, don't know why.

All the others are around 4.2 - 5.1 so not so fast, they're in a room that today is at 30 degrees C though.

There are several messages of share below target, so I've compiled cgminer with zefir's patch and I'll run it to be able to see what boards have higher invalids and flash them with slower bitstream.

spiccioli.
sr. member
Activity: 397
Merit: 500
August 02, 2012, 05:30:13 AM
I get the same error as Ebereon I have 62-0013. The xc3sprog –c cm1 –p 0 twin_test.bit works, but loading to SPI Flash with xc3sprog –c cm1 –p 0 –I xc6lx150.bit twin_test.bit gives me the error below. 

JEDEC: ff ff 0xff 0xff
unknown JEDEC manufacturer: ff
ISF Bitfile probably not loaded

Also looking forward to programming these in Linux.

Thanks!


trying to flash 2 * FPGA boards and get this same problem on my Win 7 64 bit PC...any help on solving?

change "–I xc6lx150.bit" to
"–Ixc6lx150.bit" <- no space between -I and xc.....
hero member
Activity: 810
Merit: 1000
August 02, 2012, 05:23:07 AM
I get the same error as Ebereon I have 62-0013. The xc3sprog –c cm1 –p 0 twin_test.bit works, but loading to SPI Flash with xc3sprog –c cm1 –p 0 –I xc6lx150.bit twin_test.bit gives me the error below. 

JEDEC: ff ff 0xff 0xff
unknown JEDEC manufacturer: ff
ISF Bitfile probably not loaded

Also looking forward to programming these in Linux.

Thanks!


trying to flash 2 * FPGA boards and get this same problem on my Win 7 64 bit PC...any help on solving?
legendary
Activity: 1379
Merit: 1003
nec sine labore
August 02, 2012, 02:26:08 AM
I've written an Icarus change for cgminer that will support 3 new options:
baud rate (115200 or 57600) work divisor (1, 2, 4 or 8 ) and number of FPGA
This should even work with the old setup with only 1 of 2 FPGA working Smiley
https://github.com/ckolivas/cgminer/pull/283
Anyone interested come visit #cgminer as usual ... tomorrow ...

kano,

I'd like to ask one thing: the HW: value gives the number of invalid hashes that have been returned by a GPU, can this control be enable for FPGAs as well?

MPBM has a column in its web-page interface which tells how many invalid shares have been returned, can cgminer test returned shares to see if they're valid?

spiccioli

Hi spiccioli,

that bothered me too, since the information about invalids is missing completely in the Icarus stats.

What you can do with the original cgminer is enable verbose output. Do this either by adding --verbose to cgminer command line parameters or enable it interactively:
a) press D for display menu
b) press V for verbose
c) press to get back to main menu

Invalid shares will be displayed with 'Share below target'. That helps to see if a board / bitstream combination is instable.

To track invalids long term, I use the following patch:
Code:
diff --git a/driver-icarus.c b/driver-icarus.c
index 5f2c78a..82d06f3 100644
--- a/driver-icarus.c
+++ b/driver-icarus.c
@@ -563,7 +566,12 @@ static int64_t icarus_scanhash(struct thr_info *thr, struct work *work,
        nonce = swab32(nonce);
 #endif
 
-       submit_nonce(thr, work, nonce);
+       if (!test_nonce(work, nonce)) {
+               applog(LOG_INFO, "%s%d: Share below target", icarus->api->name, icarus->device_id);
+               thr->cgpu->hw_errors++;
+               return 0;
+       }
+       submit_work_sync(thr, work);
 
        hash_count = (nonce & 0x7fffffff);
        if (hash_count++ == 0x7fffffff)

It will add those invalid shares to the HW counter displayed in the stats, quite useful to see how well a board did overnight with a given bitstream.

Hi zefir,

thanks for the info and patch, I'm seeing several share below target now!

This explains also while a few fpgas have U: lower than others.

spiccioli
sr. member
Activity: 327
Merit: 250
August 01, 2012, 09:54:33 PM
0   ICA   0   Y   Alive   0.00°C   379.81   379.73   14,494   22   0   U5.33/m   
1   ICA   1   Y   Alive   0.00°C   379.81   379.62   14,317   33   0   U5.27/m   

SUMMARY   1day 21h 17m 27s

So far so good with makomk's 190 bitstream, getting to that 48h Mark.

newbie
Activity: 48
Merit: 0
August 01, 2012, 08:41:27 PM
Here's a little write-up on my efforts last night with CM1 serial 62-0023.

Fired it up for the first time (been sitting gathering dust waiting for some worthwhile bitstreams etc), used my gaming pc (windows) to update controller to 1.3

Connected to the mining PC (gentoo), compiled xc3sprog with some help from arne and others in the IRC channel, and programmed makomk's 190 bitstream.

Configured MPBM, and mucked around with baud rates and dip settings till it was all working, again with more help from ebereon and others (thanks) on IRC. Note with controller 1.3, SW6 #1 dip controls 50/100mhz clock, which also sets 115200/57600 baud. This needs to be reset in the miner worker if changed. At one point the dip switch looked like it was off but hadnt properly "clicked" into the position somehow. It took me a while to realise, a click on and off (and powercycle board) fixed it. You can see the red flashing light next to the controller flashes at half the speed in 50mhz mode quite clearly.

Results for 190mhz: 100% invalids.

Reprogrammed with 150 makomk bitstream:
Results for 75mhz (hadnt quite figured out the dip setting problem above at this stage) : 30% invalids

Reprogrammed with 140 makomk bitstream:
Results for 140mhz: 80% invalids
Results for 70mhz: ~25% invalids

Thats how it stands currently, at 70mhz I get 280MH/s for the whole board, with 25% of that being invalid.

Well its no wonder Enterpoint had difficulties initially and limited the shipping test bitstream to 50mhz.

I've sent an email to Enterpoint asking about the possible capacitor fix that applies to the first 50 boards, or complete RMA.
hero member
Activity: 556
Merit: 500
August 01, 2012, 06:31:14 PM
This is a make-shift guide for windows users for faster, permanent flashing. I have not yet tested it, but apparrentley Slipbye has had succes with it. It also gets us out of the virtual machine (for good ?)

21:13] http://sourceforge.net/projects/libusb-win32/files/libusb-win32-releases/1.2.6.0/libusb-win32-devel-filter-1.2.6.0.exe/download
[21:14] install that, make sure the board to be flashed is plugged, start the libusb filter wizard
[21:14] select the USB composite device which shows an ID of 0403 8350
[21:14] install the filter driver for that
[21:14] download this: https://xc3sprog.svn.sourceforge.net/svnroot/xc3sprog/trunk/xc3sprog.exe
[21:14] create a new file called cablelist.txt in the same directory
[21:15] put this line inside that file:
[21:15] cm1 ftdi 20000000 0x0403:0x8350:
[21:15] open a command prompt in the directory where the files are and run these commands:
[21:15] set CABLEDB=cablelist.txt
[21:15] xc3sprog -c cm1
[21:15] it should detect the fpgas
a this point you want to copy the .bit files you'll be using to the same folder as xc3sprog is in.
[21:16] if that worked, you can go ahead with flashing like usual
[21:16] xc3sprog -c cm1 -p 0 -Ixc6lx150.bit file_to_be_flashed.bit

Enjoy your 1-3hours of spare time per day Smiley

GLORIUS, thank you sir! I was hating spending so much time flashing my boards.
sr. member
Activity: 397
Merit: 500
August 01, 2012, 05:32:28 PM
Can I pay for my order in BTC? If not, what are the options?

Here are some informations about this board -> http://www.enterpoint.co.uk/cairnsmore/cairnsmore1.html

Payment options are also shown on it, just scroll down a bit.

eb
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
August 01, 2012, 05:31:39 PM
No, the problem has been there since luke-jr first wrote the bitforce code (xiangfu copied the original bitforce code and then we rewrote ... most of it Tongue)
I suspect bitforce has the same problem - I'll check later.

It should be done in cgminer.c in my opinion and I'll discuss with ckolivas if he can think of a reason why it shouldn't
(i.e. yes all the drivers will need changing)
sr. member
Activity: 308
Merit: 250
August 01, 2012, 05:30:20 PM
Can I pay for my order in BTC? If not, what are the options?
donator
Activity: 919
Merit: 1000
August 01, 2012, 05:27:16 PM
Yep - looks like I'm wrong.
I thought the hash check was automatically called inside test_nonce()
(well there is certainly no reason I can think of why it shouldn't for all devices Tongue)
I'll chase up getting that fixed ...

Edit: oh it is called in there, just the HW counter isn't incremented generically, the driver code has to ...
Well ... don't I look stupid Cheesy
Meh, I guess I shouldn't have assumed the original code (back when I started changing it) did that properly.
I was not sure if it would be semantically right to increment the hw_error counter in submit_nonce() generally without adding side effects. Just searching the code for places that modify it shows that some drivers do it privately, so adding it to the generic function might account events twice.

Anyhow, it was meant to be a temporary hack helping us to tackle down the problems with CM1. If you think it is correct to use it for Icarus in general (which IMHO is, since a wrong share is a HW error), I can submit it as pull request, or I'm also fine if you just add it to one of your staging patch sets.

Thanks, zefir
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
August 01, 2012, 05:15:12 PM
Yep - looks like I'm wrong.
I thought the hash check was automatically called inside test_nonce()
(well there is certainly no reason I can think of why it shouldn't for all devices Tongue)
I'll chase up getting that fixed ...

Edit: oh it is called in there, just the HW counter isn't incremented generically, the driver code has to ...
Well ... don't I look stupid Cheesy
Meh, I guess I shouldn't have assumed the original code (back when I started changing it) did that properly.
donator
Activity: 919
Merit: 1000
August 01, 2012, 04:47:31 PM
I've written an Icarus change for cgminer that will support 3 new options:
baud rate (115200 or 57600) work divisor (1, 2, 4 or 8 ) and number of FPGA
This should even work with the old setup with only 1 of 2 FPGA working Smiley
https://github.com/ckolivas/cgminer/pull/283
Anyone interested come visit #cgminer as usual ... tomorrow ...

kano,

I'd like to ask one thing: the HW: value gives the number of invalid hashes that have been returned by a GPU, can this control be enable for FPGAs as well?

MPBM has a column in its web-page interface which tells how many invalid shares have been returned, can cgminer test returned shares to see if they're valid?

spiccioli

Hi spiccioli,

that bothered me too, since the information about invalids is missing completely in the Icarus stats.

What you can do with the original cgminer is enable verbose output. Do this either by adding --verbose to cgminer command line parameters or enable it interactively:
a) press D for display menu
b) press V for verbose
c) press to get back to main menu

Invalid shares will be displayed with 'Share below target'. That helps to see if a board / bitstream combination is instable.

To track invalids long term, I use the following patch:
Code:
diff --git a/driver-icarus.c b/driver-icarus.c
index 5f2c78a..82d06f3 100644
--- a/driver-icarus.c
+++ b/driver-icarus.c
@@ -563,7 +566,12 @@ static int64_t icarus_scanhash(struct thr_info *thr, struct work *work,
        nonce = swab32(nonce);
 #endif
 
-       submit_nonce(thr, work, nonce);
+       if (!test_nonce(work, nonce)) {
+               applog(LOG_INFO, "%s%d: Share below target", icarus->api->name, icarus->device_id);
+               thr->cgpu->hw_errors++;
+               return 0;
+       }
+       submit_work_sync(thr, work);
 
        hash_count = (nonce & 0x7fffffff);
        if (hash_count++ == 0x7fffffff)

It will add those invalid shares to the HW counter displayed in the stats, quite useful to see how well a board did overnight with a given bitstream.
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
August 01, 2012, 03:54:45 PM
I've written an Icarus change for cgminer that will support 3 new options:
baud rate (115200 or 57600) work divisor (1, 2, 4 or 8 ) and number of FPGA
This should even work with the old setup with only 1 of 2 FPGA working Smiley
https://github.com/ckolivas/cgminer/pull/283
Anyone interested come visit #cgminer as usual ... tomorrow ...

kano,

I'd like to ask one thing: the HW: value gives the number of invalid hashes that have been returned by a GPU, can this control be enable for FPGAs as well?

MPBM has a column in its web-page interface which tells how many invalid shares have been returned, can cgminer test returned shares to see if they're valid?

spiccioli
It is enabled.
I don't filter anything out of the return data.
If I understand the Icarus bitstream source correctly (though I'm not sure), the only invalids I should get are if the share value is something like less than 256
Thus you should find ... on average ... around one HW: per 8 to 10 blocks you find.
I've actually never had any HW: in the months I've been mining on 2 Icarus boards (I just searched all my logs)
In cgminer a HW: is when the device returns a share, but it's not actually a share according to a re-hash of it.
All shares are checked that way in cgminer (i.e. Icarus also)

Having seen the recent problems with BFL - I suspect it's either a case of the Serial/USB driver queues data to avoid overwriting, or discards corrupt data before it gets back.
However, MPBM accesses the device via USB so I'm not sure exactly why it shows regular HW errors.

For anyone curious:
... and if you did actually add my pull request to the compile, the new option (as per the changed FPGA-README) is --icarus-options
Normally people would just specify --icarus-options 57600 to change the baud rate for all boards to 57600 instead of 115200
It also has a 'work_division' value which would normally be the default 2 (meaning the board divides the work in half for the FPGAs - you can specify 1, 2, 4 or even 8 Smiley )
And lastly 'fpga_count' which would normally be the same as work_division (but defaults to 2), however the earlier bitstreams had the issue where only one of 2 FPGA were hashing and thus setting that to 1 (instead of 2) will give back the correct MH/s
i.e. as per the example in FPGA-README --icarus-options 57600:2:1 would match a CM1 that hashes at standard icarus speed but only uses 1 of the 2 FPGA - and thus should display the MH/s correctly (and I/O at 57600 instead of 115200)
Of course, anyone specifically using --icarus-options please say so, so I can be sure if it actually is working correctly Smiley
Pages:
Jump to: