Author

Topic: Avalon ASIC users thread - page 116. (Read 438596 times)

hero member
Activity: 546
Merit: 500
July 17, 2013, 02:37:43 PM
So I attempted to upgrade the power supply on my 3-board Avalon...

It had a 750w Corsair in it already so I don't think it actually needed to be upgraded unless I decided to add a 4th board. I was getting a lot of invalid shares, though, so I figured a bigger power supply might help.

I bought a 850w OCZ ZW power supply and installed it and it didn't work! I'm not sure what was wrong. I could connect to the Avalon through the web interface, log in, and edit the configuration but the fans never spun up and it never started hashing.

I thought maybe I had broken it but I put the old power supply back in and it was fine.

I thought maybe I was missing a cable or something but I guess I just had a defective supply?

I plugged in:

The very wide main connector
The 8 pin connector that said "CPU" on it
The two red cables (GPU) that plug into the side of the supply.

The supply also had an additional connector, 8-pin but a different layout from the CPU power, I believe that looks like it would fit into another connector across from the CPU connector. It wasn't plugged in with the old power supply, so i didn't plug it in for the new one. Not sure what its for. Power supplies were always a mystery to me.

I wish there was a schematic or something that explained it. I haven't been able to find one for the Avalon miner.

PS Damn the Chinese engineer who designed the screws on that case!
sr. member
Activity: 342
Merit: 250
July 17, 2013, 01:40:24 PM
is there any module being obstructed by the power / data cables going from it to fpga? I noticed my top 2 modules had this issue right behind the fans that blew on them, those cables were blocking like 20% of the airflow... moving them to the side of the modules freed up this blocked air ... causing drop in temps for the lower module & probably the middle one.

Opened it up, checked it and looked good.  Is it possible they might have forgotten some thermal compound somewhere and sensors aren't reading correct.

In the cgminer stats log, does any value for the match_work_count stand out?


When the ac is on it keeping the temps under 41 the match_work values are pretty similar.  Once I pull the ac off and the temp climbs above 41 their is a big difference in values from one to another.
hero member
Activity: 546
Merit: 500
July 17, 2013, 01:26:27 PM

I get a REALLY large number of invalid shares with my Avalon.

I'm not sure whats going on. It's not consistent.

One day I'll be mining and I'll get 99+% valid shares and the next day or two it will only be 88% valid shares.

I've tried mining on Ozcoin and 50btc and it seems to be bad on both.

I don't think it is my network because my jalapenos only get <1% invalid shares.

I'm using the dynamically adjusting frequency and it usually settle at about 352 MHz or so and 82 GH/s on average. Of course, according to the pool, I'm only getting 70 GH/s or so due to all the invalid shares.

I have an AC feeding directly into the intake on the Avalon and no temperature gets above 48 or 49 or so (very rarely 50).

What could be wrong?
This is normal on stratum. Every time the block changes, you get invalids. Some days burn more blocks than others.

It seems like I have way more than other Avalon users, at least from what I've seen on Ozcoin. Most of them have only a couple percent invalid shares at most, while I will have 12 or 13% invalid shares. Right now I'm standing at about 94% efficiency on average, that's with 400,000 invalid shares in the less than a week I've been mining there.


That is definitely far more than you should be getting. There is a good chance you're submitting heaps of duplicates which may also be a different form of instability that the auto mode can't check for. Try setting a lower maximum speed if you're using auto mode because clearly you're not doing 82GH of useful work.

I also have 11% of rejected shares while HW is ~1.3%
Max temp is 48 C, ambient 28 C.
My command line is: --quiet --avalon-auto --avalon-freq 282-375 --avalon-cutoff 60

Last night it somehow stopped dumping rejected shares and so was till today, then I've caught blackout. After resuming it still generates 11% reject shares working at 352 MHz (autotuned).

It's odd in that it will run for a day or two and get 99.5% efficiency (about 0.5% rejected shares) and then all of a sudden (might coincide with a new round) it will start getting 10+% rejected shares for a day. I tried hard rebooting that when it happens and I'm hoping that might fix it.

I'm also going to play around with different clocks and see if that helps. It usually settles about 352 with max temps at around 49. It's strange that it can run like that for a day before the invalid shares start though. I've determined it is not likely to be my network as my 2 jalapenos only get <0.5% invalid shares.
legendary
Activity: 1246
Merit: 1002
July 17, 2013, 01:11:49 PM
is there any module being obstructed by the power / data cables going from it to fpga? I noticed my top 2 modules had this issue right behind the fans that blew on them, those cables were blocking like 20% of the airflow... moving them to the side of the modules freed up this blocked air ... causing drop in temps for the lower module & probably the middle one.

Opened it up, checked it and looked good.  Is it possible they might have forgotten some thermal compound somewhere and sensors aren't reading correct.

In the cgminer stats log, does any value for the match_work_count stand out?
sr. member
Activity: 342
Merit: 250
July 17, 2013, 12:44:03 PM
is there any module being obstructed by the power / data cables going from it to fpga? I noticed my top 2 modules had this issue right behind the fans that blew on them, those cables were blocking like 20% of the airflow... moving them to the side of the modules freed up this blocked air ... causing drop in temps for the lower module & probably the middle one.

Opened it up, checked it and looked good.  Is it possible they might have forgotten some thermal compound somewhere and sensors aren't reading correct.
legendary
Activity: 2450
Merit: 1002
July 17, 2013, 09:55:48 AM
is there any module being obstructed by the power / data cables going from it to fpga? I noticed my top 2 modules had this issue right behind the fans that blew on them, those cables were blocking like 20% of the airflow... moving them to the side of the modules freed up this blocked air ... causing drop in temps for the lower module & probably the middle one.
sr. member
Activity: 342
Merit: 250
July 17, 2013, 09:52:51 AM
Yesterday my avalon started showing a super high number of hw.  Almost 75% of the accepted diff1a shares.
For instance shares 100000 hw 75000. Any ideas on what to check?

Found out that running any temps higher then 41 starts giving hw higher then 70%.  Im using current firmware from cgminer and using ethernet not wifi.  Anyone able to help or any have ideas. 

What clocks / temps?

Pretty much at any clock,  I have tried 282 and once it climbs over 40-41 then it starts with high errors.    I currently have an airconditioner by it and running stable on 300 clock with temps at 40 and hw errors less then 2%.  If I move the airconditioner the hw starts climbing fast.
legendary
Activity: 2450
Merit: 1002
July 17, 2013, 09:46:52 AM
Yesterday my avalon started showing a super high number of hw.  Almost 75% of the accepted diff1a shares.
For instance shares 100000 hw 75000. Any ideas on what to check?

Found out that running any temps higher then 41 starts giving hw higher then 70%.  Im using current firmware from cgminer and using ethernet not wifi.  Anyone able to help or any have ideas. 

What clocks / temps?
sr. member
Activity: 342
Merit: 250
July 17, 2013, 08:38:13 AM
Yesterday my avalon started showing a super high number of hw.  Almost 75% of the accepted diff1a shares.
For instance shares 100000 hw 75000. Any ideas on what to check?

Found out that running any temps higher then 41 starts giving hw higher then 70%.  Im using current firmware from cgminer and using ethernet not wifi.  Anyone able to help or any have ideas. 
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
July 17, 2013, 02:13:34 AM
That is definitely far more than you should be getting. There is a good chance you're submitting heaps of duplicates which may also be a different form of instability that the auto mode can't check for. Try setting a lower maximum speed if you're using auto mode because clearly you're not doing 82GH of useful work.

I also have 11% of rejected shares while HW is ~1.3%
Max temp is 48 C, ambient 28 C.
My command line is: --quiet --avalon-auto --avalon-freq 282-375 --avalon-cutoff 60

Last night it somehow stopped dumping rejected shares and so was till today, then I've caught blackout. After resuming it still generates 11% reject shares working at 352 MHz (autotuned).
Like I said, try setting a lower maximum.
sr. member
Activity: 277
Merit: 254
July 17, 2013, 12:51:24 AM
today I have experienced a "bigger problem" for the second time since I have it

the machine was unresponsive on web interface, so I thought it has crashed completely and would need a physical restart, which is very bad for me

however, I tried to connect to SSH and somehow it worked, but extremely slowly, took my a couple of minutes but finally I was able to type reboot command and reboot the machine, which it did and everything is OK again

so, if someone has this totally unresponsive machine, try SSH and be very very patient, it may come up

funny thing is that it did not stop hashing until I reboot it - i.e. even in this strange unresponsive state, it was doing its job, just web interface could not be reached and SSH extremely hardly

since the operation with the whole machine was impossibly slow I could not analyze any logs before reboot, so I have no idea what happened
Out of mem - free?
WiFi disabled yes/no
dmesg?
just a few basic commands will tell you what went wrong. Make sure you post the output next time Wink

thanks, will try next time
wifi disabled -> yes
legendary
Activity: 966
Merit: 1000
July 16, 2013, 07:30:45 PM
I can watch MHS5s slowly drop to 0, and eventually cgminer-monitor will restart cgminer. Why is cgminer stopping hashing?
Common reasons:
Wifi kernel problem
Overdoing the overclocking
Pool failure and the cgminer-monitor watchdog is trigger happy and kills cgminer when all it's doing is waiting for a pool to come back online.
FPGA failure in the avalon.

If the fans stop running entirely, does that indicate FPGA failure?

My Batch 1 unit does this sometimes.  I now have a script that queries the API, and if it has a zero hashrate for too long, it will call the connected Web Power Switch to cycle the power on the unit, which always brings it back up.  (I do have to have it leave the power off for a full 30 seconds.)

Well today power-cycleing the unit did not bring it back to life.  I found it stuck repeatedly cycling.  The blue LED on the TP-Link would light, and the red LED on the front would come on, but the fans never spun up.

I opened it up, and out came the stock Antec EarthWatts 650 power supply, and in went a new Corsair HX850.  It was a tight fit.  I had to dismount the PDU board, and reinstall it after the PSU went in.

I'm happy to report that it's once again hashing away.  --avalon-auto has now settled on 360MHz and it's been running for over an hour now, whereas before I had to limit it to 345 to keep things stable.

I'm glad I thought ahead to buy that PSU as a spare for exactly this kind of situation.  I guess I should now buy another spare.

http://www.newegg.com/Product/Product.aspx?Item=N82E16817139011
sr. member
Activity: 315
Merit: 250
Official sponsor of Microsoft Corp.
July 16, 2013, 05:32:28 PM

I get a REALLY large number of invalid shares with my Avalon.

I'm not sure whats going on. It's not consistent.

One day I'll be mining and I'll get 99+% valid shares and the next day or two it will only be 88% valid shares.

I've tried mining on Ozcoin and 50btc and it seems to be bad on both.

I don't think it is my network because my jalapenos only get <1% invalid shares.

I'm using the dynamically adjusting frequency and it usually settle at about 352 MHz or so and 82 GH/s on average. Of course, according to the pool, I'm only getting 70 GH/s or so due to all the invalid shares.

I have an AC feeding directly into the intake on the Avalon and no temperature gets above 48 or 49 or so (very rarely 50).

What could be wrong?
This is normal on stratum. Every time the block changes, you get invalids. Some days burn more blocks than others.

It seems like I have way more than other Avalon users, at least from what I've seen on Ozcoin. Most of them have only a couple percent invalid shares at most, while I will have 12 or 13% invalid shares. Right now I'm standing at about 94% efficiency on average, that's with 400,000 invalid shares in the less than a week I've been mining there.


That is definitely far more than you should be getting. There is a good chance you're submitting heaps of duplicates which may also be a different form of instability that the auto mode can't check for. Try setting a lower maximum speed if you're using auto mode because clearly you're not doing 82GH of useful work.

I also have 11% of rejected shares while HW is ~1.3%
Max temp is 48 C, ambient 28 C.
My command line is: --quiet --avalon-auto --avalon-freq 282-375 --avalon-cutoff 60

Last night it somehow stopped dumping rejected shares and so was till today, then I've caught blackout. After resuming it still generates 11% reject shares working at 352 MHz (autotuned).
legendary
Activity: 2450
Merit: 1002
July 16, 2013, 09:43:22 AM
in case you have multiple pools in the list that switch between you will add up diff1shares then do the calculation.
Also, I find setting up miner.php an easier method of lookin at the data =)
legendary
Activity: 1764
Merit: 1002
July 16, 2013, 09:17:08 AM
I had a maximum temp of 51C last night, and my HW values were almost as high as my accepted.  The past week, it has stayed below 50 during routine operation.

Are you talking about "Accepted" vs "HW" or "Diff1Shares" vs "HW"?

"Accepted"  I don't see a number labeled "Diff1Shares" in the status panel.
 

it's there.  look harder.  you should be using that as part of your denominator.
legendary
Activity: 1246
Merit: 1002
July 16, 2013, 08:48:58 AM
I had a maximum temp of 51C last night, and my HW values were almost as high as my accepted.  The past week, it has stayed below 50 during routine operation.

Are you talking about "Accepted" vs "HW" or "Diff1Shares" vs "HW"?

"Accepted"  I don't see a number labeled "Diff1Shares" in the status panel.
 
legendary
Activity: 1112
Merit: 1000
July 16, 2013, 08:42:22 AM
I had a maximum temp of 51C last night, and my HW values were almost as high as my accepted.  The past week, it has stayed below 50 during routine operation.

Are you talking about "Accepted" vs "HW" or "Diff1Shares" vs "HW"?
legendary
Activity: 1246
Merit: 1002
July 16, 2013, 08:40:05 AM
Yesterday my avalon started showing a super high number of hw.  Almost 75% of the accepted diff1a shares.
For instance shares 100000 hw 75000. Any ideas on what to check?

I had a maximum temp of 51C last night, and my HW values were almost as high as my accepted.  The past week, it has stayed below 50 during routine operation.

It has been stable around 343-347 frequency.  I had --avalon-auto set to 285-375.  I just changed it to 285-350.

My output is conveniently shown at http://eligius.st/~wizkid057/newstats/userstats.php/18bLcVkviErQi75zB8X39jZXxHNpSZggdC

At the start of 16 Jul I moved the unit from a warm upstairs room where the fans fairly consistently ran at 3800 to the basement where they have run closer to 2200.  The dip is from the time it took to move the machine.  The stability and speed both seemed to improve, but during the night the basement started warming up from it's initial 69°F.

I also put a filter in front of the fans, and the fan speed seems to be a little unstable now.  Without the filter, their speed stays pretty constant.
sr. member
Activity: 342
Merit: 250
July 16, 2013, 07:37:23 AM
Yesterday my avalon started showing a super high number of hw.  Almost 75% of the accepted diff1a shares.
For instance shares 100000 hw 75000. Any ideas on what to check?
legendary
Activity: 1610
Merit: 1000
July 16, 2013, 02:54:08 AM
today I have experienced a "bigger problem" for the second time since I have it

the machine was unresponsive on web interface, so I thought it has crashed completely and would need a physical restart, which is very bad for me

however, I tried to connect to SSH and somehow it worked, but extremely slowly, took my a couple of minutes but finally I was able to type reboot command and reboot the machine, which it did and everything is OK again

so, if someone has this totally unresponsive machine, try SSH and be very very patient, it may come up

funny thing is that it did not stop hashing until I reboot it - i.e. even in this strange unresponsive state, it was doing its job, just web interface could not be reached and SSH extremely hardly

since the operation with the whole machine was impossibly slow I could not analyze any logs before reboot, so I have no idea what happened
Out of mem - free?
WiFi disabled yes/no
dmesg?
just a few basic commands will tell you what went wrong. Make sure you post the output next time Wink
Jump to: