Author

Topic: Antminer S1 Hashrate and Hw Errors (Read 4410 times)

sr. member
Activity: 240
Merit: 250
August 02, 2014, 01:19:27 PM
#10
I have also noticed 1 unit with excessive HW errors.

I had 1 unit new and purchased 2 units used.

One of the used units is showing high HW errors:

for comparison:

original new unit:



frist used unit:



second used unit:



I found the calculator and it shows the error rate as:

http://www.coincadence.com/antminer-s1-hardware-error/

unit 1 = 0.00013477952 %
unit 2 = 0.64%
unit 3 = 0.00010218645 %

Maybe less than 1% is fine ??

I do notice that when all 3 units are reset at the same time. The first used unit seems to close in faster on a higher best share.
(which I assume if you found a block would be close to the difficulty level ??)
The original unit found a block on bitminter, but before I could look at the stats for analysis had a power bump so lost them.

Thanks,


legendary
Activity: 4256
Merit: 8551
'The right to privacy matters'
July 15, 2014, 10:36:09 PM
#9
I have an S1 set up at freq 400 temps running dead on 44.

I have a HW error rate of about 2.86% which I found to be alarming.  So I slowed it back down to freq 375 and the error rate basically went away.  

Then I got to thinking.

What difference does the HW error rate make?  Does that mean I'm actually losing shares in my pool?  Because unless you're running solo, that's all you care about.  So I did a little test and this is what I came up with:

FreqGH/s (from miner)GH/s (from pool)Wall WattsW/GH mineW/GH pool
400199.53204.844442.232.17
375191.52179.884142.162.30

What we're looking at here on the first line is the S1 freq then the speed from the miner status then the speed as reported by my pool then the watts the machine is pulling from the wall then watts divided by the speed from the miner status and, finally, watts divided by the speed as reported by my pool.

So my pool thinks I'm running faster than the S1 thinks it's running with a bunch of hardware errors at freq 400.

But when I slow things down to eliminate the HW errors, my pool thinks I'm dogging even though the S1 thinks I'm doing OK.

If anyone has insight to this, I'm interested.

In the mean time I'm taking the 3% HW hit and also a larger share in my pool.

okay you need the setting for 393 and 387    give me a minute

https://bitcointalksearch.org/topic/an-attempt-to-create-a-more-complete-freq-value-chart-for-the-ant-miner-chip-589429  


try 387 and 393            any    hw    under  2% is  okay .  look for the accepts  at 387, 393 , 400  and look for the errors.


pool readings can be all over the place  but they may even out after 1 or 2 days.
newbie
Activity: 3
Merit: 0
July 15, 2014, 07:01:19 PM
#8
I have an S1 set up at freq 400 temps running dead on 44.

I have a HW error rate of about 2.86% which I found to be alarming.  So I slowed it back down to freq 375 and the error rate basically went away. 

Then I got to thinking.

What difference does the HW error rate make?  Does that mean I'm actually losing shares in my pool?  Because unless you're running solo, that's all you care about.  So I did a little test and this is what I came up with:

FreqGH/s (from miner)GH/s (from pool)Wall WattsW/GH mineW/GH pool
400199.53204.844442.232.17
375191.52179.884142.162.30

What we're looking at here on the first line is the S1 freq then the speed from the miner status then the speed as reported by my pool then the watts the machine is pulling from the wall then watts divided by the speed from the miner status and, finally, watts divided by the speed as reported by my pool.

So my pool thinks I'm running faster than the S1 thinks it's running with a bunch of hardware errors at freq 400.

But when I slow things down to eliminate the HW errors, my pool thinks I'm dogging even though the S1 thinks I'm doing OK.

If anyone has insight to this, I'm interested.

In the mean time I'm taking the 3% HW hit and also a larger share in my pool.
newbie
Activity: 56
Merit: 0
May 08, 2014, 07:15:04 AM
#7
In my personal experience suggests that a hardware error only physical problem, with S1 you can do basically solve the heat, give you a set of parameters, said that this operation is better, I tested it, and indeed increased by 1% performance. Do not ask me why, I do not know.
         option 'freq_value' '5 f05 '# 393M
         option 'chip_freq' '393 .75 '
         option 'timeout' '36 '
hero member
Activity: 728
Merit: 500
cryptoshark
May 08, 2014, 05:41:02 AM
#6
i got few antminers
one when oced have huge number of errors, and cant hash within hour without reboot
on stock it has hardware errors too

it depends on quality of chips. if you draw bad chips it can give hardware errors or completly stop antminer from hashing!
newbie
Activity: 4
Merit: 0
May 07, 2014, 01:03:15 PM
#5
Here is my update since the original post.

I bought a new Western Digital N900 dual-band router which arrived at my door an hour ago.  It should be a lot better than my 5 year old Linksys WRT120N.  I went with the Western Digital N900 because it comes with 7 gigabit Ethernet ports, compared to the standard 4 (something you might want to consider if you're a fellow miner).  I configured the new router and continued to run all 4 units through the switch, which is plugged into 1 router port.  HW errors popped up almost instantly on all four units at the same rate as the old router.  So just for fun, I tried plugging unit 4 directly into the router.  It had the same HW error % as it did through the switch.  So my old router/networking doesn't seem to be the problem.

          - I'd also like to note that both the green and orange Ethernet port lights blink intermittently on units 1, 2, and 4.  The orange light on unit 3 stays solid.  All three of my buddy's S1's also had solid orange.
          - Unit 3 is still slower with a higher HW % than unit 4, even though the orange light on unit 4 blinks.

The temperature on the units normally run between 43-47, but I've seen them as high as 49 on warmer days.  Is that an acceptable temperature range?  (I know that video cards can run a lot hotter than that).  They are directly next to a window with great air circulation, so I still don't believe that over heating is the issue here.  Ambient room temperature is 65-75 degrees depending on the outside temperature.  I have the units set up in the guest bedroom on the dresser and just keep the window open.  Haha Smiley

I also increased the timeout from 35ms to 36ms and saw no difference.  I guess i'll try 37 and 38 to see if that helps, but I don't think that's going to do anything.

So basically after all of the methodical troubleshooting, I'm back to number 5 on my list from the original post.
          
          "5) I know the S1's are rated at 180-205 gh/s depending on the frequency with a +/- 5% rating. Are units 1 and 2 just slower?

           -192 and 196 gh are within that -5%.  But I doubt that's the issue considering I'm getting lots of hw error."

I guess all S1's are not created equally and it's the luck of the draw.  My buddy must have been really lucky because he's up to 9 units and only one of them gets HW errors like all 4 of mine.  I'm not a BTC veteran by any means, but I hope this troubleshooting process helps other S1 owners and newer miners out there.  Here is some other useful information for everyone:

1)  You can run three overclocked S1 units at 400 MHz off of one Corsair atx1200i platinum Power Supply.  My buddy uses this setup and has three PSU's for nine S1's.  His units run better than mine...

2)  Slush's Pool has significantly outperformed BTC Guild over the 3 week period in which I've been mining.  I have two S1's on each pool and my total rewards are as follows.
          - .59 BTC from Slush's Pool
          - .49 BTC from BTC Guild

Once again, I hope this thread is helpful.  I'd still appreciate any help or possible solutions to my problems.  I'll post another update in the future to let you know if cleaning the heatsinks helped any of my S1 issues.  If I somehow helped you in any way and would like to tip, or if you're just a mining kingpin baller making 100's of BTC a day and want to throw some at me; Here is my wallet address:

1HyBKPMAQKag3vGyHaa9TwkEzu5CCFWXAC

Thanks!


          




full member
Activity: 130
Merit: 100
April 25, 2014, 12:43:31 AM
#4
Use the recommended setting first at 375 Mhz. 
See this document for the configuration.
https://www.bitmaintech.com/userfiles/download/AntMiner%20S1%20FAQ.pdf

I think your high HW error has to do with the time out in ms.  Basically play around with trying out 35-38, e.g. 35, 36, 37, etc.

Clean out the heatsinks with a vacuum cleaner or canned air.  Most of the S1s I've read about are not new. 
What is your ambient temperature?  How is the air flow in the room?

Borrow a new router.  My old Belkin router gave me all sorts of random trouble.  Bought a new one, problem solved.  A dual band wireless TP link router is about $55 on Amazon.

On my units, I get average 195 GH/s at 375 MHz.  Depending on the machine, you may be pushing it too hard and already past the point of diminishing returns with higher freq.

Good luck.
sr. member
Activity: 448
Merit: 250
April 24, 2014, 01:01:47 PM
#3
They are all different. It's like the box of chocolate; you never know what you are going to get. Some S1's are wonderful and gives you 0 HW and others give you many more HW.

I have 5 S1's and 2 of them are wonderful and 3 are not so. You just have to find the sweet spot and it's all good.
sr. member
Activity: 314
Merit: 250
April 23, 2014, 10:38:30 PM
#2
I am having similar problems with my recently purchased S1. Roughly the same HW errors as above, and I am unable to overclock to 400MHz even with an extra fan.
newbie
Activity: 4
Merit: 0
April 21, 2014, 11:31:21 PM
#1
So I bought 4 Antminer S1's and have been running them for about a week.  They are all version 1.5 with a date of 2013/12/16. All 4 are plugged into one switch which is plugged into 1 router port. Each pair is connected to a brand new corsair rm1000 gold 80 PSU. The first thing that I did was overclock them to 400mhz and let them run for a full day. I got multiple asic chip errors on 3 out of the 4 units.  So I then reflashed them with the newest .bin file which claimed to have bug fixes for asic chip errors and hw errors.  That seemed to work because now I have all 00000000 on all four units. I let them run for another day and this is what I found:

Unit 1   196 gh avg   hw errors 5%
Unit 2   188 gh avg   hw errors 9%
Unit 3   200 gh avg   hw errors 2%
Unit 4   200 gh avg   hw errors 2%        I'm using the formula (Hw / DiffA+DiffR+Hw) * 100 to get the percentage.

This HW error rate seems to be unacceptable.  My buddy is running 3 S1's at 400mhz and all are at or over 200 gh avg and has next to 0 HW errors - like 30 after running them for 9 days opposed to my 50,000 - 500,000 hw depending on the machine.  So I have a lot of questions for any experienced people out there, but I want to first tell you what I did.

So unit 2 was giving me the most problems (hw error).  So what I did was back it down to 393 MHz, then I got a 191 gh avg with 4% hw error.  Then after another day, I backed it down to 387mhz and got roughly a 191 gh avg with 3% hw error.  Then repeated the process again and ran it at 381 mhz and got a 192 gh avg with about 2% hw error.  If it's only going to run at 192 gh, I might as well run it at 381 MHz opposed to any of the higher frequencies.  I also lowered unit 1 to 393mhz and I'm still getting a 196 gh avg with about 2.5% hw error.  So now for all the fun questions...

1) What does the hex frequency actually do? I mean the first line when changing the frequency. '0781' or '5f81' ect.

     -I was told that some people keep that the same for frequencies between 350mhz and 400mhz.  Can a bad combo be causing my problems? I used this combo on all 4 units at first.
     #option 'freq_value' '4f81' #205GH
     #option 'chip_freq' '400'
     #option 'timeout' '35'

2) I have an older router which is about 4-5 years old.  I know that it randomly resets at times.  I'm sure that this can lead to hw errors.  But wouldn't it affect all 4 machines equally, not just units 1 and 2? 

      -I tried plugging unit 2 directly into the router instead of the switch and still got a 188 gh avg at 400mhz, so I've ruled out a bad switch.

3) Units 1 and 2 are setup to run on Slush's Pool while units 3 and 4 are setup on BTC Guild.  Does a pool affect hw error?

     -I'm assuming that it is just a coincidence that units 1 and 2 are running slower, the pool prob isn't doing it because my buddy's S1 on slushes pool has low hw errors .01% and good hashrate 200+

4) Units 1 and 2 are hooked up to their own brand new PSU and units 3 and 4 are on their own brand new PSU.  Both came from newegg and seem to be running just fine.  Is the PSU for units 1 and 2 worse than the other one? Can this be the problem?

     -I doubt that the PSU is the problem.  All my cables are connected properly.

5) I know the S1's are rated at 180-205 gh/s depending on the frequency with a +/- 5% rating. Are units 1 and 2 just slower?

     -192 and 196 gh are within that -5%.  But I doubt that's the issue considering I'm getting lots of hw error.
     -My temperatures are all between 43-47, so I don't think that overheating is the issue.

6) When logging into the antminer interface, auto refresh is on. The only screen that doesn't seem to auto refresh is the miner status screen. How can I fix this?

     -This is just a pain, because I have to click the tab each time that I want to see the new numbers.  The auto refresh at the top left is not present next to the load numbers (only on the miner status tab!). So I can't even change it to on/off. (all 4 units have this prob)

I'm basically new to the whole btc mining game and would appreciate any advice from seasoned veterans out there.  As you can see I've done my homework and know a good amount about computers, but I need your help on this one.  Thanks!
Jump to: