Pages:
Author

Topic: Swedish ASIC miner company kncminer.com - page 93. (Read 3049514 times)

legendary
Activity: 1610
Merit: 1003
"Yobit pump alert software" Link in my signature!
Ive got this firmware running on multiple  full sets of titans and its flawless.

Vegas
legendary
Activity: 2450
Merit: 1002
So, I got a question for yall, in the status page.
What is the "Temperature" column? on my titan its substantially lower than DCDC temp average.
Is it the ASIC temp or some sort of ambient temp?
I dont see how it would be the ASIC itself.... since mines so cool and I would expect the chips to run quite hot.

If its the actual ASIC temp, I *may* be able to code in a temperature protection for the chip. But if its not then, I wouldnt be able to.
legendary
Activity: 2450
Merit: 1002
GenTarkin -- looks like after one soft reset fail it is going straight to hard reset not the loop.

btw how can I delete this log and start over?

[2015-08-14 08:39:44] Die 5-1 requires restart
Attempting softreset of ASIC# 5 DIE# 1
{ "asic_6_voltage": { "die2": "-0.0366" }, "asic_6_frequency": { "die2": "150" } }
STATUS=S,When=1439559594,Code=92,Msg=PGA 0 set OK: Die setup Ok; asic 5 die 1 cmd RECONFIGURE,Description=bfgminer 5.2.0|
[2015-08-14 08:40:24] Die 5-1 restarted
Manually disabled die detected, skipping dead die detection. ASIC# 4, DIE# 2
Manually disabled die detected, skipping dead die detection. ASIC# 4, DIE# 4
Moving on with dead die test, no manual disabled die found
[2015-08-14 08:45:38] Die 5-1 requires restart
Attempting softreset of ASIC# 5 DIE# 1
KnC: Frequency change FAILED! { "asic_6_voltage": { "die2": "-0.0366" }, "asic_6_frequency": { "die2": "150" } }
Soft reset failed, initiatng hard reset
Stopping bfgminer.
Power cycling ASIC# 5
INFO: Attempt to power down dc/dc
INFO: Attempt to power UP dc/dc
Starting bfgminer.

[2015-08-14 08:46:45] Die 5-1 restarted
Manually disabled die detected, skipping dead die detection. ASIC# 4, DIE# 2
Manually disabled die detected, skipping dead die detection. ASIC# 4, DIE# 4
Moving on with dead die test, no manual disabled die found
[

Thats intentional, if it fails a soft reset issuing the waas -s command, it will immediately perform hard reset. The reason is, from watching my titan, in all cases where it failed waas -s it could never be brought back unless a hard reset was performed. Its working as I designed it =) ... Remember, the waas -s command should work even if there is pool issues, therfore like I said, waas -s failing needs a hard reset.
KnC: Frequency change FAILED! { "asic_6_voltage": { "die2": "-0.0366" }, "asic_6_frequency": { "die2": "150" } }
^ means waas -s failed.

I guess, if it really is problematic for your rig, I could have it just skip the waas -s failed / success checking and simply rely on current readings from the DCDC's to measure if soft resets worked on the die. But that could be potentially time loss because its running the loop for 5 minutes. Can you confirm if just soft reset spamming this die actually brings it back, 100% of the time? If so, Ill upload a mod for ya that disables waas -s success/fail checking.


Also, regarding logs, I just looked at the log directory, the monitordcdc.log is not getting large for me. So, you shouldnt have to worry bout space consumption(of course this depends on how misbehaving ur dies are in a given timeframe =P. I think they get cycled out too though =).
If 'catting' the log file is too large for you then just use tail to return n lines from the end of it
like    tail -n 25 /var/log/monitordcdc.log
sr. member
Activity: 342
Merit: 250
GenTarkin -- looks like after one soft reset fail it is going straight to hard reset not the loop.

btw how can I delete this log and start over?

[2015-08-14 08:39:44] Die 5-1 requires restart
Attempting softreset of ASIC# 5 DIE# 1
{ "asic_6_voltage": { "die2": "-0.0366" }, "asic_6_frequency": { "die2": "150" } }
STATUS=S,When=1439559594,Code=92,Msg=PGA 0 set OK: Die setup Ok; asic 5 die 1 cmd RECONFIGURE,Description=bfgminer 5.2.0|
[2015-08-14 08:40:24] Die 5-1 restarted
Manually disabled die detected, skipping dead die detection. ASIC# 4, DIE# 2
Manually disabled die detected, skipping dead die detection. ASIC# 4, DIE# 4
Moving on with dead die test, no manual disabled die found
[2015-08-14 08:45:38] Die 5-1 requires restart
Attempting softreset of ASIC# 5 DIE# 1
KnC: Frequency change FAILED! { "asic_6_voltage": { "die2": "-0.0366" }, "asic_6_frequency": { "die2": "150" } }
Soft reset failed, initiatng hard reset
Stopping bfgminer.
Power cycling ASIC# 5
INFO: Attempt to power down dc/dc
INFO: Attempt to power UP dc/dc
Starting bfgminer.

[2015-08-14 08:46:45] Die 5-1 restarted
Manually disabled die detected, skipping dead die detection. ASIC# 4, DIE# 2
Manually disabled die detected, skipping dead die detection. ASIC# 4, DIE# 4
Moving on with dead die test, no manual disabled die found
[
legendary
Activity: 1610
Merit: 1003
"Yobit pump alert software" Link in my signature!
its your power supply. These units are VERY finiky. If you want NO power supply issues. Use 2 X EVGA 850 W. (for batch 1), and 3X EVGA 850 for batch 2. As a computer repair shop owner I stock about 15 different kinds of power supplies. I was fortunate enough to be in  a position to test as many power supplies as I want. This is the ONLY perfect config. Put the PCI-E cables in and X config when you plug them into your power supply. If you are low on funds try any other power supply maybe just 1 cube by itself. These cubes do not turn off because of heat, they will usually burn themselves up first. The only exception MIGHT be if the actual processor in the middle is not even touching the processor in the middle. But ive left the heatsink by itself without any bracket just sitting on top of the processor for hours and it doesnt go above 50C, dont try this though. I DID have VRMs covered. Its your psu. Send your Donation to Tarkin, not me.

Vegas
newbie
Activity: 2
Merit: 0
If GenTarkin or anyone else could provide input, I would appreciate it.

I have a two titans that were working solid, one was 330 the other 320, 1 die dead on one 2 dies dead on the 2nd. After shipment, there are 4 dead die's, 1 half working die @ 200mhz, and 2-3 dies that work when reset, but slowly gain in amperage from standard area around 38-42 (300-325) all the way up to around 52(and over 100C) when they cutoff and everything goes 0.00. I am unsure how this occurred as everything was working fine prior to shipping. I thought at first it was the diff brand PSU's being used, but they are excellent brand/models and should not be causing any issues.

Is the die's slowly gaining amperage a possibility of heatsinks getting knocked loose in shipping, or do you suspect something else?

I will be able to test your fw on the units tomorrow and provide results. In the mean time I would like to brainstorm possibilities for resolutions.

After suffering a similar issue with a Titan I bought recently I decided to contact KnC and told them the symptoms and they asked me to check the heatsink. There is a rivet at the back of the cube which holds a screw in place, on mine that was pushed out leaving the heat sink loose at the back. I just used a slightly longer screw with a washer and nut on the outside of the case and the dies then worked as they should.

Might also be worth refreshing the CPU paste as well, mine had dried up from the heat build up with the heat sink separated from the chip.
legendary
Activity: 2450
Merit: 1002
Hi GenTarkin can you please list all the changes from the original img from KNCMINER 2.0 to your latest image? Please list it here or preferably on the github?

Thanks
Vegas

Well, for my latest changes including the smarter soft / hard reset ... I havent made a new release out of those. But if you go through all the releases on my github, I have notes for all the changes made since the previous release. At some point when the current test is confirmed working Ill make a new release and post change notes to it =)
sr. member
Activity: 342
Merit: 250
If GenTarkin or anyone else could provide input, I would appreciate it.

I have a two titans that were working solid, one was 330 the other 320, 1 die dead on one 2 dies dead on the 2nd. After shipment, there are 4 dead die's, 1 half working die @ 200mhz, and 2-3 dies that work when reset, but slowly gain in amperage from standard area around 38-42 (300-325) all the way up to around 52(and over 100C) when they cutoff and everything goes 0.00. I am unsure how this occurred as everything was working fine prior to shipping. I thought at first it was the diff brand PSU's being used, but they are excellent brand/models and should not be causing any issues.

Is the die's slowly gaining amperage a possibility of heatsinks getting knocked loose in shipping, or do you suspect something else?

I will be able to test your fw on the units tomorrow and provide results. In the mean time I would like to brainstorm possibilities for resolutions.

sounds like the thermal grease under the heatsink may need to be replaced
legendary
Activity: 1610
Merit: 1003
"Yobit pump alert software" Link in my signature!
Hi GenTarkin can you please list all the changes from the original img from KNCMINER 2.0 to your latest image? Please list it here or preferably on the github?

Thanks
Vegas
legendary
Activity: 2450
Merit: 1002
If GenTarkin or anyone else could provide input, I would appreciate it.

I have a two titans that were working solid, one was 330 the other 320, 1 die dead on one 2 dies dead on the 2nd. After shipment, there are 4 dead die's, 1 half working die @ 200mhz, and 2-3 dies that work when reset, but slowly gain in amperage from standard area around 38-42 (300-325) all the way up to around 52(and over 100C) when they cutoff and everything goes 0.00. I am unsure how this occurred as everything was working fine prior to shipping. I thought at first it was the diff brand PSU's being used, but they are excellent brand/models and should not be causing any issues.

Is the die's slowly gaining amperage a possibility of heatsinks getting knocked loose in shipping, or do you suspect something else?

I will be able to test your fw on the units tomorrow and provide results. In the mean time I would like to brainstorm possibilities for resolutions.

My firmware is not designed to fix anything that is related to actually damaged dies. You can give it a shot but any dies which misbehave please flag as OFF or to the speed they actually do run.

It sounds  like the one thats climbing in temp maybe needs HSF reseated, that is of course if only its the dies that are overheating.
newbie
Activity: 14
Merit: 1
If GenTarkin or anyone else could provide input, I would appreciate it.

I have a two titans that were working solid, one was 330 the other 320, 1 die dead on one 2 dies dead on the 2nd. After shipment, there are 4 dead die's, 1 half working die @ 200mhz, and 2-3 dies that work when reset, but slowly gain in amperage from standard area around 38-42 (300-325) all the way up to around 52(and over 100C) when they cutoff and everything goes 0.00. I am unsure how this occurred as everything was working fine prior to shipping. I thought at first it was the diff brand PSU's being used, but they are excellent brand/models and should not be causing any issues.

Is the die's slowly gaining amperage a possibility of heatsinks getting knocked loose in shipping, or do you suspect something else?

I will be able to test your fw on the units tomorrow and provide results. In the mean time I would like to brainstorm possibilities for resolutions.
legendary
Activity: 1610
Merit: 1003
"Yobit pump alert software" Link in my signature!
I have download the latest image and can confirm, it works excellent!

Vegas
sr. member
Activity: 342
Merit: 250
here's the screen shot, trying your latest commit now



Wait, this is while bfgminer says its configured successfully?! ... cuz according to screenshot its not configured successfully at all, its not hashing ... thats why it got hard reset.

yeah it an odd one, it is hashing -- that die never displays -- if I turn off all the good dies and let that die run I get aprox 10 mh/s on that bad die

btw that latest commit fixed it, the soft resets are random as needed -- before they occurred every 37 secs or so while in that loop

I bet whats going on is one of the DCDC's for that die is completely shot and the other is barely limping along. The shot one would of course fail the threshold test. Now I have the script doing an AND comparison of threshold results for both DCDC's in the pair for the die. If both are fail the threshold test then it schedules the hard reset. 
I think dies that need a hard reset will have the DCDC's not putting any current out anyways so thats good, they would be scheduled for hard reset as desired.
Dies such as urs still have a bit of current through at least one DCDC so we wouldnt want a hard reset on it.
For those miners who have completely dead dies, they should be set to OFF in the webgui that way my script wont constantly try to do hard resets.

Also, keep in mind this will still schedule a hard reset in the event that there has been no work from pool in over 6mins or so(once it enters the die reset loop).

Hopefully this will work as expected for needed hard resets, thats what Im waiting to test next on my rig haha!
Also, eventually I will scale back the log output during that loop, once we confirm stuff works so the log doesnt get huge.


this thing is running smooth ... no hard resets today at all, but I haven't needed any either, usually just need 1 or 2 every few days
legendary
Activity: 2450
Merit: 1002
here's the screen shot, trying your latest commit now



Wait, this is while bfgminer says its configured successfully?! ... cuz according to screenshot its not configured successfully at all, its not hashing ... thats why it got hard reset.

yeah it an odd one, it is hashing -- that die never displays -- if I turn off all the good dies and let that die run I get aprox 10 mh/s on that bad die

btw that latest commit fixed it, the soft resets are random as needed -- before they occurred every 37 secs or so while in that loop

I bet whats going on is one of the DCDC's for that die is completely shot and the other is barely limping along. The shot one would of course fail the threshold test. Now I have the script doing an AND comparison of threshold results for both DCDC's in the pair for the die. If both are fail the threshold test then it schedules the hard reset. 
I think dies that need a hard reset will have the DCDC's not putting any current out anyways so thats good, they would be scheduled for hard reset as desired.
Dies such as urs still have a bit of current through at least one DCDC so we wouldnt want a hard reset on it.
For those miners who have completely dead dies, they should be set to OFF in the webgui that way my script wont constantly try to do hard resets.

Also, keep in mind this will still schedule a hard reset in the event that there has been no work from pool in over 6mins or so(once it enters the die reset loop).

Hopefully this will work as expected for needed hard resets, thats what Im waiting to test next on my rig haha!
Also, eventually I will scale back the log output during that loop, once we confirm stuff works so the log doesnt get huge.
legendary
Activity: 2450
Merit: 1002
https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

instead of burning the new img file I tried doing a git pull:

cd knc-asic
git stash save --keep-index                (didn't need this line on all rigs)
git pull
cd
./update-webgui.sh

seems to work but did I miss anything??




If I wanted to update my pi to your firmware without win32diskimager flashing the full image, what would be my commands?

git pull
./update-webgui.sh

thats it? How do I specify your git repository? Im a little confused as TXSteve added that 'didnt need this line on all rigs' comment so not sure if I need that git stash line or not.

I need to upgrade a remote box to test with this fw but dont have physical access. Im assuming this will work?

Yeah, ull want to at least do a full flash to new SD card once, then can do git pulls =)
newbie
Activity: 14
Merit: 1
https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

instead of burning the new img file I tried doing a git pull:

cd knc-asic
git stash save --keep-index                (didn't need this line on all rigs)
git pull
cd
./update-webgui.sh

seems to work but did I miss anything??




If I wanted to update my pi to your firmware without win32diskimager flashing the full image, what would be my commands?

git pull
./update-webgui.sh

thats it? How do I specify your git repository? Im a little confused as TXSteve added that 'didnt need this line on all rigs' comment so not sure if I need that git stash line or not.

I need to upgrade a remote box to test with this fw but dont have physical access. Im assuming this will work?

I burned the img file on the 1st upgrade then you can do a git pull (may be another way)
if the git pull gives an error then run the git stash line

git stash save --keep-index
git pull
./update-webgui.sh

reboot

Ahh, that makes sense. Thanks for the confirmation, looks like I wont be able to remote upgrade this.
sr. member
Activity: 342
Merit: 250
https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

instead of burning the new img file I tried doing a git pull:

cd knc-asic
git stash save --keep-index                (didn't need this line on all rigs)
git pull
cd
./update-webgui.sh

seems to work but did I miss anything??




If I wanted to update my pi to your firmware without win32diskimager flashing the full image, what would be my commands?

git pull
./update-webgui.sh

thats it? How do I specify your git repository? Im a little confused as TXSteve added that 'didnt need this line on all rigs' comment so not sure if I need that git stash line or not.

I need to upgrade a remote box to test with this fw but dont have physical access. Im assuming this will work?

I burned the img file on the 1st upgrade then you can do a git pull (may be another way)
if the git pull gives an error then run the git stash line

git stash save --keep-index
git pull
./update-webgui.sh

reboot
sr. member
Activity: 342
Merit: 250
here's the screen shot, trying your latest commit now



Wait, this is while bfgminer says its configured successfully?! ... cuz according to screenshot its not configured successfully at all, its not hashing ... thats why it got hard reset.

yeah it an odd one, it is hashing -- that die never displays -- if I turn off all the good dies and let that die run I get aprox 10 mh/s on that bad die

btw that latest commit fixed it, the soft resets are random as needed -- before they occurred every 37 secs or so while in that loop
newbie
Activity: 14
Merit: 1
https://github.com/GenTarkin/Titan/releases/tag/v.93

New release published!!

---details---

Login through SSH & webgui is now: admin/admin        (should be anyways LOL! Hope I updated it correctly) =P test it peoples! =)

If soft die reset fails then initiate hard reset sequence (power off cube, restart bfgminer)
Instead of setting Dies w/ overheating DCDC's to OFF, now scales down 25mhz each check until DCDC's are under temp threshold.. If goes to 100 then sets die to OFF.
Added more temps to temp threshold setting, including all numbers between 90 & 95

*default DCDC temp monitoring settings are: ENABLED / 90C

---details---


Please test everyone, Ill fix as issues arise. Please Please donate =) Helps fuel my motivation to continue improving upon stuff =)

instead of burning the new img file I tried doing a git pull:

cd knc-asic
git stash save --keep-index                (didn't need this line on all rigs)
git pull
cd
./update-webgui.sh

seems to work but did I miss anything??




If I wanted to update my pi to your firmware without win32diskimager flashing the full image, what would be my commands?

git pull
./update-webgui.sh

thats it? How do I specify your git repository? Im a little confused as TXSteve added that 'didnt need this line on all rigs' comment so not sure if I need that git stash line or not.

I need to upgrade a remote box to test with this fw but dont have physical access. Im assuming this will work?
legendary
Activity: 2450
Merit: 1002
here's the screen shot, trying your latest commit now



Wait, this is while bfgminer says its configured successfully?! ... cuz according to screenshot its not configured successfully at all, its not hashing ... thats why it got hard reset.
Pages:
Jump to: