Efudd Z-Series Fuddware 2.3 -Z11/Z11e/Z11j/Z9/Mini - page 27.

efudd

member

Activity: 504

Merit: 51

Quote from: Pizzi_h on November 29, 2018, 03:37:24 PM

...snip...

I Reflashed the firmware and now i can push it higher then 656mhz. Temps now is 49-50c fans 2000 rpm

Okey. well if it starts to drop When there is to cold outside i have to split the intake air abit

Thanks for that info.

Another question..

I also tried the Biggie firmware earlier, with the same Mhz it could spike to +17ksols avg was the around 14.5 i think
i got higher spikes with the "biggie" firmware then the mini.

Avg is better at the mini FW though

Good job! will buy the license when you start bringing in new ones

The spikes are gonna be completely random for what it is worth. Your miner could get really lucky on calculations for a few seconds and jump to 2x what you would otherwise expect, but the average is where the truth really sits.

I honestly am not sure I am going to sell new licenses and instead stick with the dev supported model. It actually is cheaper for users that way to be honest... it'll take 3-6 months of runtime or more for me to make up what the license fee was at 3%. It's just a lot easier on me to not manage individual licenses.

Jason

Pizzi_h

newbie

Activity: 9

Merit: 0

Quote from: efudd on November 29, 2018, 03:27:35 PM

This is a very good question. First on the CRC error -- that is going to happen some and is only a problem if it is constant. It happens on even the stock firmwares depending on machines, temps, frequencies, and phase of the moon.

Temperatures will play into how far you can push these, but there is not a clear formula for that. What's really interesting is I have a customer with a large install (1000+ machines) who has observed that there is a point where the machines get too cold and slow down! I'm unsure of the exact details on the temperatures, just the observation that was shared with me.

So yes, temperature has a play both when going up and when going down.

The summers here are very hot -- my miners I had to constantly tune even through the day to get maximum out of them; they always ran best at night.

I hope this helps some.

Jason

I Reflashed the firmware and now i can push it higher then 656mhz. Temps now is 49-50c fans 2000 rpm

Okey. well if it starts to drop When there is to cold outside i have to split the intake air abit

Thanks for that info.

Another question..

I also tried the Biggie firmware earlier, with the same Mhz it could spike to +17ksols avg was the around 14.5 i think
i got higher spikes with the "biggie" firmware then the mini.

Avg is better at the mini FW though

Good job! will buy the license when you start bringing in new ones

efudd

member

Activity: 504

Merit: 51

Quote from: j.weber on November 29, 2018, 03:22:55 PM

Definitely once a day. Just a quick question, is there a way I can set the frequency for the different hashboards via PuTTY / the JSON?

Yessir, bitmain-freq1, bitmain-freq2, bitmain-freq3 are the 3 variables for that.

The only caveat is if you set the frequencies via that method the web interface will not get updated to reflect it until you go into the web interface and "save frequencies".

Jason

efudd

member

Activity: 504

Merit: 51

Quote from: Pizzi_h on November 29, 2018, 03:09:16 PM

Thoughts.

z9 mini
I have my miners Hosted outside, OR direct outside air, we have been having around -15c and the miner worked great got it up to stable at 681mhz since release.
2 fans front 1800rpm rear 1640 rpm Chips temp around 28-30c hash Avg 14.9ksols

Today the weather Drastically changed to +2c and i got all 3 boards xxxx

seems that it was the bm1740_verify_nonce_integrality CRC error. Reebooted but only took like 7min then got the same error again.
But after that ive only been able to maintain 656mhz.

soon as i go above that i loose one board.

CAN it be possible that the colder the chips can be maintained the higher mhz we can maintain? I never tried above 681mhz

This is a very good question. First on the CRC error -- that is going to happen some and is only a problem if it is constant. It happens on even the stock firmwares depending on machines, temps, frequencies, and phase of the moon.

Temperatures will play into how far you can push these, but there is not a clear formula for that. What's really interesting is I have a customer with a large install (1000+ machines) who has observed that there is a point where the machines get too cold and slow down! I'm unsure of the exact details on the temperatures, just the observation that was shared with me.

So yes, temperature has a play both when going up and when going down.

The summers here are very hot -- my miners I had to constantly tune even through the day to get maximum out of them; they always ran best at night.

I hope this helps some.

Jason

j.weber

newbie

Activity: 5

Merit: 0

Definitely once a day. Just a quick question, is there a way I can set the frequency for the different hashboards via PuTTY / the JSON?

Pizzi_h

newbie

Activity: 9

Merit: 0

Thoughts.

z9 mini
I have my miners Hosted outside, OR direct outside air, we have been having around -15c and the miner worked great got it up to stable at 681mhz since release.
2 fans front 1800rpm rear 1640 rpm Chips temp around 28-30c hash Avg 14.9ksols

Today the weather Drastically changed to +2c and i got all 3 boards xxxx

seems that it was the bm1740_verify_nonce_integrality CRC error. Reebooted but only took like 7min then got the same error again.
But after that ive only been able to maintain 656mhz.

soon as i go above that i loose one board.

CAN it be possible that the colder the chips can be maintained the higher mhz we can maintain? I never tried above 681mhz

efudd

member

Activity: 504

Merit: 51

Folk,

I wanted to get some feedback on dev-fees: Once per day, or split up throughout the day? I've had feedback from both, but am leaning towards once-per-day.

I personally think that the once-per-day has the least impact since it greatly reduces the swapping/moving things around.

Can you please share you view point on this and reasoning why?

Thank you,

Jason

efudd

member

Activity: 504

Merit: 51

Quote from: Marchcat2008 on November 29, 2018, 01:19:09 AM

@efudd Please read PM. Thank's.

@Marchcat2008 - Responded. AT this point in time, I am not selling new licenses. The developer supported version will remain available. If this changes in the future, I will update the original post in this thread.

Thank you,

Jason

Marchcat2008

newbie

Activity: 17

Merit: 0

@efudd Please read PM. Thank's.

efudd

member

Activity: 504

Merit: 51

Quote from: waterman on November 28, 2018, 09:29:23 PM

Excellent good job dude! I want that PS4 Grin

Install now! Supplies are Limited!

I was gonna return it and then thought... wait a second!

Best of luck to you. It'll go to someone!

(I can't help but type this while reading it in a "Saul Goodman" voice).

Jason

waterman

full member

Activity: 192

Merit: 119

★Bitvest.io★ Play Plinko or Invest!

Excellent good job dude! I want that PS4 Grin

efudd

member

Activity: 504

Merit: 51

Folk,

For the month of December, I will be running a contest for users of the Z9 and Z9 Mini version 2.1 or later firmware. There will be one automatic entry per day per miner. On 12/24, a random machine will be selected. The Summary page on the version 2.1 firmware will be automatically updated to let you know who the winner is. Details will be posted in the thread here and on the Equihash discord.

I've created a thread to discuss this at https://bitcointalksearch.org/topic/december-z9z9mini-firmware-users-ps4spiderman-bundle-giveaway-thread-5077347

Users will find your Summary page on the miner updating with this information over the next 24 hours automatically.

I have put details in the original post, but will copy here also.

Thank you,

Jason

efudd

member

Activity: 504

Merit: 51

Quote from: badbart on November 28, 2018, 06:56:37 PM

I installed 2.1 and I don't have an option to upload a licences file.

The system page says:
Efudd's Z9 Series Firmware v2.1
No dev-fee until 12/01/2018!

But under upgrade no option to upload a license file.

P.S. My Z9 is running faster now then your old firm ware with no clock changes.

Refresh your browser cache on the upgrade page. Shift-f5 or cmd-f5 if you are on a Mac. Once uploaded, the license will tell you if applied. The summary page will update on the next poll or restart with your license status. That page may be cached as well.. same thing.

In the next release (currently being tested), I have fixed the page cache issues.

-Jason

badbart

member

Activity: 449

Merit: 24

I installed 2.1 and I don't have an option to upload a licences file.

The system page says:
Efudd's Z9 Series Firmware v2.1
No dev-fee until 12/01/2018!

But under upgrade no option to upload a license file.

P.S. My Z9 is running faster now then your old firm ware with no clock changes.

efudd

member

Activity: 504

Merit: 51

Quote from: efudd on November 28, 2018, 02:49:59 PM

Quote from: xkosx on November 28, 2018, 12:16:21 PM

... snip ...
Kernel log,monitor log, screenshot System,Miner Status and img fan emu. Password in PM
Link https://dropmefiles.com/3iSv8

Thank you for those details. I think I might know what is going on -- can you check your PM and email me at the email address I provided?

Once we can get confirmation, I should be able to get this fixed reasonably quickly.
....snip...
Ok, this is going to take a little longer. This is Roskomnadzor. I am looking for a workaround.

xkosx - by the time you wake up the issue should be resolved. I have migrated primary services to something not blocked by Roskomnadzor. In fact, I can already see russian installations coming online.

Thank you,

Jason

efudd

member

Activity: 504

Merit: 51

Quote from: chipless on November 28, 2018, 05:37:23 PM

...snip...
This should fix the problem the majority of the time if not completely. Doing a double check gives cgminer 120 seconds and a double check before it starts a new instance of the miner.

place this in the monitorcg file

#!/bin/sh
#set -x
check_inter="60s"
while true; do
   sleep $check_inter
   #date
   a="$(ps | grep cgminer | grep -v 'grep cgminer')"
   if [ -z "$a" ] ; then
      chk_again
   fi

chk_again() {
while true; do
   sleep $check_inter
   #date
   a="$(ps | grep cgminer | grep -v 'grep cgminer')"
   if [ -z "$a" ] ; then
      /etc/init.d/cgminer.sh restart
   fi
}
done

There also needs to be some checking added for the asic status. If there is x number of acics reporting an x for status then cgminer restarts or the system reboots. This can help when overclocking and out of the blue a board fails. The miner will at least restart rather then stay dropped out losing speed

Please clean up your quotes instead of re-quoting everyone else before you each time.

That will not fix the race. You can add "chk_again" as many times as you want, it does not remove it.

Jason

chipless

jr. member

Activity: 559

Merit: 4

Quote from: efudd on November 28, 2018, 04:40:56 PM

Folk,

I'd like to share a lesson's learned that may be useful to others. I have a Z9mini here that I just checked and it has a hash rate of 0. Immediately I went to the logs to see what was going on and found the following:

Code:

Nov 28 21:33:52 (none) local0.err cgminer[23558]: bm1740_verify_nonce_integrality CRC error. cal-crc=374c, chip-crc=60bf
Nov 28 21:33:52 (none) local0.warn cgminer[23558]: receive a error nonce. total = 8908
Nov 28 21:33:52 (none) local0.err cgminer[23546]: bm1740_verify_nonce_integrality CRC error. cal-crc=ac2d, chip-crc=3f77
Nov 28 21:33:52 (none) local0.warn cgminer[23546]: receive a error nonce. total = 8705

The key here is that these are happening constantly, every second, the count is up to 8000+. (If these are infrequet, every few minutes, to hours, they can be ignored) What's going on? To figure that out, let's take a look at the process list ("System" -> "Monitor")... and what do I find?

Code:

23546 23545 root S < 225m 98% 50% /usr/bin/cgminer --version-file=/usr/bin/compile_time --config=/config/cgminer.conf -T --syslog
23558 23557 root S < 257m 111% 40% /usr/bin/cgminer --version-file=/usr/bin/compile_time --config=/config/cgminer.conf -T --syslog

Two copies of cgminer running! How could that happen? The answer is in this little program right here:

Code:

1012 1 root S 2152 1% 0% {monitorcg} /bin/sh /sbin/monitorcg

This is a factory process that tries to be a "watchdog" for cgminer and restart it if it is not running. From the factory it ran every 20 seconds, but I modified it to sleep for 60 seconds to try to limit the possibility of this race condition.

What happens is if you change frequency or pool configuration, cgminer is stopped and restarted. While that stop/start is occurring, monitorcg has a change to see cgminer is not running and start one itself. End result: Two cgminer's stepping on each other.

I may end up removing /sbin/monitorcg from the firmware as I've attempted to fix this particular race a myriad of ways... but when two separate processes (web interface actions and monitorcg) are both touching the same resource ("cgminer"), there is not any good way to prevent them from stepping on each other unless they are talking to each other constantly to achieve what is called "quorum".

What's the lesson here? Many times the errors that you may see are a function of this particular race condition.... and if you have two cgminer processes running, the fix is to kill/restart them. The simplest way to do that is ust to go to the frequency page and click submit. That will terminate both cgminers and hopefully restart it before monitorcg tries to help. A guaranteed way to fix it is to reboot, but I am not a fan of unnecessary reboots.

Hopefully this bit of information will be useful to someone. I've been meaning to write posts like this explaining various scenarios for a while.

Thank you,

Jason

This should fix the problem the majority of the time if not completely. Doing a double check gives cgminer 120 seconds and a double check before it starts a new instance of the miner.

place this in the monitorcg file

#!/bin/sh
#set -x
check_inter="60s"
while true; do
   sleep $check_inter
   #date
   a="$(ps | grep cgminer | grep -v 'grep cgminer')"
   if [ -z "$a" ] ; then
      chk_again
   fi

chk_again() {
while true; do
   sleep $check_inter
   #date
   a="$(ps | grep cgminer | grep -v 'grep cgminer')"
   if [ -z "$a" ] ; then
      /etc/init.d/cgminer.sh restart
   fi
}
done

There also needs to be some checking added for the asic status. If there is x number of acics reporting an x for status then cgminer restarts or the system reboots. This can help when overclocking and out of the blue a board fails. The miner will at least restart rather then stay dropped out losing speed

efudd

member

Activity: 504

Merit: 51

Folk,

I'd like to share a lesson's learned that may be useful to others. I have a Z9mini here that I just checked and it has a hash rate of 0. Immediately I went to the logs to see what was going on and found the following:

Code:

Nov 28 21:33:52 (none) local0.err cgminer[23558]: bm1740_verify_nonce_integrality CRC error. cal-crc=374c, chip-crc=60bf
Nov 28 21:33:52 (none) local0.warn cgminer[23558]: receive a error nonce. total = 8908
Nov 28 21:33:52 (none) local0.err cgminer[23546]: bm1740_verify_nonce_integrality CRC error. cal-crc=ac2d, chip-crc=3f77
Nov 28 21:33:52 (none) local0.warn cgminer[23546]: receive a error nonce. total = 8705

The key here is that these are happening constantly, every second, the count is up to 8000+. (If these are infrequet, every few minutes, to hours, they can be ignored) What's going on? To figure that out, let's take a look at the process list ("System" -> "Monitor")... and what do I find?

Code:

23546 23545 root S < 225m 98% 50% /usr/bin/cgminer --version-file=/usr/bin/compile_time --config=/config/cgminer.conf -T --syslog
23558 23557 root S < 257m 111% 40% /usr/bin/cgminer --version-file=/usr/bin/compile_time --config=/config/cgminer.conf -T --syslog

Two copies of cgminer running! How could that happen? The answer is in this little program right here:

Code:

1012 1 root S 2152 1% 0% {monitorcg} /bin/sh /sbin/monitorcg

This is a factory process that tries to be a "watchdog" for cgminer and restart it if it is not running. From the factory it ran every 20 seconds, but I modified it to sleep for 60 seconds to try to limit the possibility of this race condition.

What happens is if you change frequency or pool configuration, cgminer is stopped and restarted. While that stop/start is occurring, monitorcg has a change to see cgminer is not running and start one itself. End result: Two cgminer's stepping on each other.

I may end up removing /sbin/monitorcg from the firmware as I've attempted to fix this particular race a myriad of ways... but when two separate processes (web interface actions and monitorcg) are both touching the same resource ("cgminer"), there is not any good way to prevent them from stepping on each other unless they are talking to each other constantly to achieve what is called "quorum".

What's the lesson here? Many times the errors that you may see are a function of this particular race condition.... and if you have two cgminer processes running, the fix is to kill/restart them. The simplest way to do that is ust to go to the frequency page and click submit. That will terminate both cgminers and hopefully restart it before monitorcg tries to help. A guaranteed way to fix it is to reboot, but I am not a fan of unnecessary reboots.

Hopefully this bit of information will be useful to someone. I've been meaning to write posts like this explaining various scenarios for a while.

Thank you,

Jason

efudd

member

Activity: 504

Merit: 51

Quote from: chipless on November 28, 2018, 03:28:24 PM

...snip...

Netstat on the machine don't lie about how many times you are trying to connect. Your in and out of dev mode and the callbacks to your server is telling the miner what mode to mine in. More or less the firmware is dependent on the ability to reach your servers and if it don't then it stops mining and the user lost some shares. I ran for hours and it would only mine if it connected to your server at bootup it never went into a bypass mode. Not a smear just the facts about your release.

Thank you for your feedback and I'm sorry the firmware did not meet your needs. Please let me know if I can assist in some manner in the future.

Jason

chipless

jr. member

Activity: 559

Merit: 4

Quote from: efudd on November 28, 2018, 10:15:57 AM

Quote from: chipless on November 28, 2018, 09:39:56 AM

Never said I didn't know what I was doing and never used your work or others besides the factory firmware. As far as the lockout problem your firmware will have the same issue at some point unless you found the cause. By your statements you have had the issue or have helped already. Your fix was just do a recovery but you don't know the cause of the issue as far as I know your explanation on my post was unsure. I forget but in the end you still had to tell them to do a recovery.

I pointed out that I had already included the recovery conditions as a worst case, not that I had seen the issue. I have had to use the recovery exactly once when I bricked my own machine on the very first image I ever made. I was attempting to be polite. Users won't have that issue with my releases unless it is flat out concocted.

Quote from: chipless on November 28, 2018, 09:39:56 AM

You also need to inform your customers that their miner on the current 2.1 version may stop mining if it cant reach your api callback server us-api1.fudd.net or your dev-fee pool zec-bj.ss.poolin.com or they may lose shares on a callback or dev-fee connection error. I did some testing and from bootup for 6 hours and anytime the callback couldnt reach your server or the dev server it gave the error I posted to you in an earlier post it which the kernel log stated that about 100 shares were lost and I stopped mining at different times and eventually would start again. Callbacks seem to happen more then once every 280 minutes also.

To sumit up if your server is down or unreachable the miner may stop mining all servers for a short time and the owner lose shares because of it.

If the authorization server is down, the inability to start is a purposeful design decision. What you are missing there is how that is cached and handled in various failure scenarios and the fact there is regionalization on the API server you have not yet found. You've also not found the downtime mode that exists such that maintenance server-side can occur without downtime to miners. Nor have you found the retry mechanisms to ensure continuity in mining. Instead, you've created fake scenarios without understanding what you are doing.

As far as the pool being down, no, you have done something wrong. cgminer will simply find another pool if a pool is unreachable.

As far as the callback goes, you are also simply wrong there, too unless you were repeatedly restarting cgminer and purposefully breaking the network.

As far as 280 minutes go, you are also simply wrong there as well. In fact, 280 minutes is not even configured at the moment. Paid user callbacks are once a day. Dev user callbacks are once every 2 hours simply for testing purposes (more traffic) and will drop back to once a day by 12/01.

As far as your analysis about lost shares, there is a piece you are missing there as well; shares there are not defined as "successful shares waiting to be submitted to the server", but rather the current getwork queue, many of which will be discarded locally since they do not meet the targets.

Quote from: chipless on November 28, 2018, 09:39:56 AM

Enjoy the input

i did! Nice smear attempted though.

Jason

Netstat on the machine don't lie about how many times you are trying to connect. Your in and out of dev mode and the callbacks to your server is telling the miner what mode to mine in. More or less the firmware is dependent on the ability to reach your servers and if it don't then it stops mining and the user lost some shares. I ran for hours and it would only mine if it connected to your server at bootup it never went into a bypass mode. Not a smear just the facts about your release.

Topic: Efudd Z-Series Fuddware 2.3 -Z11/Z11e/Z11j/Z9/Mini - page 27. (Read 45556 times)