Author

Topic: Gateless Gate Sharp 1.3.8: 30Mh/s (Ethash) on RX 480! - page 105. (Read 214410 times)

newbie
Activity: 13
Merit: 0
1.1.8 doesnt detect my cards at all on startup,
1.1.7 crashes on startup, 1.1.6 starts and runs /  but only 8 cards available.
Please fix this so i can use your miner.  Smiley

A log file, please!

Hey Zawawa heres the log :

2017-12-30 12:59:19.2235 [1] Gateless Gate Sharp 1.1.8 alpha started.
2017-12-30 12:59:19.9210 [1] Exception in InitializeDevices(): Method not found: 'Byte[] Cloo.ComputeDevice.get_BoardNameAMD()'.   at GatelessGateSharp.MainForm.InitializeDevices()
   at GatelessGateSharp.MainForm.MainForm_Load(Object sender, EventArgs e)
2017-12-30 12:59:22.9576 [1] Exception: Object reference not set to an instance of an object.   at GatelessGateSharp.MainForm.LoadDatabase()

You just need to upgrade the Cloo.dll to the latest version. It is in the distribution package.


Which distribution package is this? I downloaded the MSI from github, do you mean net framework?
full member
Activity: 729
Merit: 114
I hope everyone would eventually see that paying the DEVFEE is actually a good thing for themselves.
That way, I can continue development without worrying about money, and my wife would stop bugging me about how much money I am making everyday. lol
Right now, I have to spend a considerable chunk of my time on my own small mining farm just to support my family.
If I can live off the DEVFEE, I don't have to maintain the farm, I can just focus on GGS, and the miner gets faster and more stable
 with more features.
It is just as simple as that.

Totally agree.  Devfee should be there and keeps the dev motivated to improve the miner.  How about having static 0.5% devfee and 0.5-1% user configurable?
sr. member
Activity: 588
Merit: 251
I'm curious why you picked the ethash-new kernel from sgminer-gm.  I'm still using most of the original ethminer kernel, and get within 1% of the original sgminer-gm ethash kernel.
With fglrx, amgpu-pro 16 (and amdgpu-pro 17 when using -legacy) on GCN3, the keccak loop compiles to 306 instructions and needs 80 VGPRs.  16 of the instructions are scalar for the loop and Keccak_f1600_RC lookup, and 30 of the vector instructions are just v_mov_b32 to shuffle around registers at the end of each round.  My calculations by hand were that each loop would take 268 instructions minimum if it could be done without any move instructions.

Besides skipping most of the chi function in the final round which ethash-new already does, some of the theta function in the first round could be skipped as well.  Together those optimizations can save 3-4% of the instructions.  Aside from Wolf's GCN assembler kernel, I'm not aware of anyone doing a highly optimized ethash kernel.  He shared some of the code with me, and even he wasn't able to get the code below 250 instructions per loop.  The major difference for his kernel was that it only needed 64 VGPRs, and therefore permitted 4 waves instead of 3.

However, as you know, with a plain ethash kernel, the keccak function isn't the bottleneck, it's memory latency/bandwidth.  But since you are now writing dual miner kernels, the fewer cycles taken in the keccak function, the more available for the secondary algorithm.  Getting any significant optimizations out of the keccak function is very difficult, which is no surprise since it wouldn't have been chosen for sha-3 if there were easy shortcuts.  I'm still looking at ways to optimize the 2nd keccak specific to the fact it only needs 64 bits output (or perhaps even only 32).  That may or may not bear fruit, but I'm more confident in another idea I have.  Since I enjoy thinking more than I do coding, by sharing it with you it is more likely to get implemented sooner.

The ethash inner loop that reads 64 * 128-byte DAG items is the primary bottleneck, as I believe each read has a latency of around 100us.  With memory controller channel conflicts and GDDR5 bank conflicts, some reads could have a latency of 150us.  When one wave is stalled waiting for memory, the 2nd and 3rd wave can run, but I suspect there is still room to improve SIMD occupancy.  While one way is by reducing VGPR usage to get 4 waves in flight, another way would be to interleave 2 ethash iterations.  The 64 bytes of state output from the first keccack hash can be saved in LDS, then a 2nd nonce can be run through the keccak hash while the 64 DAG reads are going on for the first nonce.  Since there are only 24 rounds to the keccak hash, the rounds would need to be divided in two, for a total of 48 half-rounds interleaved between the first 49 DAG lookups.  LDS usage per would be 64 bytes * 64 threads * 4 SIMD = 16KB/wave on each CU, or 48KB with 3 waves.  The regular ehash kernel uses 6KB of LDS for the shared DAG reading, so the total would still be under the 64KB max.

I even have another optimization idea, but it can't be implemented in OpenCL.  An assembler kernel could use the DS ALU concurrently with the VALU.  Unfortunately the DS ALU has no shift instruction, so that will still have to be done by the VALU.  All or most of the XOR operations could be done using DS_XOR instructions.  With 32 LDS banks, only 32 of the 64 threads on a CU will be able to concurrently execute a DS instruction, but that wouldn't be a concern with a dual mining kernel where 2 out of the 4 SIMD units per CU are working on the secondary algorithm.
sr. member
Activity: 588
Merit: 251
AMD's Blockchain Driver seems incompatible with Windows 10 Version 1709.
Grrrrr......

Ready to switch to Linux yet? :-)

See, I have been using Linux before it reached Version 1.0, and I would never do server-side stuff on Windows.
The thing is, I am essentially an effort-minimalist, and I am actually quite happy with the Cathedral and choose it over the Bazaar anytime when it comes to desktop PC's.
I had pretty interesting conversations with top engineers at NiceHash regarding this subject.
Our consensus was that consumers just want a neat all-in-one package that just works without any tweaking.
Philosophically, this attitude is in direct contrast to Un*x's design philosophy as described by Kernighan and Ritchie, which encourages combined use of small tools.
What I am trying to achieve with GSS is to cater to the needs of the former, and that is why I started developing the GUI version first.
(I will eventually make a command-line version.)
We will see how that goes.

I agree that the Windoze crowd want something as close to idiot proof as possible.  I also agree with others that the vast majority (my guess would be 80+%) of serious miners use Linux.  For a gamer with a couple of GPUs that wants to make a few extra $ mining, GGS looks great.  And as others have pointed out, that crowd has the least amount of clue and so is less likely to disable a dev fee.

If it wasn't obvious, I'm not requesting that you make a Linux version.  I just like to have a little fun at the expense of those that go into the quagmire of Windows software development.
sr. member
Activity: 1484
Merit: 253
I hope everyone would eventually see that paying the DEVFEE is actually a good thing for themselves.
That way, I can continue development without worrying about money, and my wife would stop bugging me about how much money I am making everyday. lol
Right now, I have to spend a considerable chunk of my time on my own small mining farm just to support my family.
If I can live off the DEVFEE, I don't have to maintain the farm, I can just focus on GGS, and the miner gets faster and more stable
 with more features.
It is just as simple as that.
Yes, you're absolutely right. Disabling the dev-fee is the choice of conscience of every miner. But all must understand, that disabling devfee is slowing down development of miner.
sr. member
Activity: 728
Merit: 304
Miner Developer
I hope everyone would eventually see that paying the DEVFEE is actually a good thing for themselves.
That way, I can continue development without worrying about money, and my wife would stop bugging me about how much money I am making everyday. lol
Right now, I have to spend a considerable chunk of my time on my own small mining farm just to support my family.
If I can live off the DEVFEE, I don't have to maintain the farm, I can just focus on GGS, and the miner gets faster and more stable
 with more features.
It is just as simple as that.
sr. member
Activity: 1484
Merit: 253
AMD's Blockchain Driver seems incompatible with Windows 10 Version 1709.
Grrrrr......

Ready to switch to Linux yet? :-)

See, I have been using Linux before it reached Version 1.0, and I would never do server-side stuff on Windows.
The thing is, I am essentially an effort-minimalist, and I am actually quite happy with the Cathedral and choose it over the Bazaar anytime when it comes to desktop PC's.
I had pretty interesting conversations with top engineers at NiceHash regarding this subject.
Our consensus was that consumers just want a neat all-in-one package that just works without any tweaking.
Philosophically, this attitude is in direct contrast to Un*x's design philosophy as described by Kernighan and Ritchie, which encourages combined use of small tools.
What I am trying to achieve with GSS is to cater to the needs of the former, and that is why I started developing the GUI version first.
(I will eventually make a command-line version.)
We will see how that goes.

While I agree with your arguments concerning consumer friendliness, I would advise not to underestimate the "dev-fee thing"...

Consumers usually run one or two rigs (maybe a few more, but rarely). Miners usually dedicate their time and work in making money, running several rigs etc. These guys bring in the dev-fees(!). You should consider if you are interested in consumers with relatively low hashing power running Windows or in the more professionals (Linux all over the place) that have the much bigger hashing power. More hashing power = more dev-fees.

Just looking at the repeating driver drama on Windows, the number of CPUs supported per rig etc, should make you think harder about a Linux version ;-)

- just my 2 cents (and a +1 Linux vote)

Cheers
Andy
All this is no matter while project is open-source. As you say "more professionals" are know how to not pay to dev...
newbie
Activity: 48
Merit: 0
AMD's Blockchain Driver seems incompatible with Windows 10 Version 1709.
Grrrrr......

Ready to switch to Linux yet? :-)

See, I have been using Linux before it reached Version 1.0, and I would never do server-side stuff on Windows.
The thing is, I am essentially an effort-minimalist, and I am actually quite happy with the Cathedral and choose it over the Bazaar anytime when it comes to desktop PC's.
I had pretty interesting conversations with top engineers at NiceHash regarding this subject.
Our consensus was that consumers just want a neat all-in-one package that just works without any tweaking.
Philosophically, this attitude is in direct contrast to Un*x's design philosophy as described by Kernighan and Ritchie, which encourages combined use of small tools.
What I am trying to achieve with GSS is to cater to the needs of the former, and that is why I started developing the GUI version first.
(I will eventually make a command-line version.)
We will see how that goes.

While I agree with your arguments concerning consumer friendliness, I would advise not to underestimate the "dev-fee thing"...

Consumers usually run one or two rigs (maybe a few more, but rarely). Miners usually dedicate their time and work in making money, running several rigs etc. These guys bring in the dev-fees(!). You should consider if you are interested in consumers with relatively low hashing power running Windows or in the more professionals (Linux all over the place) that have the much bigger hashing power. More hashing power = more dev-fees.

Just looking at the repeating driver drama on Windows, the number of CPUs supported per rig etc, should make you think harder about a Linux version ;-)

- just my 2 cents (and a +1 Linux vote)

Cheers
Andy
sr. member
Activity: 728
Merit: 304
Miner Developer
It's seems that neoscrypt algo consumes many power... I don't remember, gg was the same power consumption as GGS in neoscrypt?
I noticed that GGS use high intensity volumes like 256 or more. In gg it was low numbers like 14-16 etc. Is it mean that GGS uses different neocrypt kernel?

The kernel is essentially the same. The intensity is calculated differently.
My NeoScrypt kernel really pushes GCN to its limits.
I would imagine I could rewrite it in the GCN assembly to make it more power efficient.
I am planning to do optimizations for all the algorithms in one big batch once the miner is relatively bug-free and has essential features.
sr. member
Activity: 728
Merit: 304
Miner Developer
AMD's Blockchain Driver seems incompatible with Windows 10 Version 1709.
Grrrrr......

Ready to switch to Linux yet? :-)

See, I have been using Linux before it reached Version 1.0, and I would never do server-side stuff on Windows.
The thing is, I am essentially an effort-minimalist, and I am actually quite happy with the Cathedral and choose it over the Bazaar anytime when it comes to desktop PC's.
I had pretty interesting conversations with top engineers at NiceHash regarding this subject.
Our consensus was that consumers just want a neat all-in-one package that just works without any tweaking.
Philosophically, this attitude is in direct contrast to Un*x's design philosophy as described by Kernighan and Ritchie, which encourages combined use of small tools.
What I am trying to achieve with GSS is to cater to the needs of the former, and that is why I started developing the GUI version first.
(I will eventually make a command-line version.)
We will see how that goes.
sr. member
Activity: 1484
Merit: 253
It's seems that neoscrypt algo consumes many power... I don't remember, gg was the same power consumption as GGS in neoscrypt?
I noticed that GGS use high intensity volumes like 256 or more. In gg it was low numbers like 14-16 etc. Is it mean that GGS uses different neocrypt kernel?
sr. member
Activity: 728
Merit: 304
Miner Developer
1.1.8 doesnt detect my cards at all on startup,
1.1.7 crashes on startup, 1.1.6 starts and runs /  but only 8 cards available.
Please fix this so i can use your miner.  Smiley

A log file, please!

Hey Zawawa heres the log :

2017-12-30 12:59:19.2235 [1] Gateless Gate Sharp 1.1.8 alpha started.
2017-12-30 12:59:19.9210 [1] Exception in InitializeDevices(): Method not found: 'Byte[] Cloo.ComputeDevice.get_BoardNameAMD()'.   at GatelessGateSharp.MainForm.InitializeDevices()
   at GatelessGateSharp.MainForm.MainForm_Load(Object sender, EventArgs e)
2017-12-30 12:59:22.9576 [1] Exception: Object reference not set to an instance of an object.   at GatelessGateSharp.MainForm.LoadDatabase()

You just need to upgrade the Cloo.dll to the latest version. It is in the distribution package.
sr. member
Activity: 588
Merit: 251
AMD's Blockchain Driver seems incompatible with Windows 10 Version 1709.
Grrrrr......

Ready to switch to Linux yet? :-)
newbie
Activity: 13
Merit: 0
1.1.8 doesnt detect my cards at all on startup,
1.1.7 crashes on startup, 1.1.6 starts and runs /  but only 8 cards available.
Please fix this so i can use your miner.  Smiley

A log file, please!

Hey Zawawa heres the log :

2017-12-30 12:59:19.2235 [1] Gateless Gate Sharp 1.1.8 alpha started.
2017-12-30 12:59:19.9210 [1] Exception in InitializeDevices(): Method not found: 'Byte[] Cloo.ComputeDevice.get_BoardNameAMD()'.   at GatelessGateSharp.MainForm.InitializeDevices()
   at GatelessGateSharp.MainForm.MainForm_Load(Object sender, EventArgs e)
2017-12-30 12:59:22.9576 [1] Exception: Object reference not set to an instance of an object.   at GatelessGateSharp.MainForm.LoadDatabase()
sr. member
Activity: 728
Merit: 304
Miner Developer
1.1.8 doesnt detect my cards at all on startup,
1.1.7 crashes on startup, 1.1.6 starts and runs /  but only 8 cards available.
Please fix this so i can use your miner.  Smiley

A log file, please!
newbie
Activity: 13
Merit: 0
Im running 12 cards and adrenaline 17.12.2 drivers and 1.1.8 doesnt detect my cards at all on startup,
1.1.7 crashes on startup, 1.1.6 starts and runs but only 8 cards available.
Please fix this so i can use your miner.  Smiley
newbie
Activity: 126
Merit: 0
The surest way to solve this order problem is to run a diagnostic test at launch.
I really don't want to take this route, though... It seems too hackish.

What if some interactive dialog at start asks us , the correct order..or some text boxes to rewrite desired order from
 user input.. or dragging and dropping text boxes of sensor groups /device names ??..because it is just a visual problem.. just simple ideas sorry.
(but if such a solution , conflicts with on the fly settings , i dont know )
sr. member
Activity: 1484
Merit: 253
The surest way to solve this order problem is to run a diagnostic test at launch.
I really don't want to take this route, though... It seems too hackish.
How to run a diagnostic test?
sr. member
Activity: 1484
Merit: 253
You enabled phimem loading? Why? It doesn't needed at this moment. Try to disable it.

Thanks good to know..But I think (tested) , gpu order problem is not related to this..
UnclWish it is not the right place I know,  but, I did see your another post about searching sapphire Rx 480 (not +) micron, bios. Did u find something useful (other than Doktor83 bios or other than  strap trick 1650-->2000) ?


No. I didn't find something else...
sr. member
Activity: 728
Merit: 304
Miner Developer
The surest way to solve this order problem is to run a diagnostic test at launch.
I really don't want to take this route, though... It seems too hackish.
Jump to: