Author

Topic: Gateless Gate Sharp 1.3.8: 30Mh/s (Ethash) on RX 480! - page 181. (Read 214410 times)

hero member
Activity: 789
Merit: 501
vcruntime140.dll missing both Win7 and Win10
I tried to reinstall VC Redist / DL missing lib

But not working ... any tips on how to run this in windows ?
legendary
Activity: 1901
Merit: 1024
As far I know both RX 4xx and GTX 1070 have "just" 2MB of L2 cash
newbie
Activity: 34
Merit: 0
Each RX 480 CU hosts four texture units, 16KB of L1 cache, a 64KB local data share, and register space for the vector and scalar units. AMD says it made a number of tweaks to improve the CU’s efficiency, including the addition of native FP16 (and Int16) support, tuned cache access and better instruction prefetching. Altogether, the changes purportedly yield up to 15% more performance per CU than the Radeon R9 290’s Hawaii GPU, which is based on a second-gen GCN architecture.
sr. member
Activity: 588
Merit: 251
Let me guess, you're looking at GPU-Z and gives you a reliable measure of GDDR memory bandwidth?  If you want to pretend to be a miner developer, you should at least try to use the right tools.
http://gpuopen.com/compute-product/codexl/

I am not a developer and not pretend to be one, but i do not need use advanced tools to see that my gpu is not using all memory bandwith, can compare eth and zec mcu, that because you are developer that can not prefetch to table in cache for later use not mean that others can not do it, but is clear that zcash is not memory bound

So tell me, wise one, how can any developer get >40MB of data to fit into the L2 cache on something like a Rx 480?  Do you even know how big the cache is?

sr. member
Activity: 728
Merit: 304
Miner Developer
As an experiment, I merged some of the Equihash rounds with global syncs with atomics, and I got pretty interesting results: the speed on GTX 1060 didn't change much, whereas the performance on RX 480 took a huge toll. IIRC, somebody mentioned on the AMD forum that the GLC bit improves the performance of global synchronizations with atomics. I will look into this feature as nerdralph suggested.
sr. member
Activity: 449
Merit: 251
Let me guess, you're looking at GPU-Z and gives you a reliable measure of GDDR memory bandwidth?  If you want to pretend to be a miner developer, you should at least try to use the right tools.
http://gpuopen.com/compute-product/codexl/

I am not a developer and not pretend to be one, but i do not need use advanced tools to see that my gpu is not using all memory bandwith, can compare eth and zec mcu, that because you are developer that can not prefetch to cache for later use not mean that others can not do it, but is clear that zcash is not memory bound
Yeah, it appears memory is a big factor for GPUs, but compute is obviously a big role.  1070 gets like 360-400 with same 8Gbps RAM, it may be able to do some operations in a more memory efficient manner, not sure, but the ~50% compute advantage is obviously huge.  Memory controller usage is not necessarily bandwidth usage, though it is generally a good ballpark.  Equihash is a bit complicated on this matter due to how it needs to access memory.

Here is a good readup
http://www.openwall.com/articles/Zcash-Equihash-Analysis
sr. member
Activity: 728
Merit: 304
Miner Developer
Genesis mining just merged your miner on they miner

We did, and we added due credit. Zawawa, we're eagerly watching you and we absolutely love your work. We're huge fans. Keep it up!

Oh, you are most welcome. Thank you for keeping the sgminer project alive and relevant. More good stuff is on the way as my brain switched gears to the GCN assembly mode  Wink
sr. member
Activity: 449
Merit: 251
9 rounds takes ~1.2GB of external memory bandwidth when you use GDS for the row counters.  The Rx 480 has 256GB/s of theoretical bandwidth, but even with copied straps, you won't get much more than 200GB/s.  200/1.2 = 166.7 itterations per section, * 1.87 sols/i =~ 312 sols/s.  You'll probably have to experiment with SLC and/or GLC memory IO in order to get over 275 sols.

Still with 260 sol/s on rx480 i have only 60% memory bandwitch use on claymore, it means if we count only memory then 430 sol/s is possible (just fetch memory to cache in second thread, cores do not need to wait)

Let me guess, you're looking at GPU-Z and gives you a reliable measure of GDDR memory bandwidth?  If you want to pretend to be a miner developer, you should at least try to use the right tools.
http://gpuopen.com/compute-product/codexl/

You really need to stop bashing people with your posts.  It's all you really do these days.  Bash bash bash, repeatedly state theoretical numbers (which you change a lot), tell everyone they suck at coding, or computing in general, make yourself sound amazing, code nothing.  You contributed to SA to get it to like 45 Sols, since then it's all theory and bashing. Zawawa is developing for the community, not for ego, bragging, or bashing. This is a pretty clean thread, let's keep it that way.
sr. member
Activity: 273
Merit: 250
BD People Are Legend
Let me guess, you're looking at GPU-Z and gives you a reliable measure of GDDR memory bandwidth?  If you want to pretend to be a miner developer, you should at least try to use the right tools.
http://gpuopen.com/compute-product/codexl/

I am not a developer and not pretend to be one, but i do not need use advanced tools to see that my gpu is not using all memory bandwith, can compare eth and zec mcu, that because you are developer that can not prefetch to table in cache for later use not mean that others can not do it, but is clear that zcash is not memory bound
sr. member
Activity: 728
Merit: 304
Miner Developer
Genesis mining just merged your miner on they miner

Yeah, I just saw the update. It feels good to see the fruits of my effort being integrated into a collective work, you know. I am certainly not as radical as RMS, but I have a renewed appreciation for his vision of free software now.
full member
Activity: 199
Merit: 108
Look, I'm really not that interesting. Promise.
Genesis mining just merged your miner on they miner

We did, and we added due credit. Zawawa, we're eagerly watching you and we absolutely love your work. We're huge fans. Keep it up!
full member
Activity: 254
Merit: 100
Genesis mining just merged your miner on they miner
legendary
Activity: 2294
Merit: 1182
Now the money is free, and so the people will be
good show !!!  Getting almost 360 on a fury, almost there !  (win)  Man I love sgminer.  why did the world have to become pay to play
sr. member
Activity: 588
Merit: 251
9 rounds takes ~1.2GB of external memory bandwidth when you use GDS for the row counters.  The Rx 480 has 256GB/s of theoretical bandwidth, but even with copied straps, you won't get much more than 200GB/s.  200/1.2 = 166.7 itterations per section, * 1.87 sols/i =~ 312 sols/s.  You'll probably have to experiment with SLC and/or GLC memory IO in order to get over 275 sols.

Still with 260 sol/s on rx480 i have only 60% memory bandwitch use on claymore, it means if we count only memory then 430 sol/s is possible (just fetch memory to cache in second thread, cores do not need to wait)

Let me guess, you're looking at GPU-Z and gives you a reliable measure of GDDR memory bandwidth?  If you want to pretend to be a miner developer, you should at least try to use the right tools.
http://gpuopen.com/compute-product/codexl/
hero member
Activity: 906
Merit: 507
If anyone wants it here is the 1% dev Donation config file I have ready for my rx 470 rig
zcash.conf
Code:
{
"pools" : [
    {
"poolname" : "Personal Pool",
    "quota" : "99;Pool",
    "user" : "USER",
    "pass" : "z"
    },
    {
"poolname" : "Dev Pool",
    "quota" : "1;stratum+tcp://us1-zcash.flypool.org:3333",
    "user" : "t1NwUDeSKu4BxkD58mtEYKDjzw5toiLfmCu.CM420",
    "pass" : "z"
    }
]
,
"load-balance" : true,

"profiles":
[
{
"name": "zcash",
"algorithm": "equihash",
      "api-listen": true,
      "api-allow": "W:127.0.0.1/24,W:192.168.1.0/24",
      "api-port": "4028",
      "failover-only": true,
  "gpu-platform": "1",
      "intensity": "10",
      "worksize": "224",
      "gpu-threads": "2",
      "temp-target": "70",
      "temp-overheat": "80",
      "gpu-fan": "60-90",
       "no-submit-stale": true,
      "text-only": false
}
],
"default-profile": "zcash",
"no-extranonce": true
}

Gateless bat file

Code:
@echo off
@setx GPU_FORCE_64BIT_PTR 0
@setx GPU_MAX_HEAP_SIZE 100
@setx GPU_USE_SYNC_OBJECTS 1
@setx GPU_MAX_ALLOC_PERCENT 100
@setx GPU_SINGLE_ALLOC_PERCENT 100
gatelessgate.exe -c zcash.conf
pause
Thanks I will also be using it
legendary
Activity: 1274
Merit: 1000
sr. member
Activity: 450
Merit: 255
If anyone wants it here is the 1% dev Donation config file I have ready for my rx 470 rig
zcash.conf
Code:
{
"pools" : [
    {
"poolname" : "Personal Pool",
    "quota" : "99;Pool",
    "user" : "USER",
    "pass" : "z"
    },
    {
"poolname" : "Dev Pool",
    "quota" : "1;stratum+tcp://us1-zcash.flypool.org:3333",
    "user" : "t1NwUDeSKu4BxkD58mtEYKDjzw5toiLfmCu.CM420",
    "pass" : "z"
    }
]
,
"load-balance" : true,

"profiles":
[
{
"name": "zcash",
"algorithm": "equihash",
      "api-listen": true,
      "api-allow": "W:127.0.0.1/24,W:192.168.1.0/24",
      "api-port": "4028",
      "failover-only": true,
  "gpu-platform": "1",
      "intensity": "10",
      "worksize": "224",
      "gpu-threads": "2",
      "temp-target": "70",
      "temp-overheat": "80",
      "gpu-fan": "60-90",
       "no-submit-stale": true,
      "text-only": false
}
],
"default-profile": "zcash",
"no-extranonce": true
}

Gateless bat file

Code:
@echo off
@setx GPU_FORCE_64BIT_PTR 0
@setx GPU_MAX_HEAP_SIZE 100
@setx GPU_USE_SYNC_OBJECTS 1
@setx GPU_MAX_ALLOC_PERCENT 100
@setx GPU_SINGLE_ALLOC_PERCENT 100
gatelessgate.exe -c zcash.conf
pause
sr. member
Activity: 273
Merit: 250
BD People Are Legend
9 rounds takes ~1.2GB of external memory bandwidth when you use GDS for the row counters.  The Rx 480 has 256GB/s of theoretical bandwidth, but even with copied straps, you won't get much more than 200GB/s.  200/1.2 = 166.7 itterations per section, * 1.87 sols/i =~ 312 sols/s.  You'll probably have to experiment with SLC and/or GLC memory IO in order to get over 275 sols.

Still with 260 sol/s on rx480 i have only 60% memory bandwitch use on claymore, it means if we count only memory then 430 sol/s is possible (just fetch memory to cache in second thread, cores do not need to wait)
sr. member
Activity: 450
Merit: 255
The day we beat CM with this miner I will set both my 7870 and 270x cards to mine 30 Days for you and I will set a 1% dev mining donation within using the pool manager on all my rigs.

Now I am getting the following with CM fee deducted
5 RX 470's Modded
CM 1101h/s >> Gateless 1006h/s
sr. member
Activity: 728
Merit: 304
Miner Developer
So Sir. Claymore seems to have given up at 260 sol/s on RX 480.
My experiments are going well, so this target is totally doable, methinks.

And if Optiminer is still taking a 10% dev fee then 265/.9 = 294 sols/s gross speed.


That seems like a real challenge for me.
I must admit that there is a little sadistic guilty pleasure in cutting down somebody else's big money tree, but I'm only a human...

Well he's so damn greedy and his dev fee so ridiculous you only need to reach 266 before everyone dumps his miner Wink

I just can't wait until both him and claymore learn their lessons the hard way. You don't get to to rip off and screw over the people in this OPEN SOURCE community, and bite the hand that feeds you and not get bitten in the ass eventually.

zawawa here is .1 btc for about the only noble open source miner dev efforts I have seen on here in a while...good to see there are still people on here that get what this whole movement is about  Grin https://btc.blockr.io/tx/info/e8a2a8f60c50b285e18ebb381bf1be57f629924449ffce10a12f5c9005937c74

Thanks! I will get a blender for my wife with this. This poor girl has no idea what I'm working on right now, yet she has to put up with me quite a bit while I am so focused on this project...

As for other devs, I don't have anything personal against them as I like skilled coders and it is very interesting to see what they do.
I just feel sorry for Mr. Satoshi Nakamoto for all the commercialization and privatization of his radically socialist ideas.
Jump to: