Author

Topic: Gateless Gate Sharp 1.3.8: 30Mh/s (Ethash) on RX 480! - page 156. (Read 214458 times)

sr. member
Activity: 728
Merit: 304
Miner Developer
Code:
int amdgpu_device_initialize(int fd,
     uint32_t *major_version,
     uint32_t *minor_version,
     amdgpu_device_handle *device_handle)

int amdgpu_query_gds_info(amdgpu_device_handle dev,
struct amdgpu_gds_resource_info *gds_info)

Very nice, very nice.
sr. member
Activity: 728
Merit: 304
Miner Developer
I was just able to directly access the GPU through libdrm, so I'm getting pretty close...
Thank God I read optiminer's README. I wouldn't have thought of doing all this otherwise.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
All this useless work and so little speed improvements. I think you miss the difference between a slut and slot. You should spend more time with your slut than your slot's..  Grin
sr. member
Activity: 728
Merit: 304
Miner Developer
If optiminer directly initializes the GDS, it could probably do it for both Linux and Windows.

It looks like it is possible, on Linux, to send PM4 packets from the user space through libdrm, though, whereas that functionality is not exposed on Windows:

https://github.com/Lucretia/libdrm-amdgpu/search?utf8=%E2%9C%93&q=gds

I will try libdrm and see what I can do with it.
sr. member
Activity: 588
Merit: 251
Nope, it didn't work.
I was just glancing through the change logs of optiminer and noticed something pretty interesting, though.

Code:
- [1.7.0] New --pci-modes (0-3). Try if you see GPU freezes.

It seems that optiminer directly accesses the GPU through the PCI bus.
Maybe it sends a GDS_INIT P4 packet to the GPU so that it could access the entire GDS.
What do you think, nerdralph?

Interesting idea.  When I read that optiminer changes pci modes, I figured that was for newbies using cheap risers that don't know how to set the pci-e bus speed to gen1 in the BIOS.  If optiminer directly initializes the GDS, it could probably do it for both Linux and Windows.

I think the two most likely possibilities are:
1) There is some way to access more than 16KB of GDS from one kernel.  Perhaps the driver initializes GDS differently using the CL2.0 ABI vs using CL1.2.
2) Optiminer executes 4 instances of the kernel, each using 16KB.  If the original SA data structures are kept, the 16KB has to be split into 8KB for source (previous round) and dest (current round).  However 8K * 256 (8-bit counters) = 2 million, which doesn't allow for overflow.  It could use an overflow table in the GDDR, or maybe a form of wave sync with two kernels reading row counters from GDS and two kernels writing.

sr. member
Activity: 728
Merit: 304
Miner Developer
Code:
PCI configuration space access from user space is possible via sysfs.
This is done through a "config"-attribute provided with each PCI
device sysfs-representation.
http://developer.amd.com/wordpress/media/2012/10/pci%20-%20pci%20express%20configuration%20space%20access.pdf

Interesting!

Code:
$ lspci | grep VGA
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev c7)
06:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev c7)
$ hexdump /sys/bus/pci/devices/0000:01:00.0/config
0000000 1002 67df 0407 2010 00c7 0300 0010 0080
0000010 000c e000 0000 0000 000c f000 0000 0000
0000020 e001 0000 0000 f7e0 0000 0000 1682 9480
0000030 0000 f7e4 0048 0000 0000 0000 010b 0000
0000040
$ hexdump /sys/bus/pci/devices/0000:06:00.0/config
0000000 1002 67df 0407 0010 00c7 0300 0010 0080
0000010 000c c000 0000 0000 000c d000 0000 0000
0000020 d001 0000 0000 f7c0 0000 0000 1682 9480
0000030 0000 f7c4 0048 0000 0000 0000 010a 0000
0000040

Looks good to me. All I need to do is to send GDS_INIT packets to them, no?
sr. member
Activity: 728
Merit: 304
Miner Developer
On stock: AMD Radeon HD 7850, somewhat dated
just 7990 getting mentioned in ANN so could I still employ my card? what to exspect in numbers?
am considering XMR and also ETH beeing a long term investment, not so much interested into the others. would mine & hodl
would have to get away from fglrx drivers first, some time investment.

XMR should work. Wolf's kernel is just great. I wouldn't recommend ETH, though.
In any case, you have to try it yourself. Good luck!
legendary
Activity: 2380
Merit: 1085
Money often costs too much.
On stock: AMD Radeon HD 7850, somewhat dated
just 7990 getting mentioned in ANN so could I still employ my card? what to exspect in numbers?
am considering XMR and also ETH beeing a long term investment, not so much interested into the others. would mine & hodl
would have to get away from fglrx drivers first, some time investment.
sr. member
Activity: 728
Merit: 304
Miner Developer
Nope, it didn't work.
I was just glancing through the change logs of optiminer and noticed something pretty interesting, though.

Code:
- [1.7.0] New --pci-modes (0-3). Try if you see GPU freezes.

It seems that optiminer directly accesses the GPU through the PCI bus.
Maybe it sends a GDS_INIT P4 packet to the GPU so that it could access the entire GDS.
What do you think, nerdralph?
sr. member
Activity: 728
Merit: 304
Miner Developer
I've been running a ton of experiments on the GDS on RX 480 and getting some pretty weird results.
It seems like there is no straight forward way to access the entire 64KB.
Smaller values for gds_segment_byte_size may work, though. Let's see...
legendary
Activity: 1050
Merit: 1294
Huh?
It turned out that only 8KB out of 64KB GDS is available with RX 480 and AMDGPU-PRO.
GDS does not seem to work if m0 is greater than 0x1fff.
I suppose I could just let the first CPU thread use GDS for row counters and keep everything else the same.
It wouldn't make any sense to do all these experiments entirely in the GCN assembly as it consumes way too much time and I am hardly making any money with this miner...

Edit: All the GDS-related parameters in the HSA binary seems to be ignored. Oh well.

I appreciate your work.  Just sent you a small thank you.

Thank you! I mentioned your donation to my wife, and her reply was, "I would love monthly donations!" I think she's crazy...

Someone wants a new pair of shoes... Smiley
sr. member
Activity: 728
Merit: 304
Miner Developer
It turned out that only 8KB out of 64KB GDS is available with RX 480 and AMDGPU-PRO.
GDS does not seem to work if m0 is greater than 0x1fff.
I suppose I could just let the first CPU thread use GDS for row counters and keep everything else the same.
It wouldn't make any sense to do all these experiments entirely in the GCN assembly as it consumes way too much time and I am hardly making any money with this miner...

Edit: All the GDS-related parameters in the HSA binary seems to be ignored. Oh well.

I appreciate your work.  Just sent you a small thank you.

Thank you! I mentioned your donation to my wife, and her reply was, "I would love monthly donations!" I think she's crazy...
sr. member
Activity: 728
Merit: 304
Miner Developer
It turned out that only 8KB out of 64KB GDS is available with RX 480 and AMDGPU-PRO.
GDS does not seem to work if m0 is greater than 0x1fff.
I suppose I could just let the first CPU thread use GDS for row counters and keep everything else the same.
It wouldn't make any sense to do all these experiments entirely in the GCN assembly as it consumes way too much time and I am hardly making any money with this miner...

Edit: All the GDS-related parameters in the HSA binary seems to be ignored. Oh well.

I get 268sols with Optiminer on my Rx 470 on Linux 4.8 with AMDGPU-Pro 16.40 (1900 memory clock).  I think it would be pretty hard to get that kind of performance using just 8KB of the GDS.


I wonder if you could get the same speed on Linux 4.10 with AMDGPU-Pro 16.60.
Let me try Optiminer on my Linux box...

Edit: I got 290 S/s, so the driver is not the problem. There got to be a better way to access GDS, then...
sr. member
Activity: 588
Merit: 251
It turned out that only 8KB out of 64KB GDS is available with RX 480 and AMDGPU-PRO.
GDS does not seem to work if m0 is greater than 0x1fff.
I suppose I could just let the first CPU thread use GDS for row counters and keep everything else the same.
It wouldn't make any sense to do all these experiments entirely in the GCN assembly as it consumes way too much time and I am hardly making any money with this miner...

Edit: All the GDS-related parameters in the HSA binary seems to be ignored. Oh well.

I get 268sols with Optiminer on my Rx 470 on Linux 4.8 with AMDGPU-Pro 16.40 (1900 memory clock).  I think it would be pretty hard to get that kind of performance using just 8KB of the GDS.
full member
Activity: 305
Merit: 148
Theranos Coin - IoT + micro-blood arrays = Moon!
It turned out that only 8KB out of 64KB GDS is available with RX 480 and AMDGPU-PRO.
GDS does not seem to work if m0 is greater than 0x1fff.
I suppose I could just let the first CPU thread use GDS for row counters and keep everything else the same.
It wouldn't make any sense to do all these experiments entirely in the GCN assembly as it consumes way too much time and I am hardly making any money with this miner...

Edit: All the GDS-related parameters in the HSA binary seems to be ignored. Oh well.

I appreciate your work.  Just sent you a small thank you.
sr. member
Activity: 728
Merit: 304
Miner Developer
It turned out that only 8KB out of 64KB GDS is available with RX 480 and AMDGPU-PRO.
GDS does not seem to work if m0 is greater than 0x1fff.
I suppose I could just let the first CPU thread use GDS for row counters and keep everything else the same.
It wouldn't make any sense to do all these experiments entirely in the GCN assembly as it consumes way too much time and I am hardly making any money with this miner...

Edit: All the GDS-related parameters in the HSA binary seems to be ignored. Oh well.
sr. member
Activity: 728
Merit: 304
Miner Developer
The current overhaul of my small mining farm is going well, and new Linux rigs are pretty stable.
I had to work on my farm as I was seriously running out of money, but I am also making some progress with GG.
I ported the new GCN compiler to Linux, so I can finally run some experiments with GDS on RX 480.

Did it get better than Cm?

Not yet, not yet. I just confirmed that GDS is accessible through inline assembly on Linux, however.
We will see...
full member
Activity: 254
Merit: 100
The current overhaul of my small mining farm is going well, and new Linux rigs are pretty stable.
I had to work on my farm as I was seriously running out of money, but I am also making some progress with GG.
I ported the new GCN compiler to Linux, so I can finally run some experiments with GDS on RX 480.

Did it get better than Cm?
sr. member
Activity: 728
Merit: 304
Miner Developer
The current overhaul of my small mining farm is going well, and new Linux rigs are pretty stable.
I had to work on my farm as I was seriously running out of money, but I am also making some progress with GG.
I ported the new GCN compiler to Linux, so I can finally run some experiments with GDS on RX 480.
newbie
Activity: 27
Merit: 0
Ocтaвa caмo дa мaxнeш windows-a и вcичкo щe e нapeд Smiley

Tρία πoυλάκια κάθoνται και πλέκoυνε παστίτσιo
Jump to: