Pages:
Author

Topic: [BBR] Boolberry GPU Miner Discussion - AMD & Nvidia - OpenCL & CUDA - page 15. (Read 82970 times)

member
Activity: 81
Merit: 1002
It was only the wind.
You mean ROUND? Or reordering "for" loops? Anyway it's hard to find any logic. The GPU spends 99% in MIX so other changes shouldn't make any difference but they do. In other miner I did some time ago I got boost accidentally switching on debug code... OpenCL compilers are still very far from perfect.
What gives it boost on nvidia? That change in MIX macro? Keccak optimizations couldn't help that much cause all keccak code make just few percent of overall load.
Merged all updates from the main branch. Added binary release too. Update if you use pool miner. Included wolf's nvidia optimized kernel. I cannot measure it's performance on nvidia but it gives about +2% on AMD cards too so it's the default kernel now.
https://github.com/mbkuperman/boolberry-opencl


Yeah, the small percentage bump on AMD is probably from the shortcuts.

Nah, it's what I did to the loop.

Agreed - they aren't perfect at all. The loop re-ordering made ptxas see reason though, I think.
newbie
Activity: 4
Merit: 0
When the editor opens change "thread_delay" to 2000, save and exit the editor.

This is most important part of whole guide. I lost over 2 hours to configure 2 GPU working together. I try countless number of combination in json file, but no one work. When I put all to "defaults" and change delay to 2000 both card started without any problem. 

to me 2000 is not enough,some cards didn't working,try 4000 that make all my cards working together...I also have a question,does anyone knows the ATI card mining electricity cost contrast with scrypt or X11?
hero member
Activity: 938
Merit: 1001
Good News Everybody!


The Boolberry team has pledged an 1100 BBR bounty to Wolf0 to make the stratum version of the OpenCL GPU miner.


Please contribute to the bounty to show Wolf0 at:
1DMRXkaBU6kThGHvvbx52bCSw7rUkcVsiDPjh8dq92Kj9sYRauGs7Dk2JbAvNVouvZGTUHMM3bx7hBW m6i6ZMKQDE5XkYfR.

member
Activity: 81
Merit: 1002
It was only the wind.
What gives it boost on nvidia? That change in MIX macro? Keccak optimizations couldn't help that much cause all keccak code make just few percent of overall load.
Merged all updates from the main branch. Added binary release too. Update if you use pool miner. Included wolf's nvidia optimized kernel. I cannot measure it's performance on nvidia but it gives about +2% on AMD cards too so it's the default kernel now.
https://github.com/mbkuperman/boolberry-opencl


Yeah, the small percentage bump on AMD is probably from the shortcuts.

Nah, it's what I did to the loop.
full member
Activity: 414
Merit: 101
When the editor opens change "thread_delay" to 2000, save and exit the editor.

This is most important part of whole guide. I lost over 2 hours to configure 2 GPU working together. I try countless number of combination in json file, but no one work. When I put all to "defaults" and change delay to 2000 both card started without any problem. 
full member
Activity: 209
Merit: 100
This is tutorial only for BBR AMD GPU miner setup. No sensitive data should end up on the USB (e.g. wallet keys file) thus no encryption of the file system is necessary.

Ingredients:
- GPU mining rig. Rig doesn't need to have HDD.
- 8GB USB thumb drive
- lot of patience, installing on USB 2.0 thumb drive is painfully slow but in the end miner will work just fine

Tutorial itself:
Download amd64 Ubuntu 14.04.1 from http://www.ubuntu.com/download/desktop
Download Universal USB Installer http://www.pendrivelinux.com/downloads/Universal-USB-Installer/Universal-USB-Installer-1.9.5.5.exe
Follow instructions from http://www.ubuntu.com/download/desktop/create-a-usb-stick-on-windows to create bootable USB with Ubuntu. Make sure you put 4GB persistence file on it. If you wish to have persistence file larger than 4GB, i have no clue how to do it from Win, 4GB will be enough.
From GNU/Linux you can create larger Persistence like this:
- Make bootable USB with persistence of any size
- Mount USB and delete casper-rw file and resize partition with gparted
- Create second partition labeled "casper-rw" on USB, and format it as ext3

Both Win and GNU/Linux, open boot/grub/grub.cfg file from the first partition on USB to make sure following menuentry has "persistent" option:
Code:
menuentry "Try Ubuntu without installing" {
set gfxpayload=keep
linux /casper/vmlinuz.efi  file=/cdrom/preseed/ubuntu.seed boot=casper quiet splash persistent --
initrd /casper/initrd.lz
}
If it doesn't, add "persistent" keyword and save file.

Now restart comp and boot it from USB, choose "Try Ubuntu without installing" from grub menu. If you use GNU/Linux for the first time you can make sure that you'll not mess up anything on your HDD by e.g. pulling out power cable from HD before booting from USB (at this point you wish to have only one GPU card plugged in!).

Once Ubuntu boots test if persistence is working by opening terminal (press Ctrl+Alt+t) and issue following two commands (without $ char):
Code:
$ touch fileOnPersistance
$ sudo reboot
After reboot open terminal and:
Code:
$ ls
If you see that "fileOnPersistance" is there you are good to go.

Then:
Code:
$ rm fileOnPersistance
$ gedit initialUpdate.sh &
When gedit opens copy the following in it:
Code:
#!/bin/bash

sudo add-apt-repository "deb http://archive.ubuntu.com/ubuntu $(lsb_release -sc) main universe multiverse"
sudo apt-get update
sudo apt-get -y install git-core build-essential cmake libboost1.55-all-dev g++ automake indicator-multiload htop autoconf vim gufw

indicator-multiload &
sudo gufw &

sudo apt-get upgrade
Save and exit. Then:
Code:
$ chmod +x initialUpdate.sh
$ ./initialUpdate.sh

At one point firewall (ufw) GUI will appear, just press ON and close it. Then you may notice indicator-multiload appeared in tray, you may wish to check Preferences of it.

Also, you can use this time to download Download AMD-APP-SDK-v2.9-lnx64.tgz from
http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/

After few hours, when initialUpdate.sh script finishes, issue commend:
Code:
$ sudo reboot
if you got new kernel. If not sure, just reboot.

When it comes back to life type:
Code:
$ sudo apt-get install fglrx-updates
to install AMD drivers. If you wish to install newest drivers see
http://wiki.cchtml.com/index.php/Ubuntu_Trusty_Installation_Guide
but it will work fine with "fglrx-updates" drivers. Either way, you'll have more than enough time to read the Ubuntu_Trusty_Installation_Guide while drivers install.

When the driver is installed, type:
Code:
$ sudo halt
and plug in all your AMD GPUs, and power on comp. When it boots press Ctrl+Alt+F1 and type in following:
Code:
$ sudo amdconfig --initial -f --adapter=all
$ sudo reboot
When it boots, you should have nice and shiny display that uses the AMD drivers. Now, in terminal:
Code:
$ cd && mkdir SDK && mv ~/Downloads/AMD-APP-SDK-v2.9-lnx64.tgz SDK && cd SDK
$ tar xvzf AMD-APP-SDK-v2.9-lnx64.tgz
$ sudo ./Install-AMD-APP.sh
$ sudo reboot
to reboot it for final time. When comes back:
Code:
$ env | grep AMDAPPSDKROOT
should return
Code:
AMDAPPSDKROOT=/opt/AMDAPP
If not than:
Code:
$ export AMDAPPSDKROOT=/opt/AMDAPP
$ cd && echo "\nexport AMDAPPSDKROOT=/opt/AMDAPP" >> .bashrc

Now, and at last:
Code:
$ git clone https://github.com/mbkuperman/boolberry-opencl && cd boolberry-opencl && make -j 4 
$ cd build/release/src
List line takes you to the directory where boolbd is, if everything went well. Now:
Code:
$ cp ../../../src/cl/*.cl .
$ ./boolbd --start-mining=
--mining-threads=N
where
is your BBR address and N is number of GPUs. This will download blockchain, but it will probably not mine with all your cards. When the blockchain is downloaded, type "exit" and hit Enter in the daemon window.
Now, in terminal:
Code:
$ cd ~/.boolb
$ gedit miner_conf.json &
$ cd -
When the editor opens change "thread_delay" to 2000, save and exit the editor. Now (you can get command from history by pressing up arrow on the keyboard):
Code:
$ ./boolbd --start-mining=
--mining-threads=N
should mine with all your N GPUs.

This is first version of the tutorial. When my memory on how painfully slow this installation on USB 2.0 thumb drive is fades away, i'll try to do it once more to streamline the process and find errors in the tutorial (or i'll just buy USB 3.0 thumb drive). Anyone who tries to do it for the first time please report on problems that you encounter. Also, veteran AMD miners please see if you see errors, this is my first time installing GPU miner. Tnx.



member
Activity: 81
Merit: 1002
It was only the wind.
Merged all updates from the main branch. Added binary release too. Update if you use pool miner. Included wolf's nvidia optimized kernel. I cannot measure it's performance on nvidia but it gives about +2% on AMD cards too so it's the default kernel now.
https://github.com/mbkuperman/boolberry-opencl


Yeah, the small percentage bump on AMD is probably from the shortcuts.
full member
Activity: 414
Merit: 101
If someone can answer to me next question.

I solo-mine for few days: and then I

1. stopped daemon, stop wallet.
Did I loose all shares I done before or not?

If I start again: did I get new block or continue to process when I stop?


2. my comp crashed ( so there is no clean exit from daemon.
Did I loose all shares?

Second question Smiley
I have two computers: first have one GPU and on second have two GPU ( of course I am mining with same address)

Does they mine same block or not? (since they used same address)
What if one computer/GPU chrashed? Will I lost all shares or only shares from crashed GPU?  ( but if I lost any share if they mine same block) I will fail to get it?

Thanks for reply.
full member
Activity: 414
Merit: 101
I am wiling try wolf CUDA miner, but since it is not compiled.... ( I know GT 640 is "small card" ) but I am interested in trying that card: Assume that will give more hash rate then CPU and it consumed less power.
So please release exe file....

Second after reading and re reading this post from top to end I manage to get working two card same time on solo mining Smiley

Hash rate with lowered voltage and slighty OC GPu is

hr: 1763224, efficiency: 105% (shares 332/333, blocks 0/0)
hr: 1763208, efficiency: 105% (shares 332/333, blocks 0/0)
hr: 1763193, efficiency: 105% (shares 332/333, blocks 0/0)
hr: 1763177, efficiency: 105% (shares 332/333, blocks 0/0)
hr: 1763161, efficiency: 105% (shares 332/333, blocks 0/0)
hr: 1763146, efficiency: 105% (shares 332/333, blocks 0/0)
hr: 1763130, efficiency: 105% (shares 332/333, blocks 0/0)
hr: 1763202, efficiency: 104% (shares 332/333, blocks 0/0)
hr: 1763187, efficiency: 104% (shares 332/333, blocks 0/0)
hr: 1763171, efficiency: 104% (shares 332/333, blocks 0/0)
2014-Aug-10 15:25:20.880570 [miner 0]Share found 1005573449, <93812e1
hr: 1763243, efficiency: 104% (shares 333/334, blocks 0/0)
hr: 1763227, efficiency: 104% (shares 333/334, blocks 0/0)
hr: 1763212, efficiency: 104% (shares 333/334, blocks 0/0)
hr: 1763196, efficiency: 104% (shares 333/334, blocks 0/0)
hr: 1763267, efficiency: 104% (shares 333/334, blocks 0/0)

So lets see how time to find block.  Then I will try on pool Smiley

Thanks for this miner!
full member
Activity: 414
Merit: 101
hr: 826189, efficiency: 67% (shares 8/8, blocks 0/0) on 280 ( not X) at 975/1450....

little playing with speed and voltage ...

hr: 862915, efficiency: 100% (shares 13/13, blocks 0/0)


Over 8 shares? Pick a random number, that's your short term efficiency.

Let it run for 6h or 12h with each config and THEN post results.

hr: 875492, efficiency: 102% (shares 1790/1792, blocks 0/0)
hr: 875492, efficiency: 102% (shares 1790/1792, blocks 0/0)
hr: 875492, efficiency: 102% (shares 1790/1792, blocks 0/0)
hr: 875491, efficiency: 102% (shares 1790/1792, blocks 0/0)
hr: 875491, efficiency: 102% (shares 1790/1792, blocks 0/0)
hr: 875491, efficiency: 102% (shares 1790/1792, blocks 0/0)
hr: 875491, efficiency: 102% (shares 1790/1792, blocks 0/0)
hr: 875491, efficiency: 102% (shares 1790/1792, blocks 0/0)
member
Activity: 81
Merit: 1002
It was only the wind.
So close to 800kh/s on a stock 750Ti. Did some more improvements which bumped hashrate, but it doesn't quite get me there (nsfw as always): https://ottrbutt.com/tmp/bbrcudaminer6.png
hero member
Activity: 938
Merit: 1001
yepp, I just used your example and changed the address. I'll install CUDA 6 and try again.

update: after re-compiling with cuda6.0, I get ~3900khash for 5 x 750Ti

Nice numbers!
full member
Activity: 414
Merit: 101
hr: 826189, efficiency: 67% (shares 8/8, blocks 0/0) on 280 ( not X) at 975/1450....

little playing with speed and voltage ...

hr: 862915, efficiency: 100% (shares 13/13, blocks 0/0)
legendary
Activity: 914
Merit: 1001
yepp, I just used your example and changed the address. I'll install CUDA 6 and try again.

update: after re-compiling with cuda6.0, I get ~3900khash for 5 x 750Ti
member
Activity: 81
Merit: 1002
It was only the wind.
Hey Wolf0, does the CL patches hash better for pools? Or are they still hashing 50% or less than solo still?

Nothing in the CL will help the stales and crap caused by HTTP mining. Simpleminer is garbage.

Will you give a try to implement it in the cpu multiminer?

I did, but it's REALLY dull work. I gave clintar the half-finished code, but he hasn't been able to figure it out yet...

Oh. So, the scratchpad part of BBR really is an annoying issue with implementing a stratum version of a gpu miner for it?

Oh, no, see the screenshot: https://ottrbutt.com/tmp/bbrcudaminer5.png

Stratum. It's just OpenCL that's the major pain.

Oh ok.

Has anyone try to use Mantle or is that exclusive to gaming coding only?

I'm no expert on Mantle, but as far as I can tell, it's pretty useless for compute.

From what I read about it, it's just another way to access the core technology for gaming. Probably only API from what I see.

Yeah, CUDA is a lot more fun, anyway.

CUDA is coded better than anything that AMD does, just that OpenCL is more universal and AMD compute core works with it better.

I'm in the process of migrating from AMD GPU's to Nvidia GTX 750ti's. Best bang for the buck via power/hash ratio.

Agreed. I think when high end Maxwell comes out, AMD will be out of the mining game. It won't be a fight; it'll be an execution.
sr. member
Activity: 363
Merit: 250
7850 - 550kh/s (@1000/1345)
member
Activity: 87
Merit: 11
useing" --mining-threads 5"  .The result is :

   2014-Aug-09 13:26:55.940664 OpenCL simpleminer should be used 1 instance per GPU
! Changing to 1 mining thread.

try this recommendation by mbk himself

Start multiple instances having different --device parameters (--device 0, --device 1, etc).

Ok. That was the trick. Working now with 4 instances / 4 GPUs  .   THANKS  Smiley
member
Activity: 81
Merit: 1002
It was only the wind.
Hey Wolf0, does the CL patches hash better for pools? Or are they still hashing 50% or less than solo still?

Nothing in the CL will help the stales and crap caused by HTTP mining. Simpleminer is garbage.

Will you give a try to implement it in the cpu multiminer?

I did, but it's REALLY dull work. I gave clintar the half-finished code, but he hasn't been able to figure it out yet...

Oh. So, the scratchpad part of BBR really is an annoying issue with implementing a stratum version of a gpu miner for it?

Oh, no, see the screenshot: https://ottrbutt.com/tmp/bbrcudaminer5.png

Stratum. It's just OpenCL that's the major pain.

Oh ok.

Has anyone try to use Mantle or is that exclusive to gaming coding only?

I'm no expert on Mantle, but as far as I can tell, it's pretty useless for compute.

From what I read about it, it's just another way to access the core technology for gaming. Probably only API from what I see.

Yeah, CUDA is a lot more fun, anyway.
sr. member
Activity: 342
Merit: 250
useing" --mining-threads 5"  .The result is :

   2014-Aug-09 13:26:55.940664 OpenCL simpleminer should be used 1 instance per GPU
! Changing to 1 mining thread.

try this recommendation by mbk himself

Start multiple instances having different --device parameters (--device 0, --device 1, etc).
newbie
Activity: 24
Merit: 0
if you are using the latest simpleminer by mbkuperman you should have the argument --remote_scratchpad arg

so try...
./simpleminer --pool-addr bbr.cncoin.farm:1111 --login YOUR_WALLET_ADDRESS --pass x --mining-threads 1 --remote_scratchpad http://bbr.cncoin.farm/scratchpad.bin

if have 5gpu / rig
how can i do ?

because run 5 windows are slow

try --mining-threads 5


useing" --mining-threads 5"  .The result is :

   2014-Aug-09 13:26:55.940664 OpenCL simpleminer should be used 1 instance per GPU
! Changing to 1 mining thread.

Pages:
Jump to: