Pages:
Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 74. (Read 3426989 times)

sr. member
Activity: 350
Merit: 250
For anyone wondering, cudamining.co.uk has been down for around 36 hours now. I have had 6 machines fail in the last 2 days so the server hasn't been able to be fixed yet
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
1. " bit hacking it "
2. or write a new compiler.

I think I will go for option 2
hero member
Activity: 672
Merit: 500
Banned: For Your Protection

I was a little disappointed that the Nascar endorsed car didn't do better for Doge Coin.

...now if BTC would sponsor one. and not a "cheap" one, that might do something for the market as well.
legendary
Activity: 1400
Merit: 1000
http://espn.go.com/college-football/story/_/id/11622587/georgia-tech-yellow-jackets-first-program-use-bitcoin-stadium-concessions

I think this is an awesome move by Georgia Tech University.

For several reasons this will be huge I feel for the momentum of BitCoins going forward. It now will start putting it more mainstream for a group that is probably one of the biggest age groups using/mining it. Also I feel like these kids will "grow" up with BitCoins in use so it will be natural for them to use.

Now if more University's follow this path I think here in the U. S. That BitCoins will become more mainstream and not considered some type of shady dealings.

Also the BitCoin Bowl game will get the name out there also.

I was a little disappointed that the Nascar endorsed car didn't do better for Doge Coin.
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Tested some more. The dual hash macro below pumps the speed to 420MHASH on blake coin (+20%).  I tried to do 3 and 4 hash in parallell within each cudacore, but the compiler started to spill registers and slow down.

Blake is the first X in X11,x13 and x15. I think x11 will run around 150KHash faster with this (recoded to 64bit). (This meens around 2700 KHASH with base clocks and 3000KHASH+ overclocked on the 750TI)

Code:
#define GS2(a,b,c,d,x) { \
const uint32_t idx1 = c_sigma[i][x]; \
const uint32_t idx2 = c_sigma[i][x+1]; \

v[a] += (m[idx1] ^ c_u256[idx2]) + v[b]; \
v1[a] += (m1[idx1] ^ c_u256[idx2]) + v1[b]; \
v[d] = swab32_16(v[d] ^ v[a]); \
v1[d] = swab32_16(v1[d] ^ v1[a]); \
v[c] += v[d]; \
v1[c] += v1[d]; \
v[b] = SPH_ROTR32(v[b] ^ v[c], 12); \
v1[b] = SPH_ROTR32(v1[b] ^ v1[c], 12); \
\
v[a] += (m[idx2] ^ c_u256[idx1]) + v[b]; \
v1[a] += (m1[idx2] ^ c_u256[idx1]) + v1[b]; \
v[d] = SPH_ROTR32(v[d] ^ v[a], 8); \
v1[d] = SPH_ROTR32(v1[d] ^ v1[a], 8); \
v[c] += v[d]; \
v1[c] += v1[d]; \
v[b] = SPH_ROTR32(v[b] ^ v[c], 7); \
v1[b] = SPH_ROTR32(v1[b] ^ v1[c], 7); \
}

hero member
Activity: 644
Merit: 500
Does someone have a hashrate for the GTX 980 on x11?
How does she compete to the others?

http://cryptomining-blog.com/3503-crypto-mining-performance-of-the-new-nvidia-geforce-gtx-980/
It's better to buy loads of 750ti's though, for the price a 980 goes. But to reach ROI, even with 750ti's, you'll need some fairy dust and magic too Cheesy
hero member
Activity: 965
Merit: 515
Does someone have a hashrate for the GTX 980 on x11?
How does she compete to the others?
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
If you unroll the loops and write the constants directly in the codestatements like this:
 
  v1^=v2^0x99999999UL;

The compiler will not fetch the constant from constantmemory but include it in the instructioncache. If the constant is used many times, it will load the constant into a register and use the register. The problem is when you have many constants the register space is filled up. If I somehow can find away to force the compile to encode the constant directly into the instruction, it will improve the superscalar/latency and register perfomance. With less registers used, you can have more superscalar pipelines.

Each maxwell instruction is 8 bytes of size, and many of the instructions support 1 immidiate operator within the encoding without increasing the size or slow down.

legendary
Activity: 1400
Merit: 1050
Is there a way to declare an array so that its content will be stored into the register memory (rather than in global memory) ? 
sp_
legendary
Activity: 2954
Merit: 1087
Team Black developer
Advertizing says that the Maxwell can do more work per core per cycle. I compiled some code for the compute 5.0 and noticed that after the return statement in the assembly code there where NOP operations. 5 of them actually. This meens that the maxwell core probobly is superscalar with 4 pipes.
I did a quick test with blakecoin. instead of computing 1 hash per run, I did 2 with superscalare paralell. In my test I changed the GS(a,b,c,d,x) macro to calculate 2 hashes instead of one with interleaving the instructions. The speed went from 350 to over 400MHASH(750ti base clock). (15% gain). I excpected more, so I disassembled again I noticed that the CUDA compiler doesn't like to inline constants directly inside the assembly statements. Instead it fetches from the constant buffer/shared memory or put the values in registers. Registers is ok, but when doing more than one hash superscalar I need the register count to stay low.

eor r1,r2,[0][3]
eor r3,r4,[3][3]
eor r5,r6,[4][10]
eor r7,r8,[10][5]

4 cycles

eor r1,r2,0xsomenumber
eor r3,r4,0xsomenumber
eor r5,r6,0xsomenumber
eor r7,r8,0xsomenumber

1 cycle
legendary
Activity: 1400
Merit: 1050
djm are you optimizing ccminer for 980?

Optimizing ccminer for 9xx specifically at this point would be an extremely inefficient use of time - it can be cleaned up and optimized more generally, making it faster on Kepler (compute 3.5) to Maxwell (compute 5.2) and everything in between (probably 3.0 Kepler and Fermi, too, but nobody cares about them anymore.) Once that shit is done, then you start doing improvements targeted at specific chips in an attempt to squeeze out that last bit of performance.

yes and no... the launch parameters which are optimal for the 750ti are different from those of 980 or the 780ti
So for the moment, I tried to play a bit with them. And for example on xcn, I got an easy additionnal 1MH by just playing with these (mostly on multiplication algo).
But for the moment I haven't tried to play with the code itself at that level

My main problem is to get my 980 running at p0 perf level instead as it is the case for the moment at p2.... I think it is an evga problem though...
So I need a way to make ccminer detected as a full 3D application...
legendary
Activity: 3248
Merit: 1070
djm are you optimizing ccminer for 980?
sr. member
Activity: 350
Merit: 250
so here is a question for you all. has anyone here done much work with KVM on ubuntu using gpu pass through? im looking to consolidate 2 machines into 1 monster machine and running 2 os's in virtuals with a physical gpu given to each one
sr. member
Activity: 462
Merit: 250
www.dashpay.io
thanks Ignition75, i ordered a switch and light diode for 3 usd that will do it too. for 3 usd...not bad  Smiley

Even better...  I always meant to order those for my farm but never got around to it....
legendary
Activity: 3164
Merit: 1003
ITS ALIVE   Roll Eyes  thanks djm and bigjme   Cheesy

You can setup in your Bios that in case of a power loss the computer should turn on so you can turn it on the next time only by switching off and then on your power supply, so no need to short it to turn it on ever again.

Yes but if you're like me and you want to see how far you can OC your hardware and your PC freezes often, it's a pain in the ass to wait 30 seconds for all the charge to leave your Mobo so the AC Power On setting can be invoked.

Phillips head screwdrivers are good for shorting the pins  Wink
thanks Ignition75, i ordered a switch and light diode for 3 usd that will do it too. for 3 usd...not bad  Smiley
sr. member
Activity: 462
Merit: 250
www.dashpay.io
ITS ALIVE   Roll Eyes  thanks djm and bigjme   Cheesy

You can setup in your Bios that in case of a power loss the computer should turn on so you can turn it on the next time only by switching off and then on your power supply, so no need to short it to turn it on ever again.

Yes but if you're like me and you want to see how far you can OC your hardware and your PC freezes often, it's a pain in the ass to wait 30 seconds for all the charge to leave your Mobo so the AC Power On setting can be invoked.

Phillips head screwdrivers are good for shorting the pins  Wink
legendary
Activity: 3164
Merit: 1003
ITS ALIVE   Roll Eyes  thanks djm and bigjme   Cheesy

You can setup in your Bios that in case of a power loss the computer should turn on so you can turn it on the next time only by switching off and then on your power supply, so no need to short it to turn it on ever again.
thank you, yes i heard about that, thats great isn't it and im going to do that   Smiley  right now i have to get windows installed. have to use the disc drive from my other computer.  Tongue
legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
ITS ALIVE   Roll Eyes  thanks djm and bigjme   Cheesy

You can setup in your Bios that in case of a power loss the computer should turn on so you can turn it on the next time only by switching off and then on your power supply, so no need to short it to turn it on ever again.
legendary
Activity: 3164
Merit: 1003
@ djm34 yes on page 21  Smiley
legendary
Activity: 1400
Merit: 1050
anyone, i need to jump start my first asrock 81, i dont have the power switch yet, is it PWRBTN  and  RESET that you quick jump?  thank you

Hold the board with both your hand. Jump 3 times in 1 sec. That should quick jump your board.  Grin
Tongue

 Grin

http://www.asrock.com/mb/Intel/H81%20Pro%20BTC/index.us.asp?cat=Manual

download  please and its page 21  top
hu ? Are you the one asking the question and wanting to do the manip ?

what he means is that he has the motherboard basically by itself with no power button atall. he wants to jump the power button pins to get it to start (mimic power button press)

I understood that... but if he has the manual, well it will be written how to wire it...

looking into my manual it says 6 and 8 (standard I/O connectivity design)

     ° 9
6 ° ° 7
8 ° ° 5
4 ° ° 3
2 ° ° 1

just need to short 6 and 8

ITS ALIVE   Roll Eyes  thanks djm and bigjme   Cheesy
you welcome, btw it was page 21 of your manual  Grin
Pages:
Jump to: