I managed to get some yescrypt running and I got (on
this card, 7750 850/1200). Wattages are purely indicative as my wattage meter is very cheap. Includes whole system including ~20W monitor and ~3W UPS. Wattage and temps taken only with v2.
Command line is
--no-submit-stale --kernel yescrypt -o server -u user -p password -w 4 --rawintensity X--rawIntensity | h/s (v1) | h/s (v2) | W | C |
16 | 18 | 30 | 58-106 | 44 |
32 | 31 | 56 | 56-105 | 48 |
64 | 54 | 91 | 60-105 | 50 |
128 | 95 | 152 | 62-109 | 53 |
144 | | 105 | 61-107 | 49 |
I didn't go higher as higher intensities would sporadically crash the driver. In all cases system becomes oddly unresponsive.
Increasing the worksize results in lower hashrate.
I had high hopes for the "nvidia" multi-phase kernel: I couldn't spot any reason for this to be nv-only in 5 minutes but fact is, it cuts performance in half for me.
For the more technically inclinedThe
yescrypt kernel is huge. Over 10 times bigger than the suggested size. It also
overspills, an operation which traditionally favored nvidia hardware.
On my tests, when hashing 128 items each one reads on average 14MiB and writes 1.92MiB. If memory serves this should be around 2MiB as a reference so the GPU is being hammered hard. It also consumes all on-chip registers so the GPU is currently running with all latency hiding capabilities disabled. The fact it's still remotely comparable to a CPU in this worse case scenario is nothing short of amazing.
The multi-phase kernel looks great at a glance but for some reason it just does not add up. Kernel "search2" alone takes more time alone than the whole "search" monolithic kernel.
On average, only 1 clock out of 3 is spent doing useful work for me and on that useful clock, only about 6% of GPU power is effectively used.
I think BSTY is going to be a great contributor in providing a playground for a real hashing scheme (rather than a mish mash) every advancement here will be very interesting.