Pages:
Author

Topic: A custom designed FPGA miner for LTC? (Read 5793 times)

full member
Activity: 140
Merit: 100
"Don't worry. My career died after Batman, too."
August 03, 2013, 04:47:53 AM
#73
second WindMaster, SCAM!!!! Anyone with technical knowledge can see this. Remember people: check things out before donating.

After a month and a half? Just curious.
hero member
Activity: 798
Merit: 1000
‘Try to be nice’
August 03, 2013, 03:45:44 AM
#72
just from an economic stand point I believe , with regard to sCrypt one will see ASIC before FPGA - in this iteration of the cycle .


here is why I'm correct:


1. Because SHA256 was A  "first of a first" 

2. The ASIC market is much more well developed generally now in the cycle.

3. The whole environment is more developed the whole Crypto market .

4. high difficulty of SHA256 will drive market forces "desire" to push ASIC makers to build it.

5. They/some will do what the market want.
legendary
Activity: 1176
Merit: 1280
May Bitcoin be touched by his Noodly Appendage
June 14, 2013, 05:29:51 AM
#71
If I have completely misunderstood hashing over a lifetime of programming, then I really have some long hard thinking to do.
Quote from: Nova!
Oh wait, RND meant 'round' and not 'random'?
Lol
sr. member
Activity: 406
Merit: 250
Great. Now GPU miners will be completely obsoleted. Way to go. Tongue
sr. member
Activity: 347
Merit: 250
If you still feel that way, you have to strive to discard that feeling from your mind as quickly as possible, because that way is wrong. SHA-256 is neither the problem nor the hard part in LTC mining.

Agreed.  Wishing I had a proper profiler about now though Smiley

Let me save you some time and say the SHA256 overhead is around 0.1% of the overall processing time involved in scrypt, +/- a bit.  That's why mtrlt is saying not to bother with that, since there's an upper bound of about 0.1% you could gain even if you could optimize it down to zero overhead to calculate SHA256 hashes.
hero member
Activity: 714
Merit: 510
I have found what I believe is a shortcut in scrypt that if implemented correctly in hardware could dramatically speed up the hashrate.
I believe it should work and I know how I would implement it if I had the resources to acquire the FPGA and tools I need.

To show good faith I will elaborate on the algo and how the shortcut would work.  
This is really over simplified, but you are free to take this idea and roll with it.

scrypt the algo used by LTC and in fact all hashing algos, are comprised of 2 predominant steps.
#1 Generate a random list
#2 Hash across it.

To generate consistent results the random algo is actually deterministic pseudo-random and the setup for it is determined by a seed.
We will call this the prng.

The other step is hashing which is pretty well understood, you take a value from list a and replace it with a value from list b.
When you are done iterating you now have a hash.

scrypt differs mostly because it uses an entirely new list so frequently.  
The setup and tear down of this list requires quite a bit of CPU time and a lot of time is wasted on the memory bus performing storage & retrieval operations.
It cannot be done concurrently because the list itself changes frequently.

The shortcut is to have a multicore setup and a ton of on-die ram.
A dedicated prng core which does the setup and teardown for the second core.

The secondary core is the hashing core.  It would tell the prng core to setup a new list.
Then it would retrieve position x off the list from the shared memory space.
Other than that it would also perform all the normal hashing functions in a dedicated memory space.

I believe the total I need to make this work is about $12k USD, the FPGA I'm targeting right now is $10k and a license for the dev tools will be about $2k.
If I can find a less expensive option then I will go for that, but there aren't that many FPGAs that meet requirements right now.  
The particular target FPGA also has a direct path to ASIC from the mfr.

If you're willing to donate to the effort, I will keep you in the loop with full disclosure including build instructions and a copy of the sources and the firmware.
I haven't decided on a license for this if it works, but you will at least have a right to personal use.  
Perhaps if enough people are interested in production level manufacturing we could go a different route.  I'm not particularly interested in making this something I do for the rest of my life, but the contrarian in me is very excited by the potential here.

The LTC donation address is below.
LKfKkRMvMf2stQMNzQdKCvaf2YueAv1QSa

You can also donate BTC to the key in my sig.
There is no maximum but if you do decide to donate please send at least 0.5 LTC or the equivalent in BTC.
Then post just the address you donated from and I'll PM you here with a bitmessage key to join the group.

Thanks in advance!


Go on Cryptostocks.com and list there. There is another group offering FGPA shares. You should do the same and try and pull an ASICMiner type deal.
full member
Activity: 140
Merit: 101
If you still feel that way, you have to strive to discard that feeling from your mind as quickly as possible, because that way is wrong. SHA-256 is neither the problem nor the hard part in LTC mining.

Agreed.  Wishing I had a proper profiler about now though Smiley
member
Activity: 104
Merit: 10
Close but reverse.
I had read the whitepaper and got what I thought was a good understanding, of the way scrypt was supposed to work.
Then I looked at the code and I saw RND being called repeatedly.  Of course to my mind it all made perfect sense at that point.  You were clearly re-seeding a random number generator. (rounds of SHA256 had dropped out of my head).  The reference to SHA256 in the function name didn't really register, but the couple of times I did see it, my internal explanation was along the lines of "ok so he took the framework from the SHA256 algo and modified it to the scrypt algo".  At this point SHA256 rounds in scrypt had completely fallen out of my head.

Anyways, yeah I saw RND, realized in my infinite wisdom that you would need a custom seedable random number generator, tracked down the RND at the top said to myself "ok that could in fact work as a way of generating a random list".  Never crossed my mind to compare it to SHA256.  I get the concept of rounds, but at this point my mind saw that as being somewhere in a loop somewhere.  I didn't need to account for it just then.

Whitepapers can leave one very confused, I for one have never read the scrypt whitepaper. I've just taken a cursory glance at it and decided I can't possibly understand its complex language within a reasonable time. Looking at code is far more productive. And yes, the SHA-256 rounds can be in a loop, but usually it's unrolled for speed.

Quote
Once I saw the section though, that's where I realized that this could be optimized and probably should be optimized by a custom core with only the logic to perform this function.  Putting it in the stack unrolled or as a function call would likely be much slower than a call out to a logic unit optimized to the specific task.  However the reads and writes in memory would be problematic, hence the idea of sharing on-die memory.  Still hadn't quite worked out the nuts and bolts of how it would fit together.

I still feel that way.  I'm still studying it, but I do still feel that way.
If you still feel that way, you have to strive to discard that feeling from your mind as quickly as possible, because that way is wrong. SHA-256 is neither the problem nor the hard part in LTC mining.
full member
Activity: 140
Merit: 101
Silly kitchen psychology coming right up..

I don't think your problem is your age. Then again, I don't know your age, and I'm not old enough to know the effects of aging for sure. (I'm 24.) I think your problem is overconfidence. When you generate a hypothesis out of thin air (which is a valid way of generating hypotheses, and without any information indeed the only way), you assume it must be close to the truth, instead of finding out whether it's even related. Example: You assumed RND meant random, because that was probably the first thing to pop into your head. (Incidentally, this is how I deduced you don't know much about hash algorithms in general, since all hash algorithms I know of have rounds, and if I see RND being used in a hash algorithm, I'm naturally going to assume it's referring to rounds.) Then you decided that this is obviously the place where scrypt does its memory-hard magic, and is thus the bottleneck of the algorithm, without even looking at surrounding code to see how it was being used. Am I even close to the truth? Just curious.


Close but reverse.
I had read the whitepaper and got what I thought was a good understanding, of the way scrypt was supposed to work.
Then I looked at the code and I saw RND being called repeatedly.  Of course to my mind it all made perfect sense at that point.  You were clearly re-seeding a random number generator. (rounds of SHA256 had dropped out of my head).  The reference to SHA256 in the function name didn't really register, but the couple of times I did see it, my internal explanation was along the lines of "ok so he took the framework from the SHA256 algo and modified it to the scrypt algo".  At this point SHA256 rounds in scrypt had completely fallen out of my head.

Anyways, yeah I saw RND, realized in my infinite wisdom that you would need a custom seedable random number generator, tracked down the RND at the top said to myself "ok that could in fact work as a way of generating a random list".  Never crossed my mind to compare it to SHA256.  I get the concept of rounds, but at this point my mind saw that as being somewhere in a loop somewhere.  I didn't need to account for it just then.

Once I saw the section though, that's where I realized that this could be optimized and probably should be optimized by a custom core with only the logic to perform this function.  Putting it in the stack unrolled or as a function call would likely be much slower than a call out to a logic unit optimized to the specific task.  However the reads and writes in memory would be problematic, hence the idea of sharing on-die memory.  Still hadn't quite worked out the nuts and bolts of how it would fit together.

I still feel that way.  I'm still studying it, but I do still feel that way.

 
sr. member
Activity: 350
Merit: 250
Op reminds me of a younger me trying to do a digital scope without any knowledge of FPGAs as my final thesis. It sampled up to 20kHz from a 130Mhz clock. The interface looked killer though and the guys judging it knew even less about FPGAs than I did so everything went better than expected. God that was an awful implementation  Grin
member
Activity: 104
Merit: 10
Silly kitchen psychology coming right up..

I don't think your problem is your age. Then again, I don't know your age, and I'm not old enough to know the effects of aging for sure. (I'm 24.) I think your problem is overconfidence. When you generate a hypothesis out of thin air (which is a valid way of generating hypotheses, and without any information indeed the only way), you assume it must be close to the truth, instead of finding out whether it's even related. Example: You assumed RND meant random, because that was probably the first thing to pop into your head. (Incidentally, this is how I deduced you don't know much about hash algorithms in general, since all hash algorithms I know of have rounds, and if I see RND being used in a hash algorithm, I'm naturally going to assume it's referring to rounds.) Then you decided that this is obviously the place where scrypt does its memory-hard magic, and is thus the bottleneck of the algorithm, without even looking at surrounding code to see how it was being used. Am I even close to the truth? Just curious.
full member
Activity: 140
Merit: 101
I like you more and more all the time, nova.  you seem an upstanding guy.  don't let these neysayers get you down.

Thanks, they're not getting me down.  I don't let others control my opinion of myself.  He had a fundamentally valid point.  I've worked as a coder as a team-leader as a programming manager and as a project lead in the real world.  I love to think to myself that my mind is as clear and as sharp as it used to be.  However if a programmer came to me with that level of mistake it would have been a borderline HR issue in my book.

Coming clean and saying oops I screwed up is one thing, but examining the fundamentals of how and why a mistake occurs is actually more important.
I post-mortem everything whether it succeeded or failed because the only thing that matters to me is the knowledge gained.  I view failure as a fundamental and nessecary part of the learning process.  However the same failure twice just calls into question one's own competency.

10 years ago I made a similar mistake and it cost me a company.  Literally a company I founded failed because I looked at a vast chunk of un-commented code, thought I understood it, modified it and a year later the company was gone.  I examined that whole scenario over and over again trying to find the root cause and realized the cause had only been me.  I looked at something, believed I understood what it was doing and arrogantly thought I could make it better, faster, stronger.

From that time on I had always been careful to consult with the original developer to divine intent if intent was in anyway unclear and just in general excersize much more caution before believing that I understood.  In this case I did it all over again, and this time it wasn't a vast chunk of cryptic code with no explanation, it was a small chunk with an entire whitepaper backing it.  I did in fact read the whitepaper.  I'm still not sure what the missing thought process was here.  I freely admit this mistake was the root of this debacle.

It still catches my eye as not optimal.  That's not to say it's suboptimal in anyway and I concede that it's probably optimal for a GPU, but it doesn't feel right for what I'm trying to accomplish, so I'm struggling with a new challenge.

Which honestly is exactly where I like to be.  I just need to be more careful next time.  Smiley
hero member
Activity: 924
Merit: 501
I like you more and more all the time, nova.  you seem an upstanding guy.  don't let these neysayers get you down.
full member
Activity: 140
Merit: 101
They are the rounds of SHA-256... not a random number generator. I am now completely sure you are completely ignorant. You also talk about #defines like they are a completely new concept to you. Are you even a programmer?
Defines are not a new concept.  My line of thinking was that yes having that section of code unroll was precisely the problem.  Yes flat code can be nice when it executes in a single stack, but if you call out to a separate device with a single instruction it's faster in most cases than executing a bunch of things on the stack.

This is not my first FPGA project, but it is my second.  My first being one I can't go into depth about, but the gloss of it was "Here is an OpenCL FPGA, we already use OpenCL on GPU farms in our datacenter.  We are adopting the technology so that we can port our software to hardware and yield better performance for lower cost.  Here's a manual, here's a devboard, we're going golfing your's truly management".

That project worked well, most of what we had ported well.  It was almost straight across compiles in most cases.  When I left that job I was able to take my devboard which I played with and decided to try it at mining LTC.

The rest has been explained here.

As for your comment about whether I'm actually a programmer or not, yes I am, but this experience is making me wonder if maybe I've started to age out.  Now I look, it's pretty obvious where I made my mistake and I didn't check a fundamental assumption, I just assumed I knew.  There is no excuse.  That leaves me looking like a fool and I plan to leave this thread up and check it everytime before I post something "I just know will work". Smiley

Thanks for the info.
full member
Activity: 140
Merit: 101
Now we're looking at the code.
This wasn't the exact file I had but maybe something has changed I don't know, it's close enough in the places that matter anyways.

Anyways for the crux of my argument, take lines 124 through 142 which consist of the bulk of the random number generator.
These are currently implemented as defines.

That isn't a random number generator.  You're looking at macros for the SHA256 rounds.  Just stop, and go read the scrypt whitepaper.  Immediately, if not sooner..  I'm not trying to be rude, it's just that an immediate read of the scrypt whitepaper will be better use of your time at this point.

*embarrassed*
That's actually a good catch and frankly something I had not seen before but should have.
It would have been a costly mistake to proceed on that and while I can see now it's Round not Rand for RND, I openly admit I did not see that before and this was a critical thinking error.  

Thank you both for showing me my mistake.
It completely breaks my premise.

There is no need to raise further funds for this.  These two have shown me a critical flaw in the plan which would have been to isolate the RND function off onto it's own core.
I'm really glad the original author came on and explained this because to me it wasn't clear and in the absence of anything resembling a comment in the source code, I find this information extremely valuable.

So here is what we've learned.

#1 I need to review closer to make sure the code is doing what I think it's doing when working with someone else code, especially when that code has no comments and all I have to go on is a whitepaper.

#2 I did in fact see the RND define as a hand written psuedo-random generator.  It is not one, and had my brain been in anyway functional I should have caught that before making an announcement of this type.  In my mind it's still a candidate for optimization, but that's probably just me being stubborn.

#3 Not that it's advisable but...
 
There are several dev boards and OpenCL FPGAs out there on the market.  
It should in theory be possible to modify the OpenCL miner to run on one of these FPGAs.
The Altera Stratix V is at the high end of these and I did burn one out trying.

That should be enough of a start, however I do plan to keep chasing this dog until it barks.  I'm no longer asking for contributions of any kind, I've gotten the information I needed.  I really appreciate everyone's patience on this.

Anyone who wishes a refund is entitled to it.
sr. member
Activity: 347
Merit: 250
Anyways, it's not like SHA-256 is the difficult part in calculating LTC hashes.

I think if Nova wants to get started in FPGA development, a more realistic idea would be for him to experiment with making an FPGA implementation of a Bitcoin miner instead.  He's already caught up on the details of SHA256 rounds afterall..  Smiley

Nova, consider experimenting with Bitcoin instead.  It'll be a lot easier, since SHA256D is almost the only thing you'll need to figure out (+/- some miscellaneous fiddly details and communications), and you can happily instance copies of your core all over the place until you run out of logic area.  Figuring out scrypt is much harder than that, but because it uses SHA256, you'll need to come to grips with that hashing algorithm first anyway.
member
Activity: 104
Merit: 10
Now we're looking at the code.
This wasn't the exact file I had but maybe something has changed I don't know, it's close enough in the places that matter anyways.

Anyways for the crux of my argument, take lines 124 through 142 which consist of the bulk of the random number generator.
These are currently implemented as defines.

That isn't a random number generator.  You're looking at macros for the SHA256 rounds.  Just stop, and go read the scrypt whitepaper.  Immediately, if not sooner..  I'm not trying to be rude, it's just that an immediate read of the scrypt whitepaper will be better use of your time at this point.
Well, strictly speaking scrypt is using SHA256 as a random number generator...

In a way, yes. But Nova's proposal of doing single SHA-256 rounds in a separate core is bogus. I am not very knowledgeable on FPGAs, but I'd assume you'd at least unroll it.. which would nullify what Nova is trying to accomplish in the first place.

Anyways, it's not like SHA-256 is the difficult part in calculating LTC hashes.
legendary
Activity: 2576
Merit: 1186
Now we're looking at the code.
This wasn't the exact file I had but maybe something has changed I don't know, it's close enough in the places that matter anyways.

Anyways for the crux of my argument, take lines 124 through 142 which consist of the bulk of the random number generator.
These are currently implemented as defines.

That isn't a random number generator.  You're looking at macros for the SHA256 rounds.  Just stop, and go read the scrypt whitepaper.  Immediately, if not sooner..  I'm not trying to be rude, it's just that an immediate read of the scrypt whitepaper will be better use of your time at this point.
Well, strictly speaking scrypt is using SHA256 as a random number generator...
sr. member
Activity: 347
Merit: 250
Now we're looking at the code.
This wasn't the exact file I had but maybe something has changed I don't know, it's close enough in the places that matter anyways.

Anyways for the crux of my argument, take lines 124 through 142 which consist of the bulk of the random number generator.
These are currently implemented as defines.

That isn't a random number generator.  You're looking at macros for the SHA256 rounds.  Just stop, and go read the scrypt whitepaper.  Immediately, if not sooner..  I'm not trying to be rude, it's just that an immediate read of the scrypt whitepaper will be better use of your time at this point.
member
Activity: 104
Merit: 10
Nova: You have blatantly misunderstood how hash functions work, and specifically how scrypt works. I agree with WindMaster, there is no way you have made, or will make a scrypt FPGA. I advise everyone to not send Nova money.

Ok, elaborate.  Please explain in laymans terms how a cryptographic hash function in general works first.  Then also explain in laymans terms how scrypt differs from the SHA-256 of bitcoin.

Why should I explain it to you in layman terms? I can only assume that you are indeed not a technical person. Besides, I have already proven that I know what I'm talking about (by for example writing a BTC miner from scratch, and the first ever open source GPU miner for LTC, which no-one has been able to improve significantly, at least not publicly). You haven't. The only reason you can't explain your ideas technically is that you don't know what you're talking about.

Anyways for the crux of my argument, take lines 124 through 142 which consist of the bulk of the random number generator.
These are currently implemented as defines.

defines are a sort of macro they're going to be put into the final output as the code they represent.

Ask yourself what happens if you just have that section of code isolated as it's own separate core.
Then modify the code to call into that core rather than keep repeating that section over and over again?

They are the rounds of SHA-256... not a random number generator. I am now completely sure you are completely ignorant. You also talk about #defines like they are a completely new concept to you. Are you even a programmer?
Pages:
Jump to: