Pages:
Author

Topic: BlockBurner LLC - Crucible FPGA Scrypt Miner - Announcement Aug-19 - page 12. (Read 42404 times)

hero member
Activity: 798
Merit: 1000
www.DonateMedia.org
That seems a fairly glaring oversight given their stated goal.

Yes, it does.
I noticed, in a thread from 2012, someone calling out current Scryot implementations as light/fake/tuned down. Botnet safe, as it were. Is this still the general feeling on it? Is this the variant used in LTC? And is there a significant problem with running the full version of the algorithm? Honestly, I would prefer my CPU mining to occur mostly unnoticed in the background.

With the advent of specialized devices simple botnets will become useless overall, the returns will become so low even on a million infected PCs putting a CPUs worth of hashpower into the network will still be eclipsed by ASIC and more powerful hardware. I would say the hacking game is on the way out for BTC, but Litecoin and other alts would still be suceptible to this as they are still primarily CPU/GPU based starting out. A USB type device could be secured easier from intruders and overpower them in raw computing power. Eventually botnets will be out of business but it will just spread to the altcoins instead until altcoin hardware is ready to grow up.

It seems Litecoin is ready, and so are you.


Presently I am just getting to know the dev team and organizing a proper home for us. This right here is the start of a brand new industry, by community, for community, as Satoshi intended Smiley
sr. member
Activity: 406
Merit: 250
Anyone know the hash rate with the proposed setup?
full member
Activity: 168
Merit: 100
That seems a fairly glaring oversight given their stated goal.

Yes, it does.
I noticed, in a thread from 2012, someone calling out current Scryot implementations as light/fake/tuned down. Botnet safe, as it were. Is this still the general feeling on it? Is this the variant used in LTC? And is there a significant problem with running the full version of the algorithm? Honestly, I would prefer my CPU mining to occur mostly unnoticed in the background.
full member
Activity: 196
Merit: 100


1. Do you think the market and community is ready for FPGA Litecoin?

Yes!

2. Is there definite interest in FPGA Litecoin machines? Would you buy one if the price was reasonable? What is reasonable?

Yes!  It would have to have a significant price performance advantage over the nextgen AMD GPUs coming in the fall.

3. Would you pre-order one to support first round funding for prototyping and first wave production?

Yes, well really it depends - no BFL situations.  2 month lead time max.

If specs are right I would order a few, I know many others would also.
legendary
Activity: 1484
Merit: 1026
In Cryptocoins I Trust
Interesting. Thanks for the detailed post, especially the last part where you share your results.

This is good news as far as I am concerned, the whole point of Litecoin using Scrypt was so it would be difficult for specialised hardware to have a massive performance edge.

Hi mr_random!

I think it's a good thing. Although Litecoin was made so that it could only be mined on CPUs, that has been proven wrong and now GPUs can mine it much more efficiently. This allows the evil bloodsuckers that are bot nets to feed off our Litecoin ecosystem, just like they did on Bitcoin for years.

Specialized hardware will make strides to push bot netters off the network and on to some other ALT coin. Once they can't compete profitably, they'll be outta here. I don't thing FPGAs will completely accomplish this goal, but ASICs will just like they have/will do with Bitcoin.

Also, custom mining hardware for a crypto currency is somewhat proof that you've actually "made it" as a currency, and you'll most likely be around for a while.
phk
newbie
Activity: 28
Merit: 0
That seems a fairly glaring oversight given their stated goal.

Yes, it does.
full member
Activity: 168
Merit: 100

This is good news as far as I am concerned, the whole point of Litecoin using Scrypt was so it would be difficult for specialised hardware to have a massive performance edge.

SCRYPT doesn't attempt to make it difficult for special hardware to have a performance edge.  Instead, it tries to make it more expensive.  Depending on RAM requires die area which translates directly to unit cost.

In the case of an ASIC, this is something to be considered.  The 128KB used by litecoin translates to 1 million bits of SRAM, which might multiply the ASIC unit cost x2/x4/x10?.

In the case of an FPGA, you already bought the RAM.  In the case of the popular Spartan 6 LX150 used in bitcoin mining, you already bought over 4 million bits worth.

So, whomever picked the 128KB size for litecoin didn't go out of their way to make it that hard for "special" hardware already in circulation.

So all they needed to do it ramp up the 128k number to make it cost-prohibitive on FPGAs for the next few years?  That seems a fairly glaring oversight given their stated goal.
phk
newbie
Activity: 28
Merit: 0

This is good news as far as I am concerned, the whole point of Litecoin using Scrypt was so it would be difficult for specialised hardware to have a massive performance edge.

SCRYPT doesn't attempt to make it difficult for special hardware to have a performance edge.  Instead, it tries to make it more expensive.  Depending on RAM requires die area which translates directly to unit cost.

In the case of an ASIC, this is something to be considered.  The 128KB used by litecoin translates to 1 million bits of SRAM, which might multiply the ASIC unit cost x2/x4/x10?.

In the case of an FPGA, you already bought the RAM.  In the case of the popular Spartan 6 LX150 used in bitcoin mining, you already bought over 4 million bits worth.

So, whomever picked the 128KB size for litecoin didn't go out of their way to make it that hard for "special" hardware already in circulation.

phk
newbie
Activity: 28
Merit: 0
Also: Not to be the guy asking the stupid questions here, but what is stopping bulk purchases of GPU chips (specific clocking/memory designed for mining) in bulk from AMD? With say 25 undervolted and finetuned 7850 chips on a fairly simple board that would plug via USB and be recognized as a multi-crossfire system I can see that being a $$ while solution. Or maybe I am just dreaming..

You need to have product volume in order to get the chip vendors attention.

FPGA you can go to any number of distributors and buy them one at a time if you like (with discounts at various volumes).

legendary
Activity: 1344
Merit: 1001
I have my own Scrypt code for Xilinx FPGA and a pluggable rack system, that takes 10 boards, I had to mux them as 8+2 hot spares.(yep sometimes they drop in & out of service randomly)

Unfortunately...
Performance is shite...... insofar as comparison to high-end CPU or GPUs.
Who knows if I can get an improvement but it is going to be very hard to beat the GPU thrughput Vrs cost.

Interesting. Thanks for the detailed post, especially the last part where you share your results.

This is good news as far as I am concerned, the whole point of Litecoin using Scrypt was so it would be difficult for specialised hardware to have a massive performance edge.
full member
Activity: 147
Merit: 100
I am very interested in this development.  Great to hear that an official dev team has been setup.  Looking forward to additional updates and opportunities to get involved (probably mostly in the form of funding or pre-order) but I am very interested.  Good luck!
full member
Activity: 224
Merit: 100
Also: Not to be the guy asking the stupid questions here, but what is stopping bulk purchases of GPU chips (specific clocking/memory designed for mining) in bulk from AMD? With say 25 undervolted and finetuned 7850 chips on a fairly simple board that would plug via USB and be recognized as a multi-crossfire system I can see that being a $$ while solution. Or maybe I am just dreaming..
legendary
Activity: 1484
Merit: 1005
Yes, an x1024 memory might be constructed with (32) blocks configured for x32 width.   Is there something you didn't understand about that?  (this is just repeating what I said earlier?)

No, that makes sense, before I was confused because I thought you were implying that a 9 KB memory block could have a 1024-bit width.
full member
Activity: 224
Merit: 100
Count me in for seed/prototype financing/purchase. No pain, no gain! (once dev team, ballpark cost etc. is presented).
legendary
Activity: 1484
Merit: 1026
In Cryptocoins I Trust
Good luck, I will be following closely.
phk
newbie
Activity: 28
Merit: 0

I'm sorry, but I still don't follow.  (b) Table 4 that you cited shows a maximum width of 32-bits for a 9 KB block.  With 18 KB data blocks, the maximum width is 64-bits (plus error checks bits).

You can get get a 32-bit writes in parallel on 32 separate 9 KB blocks, which is sort of like a 1024-bit interface (I guess; 1024-bit interface really implies that you're writing 1024-bits a cycle through the same memory interface...).

Yes, an x1024 memory might be constructed with (32) blocks configured for x32 width.   Is there something you didn't understand about that?  (this is just repeating what I said earlier?)

sr. member
Activity: 399
Merit: 250
 :'(Seriously some noobie statements  in this discussion.

1. FPGA  is NOT , I repeat NOT a single product from a single manufacturer and as such there are NO hard and fast rules on what you can and cannot do with  BRAMs or memory generation, even the Xilinx product range has a different 'flavour' across product lines.

So statements such as "you don't know what you are talking about" only show you up for the noob you are, if you were THAT WELL researched you would know this.

Talking about 'RAM' as a single entity is also a misnomer ,because generally there are multiple ways to 'construct' RAM, which is after all just a flipflop.

If you are "lucky" the FPGA may have BRAM blocks where the internal resources and routing are all optimized for you, and you just 'hook it up'
If you are not one of gods chosen people then you have to construct the 'RAM' from normal logic, with all the shitty routing and interconnection that infers.

2. Memory access speeds have little to do with it, ultimately it comes down to internal logic chains..., no matter how FAST your memory is,
if your shittly VHDL/verilog is so badly written it takes 20ns to execute a clocked routine, then you may as well just be using paper& pen as a scratch pad, ultimately it bottlenecks somewhere.
Xilinx allows their internal BRAM to be operated 'upto' 600Mhz on some of the V5/V6, but unless you can get the rest of your relevant logic upto that speed , it does not really matter how fast it is.

As regards Scrypt, I had actually contacted some members who claim to be interested in Technical co-op, but it came to naught....
People are only interested if they think you have an edge.

I have my own Scrypt code for Xilinx FPGA and a pluggable rack system, that takes 10 boards, I had to mux them as 8+2 hot spares.(yep sometimes they drop in & out of service randomly)

Its a nice size, about 70cm*20*35cm, which allows for cooling & to slide PCBS along to get the JTAG into each board, with per board highspeed 17CFM MAGLEV fans (none of those shitty fans with the oilpool and stupid split washer under a label)
Only oversight is WTF do I put the PSU's.....(I'd banked on an ATX actually being able to supply the 3V3 supply, but they all lie about the capability)

Unfortunately...
Performance is shite...... insofar as comparison to high-end CPU or GPUs.
Who knows if I can get an improvement but it is going to be very hard to beat the GPU thrughput Vrs cost.
full member
Activity: 137
Merit: 100

1. Do you think the market and community is ready for FPGA Litecoin?
Yes!


2. Is there definite interest in FPGA Litecoin machines? Would you buy one if the price was reasonable? What is reasonable?
I  am very interested, will be buying if the price is right.


3. Would you pre-order one to support first round funding for prototyping and first wave production?
Would need to see a complete working prototype before placing a pre order, or investing.
legendary
Activity: 1484
Merit: 1005
The total RAM per block is 18KB. Each block has a 72-bit width. I don't really know where you're pulling your numbers from. Even if you calculate in parallel, 128/18 = 8 block RAM units required, with 72-bit widths each --> not 1024 bit width either.

I think you are misinformed about what is and is not possible.

You can construct whatever width you like by putting multiple units in parallel.  This is commonly done, and is a general feature of FPGA's not unique to Xilinx.

The vendors put them into small blocks like that to improve the granularity / flexibility for the designer.  As a result, you effectively lose capacity (bits) when your chosen configuration doesn't map efficiently to the underlying memory organization.

Artix-7 is even better, but limiting the discussion to Spartan 6 which many people have already bought, here is some documentation:

See page two of this:
(a) http://www.xilinx.com/support/documentation/ip_documentation/blk_mem_gen_ds512.pdf

See page nine of this:
(b) http://www.xilinx.com/support/documentation/user_guides/ug383.pdf

See page two of this:
(c) http://www.xilinx.com/support/documentation/data_sheets/ds160.pdf


To get a x1024 memory using (a), you can see from (b) that one possibility might be (32) instances of (x32) width.
As far as the capability of the LX150 part commonly used on existing bitcoin mining boards, you will see in (c) that this devices has a total of (268) such blocks.
So accommodating the 128KB scratchpad in SCRYPT could be done with (64) blocks configured for (x32) width and (32) units in parallel.   The LX150 could possibly hold (4) such memories, but I think you run out of gates for SCRYPT arithmetic well before that.



I'm sorry, but I still don't follow.  (b) Table 4 that you cited shows a maximum width of 32-bits for a 9 KB block.  With 18 KB data blocks, the maximum width is 64-bits (plus error checks bits).

You can get get a 32-bit writes in parallel on 32 separate 9 KB blocks, which is sort of like a 1024-bit interface (I guess; 1024-bit interface really implies that you're writing 1024-bits a cycle through the same memory interface...).  I think a direct implementation like this won't achieve a very good speed, though (less than 10 KH/s on most of these chips).

The better implementation would just run in the allocated memory and remake the LUT as needed I would think.  See the kernel for cgminer and reaper, and use of the "lookup gap" function, which more or less does this.
sr. member
Activity: 247
Merit: 250
sweet Cheesy I'm interested
Pages:
Jump to: