Only if it's full below ~90nm, which means several million dollars in mask costs, a 6-9 month
mfg period, and a pretty good chance that the first chip is useless, unless you know WTF you are
doing
This is a gross overestimate. The folded, not-unrolled design is basically a two 32-bit-wide shift registers with some multi-input adders in a feedback and an adder-comparator on the output. The remaining logic is all standard cells: PLL, ROM and I/O.
SHA-256 is pretty much self-testing: there are no unreachable states and all every state is observable. Any internal fault will eventually show up on the outputs.
The lower level physical/analog design is what fucks over most amateur ASIC designs.
And again dual SHA-256 is a dream assignment for the beginners. It is small and it takes only about 64 or 128 simulated clocks to verify the entire custom circuitry. From my past experience with SPICE and BSIM4 I would venture to guess that I could simulate one clock cycle of an entire SHA-256 round on my Core2Duo laptop in one day.
The additional benefit for the team is that they have design closure achieved from the moment they pass automated DRC verification. All timing and power targets are soft, they have absolutely no interoperability requirements and any hard targets for timing closure or power closure. All they have to do is pick an internal clock generation cell with variable multiplier.
I would guess that the chances of a "zero yield" first spin are atypically low for this design.