[meta] Rust in Bitcoin reference implementation

ataraxiaceleste

newbie

Activity: 49

Merit: 0

Old post but GOLD. Very nice insight by the legendary gmaxwell. Landed here while exploring about functional programming and how it's slowly gaining popularity to build products in the blockchain and cryptocurrency space.

Carlton Banks

legendary

Activity: 3430

Merit: 3083

Quote from: gmaxwell on October 13, 2019, 01:39:28 PM

believe a second, compatible implementation of Bitcoin will ever be a good idea. So much of the design depends on all nodes getting exactly identical results in lockstep that a second implementation would be a menace to the network.

right

gmaxwell is quoting satoshi more or less verbatim above, I should've known better already!

gonna take some time to think about the responses upthread

gmaxwell

staff

Activity: 4326

Merit: 8951

I feel like the decision at blockstream to use rust for some things was, on the balance, an error or maybe it only broke even. (And I say this as being the person who was most personally responsible for it.) On one hand it did appear to allow building things that were not RCE vectors faster with fewer people and with less review, on the other hand it apparently resulted in there being significantly less review and development resources put into these efforts. Lots of time was lost due to managing toolchain insanity (which has gotten better in some respects, but worse in others). Lots of soft-testing got missed simply because toolchain friction meant that people weren't setup to build rust programs so they just didn't try them out unless their job required them to do so, when they otherwise would have.

The elements sidechain lost its one external signer because they couldn't be bothered to setup rust, and we had repeated problems with employee operated signers upgrading tied to rust. Now, it's certainly possible that engineers at blockstream were particularly bad at handling this... and certain specific actions could have been taken to correct things-- e.g. mandatory two weeks of rust training for every engineer in the company even if they weren't expected to use rust in their job. But they were all drawn from the bitcoin community, so I don't think it speaks that well for how the bitcoin community would handle it, particularly because some of the corrective actions blockstream could have taken but didn't like mandating exposure, aren't available for open development.

Even as I write that-- Matt has been asking me for the last week to try to receive his bitcoin header lowra radio broadcast-- but the tool he wrote for it is written in rust, and the friction of installing a rust toolchain on the laptop that I can easily carry over to my antenna has kept me from trying it out, although I have the hardware and am interested.

Part of what that thread is proposing includes working around some of the worst insanity in the Rust ecosystem, e.g. the rust package manager, cargo, (and package ecosystem) is very much built around invisibly downloading and running huge graphs of dubious mystermeat code, just like the situation in javascript and ruby which has been of late resulting in many security problems. The proposal there is to just not use cargo, which strikes me as a pretty much necessary idea. The downside, of course, is more local effort needed to 'swim against the stream' because the rest of the rust universe uses cargo pretty heavily.

I think there is a strong argument that the inherent safety of rust lets you spend more time on avoiding logic errors instead of making sure your program doesn't crash, but since it also reduces the population of developers and testers it can still be a loss overall. The kind of bugs we've experienced in Bitcoin are not the kind that rust structurally prevents, but instead are the kind that are still possible in rust. In fact, I can't recall a single instance of a bug shipped in Bitcoin core that would have been structurally prevented in rust. The closest I can think of is the openssl heartbleed bug which was in a third-party library, written in another language, which would have still existed if Bitcoin used rust and just called that library. I wouldn't be too surprised if I were mistaken if there were one or two I'm forgetting ... but in any case, it would be an extreme minority. There is still an argument that there is time being spent avoiding those bugs that could be redirected and development could occur faster when people didn't need to worry about that class of bug, but those benefits still have to overcome the baseline reduction in development.

Using it for things like auxiliary tools-- things which are essentially freestanding, developed independently, and likely would not share a lot of collective review to begin with might be a win on the balance. There have long been parts of the software that are effectively isolated from some populations of developer-- E.g. a number of people who work on Bitcoin hardly review anything in the QT GUI, so components with their own developers. From the thread it sounds like Wladimir favours this sort of direction.

Unfortunately, there is a bit of a chicken-and-egg where primary usage depends on a level of ambient competence which is hard to get without primary usage. It may be that auxiliary usage helps build that.

Quote from: achow101 on October 13, 2019, 10:37:32 AM

The other point of contention is whether rust will actually reduce the number of major bugs in Core. C++ already does things that lets us not have to worry about some memory things, so it isn't as bad as c where it is very easy to forget to free a pointer. But we still can and do get segfaults due to null pointer dereferences so rust would certainly help there. But if you look at a lot of the other bugs that have been in Core, most of them have been logic errors. Rust would not help with those, and it could potentially make them worse as less people know rust.

At the end of the day, I'm personally +0 on rust. I mostly don't care, but would not be opposed to having rust in Core. It would be nice to have better compile time memory protection, but I don't think that's a super big issue that really needs to be fixed.

Pretty much my view, leaning to -0 on mandatory consensus parts. It would be much more negative if I didn't like rust so much conceptually, and more positive if more of the developers who's contributions I consider critical were rust experts.

Quote from: Carlton Banks on October 13, 2019, 12:20:47 PM

So, why not alter the C++ and Rust implementations to allow them to share a block database? Either one could fall over, and we would hope that the
other wouldn't fail in the same way (or for a different reason at the same time Grin

). Isn't that a more sensible way to approach this?

I don't believe a second, compatible implementation of Bitcoin will ever be a good idea. So much of the design depends on all nodes getting exactly identical results in lockstep that a second implementation would be a menace to the network.

Quote

it reminds me slightly of the memory managed concept in general; the people that promoted that stuff very quietly concede (or haughtily change the subject...) that it's not the magic it was sold as. The reality with Java and C# was that you actually did eventually need to understand the computer science behind memory allocation/deallocation, as the byte code compiler would make mistakes, or the "garbage collection" module in the runtime would destroy variables before they've even been used etc. And so the unhelpful the response was "hey guys, but Java can still do pointers though!", which naturally gave cause to the more sensible people to wonder why they were going through all the trouble of using Java to begin with.

I don't know the details of how Rust handles memory allocation, and clearly there are accomplished developers (who seemingly know Rust well) who find the overall proposition convincing.

Rust is essentially the opposite of the managed memory in Java and C#. In those languages the idea is that the programmer is expected to ignore (and perhaps have no understanding of) memory management. It turns out that memory management is critical to the computer's operation and ignoring (and especially not understanding) it is toxic to writing software with finite resource usage or decent runtime latency and performance.

Instead, Rust treats memory essentially just like C++ does but then enforces with compile time code analysis, runtime boundary checking, and stylistic norms that you've actually handled the memory safely. You're also forced to write code that this analysis works successfully on, even when something else might be okay but it confuses the analysis. So instead of ignoring memory, you're required to pay the same attention to it you do in C++ and the correctness of your handling is enforced by the compiler.

Rust also makes a number of other morally similar decisions that make it very different from java. For example, in Java exception handling is a common source of bugs because there is this unknowable implicit return type of any function you call... which maybe you could handle but usually you don't handle because you never even knew it was there. Many C++ codebases largely eschew exceptions or only use them in very specific confined ways for this reason, but its hard to do so because the various libraries expect you to use them. Rust eschews exceptions entirely except for the special case of panic which kills a whole thread.

You could even imagine getting the same benefits of rust in C++ with some sufficiently smart static analysis tools plus a set of replacement types that add the boundary checking. But the problem is that you'd have to constrain yourself to the subset of the language that the static analysis supports, only use the boundary checked types, etc. You'd effectively be using another language. Rust is morally pretty close to what that language would be, though the syntax has stylistic differences that probably result in more learning curve than an imaginary safe-C++-subset would have.

It's perhaps worth mentioning that some elements of the rust safety-- e.g. runtime checking of bounds still result in effective program crashes ('panics', technically your program could handle it by restarting the thread, if it makes sense to do so) but just guarantees that it actually crashes rather than runs on in some corrupted zombie state potentially executing hostile code. Obviously, crashing is also no good but this is less of a problem than it might sound because idiomatic rust code uses things like iterators and coding error states with sum-types (where the type checker forces you to handle all the cass) and such that make it so you are seldom doing anything which could product a panic. The thing is, the same is true in modern C++ which is why in Bitcoin we've seen relatively few memory safety errors. In rust these norms are a lot stronger and the compiler helps assure that you don't screw it up. E.g. iterators prevent the most common causes of out-of-bounds accesses, but in C++ you have the problem of iterator invalidation when you write complicated code that mutates via iterators. In rust the compiler just guarantees you don't do that, including sometimes not letting you write perfectly valid code simply because it cannot prove it safe. Fortunately the cases where rust won't let you do a reasonable thing are rare, so the cost in having to work around the language isn't very great on average.

Unfortunately the art of computer language design and programming best practices are still essentially pre-scientific. Everywhere in programming there are taboos and rituals people recommend for making better software. There is been very little rigorous study characterizing the benefits of different approaches, so much of what people do when they try to make better software is pretty much superstition and the advocacy of it essentially religion-- which does a lot to explain the fervor that goes into it. That said, even primitive man knew that water was wet. It's pretty much unthinkable to me that the future would conclude that the areas that rust improves weren't worth improving, ... although I do suspect that much of the rust advocacy overplays the benefits.

It's conceivable to me that the effort competent developers spend avoiding and dealing with memory safety in C++ is, on a long term average, the same as the amount of effort consumed by satisfying the rust type system and borrow checker and working around cases rust won't allow. If so, the difference in a hypothetical world that invented and used rust instead of C++ would largely be reducing memory safety issues from rare to exceptionally rare (not zero due to unsafe blocks and compiler errors). This would be a worthy difference, but we're not in that world and it's less clear how to value that difference against the transition costs. Rust advocates would have you believe that it's significantly less effort and that they can be more productive as a result, and that might be true too but I haven't seen that much evidence of it, and as Carlton notes-- that sort of claim is common, made by java, c#, go ... lisp... haskell. And experience hasn't really supported those sorts of claims. There clearly are things some languages do better than others, but it seems that no efforts so far have really lived up to their advocates claims of revolutionizing software development-- at least not to a level where they aren't dwarfed by the differences in productivity among individual developers. I think that I'd rather take Pieter writing in perl or assembly than I would take the vast majority of rust developers in rust.

Carlton Banks

legendary

Activity: 3430

Merit: 3083

Quote from: achow101 on October 13, 2019, 10:37:32 AM

Quote from: Carlton Banks on October 13, 2019, 05:31:26 AM

however, one plan is not to use a Rust compiler that is bootstrapped from a trustworthy source (Canonical's Rust compiler). Call me nuts if you so choose, but that seems like a very cavalier decision to make with software that should be putting security first. You can say "trusting Canonical is subjective", in which case, it should be ruled out altogether in such a critical piece of software as Bitcoin

IIRC the plan is/was to bootstrap rustc ourselves via guix. Although right now Bitcoin Core trusts Canonical for deterministic builds (Gitian uses Ubuntu), the plan is to move to guix for purely deterministic builds on all platforms (guix builds all dependencies deterministically). However, because we are currently trusting Canonical anyways, I think it was decided that it is okay to use rustc from Canonical until we get guix working.

sure, but this is only true of those who deploy the gitian builds (although it would not come as a surprise to learn that the majority do)

and even if (hypothetically) no-one was compiling the Bitcoin source code themselves, it still doesn't make sense to pile more trust into Canonical. Operating systems are a huge project, and so with dozens of contributors, there's a real possibility that the people handling autotools or gcc are trustworthy, while the rust person is crooked as hell (or incompetent, or inexperienced....)

So the proposition would be a real trade-off: hypothetical, unproven reliability gains versus investing more trust in a Linux implementation that is (arguably) already a little questionable on ethics (specifically Canonical made deals with commercial partners to bundle data grabbing plugins with their Firefox package, I wouldn't be shocked to hear of further such poor faith)

Quote from: achow101 on October 13, 2019, 10:37:32 AM

My impression was that it would be failover, then in parallel, and then possibly, the main implementation. So at some point, both the rust and c++ implementations would be used to cross-check against each
other. But then again, I haven't followed this conversation too closely.

a complete rust implementation already exists, that can be done now (presumably it is). I'm not sure that really makes the case to put Rust into the C++ implementation; if anything, promoting the complete Rust implementation in some kind of tandem failover configuration with the C++ implementation makes alot more sense to me.

So, why not alter the C++ and Rust implementations to allow them to share a block database? Either one could fall over, and we would hope that the other wouldn't fail in the same way (or for a different reason at the same time Grin

). Isn't that a more sensible way to approach this?

Quote from: achow101 on October 13, 2019, 10:37:32 AM

The main issue I have with rust in Bitcoin Core is just the fact that there will be far fewer reviewers. I personally would have to learn rust.

this is of course a chicken and egg problem, time resolves it. but considering this is Bitcoin, moving as slowly as possible in that direction seems like the prudent option.

Quote from: achow101 on October 13, 2019, 10:37:32 AM

The other point of contention is whether rust will actually reduce the number of major bugs in Core. C++ already does things that lets us not have to worry about some memory things, so it isn't as bad as c where it is very easy to forget to free a pointer. But we still can and do get segfaults due to null pointer dereferences so rust would certainly help there. But if you look at a lot of the other bugs that have been in Core, most of them have been logic errors. Rust would not help with those, and it could potentially make them worse as less people know rust.

At the end of the day, I'm personally +0 on rust. I mostly don't care, but would not be opposed to having rust in Core. It would be nice to have better compile time memory protection, but I don't think that's a super big issue that really needs to be fixed.

it reminds me slightly of the memory managed concept in general; the people that promoted that stuff very quietly concede (or haughtily change the subject...) that it's not the magic it was sold as. The reality with Java and C# was that you actually did eventually need to understand the computer science behind memory allocation/deallocation, as the byte code compiler would make mistakes, or the "garbage collection" module in the runtime would destroy variables before they've even been used etc. And so the unhelpful the response was "hey guys, but Java can still do pointers though!", which naturally gave cause to the more sensible people to wonder why they were going through all the trouble of using Java to begin with.

I don't know the details of how Rust handles memory allocation, and clearly there are accomplished developers (who seemingly know Rust well) who find the overall proposition convincing. But I really wonder how much time this would really save (or how many network-wide catastrophes could be avoided) versus how much time may be lost building up a dependency on Rust, only for everyone to change their mind in 3 years subsequent to the real-world practicalities becoming more apparent.

Dangling pointers causing segfaults are highly likely to manifest either frequently enough that they are quickly spotted, or sufficiently infrequently that they can be either safely ignored in the short-term or completely undiscovered. Is it really worth the huge effort to move to less well-known or understood programming languages just to solve that "problem"? In Bitcoin? :/

achow101

staff

Activity: 3458

Merit: 6793

Just writing some code

Quote from: Carlton Banks on October 13, 2019, 05:31:26 AM

however, one plan is not to use a Rust compiler that is bootstrapped from a trustworthy source (Canonical's Rust compiler). Call me nuts if you so choose, but that seems like a very cavalier decision to make with software that should be putting security first. You can say "trusting Canonical is subjective", in which case, it should be ruled out altogether in such a critical piece of software as Bitcoin

IIRC the plan is/was to bootstrap rustc ourselves via guix. Although right now Bitcoin Core trusts Canonical for deterministic builds (Gitian uses Ubuntu), the plan is to move to guix for purely deterministic builds on all platforms (guix builds all dependencies deterministically). However, because we are currently trusting Canonical anyways, I think it was decided that it is okay to use rustc from Canonical until we get guix working.

Quote from: Carlton Banks on October 13, 2019, 05:31:26 AM

The rationale for putting the specific Rust code into the Bitcoin codebase is sound; if headers fetching code fails in some unknown circumstance, maybe only the C++ implementation of the headers fetching code will fail, and the Rust headers fetching will continue to function without incident. Hell of a supposition to make, but it's somewhat reasonable, as there is some acceptance that Rust can be written in such a way that certain types of bug are less likely (but not impossible)

But this would make it too easy to say "let's just re-write the main implementation in Rust, piece by piece! After all, failover Rust code is working great so far!"

My impression was that it would be failover, then in parallel, and then possibly, the main implementation. So at some point, both the rust and c++ implementations would be used to cross-check against each other. But then again, I haven't followed this conversation too closely.

The main issue I have with rust in Bitcoin Core is just the fact that there will be far fewer reviewers. I personally would have to learn rust.

The other point of contention is whether rust will actually reduce the number of major bugs in Core. C++ already does things that lets us not have to worry about some memory things, so it isn't as bad as c where it is very easy to forget to free a pointer. But we still can and do get segfaults due to null pointer dereferences so rust would certainly help there. But if you look at a lot of the other bugs that have been in Core, most of them have been logic errors. Rust would not help with those, and it could potentially make them worse as less people know rust.

At the end of the day, I'm personally +0 on rust. I mostly don't care, but would not be opposed to having rust in Core. It would be nice to have better compile time memory protection, but I don't think that's a super big issue that really needs to be fixed.

Carlton Banks

legendary

Activity: 3430

Merit: 3083

Rust is a new-er programming language, and has a reasonably good reputation:

liked for being more difficult to make mistakes with than C/C++ (although not impossible, of course)
for being sufficiently powerful in expressiveness, yet relative simplicity compared to C/C++
disliked because compiling the actual Rust compiler was until recently not possible (we have gnu's guix system to thank for that)

So, there's been a Rust implementation of Bitcoin around for a while, and some good programmers actually preferred to spend their time re-implementing the existing Bitcoin client in Rust than to work on the reference C++ implementation. Which is a good plan in and of itself, as a proof of concept if nothing else.

There are now plans to implement Rust code into the main Bitcoin implementation: https://github.com/bitcoin/bitcoin/issues/17090

speaking as a non-expert on either language, but as a Bitcoin user who does understand something about coding and computer science, I'm unenthusiastic about this.

I understand the rationale:

Rust code has lower review burden
The plan is to duplicate non-consensus C++ code, to add redundancy to network consistency/reliability
Rust is proven these days, compiler can be bootstrapped

that's all ok.

however, one plan is not to use a Rust compiler that is bootstrapped from a trustworthy source (Canonical's Rust compiler). Call me nuts if you so choose, but that seems like a very cavalier decision to make with software that should be putting security first. You can say "trusting Canonical is subjective", in which case, it should be ruled out altogether in such a critical piece of software as Bitcoin

my other concern would be the "thin end of the wedge" argument.

The rationale for putting the specific Rust code into the Bitcoin codebase is sound; if headers fetching code fails in some unknown circumstance, maybe only the C++ implementation of the headers fetching code will fail, and the Rust headers fetching will continue to function without incident. Hell of a supposition to make, but it's somewhat reasonable, as there is some acceptance that Rust can be written in such a way that certain types of bug are less likely (but not impossible)

But this would make it too easy to say "let's just re-write the main implementation in Rust, piece by piece! After all, failover Rust code is working great so far!"

In the end, such a radical change (I'm sure that statement will be a point of contention, but it's essentially self-proving by the virtue of the fact that this is a contorversial change of direction) was always bound to be divisive.

The developers who are keen to send Bitcoin in this direction should realise that there's no real practical difference between doing something divisive for good reasons, or doing something divisive to cause problems in the project.

usual thread rules, + 1 extra:

no trolling
no trolls
discussing personalities or intentions or politics of those debating is not allowed, and will be removed, e.g. "Bjarne Sostroup says x about Rust" is disallowed, everything must be provable in it's own right, here in the thread, written by you

the technical merits and the consequences to how the Bitcoin project is managed are the focus of this thread

Topic: [meta] Rust in Bitcoin reference implementation (Read 431 times)