Author

Topic: Implementing a full node from scratch for educational purposes. Advice/thoughts? (Read 325 times)

newbie
Activity: 15
Merit: 0
This sounds very interesting. Please keep us updated and share your final result once you`re done! I wish you all the best.
legendary
Activity: 2898
Merit: 1823
OP, talk to some of the Decred core developers. They have an implementation written from scratch in Go. Contributing there might be a better learning path before starting from scratch yourself, https://github.com/btcsuite/btcd
legendary
Activity: 2053
Merit: 1356
aka tonikt
Start from a block chain parser - one that process block chain data stored on disk, to produce the UTXO database.
Then make sure that your consensus validating functions properly handle all the test vectors (for scripts and transactions) from bitcoin core.
https://github.com/bitcoin/bitcoin/tree/master/src/test/data
Then you can start adding the network/p2p part, which is going to be even harder as debugging it will be a bitch.

Excellent advice, thanks! So you suggest parsing the files that bitcoind generates in the data directory?
Sure.
Although you may prefer to first convert them into files that are more convenient to use.
You can also download raw block data from some popular block explorers.
Just start with e.g. the first 100000 blocks and see how it goes.

Also keep in mind that a full node will need to be able to undo blocks.
You're basically building a huge state machine - that state is kept as UTXO database.
We are talking about a database with tens of millions of records, so you need a fast and capable key-value database (like LevelDB, most commonly used).
Each new block changes the content of this database. To undo a block, means to restore the state of UTXO database to what it was before that block was applied to it.
At he beginning don't worry about undoing blocks, just keep in mind that you will have to add this functionality at some point.
legendary
Activity: 3416
Merit: 1912
The Concierge of Crypto
I'd set up a regular full node running using the reference client software and you can at least connect to that with your experimental node from scratch. Would save some headache too, at least you know the original full node is working while you figure out your own version.
newbie
Activity: 7
Merit: 4
Start from a block chain parser - one that process block chain data stored on disk, to produce the UTXO database.
Then make sure that your consensus validating functions properly handle all the test vectors (for scripts and transactions) from bitcoin core.
https://github.com/bitcoin/bitcoin/tree/master/src/test/data
Then you can start adding the network/p2p part, which is going to be even harder as debugging it will be a bitch.

Excellent advice, thanks! So you suggest parsing the files that bitcoind generates in the data directory?
legendary
Activity: 2053
Merit: 1356
aka tonikt
So I'd like to ask: what you generally think about this idea? Do you have any tips you could share? Or do you know any other good resources that could help?

Many have tried - only few actually made it.
I wish you all the best.

Start from a block chain parser - one that process block chain data stored on disk, to produce the UTXO database.
Then make sure that your consensus validating functions properly handle all the test vectors (for scripts and transactions) from bitcoin core.
https://github.com/bitcoin/bitcoin/tree/master/src/test/data
Then you can start adding the network/p2p part, which is going to be even harder as debugging it will be a bitch.

Don't trust any documentation.
At many points you will have to study bitcoin core source code or/and the running client itself, to find out how things have to work.
So make sure you have a running bitcoin core, one that you built from sources yourself, which  you can debug to see how it does stuff.
Also, ask people - the is plenty of people who know how things work and they can save you a lot of time. Don't mind if they aren't very nice to you sometimes - they can still help you Smiley
legendary
Activity: 2870
Merit: 7490
Crypto Swap Exchange
Don't forget that Bitcoin development focus on backward compatibility, where there are many version of output (P2PK, P2P2K, P2PWSH, etc.), scripting (newer OPCODES, OPCODES which used to be valid, etc.), node (network,  filter, bloom. etc.)

For example, you need to remove signature part of SegWit transaction when send transaction/block to old node.
kzv
legendary
Activity: 1722
Merit: 1285
OpenTrade - Open Source Cryptocurrency Exchange
Hello,

I'm a software engineer that wants to get started with Bitcoin development. Until now I've read the whitepaper and the recent Schnorr & Taproot proposals. That said, I've never actually used a full node, besides of one bitcoind instance that I've set up on a Raspberry (but never used it as a wallet). Generally, I'd say I'm familiar with the basics (how it all comes together).

I thought of implementing a (non-mining) full node from scratch as a way to learn the ins and outs of the protocol, the network and the Script language.

I've gathered some resources that I think could be useful:

- https://bitcoin.org/en/developer-guide
- https://en.bitcoin.it/wiki/Main_Page
- https://github.com/jonatack/bitcoin-development/
- btcd codebase (because I'm experienced with Go)

So I'd like to ask: what you generally think about this idea? Do you have any tips you could share? Or do you know any other good resources that could help?

Thanks in advance


First of all: bitcoin is not one but several different technologies.
The base Bitcoin technology is a P2P network programming. This is not any revolution from Satoshi Nakamoto. You can read about what is P2P, its history and how P2P protocols working if google "napster", "kazaa", "gnutella".
Documentation for Bitcoin P2P network protocol you can read here https://en.bitcoin.it/wiki/Protocol_documentation
legendary
Activity: 3472
Merit: 10611
well if you want to "implement" then in my opinion looking at some source code is not going to be constructive because when you look at source codes that is all you can do from then on. you end up "copying code". if you want to understand the technical aspect and implement something and gain some experience in this process then only look at the documentation, then if you got stuck somewhere and couldn't find the answer online you can look at the source code. and if you don't want to implement that part (like not wanting to implement SHA256) then use a library that is already compiled instead of copying code.
newbie
Activity: 7
Merit: 4

Quote
To clarify: I'm not looking for implementation details (e.g. programing paradigms or which programming language to choose) but "meta" advice like "careful of endianess when writing/reading messages from the wire".

Edianness is pretty easy to find in the code though, perhaps you could've specified what you're asking in the op - we can't read your mind from text...



Excuse me. I guess I'm looking for advice related specific to Bitcoin development (the protocol, the cryptography, the network), as opposed to generic programming advice. An example would be "don't try to roll your own crypto library, use library X" vs. "don't use OOP".

As for the language, I'm going to go with Go.

Thanks!
copper member
Activity: 2856
Merit: 3071
https://bit.ly/387FXHi lightning theory
@pooya, I just mean don't go overboard with the classes you're making. Only classes you really need are going to be to do with structures that are necessary for the code... I've been doing a lot of modelling recently so maybe it's just me splitting up everything and splitting it down further but, as I say, don't go overboard with what you're doing. Refine things down to a logical standpoint in order for oop to work (I'll try and fix up that original post in a bit.

Thanks for the advice anyone! It's encouraging.

To clarify: I'm not looking for implementation details (e.g. programing paradigms or which programming language to choose) but "meta" advice like "careful of endianess when writing/reading messages from the wire".

Edianness is pretty easy to find in the code though, perhaps you could've specified what you're asking in the op - we can't read your mind from text... For a project like this language dependency might be something you want to look at soon. Encoding large integers in some languages are much easier than in others but the ones that it's easier it becomes a lot slower...

newbie
Activity: 7
Merit: 4
Thanks for the advice anyone! It's encouraging.

To clarify: I'm not looking for implementation details (e.g. programing paradigms or which programming language to choose) but "meta" advice like "careful of endianess when writing/reading messages from the wire".
legendary
Activity: 1456
Merit: 1175
Always remember the cause!
2 is pretty reasonable too. Split things up into functions and not objects. Classes are still a thing but structs not so much. Only structs youll need at a fundamental level is the packet frames, merkle root trees and keys..

i honestly don't get what OOP has anything to do with anything to be avoided! there is nothing wrong with it and in fact it is one of the best software designs that you can use.
OOP has two faces a decent original one: encapsulation and an ugly over-used feature: inheritance as far as bitcoin development is concerned the latter is irrelevant. Inheritance is adored mostly by people who care too much about re-usability, corporates, big fat project architects, ...

Satoshi did a very good job not relying on inheritance and bitcoin code is robust and elegant because of this.
legendary
Activity: 3472
Merit: 10611
2 is pretty reasonable too. Split things up into functions and not objects. Classes are still a thing but structs not so much. Only structs youll need at a fundamental level is the packet frames, merkle root trees and keys..

i honestly don't get what OOP has anything to do with anything to be avoided! there is nothing wrong with it and in fact it is one of the best software designs that you can use.
as for classes versus structs if you are talking about c# then i think you may need to revisit their differences. it is a bit weird what you said. structs are pretty useful when you want a value type object. in many places such as EC points you should use them, in certain data types such as locktime field in a transaction you can use them,... and a lot more.
copper member
Activity: 2856
Merit: 3071
https://bit.ly/387FXHi lightning theory
I was looking at doing this in C# over the summer (until my laptop broke) I need to pick it back up again though...

You:d have to pick a C based language if you want anything that'll run quickly (go, c# java, c or cpp).

1- Read the bitcoin core code carefully.

2- Don't follow the conventional software development ideas that are too much focused on software-reuse, i.e. forget about oop.

1.b. As a suggestion I was given, don't copy the code exactly as it might migrate security flaws or add new ones if you're changing the language. Some compilers handle things differently.

For 2, oop can be useful if you know what you're doing and don't aggregate everything down too much. If you get a basic number of functions then inheritance is also probably the way to go for a lot of the hashing functions at least since the hashing algorithms are in use for a few different tasks.
legendary
Activity: 1456
Merit: 1175
Always remember the cause!
Hi, welcome to this site  Smiley
Good spirit, I appreciate it. Two pieces of advice:

1- Read the bitcoin core code carefully.

2- Don't follow the conventional software development ideas that are too much focused on software-reuse, i.e. forget about oop.
newbie
Activity: 7
Merit: 4
Hello,

I'm a software engineer that wants to get started with Bitcoin development. Until now I've read the whitepaper and the recent Schnorr & Taproot proposals. That said, I've never actually used a full node, besides of one bitcoind instance that I've set up on a Raspberry (but never used it as a wallet). Generally, I'd say I'm familiar with the basics (how it all comes together).

I thought of implementing a (non-mining) full node from scratch as a way to learn the ins and outs of the protocol, the network and the Script language.

I've gathered some resources that I think could be useful:

- https://bitcoin.org/en/developer-guide
- https://en.bitcoin.it/wiki/Main_Page
- https://github.com/jonatack/bitcoin-development/
- btcd codebase (because I'm experienced with Go)

So I'd like to ask: what you generally think about this idea? Do you have any tips you could share? Or do you know any other good resources that could help?

Thanks in advance
Jump to: