Pages:
Author

Topic: [BETA] Bitcoin blockchain torrent - page 4. (Read 57771 times)

legendary
Activity: 1596
Merit: 1100
October 14, 2012, 02:23:36 PM
#45
If the results of this experiment are positive, what are the chances we could modify the client so that it downloads the initial blockchain using this method instead of the current system? Is that a long term goal of this?

Not at this time.  It is possible that somebody might create a bitcoin-firstrun.exe application, separate from bitcoin, that torrents the blockchain.

For most users the initial blockchain download will be a one-time event, so there is little interest in directly adding bittorrent code to the bitcoin client.

The import-bootstrap.dat feature was added so that bitcoin does not have to care about the source of the data.  As long as you have the file, it will import it.  Maybe you downloaded the file via torrent... or maybe HTTP.  The point is, from the bitcoin client's perspective, it is agnostic to the download method.

sr. member
Activity: 462
Merit: 250
October 14, 2012, 02:15:15 PM
#44
If the results of this experiment are positive, what are the chances we could modify the client so that it downloads the initial blockchain using this method instead of the current system? Is that a long term goal of this?
legendary
Activity: 1596
Merit: 1100
October 14, 2012, 02:10:43 PM
#43
Version 0.7.1, which just entered testing, includes a new feature:  If the file "bootstrap.dat" is found in the bitcoin data directory, it will validate and import all blockchain data found in that file.

So what is the difference vs this procedure? No need to use -loadblock and faster download, given there are many fast seeders?

The difference between 0.7 and 0.7.1 is that 0.7.1 automatically runs "-loadblock=bootstrap.dat" at startup.

The torrent will probably be a faster download...  but if you have an ultrafast network peer, the regular download will be just as fast.  This torrent is just adding option for users; it is not the New Official Recommends Means for getting the blockchain.  As the OP emphasizes, this is an experiment.

legendary
Activity: 980
Merit: 1008
October 13, 2012, 08:52:51 PM
#42

Quote
I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.

Yes, this is open data and an open file format.  Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.py

Each time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately.  A nice, decentralized solution Smiley


How do I use that script?

Should some of these variables reference my .bitcoin directory somehow?

The .bitcoin directory is for a different app.

pynode is a full bitcoin client, separate from bitcoind.  The script mkbootstrap.py requires access to the pynode database, after you have downloaded all the blocks.

Sadly you do need to be a bit of a programmer to generate a bootstrap.dat file.


I think I've got pynode synchronized with the network now. The blocks.dat file is 3.5 GB after letting it connect to a local instance of bitcoin-qt and waiting a bit. mkbootstrap.py is running now.

I guess loading the config file should be separated out into a module that can be loaded by mkbootstrap.py. But I'm not sure how to do that elegantly (I'd just create a function that returns the settings dict given the path to the config file).

EDIT: Looks like it succeeded. I'm now seeding the bootstrap.dat file as well.
hero member
Activity: 602
Merit: 508
Firstbits: 1waspoza
October 13, 2012, 06:37:19 PM
#41
Seeding 24/h on my server siting on 100 Mbps link.

legendary
Activity: 1596
Merit: 1100
October 13, 2012, 06:08:37 PM
#40

Quote
I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.

Yes, this is open data and an open file format.  Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.py

Each time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately.  A nice, decentralized solution Smiley


How do I use that script?

Should some of these variables reference my .bitcoin directory somehow?

The .bitcoin directory is for a different app.

pynode is a full bitcoin client, separate from bitcoind.  The script mkbootstrap.py requires access to the pynode database, after you have downloaded all the blocks.

Sadly you do need to be a bit of a programmer to generate a bootstrap.dat file.

legendary
Activity: 980
Merit: 1008
October 13, 2012, 05:52:51 PM
#39

Quote
I wonder if it could be generated from their existing blockchain (is it basically the same file?) which will presumably be pretty up-to-date.

Yes, this is open data and an open file format.  Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.py

Each time the blockchain torrent is updated, seeders may run that script to guarantee they have 100% of bootstrap.dat immediately.  A nice, decentralized solution Smiley


How do I use that script?

Should some of these variables reference my .bitcoin directory somehow?

Code:
NET_SETTINGS = {
'mainnet' : {
'log' : '/spare/tmp/mkbootstrap.log',
'db' : '/spare/tmp/chaindb'
},
'testnet3' : {
'log' : '/spare/tmp/mkbootstraptest.log',
'db' : '/spare/tmp/chaintest'
}
}

I get:

Code:
Traceback (most recent call last):
  File "mkbootstrap.py", line 36, in
    log = Log.Log(SETTINGS['log'])
  File "/home/rune/Programming/pynode/Log.py", line 15, in __init__
    self.fh = open(filename, 'a+', 0)
IOError: [Errno 2] No such file or directory: '/spare/tmp/mkbootstrap.log'
legendary
Activity: 1596
Merit: 1100
October 13, 2012, 10:35:04 AM
#38
Great idea, I am seeding with my 2Mb. Just a suggestion: I would love to be notified when new torrent is created so I won't be seeding obsolete one. Could someone create a mailing list for that purpose?

The best place to watch is probably this thread, though I am open to other suggestions.

legendary
Activity: 1221
Merit: 1025
e-ducat.fr
October 13, 2012, 05:23:24 AM
#37
Great idea, I am seeding with my 2Mb. Just a suggestion: I would love to be notified when new torrent is created so I won't be seeding obsolete one. Could someone create a mailing list for that purpose?
+1
 and many thanks to jgarzick for this useful development
full member
Activity: 196
Merit: 100
October 13, 2012, 04:37:20 AM
#36
Great idea, I am seeding with my 2Mb. Just a suggestion: I would love to be notified when new torrent is created so I won't be seeding obsolete one. Could someone create a mailing list for that purpose?
legendary
Activity: 2576
Merit: 2267
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
October 12, 2012, 08:33:50 PM
#35
Good idea, except the trackerless part, IMHO. Here's a magnet link to the same torrent, with 4 public trackers added:

magnet:?xt=urn:btih:0bb0521942f586ed96203c6f4d136324756f8a9a&dn=bootstrap.dat&tr=udp://tracker.openbittorrent.com:80&tr=udp://tracker.publicbt.com:80&tr=udp://tracker.ccc.de:80&tr=udp://tracker.istole.it:80


Good deal. Unless a torrent is marked as private, the dht will kick in if the trackers ever stop working (though I'm not sure if it's possible to nobble them?)
foo
sr. member
Activity: 409
Merit: 250
October 12, 2012, 05:35:04 PM
#34
Good idea, except the trackerless part, IMHO. Here's a magnet link to the same torrent, with 4 public trackers added:

magnet:?xt=urn:btih:0bb0521942f586ed96203c6f4d136324756f8a9a&dn=bootstrap.dat&tr=udp://tracker.openbittorrent.com:80&tr=udp://tracker.publicbt.com:80&tr=udp://tracker.ccc.de:80&tr=udp://tracker.istole.it:80
legendary
Activity: 1120
Merit: 1164
October 12, 2012, 02:52:00 PM
#33
Does anyone know if bittorrent can share streams between multiple versions of the same file?

It depends on your definition of "share"... locally or remotely?

Remotely

A single torrent is simply a hash-of-hashes.  Each stream is a different torrent, with different hashes, even if torrent A is a strict subset of torrent B.

Your client may be modified to share streams which are multiple versions of the same file.  Would probably only need some small mods to existing clients, and would not break the network protocol.  So from that perspective: "yes"

Remote clients will see different hashes, and assume that each stream is separate and independent of each other.  So from that perspective: "no"

Hmm... that's pretty much what I expected. Anyway I thought about it some more, and I think I have a way for my application to even deal with divergent versions of the file, really divergent trees, which bittorrent *definitely* doesn't support. It'd be a very nice feature, so at that point I might as well just bite the bullet and hack bittorrent as required. (or invent Yet Another Peer-to-Peer Network)
kjj
legendary
Activity: 1302
Merit: 1026
October 12, 2012, 02:47:59 PM
#32

Torrent works by breaking files up into pieces and hashing each piece.  The parts that you already have will have the same hash as the hashes in the seed, with the exception of the final piece.  Modern torrent clients will fetch that partial piece using the missing byte range, and then verify it with the hash.  And naturally, they will also grab all of the new pieces.

Yes. I was talking about generating the file without having to download it. No point having torrents download the missing pieces when you already have them sitting in the blockchain on your hard-drive.

Don't think that will work.  Most bittorrent clients check the file size, etc.  It would be super cool if some torrent client would be willing to serve matching chunks out of a file, even if the overall file is wrong.  But none do that I'm aware of.

These are specially cleaned block files.  These are sequential, have no orphans, and no inter-block garbage.  For virtually everyone on the planet, the first N bytes of their actual block files won't match these.  jgarzik has already published his script, and I hope to publish mine soon.  You can use them to recreate the file from your block database without having to download it.
legendary
Activity: 1596
Merit: 1100
October 12, 2012, 02:45:10 PM
#31
Does anyone know if bittorrent can share streams between multiple versions of the same file?

It depends on your definition of "share"... locally or remotely?

A single torrent is simply a hash-of-hashes.  Each stream is a different torrent, with different hashes, even if torrent A is a strict subset of torrent B.

Your client may be modified to share streams which are multiple versions of the same file.  Would probably only need some small mods to existing clients, and would not break the network protocol.  So from that perspective: "yes"

Remote clients will see different hashes, and assume that each stream is separate and independent of each other.  So from that perspective: "no"

legendary
Activity: 2576
Merit: 2267
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
October 12, 2012, 02:38:51 PM
#30

Torrent works by breaking files up into pieces and hashing each piece.  The parts that you already have will have the same hash as the hashes in the seed, with the exception of the final piece.  Modern torrent clients will fetch that partial piece using the missing byte range, and then verify it with the hash.  And naturally, they will also grab all of the new pieces.

Yes. I was talking about generating the file without having to download it. No point having torrents download the missing pieces when you already have them sitting in the blockchain on your hard-drive.
legendary
Activity: 2576
Merit: 2267
1RichyTrEwPYjZSeAYxeiFBNnKC9UjC5k
October 12, 2012, 02:26:04 PM
#29

Now currently the script has 193,000 hardcoded

Ah, this is what I was talking about.
kjj
legendary
Activity: 1302
Merit: 1026
October 12, 2012, 01:38:48 PM
#28
Yes, this is open data and an open file format.  Seeders may independently generate a byte-for-byte identical bootstrap.dat by running https://github.com/jgarzik/pynode/blob/master/mkbootstrap.py

I also have a PHP (!) script that parses the block chain and makes a clean sequential bootstrap.dat file.  It is ugly and slow, but I wanted to have an independent verification.  Our two scripts came up with identical files.

And by ugly, I mean embarrassingly ugly, like I'd be ashamed to let anyone see it.  If I have time this weekend, I'll clean it up and post it.
legendary
Activity: 1120
Merit: 1164
October 12, 2012, 01:36:59 PM
#27
This is special to our use case:  bootstrap.dat is essentially an append-only file.  Blocks are simply concatenated onto the end.

Today's torrent at height 193000 is 2,491,771,562 bytes in size.

The next torrent, a few months from now, will have the same first 2,491,771,562 bytes.

Thus, to bittorrent, the next torrent will simply appear to be a truncated / not fully downloaded bootstrap.dat.  Bittorrent is built to fill in the missing pieces of a file, so that is what it does here Smiley

Does anyone know if bittorrent can share streams between multiple versions of the same file?

I mean, lets suppose we publish the torrent for the first x bytes, add y bytes to the file, then publish another torrent for the new version. Will people downloading the new, longer torrent, be able to request blocks from people running clients that have only downloaded the shorter torrent? There does exist a Bittorrent streaming protocol, TS Engine, but as far as I can tell it's purely block based and doesn't efficiently handle the case where every client needs the whole stream, right from the beginning. I know internally bittorrent can identify blocks that is already has using a merkle tree system, but the tree can only have one tip. (1)

It's not a very important optimization for bitcoin, just publishing up to the latest checkpoint is fine for us even if old seeds aren't useful anymore, but I have an application where torrenting a file that is continuously being extended would be useful.

(1) Ironically the data I want to distribute via bittorrent in this fashion is a forest of merkle trees, exactly the sort of data structure that you could use to implement a continuously-appended-to torrent...
kjj
legendary
Activity: 1302
Merit: 1026
October 12, 2012, 01:19:10 PM
#26
Excellent. Though presumably it will be a slightly different length than whatever is torrented (which will be taken care of by the recheck-data option).

Why would a byte for byte copy be a "slightly different length"?

How does it know the length of the torrented file? (Note that it is "identical", not a copy) Though from what jgarzick says in the post above there is some kind of checkpointing that it either knows or gets fed into it?

Torrent works by breaking files up into pieces and hashing each piece.  The parts that you already have will have the same hash as the hashes in the seed, with the exception of the final piece.  Modern torrent clients will fetch that partial piece using the missing byte range, and then verify it with the hash.  And naturally, they will also grab all of the new pieces.
Pages:
Jump to: