Pages:
Author

Topic: HTTP bootstrapping ? (Read 6539 times)

pj
newbie
Activity: 24
Merit: 0
December 28, 2010, 10:34:28 AM
#26
Code:
    ...
   sed 's/\([0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\)\.\([0-9][0-9]*\)/\1:\2/g' |
    ...

omg your regex is ugly.

Code:
    sed -r 's/([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)\.([0-9]+)/\1:\2/g' |

And I'm pretty sure there is better.

But sed -r is not available on all sed commands.  I wanted something portable.  Granted, I have no example of a sed command that presently lacks it.  I just know that extended regular expression support was added (to sed and anything else) sometime after such commands first existed, so I studiously avoid using extended regular expressions in situations where portability to unknown runtime environments is desired.

If one uses sed -r, then I suppose a better (well, shorter anyway) expression would be:
Code:
sed -r 's/(([0-9]+\.){3}[0-9])\.([0-9])/\1:\3/g'
legendary
Activity: 1288
Merit: 1080
December 28, 2010, 10:14:55 AM
#25
Ok - thanks for the BSD sample output.

How about this code then:

Code:
#!/bin/sh
# Display foreign IP addresses coming from port 8333 --or-- connected to local port 8333.
# Append line at end with date and count of addresses displayed.
#
# GNU Linux netstat separates port numbers from IP addrs using colon ':',
# whereas BSD netstat separates them using a period '.'.  The sed line
# below converts the BSD '.' to a ':', to make it easier for awk to
# split off the port.

netstat -an |
    sed 's/\([0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\)\.\([0-9][0-9]*\)/\1:\2/g' |
    awk -v date="$(date)" '
        $6 == "ESTABLISHED" && /:8333/ { split($5, a, ":"); print a[1] ; n++ }
        END { print "# " date " : " n " bitcoin clients seen." }
    '

omg your regex is ugly.

Code:
    sed -r 's/([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)\.([0-9]+)/\1:\2/g' |

And I'm pretty sure there is better.
pj
newbie
Activity: 24
Merit: 0
December 28, 2010, 09:53:55 AM
#24
Ok - thanks for the BSD sample output.

How about this code then:

Code:
#!/bin/sh
# Display foreign IP addresses coming from port 8333 --or-- connected to local port 8333.
# Append line at end with date and count of addresses displayed.
#
# GNU Linux netstat separates port numbers from IP addrs using colon ':',
# whereas BSD netstat separates them using a period '.'.  The sed line
# below converts the BSD '.' to a ':', to make it easier for awk to
# split off the port.

netstat -an |
    sed 's/\([0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\)\.\([0-9][0-9]*\)/\1:\2/g' |
    awk -v date="$(date)" '
        $6 == "ESTABLISHED" && /:8333/ { split($5, a, ":"); print a[1] ; n++ }
        END { print "# " date " : " n " bitcoin clients seen." }
    '
newbie
Activity: 12
Merit: 0
December 28, 2010, 09:35:43 AM
#23

Just to be a little bit picky, the netstat output is slightly different between BSD-like Unix and GNU/Linux.
The port is separated by a dot on the BSD-like Unix. So maybe the pattern matching /:8333/ could be
reviewed to include also the other output... but beside that, this is just fine.
Well, darn.

Could you provide a sample few lines of output from a BSD netstat -an?

It would take more than changing the /:8333/ pattern to fix this, if I understand your description
correctly.  There is also the awk split on the ":" which would have to be fixed as well.  This is all
doable with a little bit of regular expression hacking (something I do easily.)  But I should see the
exact BSD netstat -an output first, to be sure I understand it correctly.


I think this could do the trick in awk for the matching :  && (/:8333/ || /\.8333/)
and for the split, an if block to match the : and another if block to split on dot. It
will start to be unreadable for an one-liner ;-)

FYI, here is output:

Code:
tcp4       0    116  192.168.1.2.8333       80.217.82.59.45167     ESTABLISHED
tcp4       0      0  192.168.1.2.8333       68.103.101.19.29297    ESTABLISHED
tcp4       0      0  192.168.1.2.8333       79.184.79.110.1191     ESTABLISHED
tcp4       0      0  192.168.1.2.8333       61.94.216.38.10100     ESTABLISHED
tcp4       0      0  192.168.1.2.8333       113.22.164.48.10020    ESTABLISHED
tcp4       0      0  192.168.1.2.8333       85.232.113.117.8597    ESTABLISHED
tcp4       0      0  192.168.1.2.8333       46.109.12.201.3328     ESTABLISHED
tcp4       0      0  192.168.1.2.8333       62.103.58.117.3230     ESTABLISHED

pj
newbie
Activity: 24
Merit: 0
December 28, 2010, 09:15:24 AM
#22

Just to be a little bit picky, the netstat output is slightly different between BSD-like Unix and GNU/Linux.
The port is separated by a dot on the BSD-like Unix. So maybe the pattern matching /:8333/ could be
reviewed to include also the other output... but beside that, this is just fine.
Well, darn.

Could you provide a sample few lines of output from a BSD netstat -an?

It would take more than changing the /:8333/ pattern to fix this, if I understand your description
correctly.  There is also the awk split on the ":" which would have to be fixed as well.  This is all
doable with a little bit of regular expression hacking (something I do easily.)  But I should see the
exact BSD netstat -an output first, to be sure I understand it correctly.
newbie
Activity: 12
Merit: 0
December 28, 2010, 09:06:45 AM
#21
Let's just make the final count with awk too...

Code:
#!/bin/sh
# Display foreign IP addresses coming from port 8333 --or-- connected to local port 8333.
# Append line at end with date and count of addresses displayed.

netstat -an |
awk -v date="$(date)" '$6 == "ESTABLISHED" && /:8333/ { split($5, a, ":"); print a[1] ; n++ }
END { print "# " date " : " n " bitcoin clients seen." }'



Just to be a little bit picky, the netstat output is slightly different between BSD-like Unix and GNU/Linux.
The port is separated by a dot on the BSD-like Unix. So maybe the pattern matching /:8333/ could be
reviewed to include also the other output... but beside that, this is just fine.


pj
newbie
Activity: 24
Merit: 0
December 28, 2010, 07:20:37 AM
#20
Let's just make the final count with awk too...
Duh!  Excellent.  Thanks.
legendary
Activity: 1288
Merit: 1080
December 27, 2010, 10:03:22 AM
#19
Let's just make the final count with awk too...

Code:
#!/bin/sh
# Display foreign IP addresses coming from port 8333 --or-- connected to local port 8333.
# Append line at end with date and count of addresses displayed.

netstat -an |
awk -v date="$(date)" '$6 == "ESTABLISHED" && /:8333/ { split($5, a, ":"); print a[1] ; n++ }
END { print "# " date " : " n " bitcoin clients seen." }'

pj
newbie
Activity: 24
Merit: 0
December 27, 2010, 08:43:27 AM
#18
I worry a bit that what might have been my most important question above could have gotten
lost in the  code refinement discussion.

So I'll ask it again:
Quote
Is this correct, that you want both IP addresses coming from remote port 8333 and coming
into local port 8333?  Or do you just want IP addresses coming into local port 8333?  If the
later, change the key line above to look for /:8333/ only in field $4, the local address.

My hunch is that we just want IP addresses coming into our port 8333.  Whether or not a connection
is coming from port on 8333 on some other system means nothing to us, as best as I can figure.
pj
newbie
Activity: 24
Merit: 0
December 27, 2010, 07:57:50 AM
#17
Code:
t="$(mktemp -t bitcoin)"
That doesn't work so well -- no XXX's in the mktemp -t template.

And I think you really do want the trap - otherwise your /tmp directory
will get filled up with these dang files.

Yes -- newlines separating each piped command are better (though
I prefer to indent all but the first one) -- I was being lazy and just
typing as I do at the command prompt.

Yes -- mktemp or the more recent tempfile are probably better.
I was just being lazy again, and doing it as I have done it for 30
years, long before those commands existed.  Sorry.  The main
problem with my old fashioned method, and even with mktemp,
is a security issue -- a hacker can get you to write a file that
they have setup, via a symlink that you thought was your file.
The main problem with mktemp and tempfile is that not all
systems have them (though you have to be on a fairly old,
odd, or barebones system not to have them.)

You can find more discussion of the temp file issue at:
  http://www.linuxsecurity.com/content/view/115462/151/
  Safely Creating Temporary Files in Shell Scripts

So ... all this suggests the following:

Code:
#!/bin/sh
# Display foreign IP addresses coming from port 8333 --or-- connected to local port 8333.
# Append line at end with date and count of addresses displayed.

t="$(tempfile -p bitcoin)"
trap 'rm -f $t; trap 0; exit' 0 1 2 3 15

netstat -an |
  awk '$6 == "ESTABLISHED" && /:8333/ { split($5, a, ":"); print a[1]}' |
  tee "$t"

echo "# $(date) $(wc -l < $t) Bitcoin clients seen."
legendary
Activity: 1288
Merit: 1080
December 27, 2010, 04:16:08 AM
#16
The result is
Code:
#!/bin/sh
# Display foreign IP addresses coming from port 8333 --or-- connected to local port 8333.
# Append line at end with date and count of addresses displayed.

t=/tmp/bitcoin.$$
trap 'rm -f $t; trap 0; exit' 0 1 2 3 15

netstat -an | awk '$6 == "ESTABLISHED" && /:8333/ { split($5, a, ":"); print a[1]}' | tee $t
echo "# $(date) $(wc -l < $t) Bitcoin clients seen."

Didn't know about the trap command.  I doubt we need it though.

Being a bit anal :

- The standard way to create a temp file is to use the mktemp command.
- You can end lines after |.  This makes the code clearer.

Code:
#!/bin/sh
# Display foreign IP addresses coming from port 8333 --or-- connected to local port 8333.
# Append line at end with date and count of addresses displayed.

t="$(mktemp -t bitcoin)"

netstat -an |
awk '$6 == "ESTABLISHED" && /:8333/ { split($5, a, ":"); print a[1]}' |
tee "$t"

echo "# $(date) $(wc -l < $t) Bitcoin clients seen."
pj
newbie
Activity: 24
Merit: 0
December 27, 2010, 03:54:39 AM
#15
Quote
It could be cleaned a bit I think :

netstat -an |
awk '/8333/ && /ESTA/ { print $5 }' |
sed 's/:8333//' |
tee >(echo "# $(date) $(wc -l) Bitcoin clients seen.")


Good ideas ... I'd like to tweak this a tad more.

The ">(process)" construct is not recognized by classic Bourne shells and similar.
So far as I know, the redirection to a file, such as to /tmp/bitcoin in the original post,
is required for some such shells.

One more command, the sed, can be removed by using a little more awk.

That (necessary for some shells) tmp file /tmp/bitcoin should be made unique and self-removing

The naked search for "8333" would pick up ports 18333, 28333 ... 58333 as well.  Prefix with a
colon ':' to avoid that.

The result is
Code:
#!/bin/sh
# Display foreign IP addresses coming from port 8333 --or-- connected to local port 8333.
# Append line at end with date and count of addresses displayed.

t=/tmp/bitcoin.$$
trap 'rm -f $t; trap 0; exit' 0 1 2 3 15

netstat -an | awk '$6 == "ESTABLISHED" && /:8333/ { split($5, a, ":"); print a[1]}' | tee $t
echo "# $(date) $(wc -l < $t) Bitcoin clients seen."

Is this correct, that you want both IP addresses coming from remote port 8333 and coming
into local port 8333?  Or do you just want IP addresses coming into local port 8333?  If the
later, change the key line above to look for /:8333/ only in field $4, the local address.
Code:
netstat -an | awk '$6 == "ESTABLISHED" && $4 ~ /:8333/ { split($5, a, ":"); print a[1]}' | tee $t
administrator
Activity: 5222
Merit: 13032
December 26, 2010, 08:59:32 PM
#14
So they are force you  to spend your own money from a bitcoin client in their favor.

They could double-spend transactions to you, but they couldn't redirect your transactions from one Bitcoin address to another one. There are much easier ways to "surround" someone if you control the ISP.

Seednode bootstrapping is used in Tor, I2P, GNUnet, and Freenet. Just removing IRC and using the already-implemented seednode system will work fine.

Reading a bit the code of the Bitcoin client, the client is using a simple trick to know its remote IP via
the IRC server (https://github.com/bitcoin/bitcoin/blob/master/irc.cpp#L333).

That's just one method of finding your external IP. There are also two HTTP external IP services.
newbie
Activity: 12
Merit: 0
December 26, 2010, 04:19:37 PM
#13
bitcoin maintains a database of P2P addresses.  Obtaining addresses via netstat is rather sub-optimal, when you could use bitcointools to extract addresses directly from the bitcoin database.

As to the larger point...

HTTP and DNS bootstrapping should be pursued.  Much more efficient than IRC.

Right, that's why I was pursuing on that way.

By the way, I made a test with bitcointools to dump the address out of the database :

Code:
python2.7 dbdump.py --datadir ~/.bitcoin/ --address

...155.6:36128 (lastseen: Sat Dec 18 21:09:42 2010)
68.52.60.203:36128 (lastseen: Sun Dec 26 15:28:48 2010)
68.53.17.115:36128 (lastseen: Thu Dec 16 18:56:57 2010)
68.56.241.235:36128 (lastseen: Sun Dec 26 17:32:34 2010)
68.62.250.145:36128 (lastseen: Sun Dec 26 15:39:33 2010)
....


Even if the netstat approach could be suboptimal, there is an advantage over relying on the addr dump
from the database. If you get the address from the TCP Established session, these are really the active Bitcoin clients
and from the database, you are guessing out of the lastseen information where you have already a lot of dead
addresses. And picking the appropriate time delta can be tricky except if there is already something in the database
structure to just list the active ones. On the other hand, the Berkeley database need to be only accessed by one
process at a time and you need to shutdown the existing the current database.

Maybe another appropriate way might be to read the addr message passing over the TCP sessions (using pcap) and
extract the addresses and publish that stream to the HTTP/DNS directory.

What's the most appropriate techniques to get the currently active Bitcoin addresses?
legendary
Activity: 1470
Merit: 1006
Bringing Legendary Har® to you since 1952
December 26, 2010, 04:14:38 PM
#12
This is security hole.

Not necessarily.

You can place blockchain bootstraps in compressed *.zip or *.tar.gz files, and hardcode just multiple (RMD160, SHA1, SHA256 + Filezize) hashes of backups into mainstream client.

OR, updated hashes of blockchain bootstraps can be avaiable for download from main bitcoin server over https, in which case it will be impossible to fake them (but this is a centralized solution, so probably not very good).
Possibilities are endless.
sr. member
Activity: 350
Merit: 252
probiwon.com
December 26, 2010, 02:54:24 PM
#11
Only a very few fallback nodes are persistent over time, and compiled (hardcoded) into the bitcoin client itself.

This is security hole.

State can block outgoing 8333 port for all hosts except for hardcoded addresses. On hardcoded addresses they can set up fake bitcoin nodes. Then these nodes will give you the addresses of a dummy nodes to create the illusion of a bitcoin network.

So they are force you  to spend your own money from a bitcoin client in their favor.

Better in the case of network problems ask the user to specify the address for the bootstrap. This address can be obtained from reliable sources, verified by the user.
legendary
Activity: 1596
Merit: 1100
December 26, 2010, 02:37:04 PM
#10
Only a very few fallback nodes are persistent over time, and compiled (hardcoded) into the bitcoin client itself.

https://en.bitcoin.it/wiki/Fallback_Nodes is a viable method of bootstrapping.  We'll call that "forum bootstrapping" or "wiki bootstrapping", where one must manually search for a list of nodes, in order to bootstrap onto the network.

I think DNS bootstrapping would be the most efficient:  a simple DNS lookup to bootstrap.bitcoin.org would work like this:
  • Community members post their nameserver (NS) records for bootstrap.bitcoin.org on the forum.  Presumably this list does not change often
  • Each member runs a DNS server, independently of anyone else, that retrieves addresses from bitcoin's addr.dat database, randomly selects "fresh" P2P nodes, and stores these in A records or SRV records.
  • When bootstrapping, the bitcoin client performs a standard DNS lookup for bootstrap.bitcoin.org

That would be very, very fast.  Much faster than IRC.  This is similar to how BitTorrent DHT bootstrapping occurs.

The only issue is trust (rogue DNS servers), but this issue also exists with the IRC server, which is a Single Point of Failure (SPOF) for both trust and general reliability.
legendary
Activity: 1288
Merit: 1080
December 26, 2010, 01:30:49 PM
#9
But I don't know the exact elliptic curves used by Bitcoin. You can get the one supported
by OpenSSL by doing an:
Code:
openssl ecparam -list_curves

Is there a table of the EC properties used by Bitcoin somewhere? I suppose the easiest is
to read the source code...

Indeed you have to look at the source code.  I've just check and the EC curve used is : secp256k1, which is in the list given by openssl.

I think a scripted implementation is feasable.

legendary
Activity: 860
Merit: 1026
December 26, 2010, 11:19:25 AM
#8
I'm not sure how up-to-date they are, but why not adding these IP's to your (or a seperate) list:
https://bitcointalksearch.org/topic/post-your-static-ip-59
(make sure to read these two posts about fallbacl-nodes:
https://bitcointalksearch.org/topic/m.14646,
https://bitcointalksearch.org/topic/m.31133)
legendary
Activity: 1470
Merit: 1006
Bringing Legendary Har® to you since 1952
December 26, 2010, 11:12:19 AM
#7
bitcoin maintains a database of P2P addresses.  Obtaining addresses via netstat is rather sub-optimal, when you could use bitcointools to extract addresses directly from the bitcoin database.

As to the larger point...

HTTP and DNS bootstrapping should be pursued.  Much more efficient than IRC.

Oh my, dat is soo awsum. +10 to dis idea.
Can we has dis idea implemented in mainstream client, plz ?

I mean how could anybody refuse this soft fluffy little lolcat ?

Pages:
Jump to: