Author

Topic: Explanation of the bitcoin address management. (Read 1529 times)

legendary
Activity: 1190
Merit: 1004
Sorry, I hit the wrong tag by mistake. Here is the code with a switch statement and a new function: "CBAddressManagerTakeAddress"

Code:
uint8_t CBAddressManagerGetBucketIndex(CBAddressManager * self,CBNetworkAddress * addr){
uint64_t group = CBAddressManagerGetGroup(self, addr);
CBRandomSeed(self->rndGenForBucketIndexes, group + self->secret); // Add the group with the secure secret generated during initialisation.
return CBSecureRandomInteger(self->rndGenForBucketIndexes) % CB_BUCKET_NUM;
}
uint64_t CBAddressManagerGetGroup(CBAddressManager * self,CBNetworkAddress * addr){
uint8_t start = 0;
int8_t bits = 16;
uint64_t group;
switch (addr->type) {
case CB_IP_I2P:
case CB_IP_TOR:
group = addr->type;
start = 6;
bits = 4;
break;
case CB_IP_SITT:
case CB_IP_RFC6052:
group = CB_IP_IPv4;
start = 12;
break;
case CB_IP_6TO4:
group = CB_IP_IPv4;
start = 2;
break;
case CB_IP_TEREDO:
return CB_IP_IPv4 | ((CBByteArrayGetByte(addr->ip, 12) ^ 0xFF) << 8) | ((CBByteArrayGetByte(addr->ip, 13) ^ 0xFF) << 16);
case CB_IP_HENET:
group = CB_IP_IPv6;
bits = 36;
break;
case CB_IP_IPv6:
group = CB_IP_IPv6;
bits = 32;
break;
case CB_IP_IPv4:
group = CB_IP_IPv4;
start = 12;
break;
default:
group = addr->type;
bits = 0;
break;
}
uint8_t x = 8;
for (; bits >= 8; bits -= 8, x += 8, start++)
group |= CBByteArrayGetByte(addr->ip, start) << x;
if (bits > 0)
group |= (CBByteArrayGetByte(addr->ip, start) | ((1 << bits) - 1)) << x;
return group;
}
bool CBAddressManagerSetup(CBAddressManager * self){
// Allocate buckets
self->buckets = malloc(sizeof(*self->buckets) * CB_BUCKET_NUM);
// Create mutexes
if(CBNewMutex(&self->addressMutex)){
if(CBNewMutex(&self->nodesMutex)){
// Create random number generators.
self->rndGen = CBNewSecureRandomGenerator();
if (self->rndGen) {
CBSecureRandomSeed(self->rndGen); // Cryptographically secure.
self->secret = CBSecureRandomInteger(self->rndGen);
self->rndGenForBucketIndexes = CBNewSecureRandomGenerator();
if (self->rndGenForBucketIndexes) {
return true;
}
CBFreeSecureRandomGenerator(self->rndGen);
}
CBFreeMutex(self->addressMutex);
CBFreeMutex(self->nodesMutex);
}else{
CBGetMessage(self)->events->onErrorReceived(CB_ERROR_NETWORK_COMMUNICATOR_MUTEX_CREATE_FAIL,"The CBAddressManager 'nodesMutex' could not be created.");
}
CBFreeMutex(self->addressMutex);
}else{
CBGetMessage(self)->events->onErrorReceived(CB_ERROR_NETWORK_COMMUNICATOR_MUTEX_CREATE_FAIL,"The CBAddressManager 'addressMutex' could not be created.");
}
return false;
}
void CBAddressManagerTakeAddress(CBAddressManager * self,CBNetworkAddress * addr){
// Find the bucket for this address.
CBBucket * bucket = self->buckets + CBAddressManagerGetBucketIndex(self, addr);
CBMutexLock(self->addressMutex); // Use mutex lock when modifying the addresses
// Find insersion point for address
uint16_t insert = 0;
for (; insert < bucket->addrNum; insert++) {
if (bucket->addresses[insert]->score > addr->score)
// Insert here
break;
}
if (bucket->addrNum == self->maxAddressesInBucket) {
// A lot of addresses stored, remove random address but with bias to a low scoring address.
uint16_t remove = (CBSecureRandomInteger(self->rndGen) % bucket->addrNum);
remove *= remove;
remove /= bucket->addrNum;
// Release address
CBReleaseObject(&bucket->addresses[remove]);
if (insert < remove)
// Insersion happens below removal. Move memory up to overwrite removed address and make space to insert.
memmove(bucket->addresses + insert + 1, bucket->addresses + insert, remove-insert);
else if (insert > remove){
// Insersion happens above removal. Move memory down to overwrite removed address and make space to insert.
memmove(bucket->addresses + remove, bucket->addresses + remove + 1, insert-remove);
insert--; // Move insert down since we moved memory down.
}
}else{
bucket->addrNum++;
// Move memory up to allow insertion of address.
memmove(bucket->addresses + insert + 1, bucket->addresses + insert, bucket->addrNum - insert - 1);
}
bucket->addresses[insert] = addr;
CBMutexUnlock(self->addressMutex); // Now other threads can access the addresses.
}
legendary
Activity: 1288
Merit: 1227
Away on an extended break
Well here is how I'm generating the bucket indexes (untested).

Code:
uint8_t CBAddressManagerGetBucketIndex(CBAddressManager * self,CBNetworkAddress * addr){
uint64_t group = CBAddressManagerGetGroup(self, addr);
CBRandomSeed(self->rndGen, group + self->secret); // Add the group with the secure secret generated during initialisation.
return CBSecureRandomInteger(self->rndGen) % CB_BUCKET_NUM;
}
uint64_t CBAddressManagerGetGroup(CBAddressManager * self,CBNetworkAddress * addr){
uint8_t start = 0;
int8_t bits = 16;
uint64_t group;
if (addr->type & CB_IP_I2P) {
group = CB_IP_I2P;
}else if (addr->type & CB_IP_TOR){
group = CB_IP_TOR;
start = 6;
bits = 4;
}else if (addr->type & CB_IP_IPv6){
if ((addr->type & CB_IP_SITT) || (addr->type & CB_IP_RFC6052)) {
group = CB_IP_IPv4;
start = 12;
}else if (addr->type & CB_IP_6TO4){
start = 2;
group = CB_IP_IPv4;
}else if (addr->type & CB_IP_TEREDO){
return CB_IP_IPv4 | ((CBByteArrayGetByte(addr->ip, 12) ^ 0xFF) << 8) | ((CBByteArrayGetByte(addr->ip, 13) ^ 0xFF) << 16);
}else if (addr->type & CB_IP_HENET){
bits = 36;
group = CB_IP_IPv6;
}else{
bits = 32;
group = CB_IP_IPv6;
}
}else if (addr->type & CB_IP_IPv4){
group = CB_IP_IPv4;
start = 12;
}else if (addr->type & CB_IP_IPv4_LOCAL){
group = CB_IP_IPv4;
bits = 0;
}else{
group = CB_IP_IPv6;
bits = 0;
}
uint8_t x = 8;
for (; bits >= 8; bits -= 8, x += 8, start++)
group |= CBByteArrayGetByte(addr->ip, start) << x;
if (bits > 0)
group |= (CBByteArrayGetByte(addr->ip, start) | ((1 << bits) - 1)) << x;
return group;
}
bool CBAddressManagerSetup(CBAddressManager * self){
// ... Other stuff here ...
self->rndGen = CBNewSecureRandomGenerator();
if (!self->rndGen) {
// ...
return false;
}
CBSecureRandomSeed(self->rndGen); // Cryptographically secure for the secret.
self->secret = CBSecureRandomInteger(self->rndGen);
}


I've taken the rare chance to edit an post above  Wink
You need to use code tags instead of the quote tags to prevent triggering the smilies.
legendary
Activity: 1596
Merit: 1100
Use a 'switch' statement rather than a super-long if tree Smiley
legendary
Activity: 1190
Merit: 1004
Well here is how I'm generating the bucket indexes (untested).

Code:
uint8_t CBAddressManagerGetBucketIndex(CBAddressManager * self,CBNetworkAddress * addr){
uint64_t group = CBAddressManagerGetGroup(self, addr);
CBRandomSeed(self->rndGen, group + self->secret); // Add the group with the secure secret generated during initialisation.
return CBSecureRandomInteger(self->rndGen) % CB_BUCKET_NUM;
}
uint64_t CBAddressManagerGetGroup(CBAddressManager * self,CBNetworkAddress * addr){
uint8_t start = 0;
int8_t bits = 16;
uint64_t group;
if (addr->type & CB_IP_I2P) {
group = CB_IP_I2P;
}else if (addr->type & CB_IP_TOR){
group = CB_IP_TOR;
start = 6;
bits = 4;
}else if (addr->type & CB_IP_IPv6){
if ((addr->type & CB_IP_SITT) || (addr->type & CB_IP_RFC6052)) {
group = CB_IP_IPv4;
start = 12;
}else if (addr->type & CB_IP_6TO4){
start = 2;
group = CB_IP_IPv4;
}else if (addr->type & CB_IP_TEREDO){
return CB_IP_IPv4 | ((CBByteArrayGetByte(addr->ip, 12) ^ 0xFF) << 8) | ((CBByteArrayGetByte(addr->ip, 13) ^ 0xFF) << 16);
}else if (addr->type & CB_IP_HENET){
bits = 36;
group = CB_IP_IPv6;
}else{
bits = 32;
group = CB_IP_IPv6;
}
}else if (addr->type & CB_IP_IPv4){
group = CB_IP_IPv4;
start = 12;
}else if (addr->type & CB_IP_IPv4_LOCAL){
group = CB_IP_IPv4;
bits = 0;
}else{
group = CB_IP_IPv6;
bits = 0;
}
uint8_t x = 8;
for (; bits >= 8; bits -= 8, x += 8, start++)
group |= CBByteArrayGetByte(addr->ip, start) << x;
if (bits > 0)
group |= (CBByteArrayGetByte(addr->ip, start) | ((1 << bits) - 1)) << x;
return group;
}
bool CBAddressManagerSetup(CBAddressManager * self){
// ... Other stuff here ...
self->rndGen = CBNewSecureRandomGenerator();
if (!self->rndGen) {
// ...
return false;
}
CBSecureRandomSeed(self->rndGen); // Cryptographically secure for the secret.
self->secret = CBSecureRandomInteger(self->rndGen);
}
legendary
Activity: 1072
Merit: 1181
Right, addrman is not primarily intended to optimize quick connections. Its first priority is preventing sybil attacks, and limited disk/memory usage.
legendary
Activity: 1190
Merit: 1004
Ok thanks. I do understand what it is trying to do and I pretty understand why it does the different things.

The way I thought about doing it was to place addresses into buckets (new or tried) using a more suitable CSPRNG rather than a double SHA-256 hash. Then addresses will be ordered by a score in each bucket so that a random selection can be made with a bias to addresses with a better score. When an address needs to be removed, a random address is removed with bias to addresses with a low score. The score will be set like with the original client, related to the time last seen but will be reduced when connections are unsuccessful.
legendary
Activity: 1072
Merit: 1181
The comments Gregory is referring to are here: https://github.com/bitcoin/bitcoin/blob/master/src/addrman.h#L91.

The "new" buckets are for addresses to which no succesful connections has been made yet, the "tried" ones for addresses that are.

The idea is that you want both, so that there is always a chance for connecting to an old and known functional node, but still always keep trying new ones as well. For "new" buckets, the bucket is derived from the origin of the node's knowledge (where you heard it from), so that an attacker needs many distincts IP ranges to poison your entire table. For the "tried" bcukets, the bucket number is based on the address itself, for the same reason.

Over-complex? Maybe...
staff
Activity: 4284
Merit: 8808
There is a very long and detailed 'comment' in the code that documents the design and much of the rationale behind it.

Everything it does it does in order to bound the maximum store for addresses (so an attacker can't flood you to run you out of disk), while preserving the most useful addresses, and while making it difficult for an attacker to replace all your addresses with nodes he controls.
legendary
Activity: 1190
Merit: 1004
OK thanks, though I'm just about done for today I think. I'll come back to this tomorrow.
hero member
Activity: 560
Merit: 501
I think addrman is sipa's branchild; check on IRC if you're low on time or he misses this.
legendary
Activity: 1190
Merit: 1004
I've been looking at the addrman.cpp code. From what I've read elsewhere it is supposed to place groups of IPs in separate buckets and connect to IPs from a selection of buckets with some randomisation.

The problem is, it's hard getting my head around the code. Can anyone explain what the code is doing? I can see by the GetGroup function how it separates addresses though the rest of the code is quite complex.

Would it be adequate for my library to separate addresses much as the GetGroup function and initiate connections from random different groups with bias to "newer"/"better" addresses?

It seems to me that the addrman.cpp code adds groups in a random bucket and some groups may be put into the same bucket. Why would different groups be put into the same bucket?

And can anyone explain why there is the "new" buckets and "tried" buckets?

The whole thing seems overcomplex to me but if anyone can explain simply what it is doing and why the things are useful, then I'll take it on board.

Thank you.
Jump to: