That sounds reasonable :-)
Any idea what Checksum algorithm this is? And is the header (4bytes and message type) also considered in the checksum and message size?
From my research, the header is _not_ considered in the size. It cannot be considered in the checksum, since the checksum would have to be known to calculate the checksum.
Here's the relevant code. Some comments added. I don't know what exactly the Hash() function does yet.
// Set the size
unsigned int nSize = vSend.size() - nMessageStart; // vSend is the message buffer, of type CDataStream (look in serialize.h)
// essentially, this says "size of whole message" - "starting position of data chunk"
// therefore, nMessageStart = size_of(header).
memcpy((char*)&vSend[nHeaderStart] + offsetof(CMessageHeader, nMessageSize), &nSize, sizeof(nSize)); // writes nSize into the right spot in the message.
// Set the checksum
if (vSend.GetVersion() >= 209)
{
uint256 hash = Hash(vSend.begin() + nMessageStart, vSend.end()); // Take a hash of the message data
unsigned int nChecksum = 0;
memcpy(&nChecksum, &hash, sizeof(nChecksum));
assert(nMessageStart - nHeaderStart >= offsetof(CMessageHeader, nChecksum) + sizeof(nChecksum));
memcpy((char*)&vSend[nHeaderStart] + offsetof(CMessageHeader, nChecksum), &nChecksum, sizeof(nChecksum)); // Put that hash in the right spot.
}
It's from net.h:703-716.
The problem I'm having is parsing the version command's data. Version _should_ be made of these things serialized, as per net.h:580:
PushMessage("version", VERSION, nLocalServices, nTime, addrYou, addrMe,
nLocalHostNonce, string(pszSubVer), nBestHeight);
However, VERSION = 304 according to my client, and here's the hex dump of the data I'm getting:
eswanson@eswanson-laptop:~$ python read.py
"version" 87b (checksum: 304)
0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x9F 0x92 0x4F 0x4C 0x00 0x00 0x00 0x00
0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0xFF 0xFF 0x7F 0x00 0x00 0x01
0xB5 0x2C 0x01 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0xFF 0xFF 0x41 0x1B
0xC2 0xBA 0x20 0x8D 0xDF 0x51 0xFA 0x2A
0x26 0x52 0x6C 0x67 0x02 0x2E 0x30 0x44
0x14 0x01 0x00
You know, I just realized: the checksum is 304... So my header parsing code must be doing something wrong. The header _is_ supposed to be 24 bytes, right?
--EDIT--
Alright, I've been playing with the "version" message the client sends automatically when you connect to it. First of all, as much as I respect Satoshi, I really don't like this format. Something like protocol buffers would have been much neater and more efficient.
Secondly, I suspect that this message doesn't have a checksum because it can't be sure if the client it is talking to has a version > 209 until it gets a verack or version message. I'll play with that next, but I think the checksum is omitted initially to deal with older clients.
Here's my analysis of the components of the data section of this message:
VERSION - int (4b)
nLocalServices - uint64 (8b)
nTime - uint64 (8b)
addrYou - struct (26b) - more below
addrMe - struct (26b) - more below
nLocalHostNonce - uint64 (8b)
pszSubVer - variable length string
nBestHeight - int (4b)
Now for the address struct, here's the IMPLEMENT_SERIALIZE block:
IMPLEMENT_SERIALIZE
(
if (nType & SER_DISK)
{
READWRITE(nVersion);
READWRITE(nTime);
}
READWRITE(nServices);
READWRITE(FLATDATA(pchReserved)); // for IPv6
READWRITE(ip);
READWRITE(port);
)
from net.h:222-233.
From what I can tell, this indicates that the over the wire format is:
nServices - uint64 (8b)
pchReserved - (12b)
ip - uint (4b)
port - unsigned short (2b)
Interestingly, the IP and Port are in Big-Endian / Network Byte Order, while everything in the main struct is in Little-Endian / Host Byte Order