Author

Topic: The variable length integer looks a bit weird on the bitcoin wiki (Read 608 times)

legendary
Activity: 1042
Merit: 2805
Bitcoin and C♯ Enthusiast
I think it is clear enough. The thing is that variable length data is not a bitcoin specific thing. For example we have it in BER and DER encodings (note DER is used in signature encoding). the format is following the same pattern. We have an initial value (usually the first bit) that tells us how to read the follow up data. For example in DER if the first bit is 0 then your first byte is the value and if not then the first byte tells you the length of what you should read and what you read afterwards is the value:
DER: 0x32 = 00110010 (first bit is 0) so the value is 0x32 or 50
DER: 0x81dc = 10000001 11011100 (first bit is 1 so length is first byte or 1 byte and next byte is value) so the value is 0b11011100 or 220

The same thing is correct for Compact Int but instead of first bit we are using the first byte and a value comparison. The rule is that if the value of first byte (the first octet) is smaller than 0xfd or 253 then the first byte is the value itself. But if the first value was bigger then the follow up bytes are the actual value and first byte will indicate its length where 0xfd is followed by 2 byte, 0xfe is followed by 4, and 0xff is followed by 8.
0xff005f20054ebbe712:
first byte is 0xff so the first byte indicates the length to read and the length is 8 bytes so we take the following 8 bytes: 00-5f-20-05-4e-bb-e7-12 and the value is 5321341234651231

Here is a preview of an upcoming tool I am working on, you can see different encodings and their relative byte array and integer value:
copper member
Activity: 2856
Merit: 3071
https://bit.ly/387FXHi lightning theory
From this page there's a table: https://en.bitcoin.it/wiki/Protocol_documentation#Message_types

valuestorage lengthformat
<0xFD1uint8_t
<=0xFFFF30xFD followed by the length as uint16_t
<=0xFFFFFFFF50xFE followed by length as uint64_t
-90xFF followed by length as uint64_t

There isn't really any description as to what the first character is supposed to represent? And why the second, third and forth are needed is this just in case the voltage gets misread?
Jump to: