Author

Topic: Question about base58 encoding (Read 1121 times)

staff
Activity: 4326
Merit: 8951
May 18, 2014, 02:41:51 AM
#6
FWIW, I think the current bignum free code we have in the reference client, now is moderately more clear than whats in the wiki. https://github.com/bitcoin/bitcoin/blob/master/src/base58.cpp#L66
donator
Activity: 1218
Merit: 1080
Gerald Davis
May 17, 2014, 05:06:16 PM
#5
It's the way it's done, but whether it was the most reasonable way to do it, I'd have argued about it. Smiley

Exactly.  It is another "Satoshism".  It makes things more complex for no obvious benefit/reason.  Each one by itself isn't that hard to workthrough but collectively they add up to reduce the transparency of the code.

This does mean that some Bitcoin addresses are shorter however each digit shorter is progressively less likely.  Most (99.7%) addresses are 34 or 35 digits long.   I can't see that being a good reason for all the complexity.  It actually makes it slightly more confusing because you can't say "All Bitcoin addresses are X digits long", you can't even say 34 or 35 digits because in theory they can be as short as 20 digits (version 0x0 + payload/hash of all zeroes + checksum).   No idea why Satoshi didn't just do something like prepend the version byte (where valid version byte is any value other than 0x00) and then convert to base58.  No need for any weird leading 0x00 = "1" check.  By using a version byte of 0x01 or greater the leading zeroes of the hash would be preserved.

It isn't ever going to change now but it does make you scratch your head and it makes it harder to understand the code because there is no obvious benefit (which helps when trying to figure out what a chunk of code is doing).
legendary
Activity: 2058
Merit: 1416
aka tonikt
May 17, 2014, 04:45:29 PM
#4
It's the way it's done, but whether it was the most reasonable way to do it, I'd have argued about it. Smiley

Anyway, glad I could help.
And reading the wiki remember that it's not the ultimate spec, but we may fix it, as we discover fragments to improve, as you just did.
legendary
Activity: 1162
Merit: 1007
May 17, 2014, 04:39:56 PM
#3
You add 0x01 at input while you should be adding '1' at the output

Yeah, in bitcoin's base58 encoder there is a weird thing.
At the end of all this math, for every byte of value 0 from the beginning of the input data, you add one character 1 to the output string.
So if there was one zero byte, you add one character. If there were two, you add two and so on...
That's probably what you're missing.
It's actually not that much weird if you consider that this extra 1s allow to preserve the byte length of the original data, though when you know that the addresses version 0 have fixed length and 4 bytes checksum, then you also figure that it's in fact pretty useless.

edit:
The wiki is a bit misleading with the 0x01, since most addresses have 0x00, but it's still all true what it says.

edit2:
ftfy: https://en.bitcoin.it/wiki/Base58Check_encoding#Base58_symbol_chart

Thanks piotr!  If the 0x01 is not really supposed to be there, then it all makes sense to me.  I still find it odd to use a base58 digit to represent a byte, but I guess that's probably actually the most reasonable way to have done it.  
legendary
Activity: 2058
Merit: 1416
aka tonikt
May 17, 2014, 03:54:47 PM
#2
You add 0x01 at input while you should be adding '1' at the output

Yeah, in bitcoin's base58 encoder there is a weird thing.
At the end of all this math, for every byte of value 0 from the beginning of the input data, you add one character 1 to the output string.
So if there was one zero byte, you add one character. If there were two, you add two and so on...
That's probably what you're missing.
It's actually not that much weird if you consider that this extra 1s allow to preserve the byte length of the original data, though when you know that the addresses version 0 have fixed length and 4 bytes checksum, then you also figure that it's in fact pretty useless.

edit:
The wiki is a bit misleading with the 0x01, since most addresses have 0x00, but it's still all true what it says.

edit2:
ftfy: https://en.bitcoin.it/wiki/Base58Check_encoding#Base58_symbol_chart
legendary
Activity: 1162
Merit: 1007
May 17, 2014, 02:43:57 PM
#1
I'm slowly working my way through the technical details of bitcoin by writing custom Mathematica code (just for learning purpose).  Last night I wrote my own code to convert an ECDSA public key into a bitcoin address.  I (think I) understand everything except for the leading "1"s in the bitcoin address.  Here's my Mathematica function for computing the bitcoin address from a public key :

Code:
bitcoinaddress[publicKey_Integer] := Module[{h1, h2, addr},
   h1 = ripemd160[sha256[publicKey, 65], 32];
   h2 = sha256[sha256[h1, 21], 32];
   addr = 2^32 h1 + Quotient[h2, 2^224];
   Return[ToBase58[addr]]];

(*Note that the second argument in the hash functions is the number of message bytes to assume since Mathematica is using arbitrary-precision integers*)

And here's the flawed code I'm using for ToBase58[].  The flaw is at the end when I prepend the "1" in order to get the "right answer," but I realize this is a hack and will not always work if there are leading zeros in the pub-key hash:

Code:
ToBase58[y_Integer] :=
 Module[{codestring = "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz", outputString = "", x = y},
 
    While[x > 0, outputString = outputString <> StringTake[codestring, {Mod[x, 58] + 1}]; x = Quotient[x, 58]];

    Return["1" <> StringReverse[outputString]]];  (*prepending the "1" like this is wrong*)



On the bitcoin wiki, it says:




What I don't understand are these things in red:

   address_byte_string (consisting of 0x01 + hash + 4-byte_check_code)

If I prepend 0x01 prior to converting to base58 I get the wrong address, whereas if I ignore this byte I get the correct address.

   repeat(number_of_leading_zero_bytes_in_hash)

^^This seems strange and arbitrary to me.  I add an extra "1" for every leading byte that is 0?



NOTE TO READERS: Do not use this code for anything important.  Use a real bitcoin library.  
Jump to: