Anyone here with math background who can help with Reed-Solomon codes problem?It's needed for the New Address Format:
https://forums.nxtcrypto.org/viewtopic.php?f=17&t=524Original ricot's idea was to use RS codes to auto-correct typos to get superficial, but cool advantage over bitcoin addresses.
But it seems there is a problem. We know that with parity = 4 the algorithm can reliably detect up to 4 errors or correct up to 2.
What isn't mentioned is that those choices seem to be
exclusive. You can either detect 4 errors for sure and get general 1/million collision probability (we use 20-bit redundancy) OR you can try to correct errors.
In almost quantum-physics-style weirdness, if you try to correct errors,
you cannot use the previous check anymore!And error correction fails miserably if there are more than 2 errors: it cannot detect that there are more. So it corrects into an incorrect address and there seems to be no way to verify this.
To put it in layman’s terms: you make more than 2 typos and you will almost certainly lose your money. This is unacceptable.
So can anyone either confirm this behavior or tell me what I am missing?
Sorry about the slow reply, haven't checked the thread this weekend. The nxtcrypto forum is down for me, and I haven't caught up with this thread yet, so I'm sorry again if it's already been answered.
It's not either-or, you can get both error-detection and error-correction. Yes, error-correction is based on the assumption that there are 2 or fewer errors, because there is always exactly one valid code within Hamming distance 2 (i.e. differing by 2 or fewer characters). But we don't know how many errors the user makes in input, so we shouldn't do like Microsoft Clippy and autocorrect, we should
reject the input (since we know it's wrong) and possibly
suggest the nearest valid code (which is what we think, but not guaranteed to be what the user intended.)
(The above is general coding theory, so applies to other codes like linear codes as well, not just Reed-Solomon)
Personally, I would implement it as a library with the following variable types (or classes) and functions:
a) baseAddress, which holds 65 bits corresponding to the 64 bits of a Nxt address, plus a pad bit (to reach 13 x 5 bit characters)
b) RSAddress, which holds a 17 char string corresponding to a possible RS-encoded address (this excludes prefixes)
1) toRSAddress(baseAddress) that returns a RSAddress, and
2) toBaseAddress(RSAddress) that either:
i) returns the baseAddress corresponding to the one and only valid Reed-Solomon address within Hamming distance 2 of RSAddress, if the pad bit indicates this is a valid baseAddress or
ii) throws an exception, if the pad bit indicates this is invalid.
(If the language I'm implementing it in doesn't support exceptions, I'd add another boolean function isValidAddress(RSAddress) )
Here's one way library users (e.g. client developers) can use it:
if ( toRSAddress(toBaseAddress(input)) == input ) {
... } //input is (almost certainly) the intended address
else {
... } //input is wrong. Up to the client dev whether they want to suggest the nearest valid neighbour, or ask the end-user to retype, or whatever.
EDIT: Personally, I'd keep this library simple and leave it to client devs how they want to handle prefixes, or other embellishments to the Reed-Solomon encoded addresses. But if the community can come to a consensus on a default way to present NXT addresses, might be a good idea for the library to handle this as well.
Hope this helps. I'm a hobbyist coder, I don't work with other professional coders so I'm not sure how coders communicate with each other about stuff like this. Let me know if there's anything you want me to expand on.