Author

Topic: Satoshi Compact Varint - CVARINT (Read 93 times)

newbie
Activity: 20
Merit: 1
March 03, 2024, 08:58:06 AM
#5
Thx for replying, sorry I missed your reply as I have been away.

I have mastered VARINT now and get it , but there was in the past more to this, especially a printed ASCII variant translation of Blockchain data.

I am now drilling down to understanding FDF%F%F% etc found in Satoshi ascii text scripts of blockchain data from 2007-2010.

Once I can resolve "F%" then I may be closer to understanding my project.

Thx again
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
February 01, 2024, 04:55:30 AM
#4
There is a CVarint type used somewhere in Bitcoin code or is this something you made? I am aware of a class called "CVarint" in the Bitcoin Core codebase but that is just an implementation of varint. I've never heard of compact varint type though.

From googling just now, there is a compact varint type outside of Bitcoin though (https://github.com/protocolbuffers/protobuf/issues/4376), that serializes arbitrary-sized integers but uses more space. Are you referring to that by any chance?
newbie
Activity: 20
Merit: 1
January 31, 2024, 12:19:54 PM
#3
some merit there, did that already but I have DM 'ed the string to test,  see if you can work it out?
hero member
Activity: 1194
Merit: 573
OGRaccoon
January 31, 2024, 11:24:40 AM
#2
You could try encode the input.

Code:
input_string = "".encode('utf-8')

Since the ord function returns the unicode point for the given charater though if your outside the range for ASCII then it might not fit in a single byte.

What you can do is try encode the string like UTF-8 before processing it.

Keep in mind if you modify the encode decode function to accpet the unicode points you may need to transform the scheme.
newbie
Activity: 20
Merit: 1
January 31, 2024, 06:34:27 AM
#1
Hello, my transformation needs to be precisely reversible. The Satoshi compact CVarint (not Varint) was used.
Here is my Python interpretation, but it seems to have created characters outside the ascii range so I can't verify.

serialize.h of the period doesn't address this fully, appreciate any pointers, artifacts ??

def encode_camount(amount):
    # CAmount transformation
    if amount == 0:
        transformed_value = 0
    else:
        e = 0
        while amount % 10 == 0 and e < 9:
            amount //= 10
            e += 1
        if e < 9:
            d = amount % 10
            n = amount // 10
            transformed_value = 1 + 10 * (9 * n + d - 1) + e
        else:
            transformed_value = 1 + 10 * (amount - 1) + 9

    # MSB base-128 encoding
    encoded_bytes = []
    while transformed_value > 0:
        byte = transformed_value & 0x7F
        transformed_value >>= 7
        if transformed_value > 0:
            byte |= 0x80
            byte -= 1
        encoded_bytes.insert(0, byte)

    return encoded_bytes

def decode_camount(encoded_bytes):
    # Decode the variable-length integer
    decoded_value = 0
    for i, byte in enumerate(reversed(encoded_bytes)):
        if i > 0:
            decoded_value += (byte + 1) * (128 ** i)
        else:
            decoded_value += byte

    # Reverse the CAmount transformation
    if decoded_value == 0:
        return 0
    e = decoded_value % 10
    decoded_value = (decoded_value - 1) // 10
    if e < 9:
        n = decoded_value // 9
        d = decoded_value % 9 + 1
        original_amount = n * 10 + d
    else:
        original_amount = decoded_value + 1
    original_amount *= 10 ** e

    return original_amount


input_string = ""



# Step 1: Encoding - Convert each character to a byte and encode
encoded_data = [encode_camount(ord(c)) for c in input_string]
print("Encoded Data:", encoded_data)

# Step 2: Decoding - Decode each byte sequence
decoded_bytes = [decode_camount(data) for data in encoded_data]
print("Decoded Bytes:", decoded_bytes)

# Step 3: Re-encoding - Re-encode the decoded bytes
reencoded_data = [encode_camount(b) for b in decoded_bytes]
print("Reencoded Data:", reencoded_data)

# Step 4: Convert reencoded data back to byte literals
reencoded_bytes = bytearray()
for data in reencoded_data:
    for byte in data:
        reencoded_bytes.append(byte)

# Convert bytearray to string for display
reencoded_string = ''.join(format(x, '02x') for x in reencoded_bytes)
print("Reencoded Byte String:", reencoded_string)

# Step 5: Verification
verification = ''.join(format(ord(c), '02x') for c in input_string) == reencoded_string
print("Verification Successful:", verification)

Thx for any pointers outside a google search....
Jump to: