Pages:
Author

Topic: 10 BTC 4 U 2 STEAL - Protected by a weak 5-letter password - crack & it's yours! - page 7. (Read 20210 times)

legendary
Activity: 1008
Merit: 1000
Stupid question, but is the encryption/decryption function the exact same one that is used when I open the binary you provided and click on "tools --> address utility" and then type in an "encryption phrase" ?

i.e., in principle, could one steal the BTC from the wallet you provided by trying every AAAAA through zzzzz combination into that box and clicking on "generate"?
legendary
Activity: 1358
Merit: 1002
Also the Password is Steal Tongue

LMAO

Mike has a sense of humour.
Too bad this test only showed that people and their predictable way of thinking are security biggest enemy.
hero member
Activity: 507
Merit: 500
Isnt there a  trick you can use prehashed tables to dot his in rapid fire?

somewhere i have a CD/DVD/USB with ever possible windows XP password hash  at the time that wintergen would spit out... made cracking brute-force a lot easier...
Check the BIP 38. It isn't quite that simple.

Let me fire up the CRAY..... Wow well so I am looking at a 1TB+ of data for hash tables Tongue


Also the Password is Steal Tongue
donator
Activity: 1218
Merit: 1079
Gerald Davis
The belief that C# (and Java) is god awful slow is simply an urban legend.  Maybe that was true in .NET 0.9b but it hasn't been true for a while now.  

Obviously it is impossible to boil down an entire language (and all compiler implementations, and all possible source codes, and all possible implementation skills down to a single number however)....
http://shootout.alioth.debian.org/u32/which-programs-are-fastest.php

Code:
C++ (g++)                      1.00
C (gcc)                        1.16
Java (Java7)                   1.59
C# (mono)                      2.20
Javascript (chrome V8)         3.04

Execution time's normalized to 1.00 for C++ execution time.   Sure Java, C#, and Javascript are slower but they aren't magnitudes slower and to cut a 36 year brute force search down to say 1 month you are talking about needing MORE than 2 magnitudes in performance increase. 

JIT compilers have come a long way.   C++ compilers have also come along way too.  C++ essnetially adds no overhead to writing native C.  Remember there is a skill factor to consider.  Compilers essentially are a skill normalizer.  Maybe one person in twenty can write better C code than a C++ compiler but then again if that person contributed to C++ compiler enhancement that knowledge is then shared.   




scrypt was simply intended to be very slow.  Even if the C# implementation is poorly written (or compiled) it is unlikely one can expect to gain 100x increase in performance by just switching languages.  There is a reason why fastcash4bitcoins uses bcrypt.  bcrypt is a similar algorithm.  scrypt is even more memory hard however we still use bcrypt simply because it has been around longer.  Anyone taking simple SHA hash of a password is just making the job of an attacker a couple MILLION times easier (no not an exaggeration).

There are two possible vulnerabilities:
a) dictionary attacks (to include dictionary substitution attacks), weak password is still weak even with scrypt. 
b) precomputation attacks (if the algorithm doesn't have some salt value then given enough time a group of hackers could build a distributed database of ALL 1,2,3,4,5,6,7,8,9,10... char passwords).
sr. member
Activity: 306
Merit: 257

Is this really the bottleneck?  I'd have figured that there would be a performance penalty stemming from all the allocation of objects on the heap and have never dug into the inner workings of scrypt this far.  Although I have worked with assembly quite a bit, I have never done so in the context of C#, and have assumed that the JIT compiler does a pretty good job of being efficient at the instruction level other than incurring a ton of overhead moving in and out of objects it so cautiously allocates.

How did you arrive at the above?  If I try to look for the same piece of code in my app, I get this:

Looks to me like it's going about it a much longer way than what you've quoted.  Are we looking at the same implementation?  (the code you've quoted isn't even something I could find verbatim in the implementation I used).  All of these instructions look like they are accomplishing nothing more than checking that the array references are in bounds, and the code you're quoting doesn't reference an array.  In fact it is checking very redundantly in spite of obvious possible optimizations (e.g. checking to see that x[0] is a in bounds immediately after having verified that x[4] is).  And it doesn't unroll the call to R(), something I wouldn't have expected anyway.

EDIT: not surprisingly, I get substantially better performance compiling this as a "release" build and then not running it under the debugger, the scrypt operation taking closer to 1 second versus 3 seconds without changing a single line of code, even though the release build still shows all the wasteful bounds checks if I look at the disassembly.

You are right about the boundary checks. They can be avoided with unsafe pointers though. Also, C# doesn't do inlines well.
So I hand-optimized the inner loop:
Code:
               int i;
                uint x0 = input[0+ inputOffset];
                uint x1 = input[1 + inputOffset];
                uint x2 = input[2 + inputOffset];
                uint x3 = input[3 + inputOffset];
                uint x4 = input[4 + inputOffset];
                uint x5 = input[5 + inputOffset];
                uint x6 = input[6 + inputOffset];
                uint x7 = input[7 + inputOffset];
                uint x8 = input[8 + inputOffset];
                uint x9 = input[9 + inputOffset];
                uint x10 = input[10 + inputOffset];
                uint x11 = input[11 + inputOffset];
                uint x12 = input[12 + inputOffset];
                uint x13 = input[13 + inputOffset];
                uint x14 = input[14 + inputOffset];
                uint x15 = input[15 + inputOffset];

                for (i = rounds; i > 0; i -= 2)
                {

                    x4 ^= R(x0 + x12, 7); x8 ^= R(x4 + x0, 9);
                    x12 ^= R(x8 + x4, 13); x0 ^= R(x12 + x8, 18);
                    x9 ^= R(x5 + x1, 7); x13 ^= R(x9 + x5, 9);
                    x1 ^= R(x13 + x9, 13); x5 ^= R(x1 + x13, 18);
                    x14 ^= R(x10 + x6, 7); x2 ^= R(x14 + x10, 9);
                    x6 ^= R(x2 + x14, 13); x10 ^= R(x6 + x2, 18);
                    x3 ^= R(x15 + x11, 7); x7 ^= R(x3 + x15, 9);
                    x11 ^= R(x7 + x3, 13); x15 ^= R(x11 + x7, 18);
                    x1 ^= R(x0 + x3, 7); x2 ^= R(x1 + x0, 9);
                    x3 ^= R(x2 + x1, 13); x0 ^= R(x3 + x2, 18);
                    x6 ^= R(x5 + x4, 7); x7 ^= R(x6 + x5, 9);
                    x4 ^= R(x7 + x6, 13); x5 ^= R(x4 + x7, 18);
                    x11 ^= R(x10 + x9, 7); x8 ^= R(x11 + x10, 9);
                    x9 ^= R(x8 + x11, 13); x10 ^= R(x9 + x8, 18);
                    x12 ^= R(x15 + x14, 7); x13 ^= R(x12 + x15, 9);
                    x14 ^= R(x13 + x12, 13); x15 ^= R(x14 + x13, 18);
                }
                output[0 + outputOffset] = x0 + input[0 + inputOffset];
                output[1 + outputOffset] = x1 + input[1 + inputOffset];
                output[2 + outputOffset] = x2 + input[2 + inputOffset];
                output[3 + outputOffset] = x3 + input[3 + inputOffset];
                output[4 + outputOffset] = x4 + input[4 + inputOffset];
                output[5 + outputOffset] = x5 + input[5 + inputOffset];
                output[6 + outputOffset] = x6 + input[6 + inputOffset];
                output[7 + outputOffset] = x7 + input[7 + inputOffset];
                output[8 + outputOffset] = x8 + input[8 + inputOffset];
                output[9 + outputOffset] = x9 + input[9 + inputOffset];
                output[10 + outputOffset] = x10 + input[10 + inputOffset];
                output[11 + outputOffset] = x11 + input[11 + inputOffset];
                output[12 + outputOffset] = x12 + input[12 + inputOffset];
                output[13 + outputOffset] = x13 + input[13 + inputOffset];
                output[14 + outputOffset] = x14 + input[14 + inputOffset];
                output[15 + outputOffset] = x15 + input[15 + inputOffset];

EDIT: Oops, got it wrong the first time, corrected. Inlining doesn't get you much. Still, x2 faster than original implementation.
vip
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
By "giving bits" I mean giving information that cuts potential search space by half for each bit given, an information theoretical bit, not referring to ASCII. Like, saying whether a letter in a given spot is upper or lower case is to leak one bit.
donator
Activity: 2058
Merit: 1054
By the way, this money is going to get taken one way or another, even if I have to start leaking bits of the password to speed up the process.

Tell me the first 4 bits and I'll get it in no time.  Cool

That means if I don't tell you the first 4 bits, you'll get it in 16 * (no time).

But if you know some way where just 4 bits would give you more than a 16x advantage, please share!

Okay, I will.

Possibilities for ASCII code of first letter with no info: 65...90, 97...122 (52 possibilities, 2704 possibilities for first two letters)
Provide the following bits: ???11??? ???11???

This narrows down the possibilities for each letter to only six options: 01011000 01011001 01011010 01111000 01111001 01111010 (88,89,90,120,121,122 = X,Y,Z,x,y,z), so 36 possibilities for the first two letters. 2704/36 = 75.11x advantage.
Yes, if those are indeed the bit values at these places. If they are other values the advantage will be markedly less than x16.

In general, querying 4 bits can never cut down your search space to less than 1/16 on average.

(I'm guessing you already knew that, this is more a service to those confused about the "magic".)
sr. member
Activity: 430
Merit: 250
Isnt there a  trick you can use prehashed tables to dot his in rapid fire?

somewhere i have a CD/DVD/USB with ever possible windows XP password hash  at the time that wintergen would spit out... made cracking brute-force a lot easier...
Check the BIP 38. It isn't quite that simple.
hero member
Activity: 507
Merit: 500
Isnt there a  trick you can use prehashed tables to dot his in rapid fire?

somewhere i have a CD/DVD/USB with ever possible windows XP password hash  at the time that wintergen would spit out... made cracking brute-force a lot easier...
sr. member
Activity: 330
Merit: 397
By the way, this money is going to get taken one way or another, even if I have to start leaking bits of the password to speed up the process.

Tell me the first 4 bits and I'll get it in no time.  Cool

That means if I don't tell you the first 4 bits, you'll get it in 16 * (no time).

But if you know some way where just 4 bits would give you more than a 16x advantage, please share!

Okay, I will.

Possibilities for ASCII code of first letter with no info: 65...90, 97...122 (52 possibilities, 2704 possibilities for first two letters)
Provide the following bits: ???11??? ???11???

This narrows down the possibilities for each letter to only six options: 01011000 01011001 01011010 01111000 01111001 01111010 (88,89,90,120,121,122 = X,Y,Z,x,y,z), so 36 possibilities for the first two letters. 2704/36 = 75.11x advantage.
legendary
Activity: 1862
Merit: 1011
Reverse engineer from time to time
Is this going to require a lot of code to implement in C? I can probably do it, but I really don't want to bother if it's going to be a lot of code.
vip
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
It would take 36 years to run on the slow C# crappy implementation of scrypt that I have bundled with my app.  Simply feeding the same input to a more efficient implementation should chop this figure down by orders of magnitude.

C# generates this code for the inner loop:

                    x4 ^= (x0 << 7) | (x12 >> (32 - 7));
000002bc  mov         eax,dword ptr [ebp-28h]
000002bf  shl         eax,7
000002c2  mov         edx,dword ptr [ebp-58h]
000002c5  shr         edx,19h
000002c8  or          eax,edx
000002ca  xor         dword ptr [ebp-38h],eax

I don't think C++ compiler would do any better


Is this really the bottleneck?  I'd have figured that there would be a performance penalty stemming from all the allocation of objects on the heap and have never dug into the inner workings of scrypt this far.  Although I have worked with assembly quite a bit, I have never done so in the context of C#, and have assumed that the JIT compiler does a pretty good job of being efficient at the instruction level other than incurring a ton of overhead moving in and out of objects it so cautiously allocates.

How did you arrive at the above?  If I try to look for the same piece of code in my app, I get this:

Code:
                   x[4] ^= R(x[0] + x[12], 7);
00000113  mov         eax,dword ptr [ebp+8]
00000116  cmp         dword ptr [eax+4],4
0000011a  ja          00000121
0000011c  call        5A95AEC4
00000121  lea         eax,[eax+18h]
00000124  mov         dword ptr [ebp-54h],eax
00000127  mov         eax,dword ptr [ebp-54h]
0000012a  mov         eax,dword ptr [eax]
0000012c  mov         dword ptr [ebp-58h],eax
0000012f  mov         eax,dword ptr [ebp+8]
00000132  cmp         dword ptr [eax+4],0
00000136  ja          0000013D
00000138  call        5A95AEC4
0000013d  mov         ecx,dword ptr [eax+8]
00000140  mov         eax,dword ptr [ebp+8]
00000143  cmp         dword ptr [eax+4],0Ch
00000147  ja          0000014E
00000149  call        5A95AEC4
0000014e  add         ecx,dword ptr [eax+38h]
00000151  mov         edx,7
00000156  call        FB56FE08
0000015b  mov         dword ptr [ebp-5Ch],eax
0000015e  mov         eax,dword ptr [ebp-58h]
00000161  xor         eax,dword ptr [ebp-5Ch]
00000164  mov         edx,dword ptr [ebp-54h]
00000167  mov         dword ptr [edx],eax


Looks to me like it's going about it a much longer way than what you've quoted.  Are we looking at the same implementation?  (the code you've quoted isn't even something I could find verbatim in the implementation I used).  All of these instructions look like they are accomplishing nothing more than checking that the array references are in bounds, and the code you're quoting doesn't reference an array.  In fact it is checking very redundantly in spite of obvious possible optimizations (e.g. checking to see that x[0] is a in bounds immediately after having verified that x[4] is).  And it doesn't unroll the call to R(), something I wouldn't have expected anyway.

EDIT: not surprisingly, I get substantially better performance compiling this as a "release" build and then not running it under the debugger, the scrypt operation taking closer to 1 second versus 3 seconds without changing a single line of code, even though the release build still shows all the wasteful bounds checks if I look at the disassembly.
sr. member
Activity: 306
Merit: 257
It would take 36 years to run on the slow C# crappy implementation of scrypt that I have bundled with my app.  Simply feeding the same input to a more efficient implementation should chop this figure down by orders of magnitude.

C# generates this code for the inner loop:

                    x4 ^= (x0 << 7) | (x12 >> (32 - 7));
000002bc  mov         eax,dword ptr [ebp-28h]
000002bf  shl         eax,7
000002c2  mov         edx,dword ptr [ebp-58h]
000002c5  shr         edx,19h
000002c8  or          eax,edx
000002ca  xor         dword ptr [ebp-38h],eax

I don't think C++ compiler would do any better
full member
Activity: 196
Merit: 100
Well I'd started and my F***** RAID just crashed.

Ouch, hope you manage to get it back up and running.


I think I have just found something slower than generating bitcoins... 4 hours later
full member
Activity: 196
Merit: 100
Well I'd started and my F***** RAID just crashed.

Ouch, hope you manage to get it back up and running.

If we look down we see:

byte[] checksum = sha256.ComputeHash(utf8.GetBytes(passphrase + "?"));

            if (hex[2] != 0x80) {
                if ((checksum[0] & 0x7f) != hex[2] || (checksum[1] & 0x7e) != (hex[3] & 0x7e)) {
                    return false;
                }
            }

It seems we can recover a partial solution, because if it is NOT a partial solution the above will fail.

This isn't it.  I have implemented multiple ways to encrypt keys, the contest key has a prefix of 6Pf and therefore uses the EC Multiply method I detail in BIP 38.  The code you're quoting here is from ShaPassphraseKeyPair.cs, which is for a simpler SHA256-based algorithm that doesn't use scrypt, and the code you've quoted doesn't ever get reached when you attempt to decrypt the note in the OP.  The class you want to be poking at is Bip38KeyPair.cs, specifically the portion that creates a Bip38Intermediate from the passphrase and tries to use it to decrypt the key.

The use of EC Multiply and the Bip38Intermediate code isn't what makes it slow, rather, these are responsible for a couple of features: 1 - it allows one person to know the passphrase and give away only an "intermediate code", and a second person to generate bitcoin addresses with it that only the first person's passphrase can spend (which I intend to use to offer two-factor physical bitcoins).  2 - it happens to allow one passphrase to be expensively hashed once and then used to create thousands of new bitcoin addresses, each with unique per-address salt but the same passphrase, very quickly.

Quote
By getting the products up-to this stage, popping them over the serial port to an fpga...... and since Bitcoin is DOUBLE sh256
We can get two engines cycling thrugh the Sha256 keys for the range AAAAA-zzzzz ( actually by splitting the space over two devices we can half the searchtime with a brute it becomes 190102016

The algorithm that needs to be accelerated to make this work is scrypt, rather than sha256, so the FPGA may not be of much use.


Yep my RAID crashed during reading the source... then it crashed during the rebuild
TIP #1 Synology kit is great until something goes wrong.

I thought perhaps it was a little too easy......, but I have seen far more stupid things done with "secure" fingerprint diskdrives.

As regards accelerating scryp, it is unlikely as it was specifically written to ensure it cannot be pipeline easily

vip
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
I imagined there'd be some Litecoin fans who would be experts in knowing what kind of performance to expect on an optimized implementation of scrypt.  By "optimized", I mostly mean an implementation that isn't handicapped by being written in C#.  What I figured would crack this is maybe someone already has a good scrypt that could be compiled into a DLL and called from the C# environment so the better-performing code could be used without rewriting anything in another language.

Taking scrypt a little further, I was thinking it is almost time for a new proposal for a standardized way to do brainwallets in place of the common practice of using SHA256.

In a brainwallet proposal, I would propose that the user would be asked for their postal code and government ID number (however they interpret that) for use as salt, to generate or redeem a brainwallet.  Those two entries would be stripped of all but letters and numbers, forced to uppercase, and concatenated, simply so that the user has a maximum chance of remembering something that produces the same final result, without having to remember how they formatted it.  That personal data would be used as salt, run through scrypt, and then they could choose a strong passphrase without it having to be "to the moon" in length to be secure.
legendary
Activity: 1904
Merit: 1037
Trusted Bitcoiner
The algorithm that needs to be accelerated to make this work is scrypt

I'm thinking scrypt cannot be accelerated enough, its deigned to take time, this is what makes it strong.

I'm no expert, this is the conclusion i came to after a few mins of poking around.

scrypt is no joke.
vip
Activity: 1386
Merit: 1140
The Casascius 1oz 10BTC Silver Round (w/ Gold B)
Well I'd started and my F***** RAID just crashed.

Ouch, hope you manage to get it back up and running.

If we look down we see:

byte[] checksum = sha256.ComputeHash(utf8.GetBytes(passphrase + "?"));

            if (hex[2] != 0x80) {
                if ((checksum[0] & 0x7f) != hex[2] || (checksum[1] & 0x7e) != (hex[3] & 0x7e)) {
                    return false;
                }
            }

It seems we can recover a partial solution, because if it is NOT a partial solution the above will fail.

This isn't it.  I have implemented multiple ways to encrypt keys, the contest key has a prefix of 6Pf and therefore uses the EC Multiply method I detail in BIP 38.  The code you're quoting here is from ShaPassphraseKeyPair.cs, which is for a simpler SHA256-based algorithm that doesn't use scrypt, and the code you've quoted doesn't ever get reached when you attempt to decrypt the note in the OP.  The class you want to be poking at is Bip38KeyPair.cs, specifically the portion that creates a Bip38Intermediate from the passphrase and tries to use it to decrypt the key.

The use of EC Multiply and the Bip38Intermediate code isn't what makes it slow, rather, these are responsible for a couple of features: 1 - it allows one person to know the passphrase and give away only an "intermediate code", and a second person to generate bitcoin addresses with it that only the first person's passphrase can spend (which I intend to use to offer two-factor physical bitcoins).  2 - it happens to allow one passphrase to be expensively hashed once and then used to create thousands of new bitcoin addresses, each with unique per-address salt but the same passphrase, very quickly.

Quote
By getting the products up-to this stage, popping them over the serial port to an fpga...... and since Bitcoin is DOUBLE sh256
We can get two engines cycling thrugh the Sha256 keys for the range AAAAA-zzzzz ( actually by splitting the space over two devices we can half the searchtime with a brute it becomes 190102016

The algorithm that needs to be accelerated to make this work is scrypt, rather than sha256, so the FPGA may not be of much use.
donator
Activity: 2058
Merit: 1054
Well I'd started and my F***** RAID just crashed.

So I may as well share it.... if I understand it correctly.......

It seems that it may have a similar weakness to the zip format.. if I'm not mistaken.


public override bool DecryptWithPassphrase(string passphrase){
.....}


If we look down we see:

byte[] checksum = sha256.ComputeHash(utf8.GetBytes(passphrase + "?"));

            if (hex[2] != 0x80) {
                if ((checksum[0] & 0x7f) != hex[2] || (checksum[1] & 0x7e) != (hex[3] & 0x7e)) {
                    return false;
                }
            }

It seems we can recover a partial solution, because if it is NOT a partial solution the above will fail.

By getting the products up-to this stage, popping them over the serial port to an fpga...... and since Bitcoin is DOUBLE sh256
We can get two engines cycling thrugh the Sha256 keys for the range AAAAA-zzzzz ( actually by splitting the space over two devices we can half the searchtime with a brute it becomes 190102016

Each time we get a 'hit' from the above, we pop it back to the computer to drop it into the code that follows the above code in  "DecryptWithPassphrase"

so even with a XUPV5 I can get over 500MHS through the key address space

52*52*52*52*52=380204032

0.76 seconds  Unless my maths have broken down.

Like I say my development env. crashed so I've nothing to test with.
This part of the code doesn't even run when doing the decryption. It has nothing to do with the problem. The code you need is in Bip38KeyPair.cs.

Anyway, doing SHA256 on the entire input range takes about 100 seconds, so that's more or less what you could save (out of decades) by offloading such calculations (if they exist).

I just read this:
Quote
On modern hardware and with default parameters, the cost of cracking the password on a file encrypted by scrypt enc is approximately 100 billion times more than the cost of cracking the same password on a file encrypted by openssl enc; this means that a five-character password using scrypt is stronger than a ten-character password using openssl.

https://www.tarsnap.com/scrypt.html
There's no way anyone is going to crack this via simple brute force. A dictionary attack is the only plausible option. If the password is a random jumble of lowercase and uppercase characters, I doubt anyone will crack it.
Perhaps for the first clue casascius will say if the password is random or meaningful.
legendary
Activity: 1536
Merit: 1000
electronic [r]evolution
I just read this:
Quote
On modern hardware and with default parameters, the cost of cracking the password on a file encrypted by scrypt enc is approximately 100 billion times more than the cost of cracking the same password on a file encrypted by openssl enc; this means that a five-character password using scrypt is stronger than a ten-character password using openssl.

https://www.tarsnap.com/scrypt.html
There's no way anyone is going to crack this via simple brute force. A dictionary attack is the only plausible option. If the password is a random jumble of lowercase and uppercase characters, I doubt anyone will crack it.
Pages:
Jump to: