Author

Topic: reverse-engineer very large JavaScript web app [Suggest reasonable price] (Read 2945 times)

member
Activity: 76
Merit: 10
Sorry, I decided to shelve this project.  I think I'm going to try other approaches to solving this.
full member
Activity: 227
Merit: 100
regular expressions on their own is not suitable.

Consider using http://www.antlr.org/wiki/display/ANTLR3/Antlr3PerlTarget for CFG's and creating an AST of the javascript code.

Doing naming heuristics is going to be a bitch

Yeah, perhaps an AST would be the best way to de-obfuscate JavaScript. I haven't checked, but I'm sure there are existing open-source modules for producing an AST from JavaScript.

http://timwhitlock.info/blog/2009/11/14/jparser-and-jtokenizer-released/ here is a php one
member
Activity: 76
Merit: 10
regular expressions on their own is not suitable.

Consider using http://www.antlr.org/wiki/display/ANTLR3/Antlr3PerlTarget for CFG's and creating an AST of the javascript code.

Doing naming heuristics is going to be a bitch

Yeah, perhaps an AST would be the best way to de-obfuscate JavaScript. I haven't checked, but I'm sure there are existing open-source modules for producing an AST from JavaScript.
member
Activity: 76
Merit: 10
I have already added appropriate line breaks and spacing to all the JavaScript. So the 80,000 line number is _with_ the added line breaks.
member
Activity: 96
Merit: 10
No, I don't. That is part of the work -- to know the meaningful names, you must figure out what the code does.

Then I'm not sure a Perl script is what you are currently looking for...

This would be a completely manual process (with search/replace done in an editor) to figure out what each function does, how it should be named and how it would affect the rest of the script. At this point, post-figuring it all out, it would already be done, converted and if the person was smart, they would have made plenty of inline comments during this process (it would only have helped them out).

A Perl (or any language) script at this point wouldn't serve much purpose except for replaying what has already been figured out...

How big is the JS file you are trying to un-obfuscate? Is it currently a 1-line minified file of X characters length, or has it already been beautified to break out the functions over X amount of lines?
full member
Activity: 227
Merit: 100
regular expressions on their own is not suitable.

Consider using http://www.antlr.org/wiki/display/ANTLR3/Antlr3PerlTarget for CFG's and creating an AST of the javascript code.

Doing naming heuristics is going to be a bitch
hero member
Activity: 588
Merit: 500
It's going to be very hard to provide any sort of reasonable price suggestion without knowing more about just what it is you want decoded.
member
Activity: 76
Merit: 10
And, I found a bug in your example perl script. Tongue These lines should be:

Code:
open INPUT_JS, "<", "input.js" or die "Failed to open";
open OUTPUT_JS, ">", "output.js" or die "Failed to open";

Additionally I don't think a simple one-time pass would be ideal for un-obfuscating javascript - the biggest issue is dealing with global vs local scope and how to handle that appropriately.

Fixed in original post. Thanks.
member
Activity: 76
Merit: 10
Do you already have the meaningful names that the variables would be converted to - or would that be part of the work?

No, I don't. That is part of the work -- to know the meaningful names, you must figure out what the code does.
legendary
Activity: 1372
Merit: 1007
1davout
Please supply a list of Perl regexp transformations to rename the identifiers (JS variables, function names, etc.) to meaningful names. They are currently things like "Cb", "Fc", "T9a", etc.
Is the regexp supposed to magically guess a meaningful name from a two or three letter variable name ?
member
Activity: 76
Merit: 10
No, it doesn't have to be Perl, and no, it doesn't have to be a one-pass series of regexp search&replace statements.  If you have a better way to do it, great!
member
Activity: 96
Merit: 10
And, I found a bug in your example perl script. Tongue These lines should be:

Code:
open INPUT_JS, "<", "input.js" or die "Failed to open";
open OUTPUT_JS, ">", "output.js" or die "Failed to open";

Additionally I don't think a simple one-time pass would be ideal for un-obfuscating javascript - the biggest issue is dealing with global vs local scope and how to handle that appropriately.
hero member
Activity: 588
Merit: 500
Does it have to be Perl? It may turn out to be easier to do it in another language. (Not that I have anything against Perl.)

And, I found a bug in your example perl script. Tongue These lines should be:

Code:
open INPUT_JS, "<", "input.js" or die "Failed to open";
open OUTPUT_JS, ">", "output.js" or die "Failed to open";
member
Activity: 96
Merit: 10
Do you already have the meaningful names that the variables would be converted to - or would that be part of the work?
member
Activity: 76
Merit: 10
I have about 80000 lines of javascript along with an HTML file. Please supply a list of Perl regexp transformations to rename the identifiers (JS variables, function names, etc.) to meaningful names. They are currently things like "Cb", "Fc", "T9a", etc.

(EDIT May 29: Deliverable is not a script; see the note at the end of this post)
Each line in the Perl script you give to me should transform $_, assuming $_ is a line of the JavaScript code that I supply you. For example:

Code:
s/this\.gB/this.descriptive_name/g;

Example of complete Perl script (including boilerplate code):

Code:
#!/usr/bin/perl -w

use warnings;
use strict;

open INPUT_JS, "open OUTPUT_JS, ">output.js" or die "Failed to open";
foreach () {
    s/this\.gB/this.descriptive_name/g;
    s/Ay/other_name/g;
    ...     # hundreds of other lines
    print OUTPUT_JS $_;
}

Example input:

Code:
   function YPb(b, a) {
        var c = a.ka();
        return c != 2 && c != 3 && c != 4 ? !1 : (c = uj(b.Fa())) && c.Ef() ? a.jv() ? c.Ef().ha() && c.Ef().Rb() : a.Da() : !1
    }

Execution time is not important, but correctness and completeness are.  Specify a price that takes into account Bitcoin's value fluctuations, since its value is likely to have changed significantly over the course of this work.  After I pay you, if I like your work, a follow-up job will be to comment the result of the last job.

I prefer IRC for communication. Please specify if this is okay with you.  Keep me updated periodically on your progress.  I can pay incrementally as you get more of the work done.

Deliverable will be the Perl script (not the output JS). (EDIT May 29: the deliverable is the JavaScript: see note at the end) Payment will be in Bitcoin.

tl/dr: This is not a quick job, but it will definitely stimulate your gray matter. Can anyone handle this? Is anyone man enough for this? Tongue

I DIDN'T THINK SO.

Useful Resources: http://perldoc.perl.org/perlre.html http://jsbeautifier.org/ http://www.prototypejs.org/api https://developer.mozilla.org/en/JavaScript  https://developer.mozilla.org/en/JavaScript/Guide/Closures

EDIT May 29:
Now, the deliverable is the de-obfuscated JavaScript, along with the complete, repeatable procedure (in English and/or pseudo-code) that you followed to de-obfuscate it.
Jump to: