Preface
This article assumes a preliminary understanding of Abstract Syntax Tree structure and BabelJS. Click Here to read my introductory article on the usage of Babel.
It also assumes that you’ve read my article about constant folding. If you haven’t already read it, you can do so by clicking here
Introduction
JSFuck is an esoteric and educational programming style based on the atomic parts of JavaScript. It uses only six different characters to write and execute code. I won’t be covering the intricacies of how JSFuck operates, so please refer to the official site if you’d like to learn more about it.
Example 1: A Simple JSFuck Case
Code obfuscated in JSFuck style tends to look like this:
js 1// JSFUckObfuscated.js
2+~!+!+!~+!!(
3 !+(
4 !+[] +
5 !![] +
6 !![] +
7 !![] +
8 !![] +
9 !![] +
10 !![] +
11 !![] +
12 [] +
13 (!+[] + !![] + !![]) +
14 (!+[] + !![] + !![] + !![] + !![] + !![] + !![] + !![]) +
15 (!+-[] + +-!![] + -[]) +
16 (!+[] + !![] + !![] + !![]) +
17 -~~~[] +
18 (!+[] + !![] + !![] + !![] + !![] + !![]) +
19 (!+[] + !![] + !![] + !![]) +
20 (!+[] + !![] + !![] + !![] + !![] + !![] + !![])
21 ) /
22 +(
23 !+[] +
24 !![] +
25 !![] +
26 !![] +
27 !![] +
28 !![] +
29 !![] +
30 !![] +
31 [] +
32 (!+[] + !![] + !![] + !![] + !![] + !![] + !![] + !![]) +
33 (!+[] + !![] + !![] + !![] + !![] + !![] + !![] + !![]) +
34 (!+[] + !![] - []) +
35 (!+[] + !![] + !![]) +
36 (!+-[] + +-!![] + -[]) +
37 (!+[] + !![] + !![] + !![] + !![] + !![] + !![] + !![]) +
38 (!+[] + !![] + !![] + !![] + !![]) +
39 (!+[] + !![] + !![] + !![] + !![] + !![] + !![])
40 )
41);
Let’s try evaluating this code in the console:
We can see that it leads to a constant: -1
. Our goal is to simplify code looking like this down to a constant number.
Before anything, let’s try reusing the code from my constant folding article.
Trying the Original Deobfuscator
javascript 1/**
2 * Deobfuscator.js
3 * The babel script used to deobfuscate the target file
4 *
5 */
6const parser = require("@babel/parser");
7const traverse = require("@babel/traverse").default;
8const t = require("@babel/types");
9const generate = require("@babel/generator").default;
10const beautify = require("js-beautify");
11const { readFileSync, writeFile } = require("fs");
12
13/**
14 * Main function to deobfuscate the code.
15 * @param source The source code of the file to be deobfuscated
16 *
17 */
18function deobfuscate(source) {
19 //Parse AST of Source Code
20 const ast = parser.parse(source);
21
22 // Visitor for constant folding
23 const foldConstantsVisitor = {
24 BinaryExpression(path) {
25 let { confident, value } = path.evaluate(); // Evaluate the binary expression
26 if (!confident) return; // Skip if not confident
27 let actualVal = t.valueToNode(value); // Create a new node, infer the type
28 if (!t.isLiteral(actualVal)) return; // Skip if not a Literal type (e.g. StringLiteral, NumericLiteral, Boolean Literal etc.)
29 path.replaceWith(actualVal); // Replace the BinaryExpression with the simplified value
30 },
31 };
32
33 // Execute the visitor
34 traverse(ast, foldConstantsVisitor);
35
36 // Code Beautification
37 let deobfCode = generate(ast, { comments: false }).code;
38 deobfCode = beautify(deobfCode, {
39 indent_size: 2,
40 space_in_empty_paren: true,
41 });
42 // Output the deobfuscated result
43 writeCodeToFile(deobfCode);
44}
45/**
46 * Writes the deobfuscated code to output.js
47 * @param code The deobfuscated code
48 */
49function writeCodeToFile(code) {
50 let outputPath = "output.js";
51 writeFile(outputPath, code, (err) => {
52 if (err) {
53 console.log("Error writing file", err);
54 } else {
55 console.log(`Wrote file to ${outputPath}`);
56 }
57 });
58}
59
60deobfuscate(readFileSync("./JSFuckObfuscated.js", "utf8"));
After processing the obfuscated script with the babel plugin above, we get the following result:
Post-Deobfuscation Result
javascript1+~!+!+!~+!!0;
That’s a lot of simplification! We could now just deduce from manual inspection that the result would be equal to -1
. But, we’d prefer our debugger to do all the work for us. Time to analyze our code and make some changes!
Analysis Methodology
As always, we start our analysis using AST Explorer.
But first, let’s make a small simplification to the analysis process. Normally, we would paste the entire obfuscated script into AST explorer. However, we already know that our original constant folding visitor can do the majority of the cases for us. So, instead of analyzing the entire original script, we can shift our focus to what our deobfuscator is not doing. Therefore, we’ll only analyze the resulting code of our deobfuscator to figure out what we need to add.
That means we only need to analyze this one-liner: +~!+!+!~+!!0;
. Let’s paste that into AST Explorer and see what we get.
Even though this code contains +
operators, there are no BinaryExpression
s present. In this case, the +
’s are Unary Operators. In fact, this code only contains nodes of type UnaryExpression
, which then act on a single NumericLiteral
node.
So, you may have realized by now why our deobfuscator doesn’t fully work. Our deobfuscator is only accounting for BinaryExpressions
, and we have yet to add functionality to handle UnaryExpressions
! So, let’s do that.
Writing the Deobfuscator Logic
Thankfully for us, the path.evaluate()
method can also be used for UnaryExpressions. So, we should also create a visitor for nodes of type UnaryExpression
, and run the same transformation for them.
If you’re still new to Babel, your first instinct might be to create two separate visitors: one for UnaryExpression
s, and one for BinaryExpression
s; then copy-paste the original plugin code inside of both. However, there is a much cleaner way of accomplishing the same thing. Babel allows you to run the same function for multiple visitor nodes by separating them with a |
in the method name as a string. In our case, that would look like: "BinaryExpression|UnaryExpression"(path)
.
In essence, all we need to do is change BinaryExpression(path)
to "BinaryExpression|UnaryExpression"(path)
in our deobfuscator. This will mostly work, but I want to explain some interesting findings regarding evaluation of UnaryExpressions.
The Problem
Problems with UnaryExpression Evaluation
path.evaluate()
, UnaryExpression
s, and t.valueToNode()
don’t work very well with each other due to their source code implementation. I’ll explain with a short example:
Let’s say we have the following code:
js1~false;
and we want to simplify it to:
js1-1;
If we use the original code from the constant folding article and only replace the method name, we’ll have this visitor:
js 1// ...
2const foldConstantsVisitor = {
3 "BinaryExpression|UnaryExpression"(path) {
4 let { confident, value } = path.evaluate(); // Evaluate the binary expression
5 if (!confident) return; // Skip if not confident
6 let actualVal = t.valueToNode(value); // Create a new node, infer the type
7 if (!t.isLiteral(actualVal)) return; // Skip if not a Literal type (e.g. StringLiteral, NumericLiteral, Boolean Literal etc.)
8 path.replaceWith(actualVal); // Replace the BinaryExpression with the simplified value
9 },
10};
11
12// Execute the visitor
13traverse(ast, foldConstantsVisitor);
14
15// ...
But, if we run this, we’ll see that it returns:
js1~false;
which isn’t simplified at all.
Here’s why:
-
t.valueToNode()
’s implementation. Runningpath.evaluate()
correctly returns an integer value,-1
. However, t.valueToNode(-1) doesn’t create aNumericLiteral
node with a value of-1
as we would expect. Instead, it creates anotherUnaryExpression
node, with propertiesoperator: -
andargument: 1
. As such,if (!t.isLiteral(actualVal)) return
results in an early return before replacement. -
Even if we delete
if (!t.isLiteral(actualVal)) return
from our code, there’s still an issue. Sincet.valueToNode(-1)
constructs aUnaryExpression
, we are checking UnaryExpressions, and we have no additional checks, our program will result in an infinite recursive loop, crashing our program once the maximum call stack size is exceeded:
- Though not directly applicable to this code snippet, another problem is worth mentioning. Unary expressions can also have a
void
operator. Based on Babel’s source code, callingpath.evaluate()
on anyUnaryExpression
with avoid
operator will simplify it toundefined
, regardless of what the argument is.
This can be problematic in some cases, such as this example:
- Snippet 1:
1var a = 1;
2function set() {
3 a = 2;
4}
5void set();
6console.log(a); // => 2
Calling path.evaluate()
to simplify the void set()
UnaryExpression
yields this:
- Snippet 2:
1var a = 1;
2function set() {
3 a = 2;
4}
5undefined;
6
7console.log(a); // => 1
The two pieces of code above are clearly not the same, as you can verify with their output.
The Fix
Thankfully, these three conditions are simple to account for. We can solve each of them as follows:
- Delete the
if (!t.isLiteral(actualVal)) return
check. - Add a check at the beginning of the visitor method to skip the node if it is a
UnaryExpression
with a-
operator. - Add a check at the beginning of the visitor method to skip the node if it is a
UnaryExpression
with avoid
operator.
I’ve also neglected to mention this before, but when using path.evaluate()
, it’s best practice to also skip the replacement of nodes when it evaluates Infinity
or -Infinity
by returning early. This is because t.valueToNode(Infinity)
creates a node of type BinaryExpression, which looks like 1 / 0
. Similarly, t.valueToNode(-Infinity)
creates a node of type UnaryExpression, which looks like -(1/0)
. In both of these cases, it can cause an infinite loop since our visitor will also visit the created nodes, which will crash our deobfuscator.
Summarizing the Logic
So putting that all together, we have the following logic for our deobfuscator:
-
Traverse the ast for nodes of type
BinaryExpression
andUnaryExpression
. -
Upon encountering one:
-
Check if it is of type
UnaryExpression
and uses avoid
or-
operator. If the condition is true, skip the node by returning. -
Evaluate the node using
path.evaluate()
. -
If
path.evaluate()
returns{confident:false}
, or{value:Infinity}
or{value:-Infinity}
, skip the node by returning. -
Construct a new node from the returned
value
, and replace the original node with it.
The Babel implementation looks like this:
Babel Deobfuscation Script
js 1/**
2 * Deobfuscator.js
3 * The babel script used to deobfuscate the target file
4 *
5 */
6const parser = require("@babel/parser");
7const traverse = require("@babel/traverse").default;
8const t = require("@babel/types");
9const generate = require("@babel/generator").default;
10const beautify = require("js-beautify");
11const { readFileSync, writeFile } = require("fs");
12
13/**
14 * Main function to deobfuscate the code.
15 * @param source The source code of the file to be deobfuscated
16 *
17 */
18function deobfuscate(source) {
19 //Parse AST of Source Code
20 const ast = parser.parse(source);
21
22 // Visitor for constant folding
23 const constantFold = {
24 "BinaryExpression|UnaryExpression"(path) {
25 const { node } = path;
26 if (
27 t.isUnaryExpression(node) &&
28 (node.operator == "-" || node.operator == "void")
29 )
30 return;
31 let { confident, value } = path.evaluate(); // Evaluate the binary expression
32 if (!confident || value == Infinity || value == -Infinity) return; // Skip if not confident
33
34 let actualVal = t.valueToNode(value); // Create a new node, infer the type
35 path.replaceWith(actualVal); // Replace the BinaryExpression with the simplified value
36 },
37 };
38
39 //Execute the visitor
40 traverse(ast, constantFold);
41 // Code Beautification
42 let deobfCode = generate(ast, { comments: false }).code;
43 deobfCode = beautify(deobfCode, {
44 indent_size: 2,
45 space_in_empty_paren: true,
46 });
47 // Output the deobfuscated result
48 writeCodeToFile(deobfCode);
49}
50/**
51 * Writes the deobfuscated code to output.js
52 * @param code The deobfuscated code
53 */
54function writeCodeToFile(code) {
55 let outputPath = "output.js";
56 writeFile(outputPath, code, (err) => {
57 if (err) {
58 console.log("Error writing file", err);
59 } else {
60 console.log(`Wrote file to ${outputPath}`);
61 }
62 });
63}
64
65deobfuscate(readFileSync("./jsFuckObfuscated.js", "utf8"));
After processing the obfuscated script with the babel plugin above, we get the following result:
Post-Deobfuscation Result
js1-1;
And we finally arrive at the correct constant value!
Example 2: A More Peculiar Case
That first example was just a warm-up, and not what I really wanted to focus on (hence the title of the article). This next example isn’t too much more difficult, but it will require you to think a bit outside of the box.
Here’s our obfuscated sample:
js1let obbedVar = +([[[[[[]], , ,]]]] != 0);
Let’s try running this in a javascript console to see what it simplifies to:
So, it simplifies to a numeric literal, 1
!
The structure of the obfuscated sample looks similar to that of the first example. We know that it leads to a constant, so let’s first try running this sample through the improved deobfuscator we created above.
If you do that, you’ll see that it yields:
js1let obbedVar = +([[[[[[]], , ,]]]] != 0);
Which is no different! So, why is our code breaking?
Analysis Methodology
The Problem
Intuitively, you can probably guess what’s causing the issue. The only real difference is that there seems to be an array containing blank elements: [, , ,]
.
But why would that even matter? Let’s paste our code into AST Explorer to try and figure out what’s going on.
We know that everything else seems normal except for the array containing empty elements, so let’s focus on that. We can highlight the empty elements in the code using our cursor to automatically show their respective nodes on the right-hand side.
You’ll notice something strange! There are elements of the array that are null
. Keep in mind, in Babel, the node types Literal
and Identifier
are used to represent null
and undefined
respectively (as shown below):
But, in this case, we don’t even have a node! It’s just simply null
.
Let’s look inside of Babel’s source code implementation for path.evaluate()
to see why the script breaks when encountering this. You can view the original script from the official GitHub repository, or by navigating to \node_modules\@babel\traverse\lib\path\evaluation.js
.
The above code snippet can be found in the _evaluate()
function, which runs as a helper for the main evaluate()
function.
We can see that when path.evaluate()
is called on an array expression, it tries to recursively evaluate all of its inner elements.However, if the evaluation fails/returns confident: false
for any element in the array, the entire evaluation short circuits.
But, Babel actually has an implementation to handle occurrences of undefined
and null
in source code:
However, that’s only after they’re converted to either a node of type NullLiteral
or Identifier
. In evaluation.js, there isn’t any handling for when a null
value is encountered, so the method will return confident: false
whenever an empty array element is encountered.
The Fix
We shouldn’t give up though, since we KNOW that it’s possible to evaluate the code to a constant because we tested it in a console before. Let’s use a console again, this time to see what an empty element in an array is actually equal to:
We can see that trying to access an empty element in an array returns undefined
! Okay, but how does that help us?
Recall that Babel has an implementation for handling undefined
in evaluation.js. However, the reason it didn’t work was because Babel failed to convert the empty array elements to a node. To fix our problem, all we have to do is replace any empty elements in arrays with undefined
beforehand, so Babel can recognize them as the undefined
keyword and evaluate them properly!
Writing the Deobfuscator Logic
The deobfuscator logic is as follows:
- Traverse the ast for
ArrayExpression
s. When one is encountered: - Check if the element is falsy. A node representation of
undefined
ornull
still will not be falsy, since a node is an object. - If it’s falsy, replace it with the node representation of
undefined
. - Run our constant folding plugin from Example 1. The babel deobfuscation code is shown below:
Babel Deobfuscation Script
js 1/**
2 * Deobfuscator.js
3 * The babel script used to deobfuscate the target file
4 *
5 */
6const parser = require("@babel/parser");
7const traverse = require("@babel/traverse").default;
8const t = require("@babel/types");
9const generate = require("@babel/generator").default;
10const beautify = require("js-beautify");
11const { readFileSync, writeFile } = require("fs");
12
13/**
14 * Main function to deobfuscate the code.
15 * @param source The source code of the file to be deobfuscated
16 *
17 */
18function deobfuscate(source) {
19 //Parse AST of Source Code
20 const ast = parser.parse(source);
21
22 const fixArrays = {
23 ArrayExpression(path) {
24 for (elem of path.get("elements")) {
25 if (!elem.node) {
26 elem.replaceWith(t.valueToNode(undefined));
27 }
28 }
29 },
30 };
31
32 traverse(ast, fixArrays);
33
34 // Visitor for constant folding
35 const constantFold = {
36 "BinaryExpression|UnaryExpression"(path) {
37 const { node } = path;
38 if (
39 t.isUnaryExpression(node) &&
40 (node.operator == "-" || node.operator == "void")
41 )
42 return;
43 let { confident, value } = path.evaluate(); // Evaluate the binary expression
44 if (!confident || value == Infinity || value == -Infinity) return; // Skip if not confident
45
46 path.replaceWith(t.valueToNode(value)); // Replace the BinaryExpression with a new node of inferred type
47 },
48 };
49
50 //Execute the visitor
51 traverse(ast, constantFold);
52 // Code Beautification
53 let deobfCode = generate(ast, { comments: false }).code;
54 deobfCode = beautify(deobfCode, {
55 indent_size: 2,
56 space_in_empty_paren: true,
57 });
58 // Output the deobfuscated result
59 writeCodeToFile(deobfCode);
60}
61/**
62 * Writes the deobfuscated code to output.js
63 * @param code The deobfuscated code
64 */
65function writeCodeToFile(code) {
66 let outputPath = "output.js";
67 writeFile(outputPath, code, (err) => {
68 if (err) {
69 console.log("Error writing file", err);
70 } else {
71 console.log(`Wrote file to ${outputPath}`);
72 }
73 });
74}
75
76deobfuscate(readFileSync("./jsFuckObfuscated.js", "utf8"));
After processing the obfuscated script with the babel plugin above, we get the following result:
Post-Deobfuscation Result
js1let obbedVar = 1;
And we’ve successfully simplified it down to a constant!
Conclusion
I will admit that this article may have been a bit longer than it needed to be. However, I felt that for my beginner-level readers, it would be more helpful to explain the entire reverse-engineering thought process; including where and why some things go wrong, and the logical process of constructing a solution.
Okay, that’s all I have to cover for today. If you’re interested, you can find the source code for all the examples in this repository.
I hope this article helped you learn something new. Thanks for reading, and happy reversing! 😄