Hey, have you ever thought about how cool and unique your algorithms are? š A lot of programmers and companies do, which is why they might be hesitant to share their work with everyone. This problem gets a little more complex if part of the code is moved to the server (for client-server applications), but this approach isn't always possible. Sometimes, we have to leave sensitive code sections right out in the open.
In this article, we're going to take a look at obfuscation in JavaScriptācreating ways to hide algorithms and make it harder to study code. We'll also be exploring what AST is and discuss tools that can be used to interact with it to implement obfuscation.
Here's a silly example. Imagine this:
Bob goes to a site that's giving away computer monitors (here it is -> šŗ). Bob's monitor is better, but free stuff is always nice!
When Bob visits the site, JavaScript runs in the browser, collecting data about the user's device and sending it to the server:
let w = screen.width, h = screen.height; // Let's say there's a logic with some check. console.info(w, h);
Unfortunately, Bob can't access the giveaway page, and he's pretty upset about it. He doesn't understand why. Then he learns in the rules of the giveaway that users with big, good monitors are not allowed.
Luckily, Bob had taken some computer science classes in high school. He opens the developer console by hitting F12, studies the script, and realizes that the organizers check the screen resolution. He then decides to participate from his phone and successfully passes the test.
A fictional story with a happy ending - but it couldn't have been this good if the main character had seen this instead of the previous code:
l=~[];l={___:++l,$$$$:(![]+"")[l],__$:++l,$_$_:(![]+"")[l],_$_:++l,$_$$:({}+"")[l],$$_$:(l[l]+"")[l],_$$:++l,$$$_:(!""+"")[l],$__:++l,$_$:++l,$$__:({}+"")[l],$$_:++l,$$$:++l,$___:++l,$__$:++l};l.$_=(l.$_=l+"")[l.$_$]+(l._$=l.$_[l.__$])+(l.$$=(l.$+"")[l.__$])+((!l)+"")[l._$$]+(l.__=l.$_[l.$$_])+(l.$=(!""+"")[l.__$])+(l._=(!""+"")[l._$_])+l.$_[l.$_$]+l.__+l._$+l.$;l.$$=l.$+(!""+"")[l._$$]+l.__+l._+l.$+l.$$;l.$=(l.___)[l.$_][l.$_];l.$(l.$(l.$$+"\""+(![]+"")[l._$_]+l.$$$_+l.__+"\\"+l.$__+l.___+"\\"+l.__$+l.$$_+l.$$$+"\\"+l.$__+l.___+"=\\"+l.$__+l.___+"\\"+l.__$+l.$$_+l._$$+l.$$__+"\\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\\"+l.__$+l.$_$+l.$$_+".\\"+l.__$+l.$$_+l.$$$+"\\"+l.__$+l.$_$+l.__$+l.$$_$+l.__+"\\"+l.__$+l.$_$+l.___+",\\"+l.$__+l.___+"\\"+l.__$+l.$_$+l.___+"\\"+l.$__+l.___+"=\\"+l.$__+l.___+"\\"+l.__$+l.$$_+l._$$+l.$$__+"\\"+l.__$+l.$$_+l._$_+l.$$$_+l.$$$_+"\\"+l.__$+l.$_$+l.$$_+".\\"+l.__$+l.$_$+l.___+l.$$$_+"\\"+l.__$+l.$_$+l.__$+"\\"+l.__$+l.$__+l.$$$+"\\"+l.__$+l.$_$+l.___+l.__+";\\"+l.__$+l._$_+l.$$__+l._$+"\\"+l.__$+l.$_$+l.$$_+"\\"+l.__$+l.$$_+l._$$+l._$+(![]+"")[l._$_]+l.$$$_+".\\"+l.__$+l.$_$+l.__$+"\\"+l.__$+l.$_$+l.$$_+l.$$$$+l._$+"(\\"+l.__$+l.$$_+l.$$$+",\\"+l.$__+l.___+"\\"+l.__$+l.$_$+l.___+");"+"\"")())();
I assure you, it's not gibberish, it's JavaScript! And it performs the same actions. You can try to run the code in the console here.
I guess in this case, our hero would've just accepted his fate by not taking part in the giveaway, and the organizers would've kept their plan.
So what's the point here? Congrats - you've just seen the jjencode tool in action. Youāve also gotten a glimpse of what obfuscation is and how it can be used. In summary, obfuscation is the process of converting program code or data into a form that's hard for humans to understand but still works for a machine or program.
Enough theories, let's move on to more practical examples šØāš¬.
Now, let's try to convert the code with the help of obfuscations you are more likely to find online. Using some code thatās a bit more interesting (because it contains our āknow-howā operations). And it is highly unlikely that everyone who is not too lazy to reach F12 can find out about them:
function getGpuData(){
let cnv = document.createElement("canvas");
let ctx = cnv.getContext("webgl");
const rendererInfo = ctx.getParameter(ctx.RENDERER);
const vendorInfo = ctx.getParameter(ctx.VENDOR);
return [rendererInfo, vendorInfo]
}
function getLanguages(){
return window.navigator.languages;
}
let data = {};
data.gpu = getGpuData();
data.langs = getLanguages();
console.log(JSON.stringify(data))
This code collects device and browser data and outputs the result to the console, for example (we'll use the output as a metric of the code's performance):
{"gpu":["ANGLE (NVIDIA, NVIDIA GeForce GTX 980 Direct3D11 vs_5_0 ps_5_0), or similar","Mozilla"],"langs":["en-US","en"]}
Now let's take the above code and modify it with a popular obfuscator for JS - obfuscator.io. As a result, we will get a code like this:
function _0x9591(_0x587f42,_0x4b4b1a){const _0x581ade=_0x581a();return _0x9591=function(_0x9591f0,_0x20e0a4){_0x9591f0=_0x9591f0-0x18a;let _0x2b716b=_0x581ade[_0x9591f0];return _0x2b716b;},_0x9591(_0x587f42,_0x4b4b1a);}const _0x5d747e=_0x9591;(function(_0x41cdc1,_0x26e305){const _0xd47419=_0x9591,_0x3c47fc=_0x41cdc1();while(!![]){try{const _0x1c6e63=-parseInt(_0xd47419(0x18c))/0x1*(parseInt(_0xd47419(0x18d))/0x2)+parseInt(_0xd47419(0x18f))/0x3+-parseInt(_0xd47419(0x18b))/0x4+-parseInt(_0xd47419(0x195))/0x5+parseInt(_0xd47419(0x196))/0x6+-parseInt(_0xd47419(0x19e))/0x7*(parseInt(_0xd47419(0x192))/0x8)+parseInt(_0xd47419(0x19a))/0x9;if(_0x1c6e63===_0x26e305)break;else _0x3c47fc['push'](_0x3c47fc['shift']());}catch(_0x2210e4){_0x3c47fc['push'](_0x3c47fc['shift']());}}}(_0x581a,0x5b85c));function _0x59d10d(){const _0x12260c=_0x9591;let _0x14403e=document[_0x12260c(0x197)](_0x12260c(0x191)),_0xf297ee=_0x14403e[_0x12260c(0x19b)](_0x12260c(0x199));const _0x16d7eb=_0xf297ee[_0x12260c(0x19f)](_0xf297ee[_0x12260c(0x198)]),_0x3174f4=_0xf297ee[_0x12260c(0x19f)](_0xf297ee[_0x12260c(0x193)]);return[_0x16d7eb,_0x3174f4];}function _0x157cda(){const _0x52b881=_0x9591;return window[_0x52b881(0x19c)][_0x52b881(0x18a)];}let _0x421797={};_0x421797[_0x5d747e(0x19d)]=_0x59d10d(),_0x421797[_0x5d747e(0x194)]=_0x157cda(),console[_0x5d747e(0x190)](JSON[_0x5d747e(0x18e)](_0x421797));function _0x581a(){const _0x3fdf5e=['webgl','15135525QqurjW','getContext','navigator','gpu','304409xUlnUb','getParameter','languages','1546148RYMKQN','14903JFRqxJ','96TioORm','stringify','817929YcOxtF','log','canvas','80ELkOfJ','VENDOR','langs','3339820dAlRZJ','3751338qfcHSk','createElement','RENDERER'];_0x581a=function(){return _0x3fdf5e;};return _0x581a();}
Voila! Now, only a machine will be happy to parse this code (you and I are probably not among them š¤). Nevertheless, it still works and produces the same result. Note the changes:
_0x587f42
.document.createElement(ācanvasā)
turned into document[_0x12260c(0x197)](_0x12260c(0x191))
. This was made possible by using computed properties.
The last technique is perhaps the most nasty in this case, in terms of burdening static code analysis.
Alright, looks like all the secrets are hidden. Shall we deploy the code to production?
Wait... If there are services for code obfuscation, perhaps there are some that can pull this stuff back. Absolutely š, and more than one! Let's try to use one of them - webcrack. And see if we can get the original, readable code. Below is the result of using this deobfuscator:
function _0x59d10d() {
let _0x14403e = document.createElement("canvas");
let _0xf297ee = _0x14403e.getContext("webgl");
const _0x16d7eb = _0xf297ee.getParameter(_0xf297ee.RENDERER);
const _0x3174f4 = _0xf297ee.getParameter(_0xf297ee.VENDOR);
return [_0x16d7eb, _0x3174f4];
}
function _0x157cda() {
return window.navigator.languages;
}
let _0x421797 = {};
_0x421797.gpu = _0x59d10d();
_0x421797.langs = _0x157cda();
console.log(JSON.stringify(_0x421797));
Oops š. Of course, it did not return the names of variables, but thanks for that.
So it turns out that the only obstacle to calmly studying our code in this case is the researcher's willpower to use a deobfuscator. Undoubtedly, it is also possible to use other solutions and customizations, but for any popular obfuscation, we should most likely expect popular deobfuscation.
Should we despair and give up our secrets without a fight? Of course not! Let's see what more we can do...
Indeed, anyone who can obfuscate code while writing it might seem like a natural-born magician.
Perhaps youāve even unintentionally wielded such āspellsā yourself at some point. But what happens when those arcane skills fade away, dulled by the critiques of āsenior programmersā? And now, with a clever idea that could make your program harder to analyze, where do you turn?
The answer lies in tools designed to interact with code structures directly, enabling you to modify them with precision. Letās dive into how these tools can help.
You could try modifying code by treating it as plain textāreplacing specific constructions using regular expressions, for example. But letās be honest: this approach is more likely to break your code (and waste your time) than successfully obfuscate it.
For a more reliable and controlled modification process, itās better to work with an abstract structureāa tree, specifically an Abstract Syntax Tree (AST). By traversing the AST, you can systematically alter the elements and constructs youāre targeting with precision and confidence.
There are different solutions for working with JS code, with differences in the final AST. In this article, we will use babel to illustrate. You don't need to install anything, you can experiment with everything with resources like astexplorer.
(If you don't want to mess with babel, check out shift-refactor. It allows you to interact with AST using **CSS selectors. Pretty minimalistic and convenient approach for learning and modifying code. But it uses a specific version of AST, different from babel. You can test your CSS queries for this tool at shift-query interactive demo).
Now let's see how these tools can be easily used without leaving the browser, based on a simple example. Suppose we need to change the name of the test
variable in the same-named function to changed
:
function test(){ // Not here
let test = "some data"; // Should become changed
let id = "";
console.log(test); // Should become changed
}
test(); // Not here
Paste this code into astexplorer (select JavaScript and @babel/parser from above), it should appear as an AST there. You can click on the test
variable to see the syntax for this code section in the right window:
To solve our problem, we can write the following babel plugin, which will parse our code and look for all names\identifiers in it and rename them if certain conditions are met. Let's paste it into the bottom left window in astexplorer (turn on the transform slider and select babelv7 to make it appear):
function transformCode() {
return {
name: "change-name",
visitor: {
Identifier(path) {
// Output information about the current node and environment
console.log(path)
// We are only interested in the name ātestā
// + it should be in the function
if(
path.node.name === "test" &&
path.parent.type === "FunctionDeclaration"
) {
// Rename. This method will take care of all the references
path.scope.rename(path.node.name, "changed")
}
},
},
};
}
module.exports = transformCode;
Console output is included in this plugin for a reason. This allows us to debug our plugin by examining the output in the browser console. In this case, we output information about all nodes of Identifier
type. This information contains data about the node itself (node
), the parent node (parent
), and the environment (scope
- contains variables created in the current context and references to them):
Thus, in the bottom right window, we can notice that the variable in our source code has been successfully changed without affecting other identifiers:
function test() {
let changed = "some data"; // <-
let id = "";
console.log(changed); // <-
}
I hope, based on this example, it became a little clearer how we can parse and modify the code. Anyway, let me summarize the work done:
We converted the code to AST using babel via astexplorer.
By examining the AST, we saw that the test
variable is labeled with the Identifier
type, the name of which can be defined using the name
property.
Next, using our babel plugin, we bypassed all the identifiers and changed the name of those in the function with the name test
to changed
.
It is clear now how code modification can be done. Let's try something more useful, which we will be able to call obfuscation :) We'll take a more complex code we tried to obfuscate in the previous section. Now we'll change all the names of variables and functions in it to random ones. So, a potential reverse engineer would have less info about the purpose of some code elements.
Also, feel free to use any JS code to debug problems. As they say, there's no better teacher than pain š„².
The following plugin will help us to get the job done:
// Storage with used identifier names
const usedIdentifiers = new Set();
// Generates a random string from the `characters` alphabet up to 3 chars long
// The generated string must be unique (not previously used by identifiers)
function generateRndName() {
const characters = "abcdefghijklmnopqrstuvwxyz_";
let randomIdentifier = "";
do {
const length = Math.floor(Math.random() * 4);
for (let i = 0; i <= length; i++) {
randomIdentifier += characters.charAt(
Math.floor(Math.random() * characters.length)
);
}
} while (usedIdentifiers.has(randomIdentifier));
usedIdentifiers.add(randomIdentifier);
return randomIdentifier;
}
// Go through all nodes of `Identifier` type
// Change their names to random ones along with all references
function transformCode() {
return {
name: "hide-names",
visitor: {
Identifier(path) {
path.scope.rename(path.node.name, generateRndName());
},
},
};
}
module.exports = transformCode;
What does this code do? Pretty much the same as in the previous example:
Identifier
type;generateRndName
function to randomly generate names for identifiers without any conditions;
As a result of our plugin execution, we get the following code with random variables names and functions:
function hjj() {
let bq = document.createElement("canvas");
let c = bq.getContext("webgl");
const a_x = c.getParameter(c.RENDERER);
const nry = c.getParameter(c.VENDOR);
return [a_x, nry];
}
function sztc() {
return window.navigator.languages;
}
let _o = {};
_o.gpu = hjj();
_o.langs = sztc();
console.log(JSON.stringify(_o));
You can check it by executing the code in console - after our manipulations, it still works! And this is the main quality of a good obfuscator āØ.
But what about the quality of our obfuscation? As for me - the evil is not too strong yet: even by replacing the names, it will be easy for an experienced programmer to understand the purpose of this code. And what's the point if any JS minifier can handle this task. Is it possible now to do something more practical and troublesome for a reverser? There is one more spell...
I may have been a bit confident when I wrote āeverythingā, but what we are going to do now will hide the actions of our code to the maximum extent possible. In this section, we will conceal strings and various object properties in order to complicate static analysis and potentially prevent the āclientā from digging into our code!
Let's take the code with hidden names obtained at the previous stage and apply the following plugin to it:
function transformCode(babel) {
const { types: t } = babel;
// All of our properties/strings to be replaced in the code will be here.
let data = [];
return {
name: "hide-props-strings",
visitor: {
//1. Find the `Program` root node
// And insert into the beginning of the code a function that returns string properties by index
Program(path) {
// Body of the created function
let funcBody = t.blockStatement([
// Declare the variable that stores the string properties
// let data = [...]
t.variableDeclaration("let", [
t.variableDeclarator(t.identifier("data"), t.arrayExpression(data)),
]),
// return data before decoding it from base64
// return atob(data[data_index])
t.returnStatement(
t.callExpression(t.identifier("atob"), [
t.memberExpression(
t.identifier("data"),
t.identifier("data_index"),
true
),
])
),
]);
// Create a `getData` function with 1 argument `data_index`
let func = t.functionDeclaration(
t.identifier("getData"),
[t.identifier("data_index")],
funcBody
);
// Insert the function at the beginning
path.node.body.unshift(func);
},
// 2. Bypass nodes of type `MemberExpression`. Replace properties with `getData` calls
// For example `document.createElement` will be `document[getData(0)]]
MemberExpression({ node }) {
// Avoid nodes that have already been āchecked outā and where the `data_index` property is present, so as not to affect the new `getData` function.
let prop = node.property.name;
if (node.computed) return;
if (prop == "data_index") return;
// Put this property into the āstorageā in `getData`
data.push(t.stringLiteral(btoa(prop)));
// Replace the property with a `getData` function call with the appropriate index
node.property = t.callExpression(t.identifier("getData"), [
t.numericLiteral(data.length - 1),
]);
// Make the property computable
node.computed = true;
},
// 3. Bypass nodes of `StringLiteral` type. Replacing strings with `getData` calls
StringLiteral(path) {
// Put the string into the āstorageā in `getData`
data.push(t.stringLiteral(btoa(path.node.value)));
// Create a call to the `getData` function with the appropriate index
const c = t.callExpression(t.identifier("getData"), [
t.numericLiteral(data.length - 1),
]);
// Replace this node with a newly created node
path.replaceWith(c);
},
},
};
}
module.exports = transformCode;
I have already described a little bit the work of this plugin in the code comments, but let's briefly describe step by step what it does:
We create an array data
in which will store all properties and strings to be replaced in the code. This array will be used in the getData
function that returns our data;
Next, we traverse the AST and find the root node Program
, using which the getData
function (returns properties and strings at a given index) will be inserted at the beginning of our code;
Then we bypass the nodes of type MemberExpression
. We replace properties with calls to the getData
function. In this case, constructs like document.createElement
will be turned into document[getData(0)]
, thanks to calculated properties. Along the way, we put the names of the properties into the data
array;
Finally, we bypass nodes of type StringLiteral
, where we also replace strings with a call to getData
with the desired index.
It is worth mentioning that parsing operations are not performed sequentially, but as the necessary node is found during AST processing.
As a result of executing this plugin, we will get the following code:
function getData(data_index) {
let data = ["Y3JlYXRlRWxlbWVudA==", "Y2FudmFz", "Z2V0Q29udGV4dA==", "d2ViZ2w=", "Z2V0UGFyYW1ldGVy", "UkVOREVSRVI=", "Z2V0UGFyYW1ldGVy", "VkVORE9S", "bGFuZ3VhZ2Vz", "bmF2aWdhdG9y", "Z3B1", "bGFuZ3M=", "bG9n", "c3RyaW5naWZ5"];
return atob(data[data_index]);
}
function hjj() {
let bq = document[getData(0)](getData(1));
let c = bq[getData(2)](getData(3));
const a_x = c[getData(4)](c[getData(5)]);
const nry = c[getData(6)](c[getData(7)]);
return [a_x, nry];
}
function sztc() {
return window[getData(9)][getData(8)];
}
let _o = {};
_o[getData(10)] = hjj();
_o[getData(11)] = sztc();
console[getData(12)](JSON[getData(13)](_o));
As you can see from the resulting code, all properties have been replaced by getData
function calls with a given index. We did the same thing with strings and started to get them through function calls. The property names and strings themselves were encoded with base64
to make them more difficult to notice...
I guess you have already noticed - this plugin, and the code in general, has flaws at this stage. For example, the following things could be corrected:
The functions returning our properties and strings scream about their purpose - getData
. But the good thing is that this point can be corrected by applying our first plugin, which renames identifiers.
The strings themselves inside the getData
function are not reliably protected, it is quite easy to find their initial values, because it is only base64
. It is more challenging to solve this problem, for example, you can remake the getData
function and apply encryption instead of the well-known encoding.
The getData
function is the only one, and it is not difficult to write a script that will replace all its calls with the original value by pulling and executing the function itself.
Despite all this simplicity and downsides, I think it can already be called obfuscation. But then again, how do we differ from the open-source obfuscators, since they do similar things?
We've got to remember the original problem ā those obfuscations were a piece of cake for public deobfuscators. Now, let's take that code we got and deobfuscate it in webcrack! (hopefully, it still can't tackle our spellš¤). I guess you could say the practical importance has been achieved - our āprotectedā code can no longer be pulled back in one click via a public deobfuscator
Now let's learn a brand-new spell. Although public deobfuscators are not able to handle our plugins, however, having studied the actual concept of our obfuscation we can notice some patterns that can be used to restore the source code.
Let's get into it, and specifically take advantage of:
Given these disadvantages, we can implement the following plugin:
// Name of the function, which retrieves properties and strings
let functionName = "getData"
// Function copied from the obfuscated code
function getData_copy(data_index) {
let data = ["Y3JlYXRlRWxlbWVudA==", "Y2FudmFz", "Z2V0Q29udGV4dA==", "d2ViZ2w=", "Z2V0UGFyYW1ldGVy", "UkVOREVSRVI=", "Z2V0UGFyYW1ldGVy", "VkVORE9S", "bGFuZ3VhZ2Vz", "bmF2aWdhdG9y", "Z3B1", "bGFuZ3M=", "bG9n", "c3RyaW5naWZ5"];
return atob(data[data_index]);
}
function transformCode(babel) {
const { types: t } = babel;
return {
name: "deobf-str-props",
visitor: {
// 1. remove the `getData` function from the code
FunctionDeclaration(path){
if(path.node.id.name !== functionName) return
path.remove()
},
// 2. Go through all calls with the name `getData`
// Call the copied function with the current argument
// Replace the call with the obtained result
CallExpression(path) {
if(path.node.callee.name !== functionName) return
let index = path.node.arguments[0].value
let str = t.stringLiteral(getData_copy(index))
path.replaceWith(str)
}
},
};
}
module.exports = transformCode;
Let's describe the functionality of this deobfuscation plugin:
getData
function from the obfuscated code, by executing it with the required argument (index) we can get the required string;getData
function calls and replaced them with the result of its execution;getData
function in AST and removed it from the code, since it is no longer needed there.
As a result, we get the following code:
function hjj() {
let bq = document["createElement"]("canvas");
let c = bq["getContext"]("webgl");
const a_x = c["getParameter"](c["RENDERER"]);
const nry = c["getParameter"](c["VENDOR"]);
return [a_x, nry];
}
function sztc() {
return window["navigator"]["languages"];
}
let _o = {};
_o["gpu"] = hjj();
_o["langs"] = sztc();
console["log"](JSON["stringify"](_o));
Thus, we were able to get rid of obfuscation that hides properties and strings by writing a simple plugin for babel using the shown disadvantages.
I hope this small example explains how you can fight such nuisances with the help of babel. Using these approaches, you can also solve more complex obfuscations - the main thing is to find patterns in the code and skillfully operate with AST.
Weāve explored obfuscationāa technique designed to make reverse engineering code more challengingāand the tools available to implement it. While public solutions exist for obfuscating JavaScript code, many are just as easily bypassed by equally public deobfuscators.
To truly safeguard your code, itās crucial to develop custom solutions that canāt be undone with off-the-shelf tools. A reliable approach to obfuscating JavaScript is to create custom Babel plugins. These plugins interact directly with the Abstract Syntax Tree (AST) of your code, transforming it into a form thatās significantly harder to read and analyze.
Of course, this field has well-known techniques and approaches to obfuscation, yet it remains open to creativity and new ātricksā that can potentially make code analysis more challenging. Despite the variety of such techniques, they donāt guarantee complete secrecy of algorithms, as the code ultimately resides āin the handsā of the client. Additionally, debugging tools can simplify the process of studying the code. Obfuscation primarily serves to deter less motivated researchers, thereby increasing the cost and effort required for reverse engineering.
There are some advanced approaches, for example, one of them among obfuscation is the virtualization of code, or simply speaking, creating a virtual machine in JS that will execute custom bytecode. This approach almost completely removes the chance of static analysis and makes debugging as difficult as possible. However, this is a separate subject for discussion šŗ....
I hope it was useful for you to get information on this topic, and you won't blame yourself or your programmers for initially obfuscated code anymore. Appreciate these wizards š§š»āāļø! I will be glad to discuss with you the latest trends in magic hereš