Deobfuscating Javascript Malware

An edited version of this post has been added to my company blog at Checkmate

Some days back I was greeted by a Google Safe browsing warning when I tried visiting a ‘known’ site. As I was sure it was supposed to be clean and harmless site, I thought it would be good to dig further into this problem. The trail led to interesting amounts of codes, concepts and techniques.

Malware writers are very smart nowadays (haven’t they always been ?). They know that once their code is understood it most likely to be detected by anti-malware applications. To delay detection by such applications, they resort to a wide range of techniques. In this blog post I’ll be discussing the most potent and easily created malware.

Javascript has become the boon and bane of the Internet. It provides greater interactivity with the user but can also be used by malware writers to infect innocent users. Javascript is a client-side scripting technology which means the processing of the script is handled by the user’s browser.

Obfuscation is the concealment of intended meaning in communication, making communication confusing, intentionally ambiguous, and more difficult to interpret.

JavaScript is sometimes obfuscated to prevent users from easily understanding their functionality. ( Legitimate uses are to prevent stealing of code)

There may be many ways to obfuscate a code and similarly there may be multiple ways to de-obfuscate a code. What I’ve presented below is very raw and cannot be used to analyze many malicious JS. But since this is the beginning for me, I thought it may help others too.

Disclaimer: Links presented below are live at the time of writing this blog post. Please do not visit them if you do not know what you are getting into.

First thing first, we need to get the HTML source the malicious page. We can either use wget/curl or Malzilla, which is what I used. It was observed that this page is dependent on the HTTP referrer. So if the domain receives a request for the page without a ‘valid’ HTTP referrer page, the page is not returned.
We get the ‘bad’ HTML at http://mybetorwager.cn:8080/index.php with a valid HTTP referrer.

The complete HTML source can be viewed here

The code starts off with the following in the SCRIPT tag.

Vhotzdq(function(p,a,c,k,e,d)

This section of the code shows that the javascript has been packed by the popular Dean Edword JS Packer. This packer is available online as well as in download-able formats. We use a GreaseMonkey script “Decode It!” to enable the online ‘ Decoder‘ on the webpage.


We paste the code from Vhotzdq(function(p,a,c,k,e,d) onwards till the end and rename the function name Vhotzdq to eval. This will help us decode and evaluate the result. The output of which can be found here

—————————————————-

Update 2: 16th September,2009
Seems like Dean Edwards had coded an UNPACKER as well. It can be accessed at http://dean.edwards.name/unpacker/. If using this tool, simply replace the Vhotzdq to eval and run the script. No additional GreaseMonkey scripts are necessary :-)

—————————————————-

Unpacked Javascript using Dean Edwards Packer

Unpacked Javascript using Dean Edwards Packer

As can be seen above, we need to unescape the code to get the decoded output. This can be done in multiple ways:

Using the 'unescape' feature provided by PHP Charset Encoder

Using the 'unescape' feature provided by PHP Charset Encoder

The decoded output of the above step can be found here

Now the code is in a more human readable format. To further complicate analysis, the malware authors have implemented small amounts of string manipulations on the code. Also, the variables used have been obfuscated or mangled. This will not pose a problem to us as the variables can be given any names.

Note that there exists a certain amount of code-block which is still encoded.On decoding this further, I was presented with non-English language statement. I wasn’t able to figure out the use of this code. A guess would be that a message/error is inserted here. This would most likely be the malware authors original language. Another malware analysis shows this section as the Shellcode. I will update this as I get more information on how to decode it.

—————————————————-

Update 1: 14th September,2009
OK, it turns out that the segment was indeed the shellcode. Using the Malzilla tool we concatenate the variable “var unf57UBnT
This presents us with an encoding which seems to be UCS2. Next, we can either use Malzilla to convert UCS2 to Hex (which does not provide reliable results) or use a shellcode to EXE converter available at http://sandsprite.com/shellcode_2_exe.php.

ShellCode 2 EXE

ShellCode 2 EXE


Once we obtain the EXE from the shellcode, we can analyze this executable in a tool called FileInsight developed by Mcafee Labs. Below is a snapshot of FileInsight analysis output which shows the malicious URL.
FileInsight - Shellcode.exe analysis

FileInsight - Shellcode.exe analysis

URLMON.DLL is a system DLL generally used by malwares to download files from online locations

—————————————————-
The next step is to execute the ‘replace’ functions which involve Regular Expressions to clean out the manipulated code.
As an example below is the line of code that we currently have in our decoded output.

rqeqG6Spq.setAttribute(‘i#)@d!’.replace(/\(|\!|&|\$|@|\^|\)|#/ig, ”),rqeqG6Spq);

Let’s take this code in detail:

rqeqG6Spq –> declared variable
setAttribute –> the property of the variable rqeqG6Spq
/\(|\!|&|\$|@|\^|\)|#/ig –> Regular Expression
(In JavaScript, to define a regex pattern, we define it between /…../ .
g‘ indicates Global Match and ‘i‘ is for Case-Insensitive search)
.replace() –> is a JavaScript string manipulation function, which runs the regex on the ‘object’ i#)@d!

After executing the replace() function, the output would look like this

rqeqG6Spq.setAttribute(‘id’,rqeqG6Spq);

Similar replace operations are performed at all other places, till we get the final output as shown here

NOTE: Your Anti-Malware may issue an alert when you try to visit the above link. I have modified the malicious URL a bit so the script won’t move ahead.

We are now at a stage where we can make a few observations on what the JavaScript does and how it works.
The original malicious domain is found to be http://3c8.ru:8080/welcome.php .This domain serves the malware payload.
The script tries to exploit a vulnerability in ActiveX which allows it to download and execute a malicious binary.
I haven’t had the chance to go deeper into the execution of the malware But once I get the time, I’ll look into analyzing the binary as well.

Before I end this long post, just a quick note that to automate this entire process, we can use an online tool called wepawet, which is a service for detecting and analyzing web-based malware. It currently handles Flash and JavaScript files.
You can find the result of the analysis of our malicious page at http://wepawet.iseclab.org/view.php?hash=07fc283602731721a97f196c3ab19092&type=js
It provides for a comprehensive analysis.

Also, do check out the VirusTotal scan results for the obfuscated and deobfuscated Javascript
Obfuscated Detection rate is 2/41
De-obfuscated Detection rate is 14/41

I guess that’s it. Hope you liked this basic tutorial. Do leave your feedback in the comments section below

About these ads

5 thoughts on “Deobfuscating Javascript Malware

  1. Great analysis!

    I am currently working on an automated remote virus scanner to detect exactly these kinds of things and add it to my existing vulnerability scanner. One tool I find valuable is Mozilla js, which allows evaluation of javascript outside the browser to help when it is doing some code generation (such as nesting created javascript attacks, each one obsfucated).

    The hardest part is detecting signatures like the shell code you show, but I think just flagging obsfucated code as potential malware if it doesn’t match the DB is enough in most cases.

    • Thanks for the comment Charlie.

      You can add “packed” JS code in your detection mechanisms as well. I’ve seen many JS malwares using packers – Dean Edward’s Packer being the most popular.

      From my experience, I would say, to optimize detection of malicious JS files, look into three aspects of the code:

      1. Packed Code
      2. Completely encoded code
      3. Random variable names – kind of a string obfuscators

      Based on this you can assign the JS code a ‘malicious’ rank.
      Some other tools you can look into are Rhino and SpiderMonkey

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s