JavaScript obfuscation in PDF: Sky is the limit
I know that most of you are probably already sick of malicious PDF documents, but one of our readers, Will Thomson, sent a really interesting malicious PDF document that used some more advanced obfuscation techniques that I wanted to share with everyone. So, let's get to work.
The PDF document Will sent contains a JavaScript section that gets called immediately after opening the document. This is today really standard. The JavaScript section is relatively short and we can see that it uses the app.doc.getAnnots() (line 17) call to get Annotations, from the code excerpt below:
When called like this, the app.doc.getAnnots() call will return an array of objects that will contain all annotations. This is important to remember.
Now, in line 4 you can see that the num variable is set to 1 which will cause the sum variable (line 18) to contain the second annotation's subject field (array members start with 0). This will then be deobfuscated with a loop at line 27 and finally at line 39 the second JavaScript layer gets called with an eval() call.
This second JavaScript layer at first sight looks familiar – the huge "blob" of obfuscated data is passed as an argument to the function called iXM__8f_ITb, which is then deobfuscated. One trap is immediately visible – the author used the arguments.callee() call to prevent us from modifying the deobfuscation function but this can normally be easily evaded by redefining the eval() call. However, if this is done, and the function is called it will only print out some meaningless numbers! Yet, the PDF is malicious when tested on a vulnerable Acrobat Reader so something else must be going on here.
Take a look at the code below, which I tidied a bit for you so you can read it easier:
Especially important are lines 6-13. So, what do the attackers do here:
- First the variable n_AXr11_7Wdj is assigned value 0,
- Then, on line 8, the existence of the app object is tested. This object is created by Acrobat Reader so if you run this outside (for example, with SpiderMonkey or another JavaScript interpreter) this call will fail since the object will not exist. We can create that object easily, but we are still not done,
- On line 10, the h__l_S_1__f variable will contain pr[n_AXr11_7Wdj].subject. Since n_AXr11_7Wdj is 0, this equals to pr[0].subject. Remember what the pr array is? It contains annotations. In other words, this will use the first annotation.
If all this passed correctly, the contents of first annotation are used as the deobfuscation key – if any part fails, the deobfuscation function will simply just print some numbers. Clever indeed! The final JavaScript layer just exploits the old Collab.collectEmailInfo vulnerability.
Why are the attackers going to this length with obfuscation you might ask? Well, the obvious answer is to make detection (and analysis) more difficult. And it indeed looks as they were successful with this since when I uploaded the document to VirusTotal (VT) only 6 out of 39 AV programs detected it as malicious, with most of big names missing it. Wepawet, another great service for analyzing malicious JavaScript/PDF/Flash files by UCSB (Wepawet) handled the file better and managed to analyze it correctly – kudos to the UCSB team.
While there has been a lot of words and warnings about how patching Adobe Reader installations is important, I would like to stress this out again as attackers are clearly not sleeping.
--
Bojan
INFIGO IS
Web App Penetration Testing and Ethical Hacking | Amsterdam | Mar 31st - Apr 5th 2025 |
Comments