Close

Request Demo

BUFFERZONE is available to Enterprise companies only. Please fill out the form below and we’ll contact you shortly


    Blog

    Back

    The beginners Guide To – Adobe PDF Malware Reverse Engineering Part 2

    By BUFFERZONE Team, 15/06/2023

    Target: Cybersecurity specialist

    Tags: Adobe PDF, Malware, Content Disarm and Reconstruction (CDR), Reverse Engineering

    In this blog we will continue the PDF malware analysis part –1 and continue to investigate more complex malware.

    Collection:

    Within this blog, we shall retrieve a potentially suspicious file from MalwareBazaar and collectively examine the PDF file (remember to operate within a virtual machine). By employing the “file_type:pdf” filter, we shall acquire the most recently uploaded PDF files within the system. Let us proceed with downloading the latest file, possessing the sha256 hash: 304a28d5e9010331c8f183b5932d0420410cf5e749f84cdd02d9992abd397285. We specifically chose this file for the blog as it does not employ a phishing/luring style and possesses intriguing attributes we wish to discuss.

    This file has a high detection rate now in VirusTotal [7]:

    This indicates that the file consists of known attack vectors. To verify we will start our static analysis.

    Reverse Engineering PDF File Using Static Analysis

    In this blog we will focus on PDFiD  [5], Pdfalyze [6], and Pdf-tool [5].

    PDFiD

    By running python pdfid.py <file> we will get the following output:

    Insights from PDfiD unveil the existence of 25 objects, 4 streams, 2 pages, along with crucial execution descriptors: /AA, /OpenAction, and /Launch. The execution procedure commonly involves /JS and /JavaScript, indicating the deployment of active scripting elements. This will assist us in prioritizing the initial search using Pdfalyze.

    Pdfalyze

    By running pdfalyze <file>

    We can explore the file tree structure:

    From this we can observe that object 23 (Action: Launch) and object (22 Action: JavaScript) are interesting and highlighted in red. We will start examining 22 and move to 23.

    We observe that JavaScript executes exportDataObject [9]. According to the documentation, we discover that the “cName” parameter is mandatory and indicates the desired file attachment for export. Additionally, there are three optional values for “nLaunch”:

    0: Triggers file preservation.

    1: Triggers file opening after preservation.

    2: Instructs Acrobat to temporarily save the file attachment and then prompt the operating               system to open it (Acrobat lacks knowledge of which programs handle specific file types, whereas the OS does).

    Launch a cmd.exe with the file that was saved.

    The form is found in object 20 –> to 21:

    Pdf-parser

    Object 21 appears to be a flat Decode stream. To analyze it, we will follow these steps:

    1. Execute the pdf-parser [5] command and save the file:

    > Python pdf-parser.py -f –o 21 –d extract_21 <file>

    1. The file will be saved as extract_21.
    2. Next, we will employ the File Linux command (a file-type detector) to determine the object’s nature. It is identified as a PE32 executable rather than a PDF in object 22.

    > File extract_21

    The result will indicate: PE32 executable (GUI) Intel 80386, for MS Windows.

    1. Now, let us calculate the MD5 hash of the exported file:

    > md5 extract_21

    5f00d238716e3f337786f4355b2b9787

    The next step is to search the executable md5 in VirusTotal:  5f00d238716e3f337786f4355b2b9787

    We can conclude that the file contains a malicious embedded object (detected by 57 engines).

    Summary

    In this blog, we expand upon the initial blog and investigate a more intricate PDF malware assault. Attack patterns may vary, but the research approach remains consistent. We trust you found the novice’s PDF guide Part -2 to be informative. Kindly visit our website for upcoming blog entries.

     

    References

    [1] Adobe PDF, https://www.adobe.com/acrobat/about-adobe-pdf.html

    [2] Common Crawl data statistics, https://commoncrawl.github.io/cc-crawl-statistics/plots/mimetypes.

    [3] Dubin, Ran. “Content Disarm and Reconstruction of PDF Files.” IEEE Access (2023).‏

    [4] MalwareBazaar, Public Malware Repository, https://bazaar.abuse.ch/

    [5] Didier Stevens, PDF tools, https://blog.didierstevens.com/programs/pdf-tools/

    [6] Pdfalyzer, https://github.com/michelcrypt4d4mus/pdfalyzer

    [7] VirusTotal, https://www.virustotal.com/gui/file/d0265161d0ed290ff81ff99e4571de9b709b357c9e663ad2b4519b68497705f5[8] Yara, https://virustotal.github.io/yara/

    [9] Adobe PDF importing and exporting attachments, https://acrobatusers.com/tutorials/print/importing-and-exporting-pdf-file-attachments-acrobat-javascript/