The Beginner’s Guide To – RTF Malware Reverse Engineering Part 2
By BUFFERZONE Team, 21/08/2023
Target: Cybersecurity specialist
Tags: RTF, Word, Malware, Content Disarm and Reconstruction (CDR), Reverse Engineering, Zero-Trust.
In this second blog post we will focus on RTF file analysis with RTFOBJ from OLETOOLS  and use md5 command and “file” command to verify drop file content. For more information about RTF file structure please review the first part of this series (Link) and detailed research focusing on RTF Content Disarm and Reconstruction (CDR) .
Malware Investigation Research Steps:
Investigating RTF malware requires a careful and systematic approach. Below are highly suggested steps we conduct in our research:
- Isolation: Always work in a safe environment when dealing with potential malware. This usually means using a sandbox or a dedicated, isolated system that is not connected to your network. In this blog, we will work inside Ubuntu Virtual Machine.
- Collection: The first step is gathering potentially malicious RTF files. These can be sourced from various locations like spam emails, and suspicious websites, or shared through threat intelligence feeds. We will use MalwareBazaar , a public malware repository, to receive interesting malware for analysis.
- Static Analysis: Start by examining the RTF without executing it. This includes viewing the file metadata, the structure, the embedded objects, scripts, or unusual elements. In this blog, we will use the OleTools suite  and we will use RTFOBJ, Yara static engine signature , malware signatures from ditekshen detection Yara signatures .
- Dynamic Analysis: This involves monitoring the behavior of the RTF file when it is opened. You would typically use a sandbox environment for this, which can safely log the actions of the file, such as network connections, file system modifications, or registry changes. Many evasive behaviors are discovered during dynamic analysis that can highlight behavior that we missed during the static analysis or are unfamiliar with. This part will be outside of this blog’s focus.
- Payload Extraction: If the RTF has an embedded payload, this will need to be extracted for further analysis. This could be another file, a script, or something else. Payload extraction can be done as part of the static analysis or part of the dynamic analysis features.
- Code Analysis: If the RTF includes embedded or obfuscated code, such as OLE (Object Linking and Embedding) objects or PowerShell, this must be analyzed. This involves de-obfuscating the code, understanding its functionality, and identifying any potential exploits or vulnerabilities it might use. This will be done as part of our static analysis investigation.
- Threat Intelligence Correlation: Correlate the information collected about the OLE malware with threat intelligence data. This can give information on the possible threat actors, campaigns, their methods, or whether this malware has been observed before. This step is done after the collection and during the static and dynamic analysis. When we discover Information of Compromise (IOC) which are a list of drop file (sha256 /MD5 hash representation), URL’s, IP addresses in the file, we can enhance our understanding of the file capabilities based on threat intelligence.
- Reporting: Finally, document your findings. This report should detail the characteristics of the malware, how it works, its impact, and recommended mitigation strategies.
Remember always to stay safe when investigating potential malware, and only do so in a controlled and isolated environment. It is important to keep systems and software up to date to protect against known vulnerabilities that malware often exploits. This tutorial is for educational purposes only. Please take full responsibility while handling dangerous malicious files.
In this blog, we will investigate sha256: 0ea61e3db99c96cf0b148d6f2ebab3ed8860c17be0298a7e5469330b0eecb7d7
The first stage will be reviewing the file in VirusTotal to get reputation and information about the file.
We can observe that the file is detected as malicious by 36 engines, and the popular threat is trojan type: trojan.noon/generickds.
Based on VMRAY sandbox environment analysis (Link), it is evident that the attack was triggered by windord.exe employing RPC communication. The malware leveraged the e equation (CVE-2017-11882  exploit), which in turn activated the command line (CMD.exe) executing the dropped DLL. Following this, rundll.exe carried out a process injection attack, progressing the assault.
Now let us research the file from the static point of view.
To begin our analysis, we will be using a script called Oleid. This script is designed to thoroughly examine OLE files and identify any unique characteristics that may indicate malicious intent. It can detect the presence of VBA macros and embedded Flash objects. However, Oleid, did not indicate any suspicious activity.
We also ran RTFOBJ on the file and discovered that it contains two problematic OLE objects. The first one is a log file the RTFOBJ knows to extract its true type and indicate that this is a PE or DLL file and not a log file. The second object is ole equation 3.0 that is quite common in today’s attacks although it was initially discovered in 2017  malware authors still Havely use it with different evasive permutations.
To examine the first item, we will utilize RTFOBJ with the parameters “-s 0”. This will save the object to a specified path provided by RTFOBJ. The next step is to execute the Linux file command: file <file_path>. This will reveal the true file format, which in this case is a PE32+ (64 bit) DLL. We can also generate the md5 of the dropped file by running md5 <file_path>. The resulting md5 code for this file is: 998c79456d9782eb1a03140e04f36d46.
Now let us search the md5 in VirusTotal:
We can observe that the DLL file is detected by 42 engines, and it is classified as trojan.noon/formbook malware.
In this installment, we explore RTFOBJ as a research tool, an alternative to the rtfdump.py we discussed in our last post. However, diving into RTF malware research can be complicated. Malware creators frequently employ clever evasion techniques, which can sidestep basic static analysis and file parsing. These techniques often go unnoticed by the RTF reader during its operation. To tackle and identify such deceptions, we recommend using Yara signatures , as we have highlighted before.
Content Disarm and Reconstruction (CDR) technology emerges as a robust countermeasure. CDR diligently strips away potential attack vectors, regardless of their malicious intent.
Stay tuned: Our upcoming post will further explore RTF file format threats and demonstrate how CDR acts as a formidable line of defense.
 Ran Dubin, “Content Disarm and Reconstruction of RTF Files a Zero File Trust Methodology,” in IEEE Transactions on Information Forensics and Security, vol. 18, pp. 1461-1472, 2023, doi: 10.1109/TIFS.2023.3241480.
 OleTools, https://github.com/decalage2/oletools
 Yara, the pattern matching Swiss knife for malware researchers, https://virustotal.github.io/yara/
 Ditekshen, Yara signatures, https://github.com/ditekshen/detection.
 Didier Stevens, rtfdump, https://github.com/DidierStevens/DidierStevensSuite/blob/master/rtfdump.py
 CVE-2017-11882, https://cve.mitre.org/cgi-bin/cvename.cgi?name=cve-2017-11882
 Decalage, RTF evasion tricks, https://decalage.info/rtf_tricks
 Inquest, Inquest Yara rules for Virus Total, https://github.com/InQuest/yara-rules-vt