Blog
The Beginners Guide – Preventing the Invisible Malware What Is Steganalysis and How CDR can improve Our Security (Part-2)
By BUFFERZONE Team, 31/07/2023
Target: Consumers
Tags: Content Disarm and Reconstruction (CDR), Malware, Images, Steganography, Zero-Trust
In our initial blog post, we explored the technique employed by malware authors to hide malicious code within images, known as steganography. In this blog, we will focus on advanced detection tools designed named steganalysis. We will delve into the limitations of these tools and explore how the innovative approach of zero-trust Content Disarm and Reconstruction (CDR) prevention can address these challenges. Our upcoming blog (Part-3) will provide insights into reverse engineering evasive malware discovered within images.
Steganalysis refers to the field of study and techniques used to detect the presence of hidden information within digital media, such as images, audio files, or videos, that has been concealed through steganography. Steganography involves the covert embedding of data within a carrier medium, making it imperceptible to casual observers. Steganalysis aims to uncover and analyze the hidden data, identify the steganographic algorithms or methods used, and determine if a given media file contains hidden information. It involves the application of statistical analysis, signal processing, machine learning, and other computational methods to reveal the presence of steganography and distinguish between innocent media and steganographic content. Steganalysis plays a crucial role in digital forensics, security, and counterintelligence, providing means to detect covert communication and potential malicious activities.
Steganalysis Methods
Muralidharan et al [1] provided detailed survey about state-of-the-art image steganalysis. We can divide steganalysis two three categories:
- Statistical Analysis: Statistical analysis is a fundamental approach in steganalysis. It involves analyzing the statistical properties of images to detect hidden information. Common techniques include histogram analysis, spatial domain analysis, and frequency domain analysis [4].
- Machine Learning-Based Methods: With the advent of machine learning algorithms, steganalysis has witnessed significant advancements. Various machine learning models, such as support vector machines (SVM), artificial neural networks (ANN), and deep learning architectures, have been applied to steganalysis tasks. These models learn from a vast amount of data and can detect subtle patterns indicative of steganography.
- Rich Model Features: Steganalysis methods can leverage rich model features to enhance detection accuracy. These features encompass higher-level image characteristics, such as texture, color, and spatial relationships. By extracting and analyzing these features, steganalysis algorithms can effectively distinguish between regular and steganographic images.
However, steganalysis is far from being perfect and the following limitations exists:
- Single Dataset Limitation: Many steganalysis methods are created, tested, and utilized only on a single dataset [1]. This can lead to a lack of versatility, potentially limiting the effectiveness of these methods when faced with different datasets. The methods might fail to generalize well across diverse scenarios and image collections, which may affect their real-world applicability.
- Specificity of Targeted Steganography Schemes: The paper [1] points out that many steganalysis methods seem to target only specific steganography schemes. This means that while they might be effective in detecting and analyzing certain steganographic methods, they might be inefficient or entirely ineffective against others. This narrow focus might limit the overall effectiveness of such steganalysis methods.
- Difficulty with Advanced Steganography Methods: The paper [1] highlights that some steganography techniques, such as coverless and Generative Adversarial Networks (GAN) based steganography, are not adequately countered by current steganalysis methods. These more advanced methods present a significant challenge for steganalysis, indicating that the field may struggle to keep pace with the evolution of steganography techniques.
- High Embedding Rates: Steganography techniques that employ a high embedding rate can pose challenges for steganalysis. When a large amount of data is hidden within an image, it becomes more difficult to detect the presence of hidden information. Steganalysis algorithms may struggle to differentiate between legitimate image noise and the embedded data. Especially if the original image is unknown.
- Adaptive Steganography: Adaptive steganography techniques dynamically adjust the embedding process based on specific image characteristics. These methods can evade traditional steganalysis methods by exploiting vulnerabilities in the detection algorithms. As a result, detecting adaptive steganography becomes a daunting task for steganalysis systems.
- Low-Bit Attacks: Attackers employing low-bit steganography techniques embed a minimal amount of data into the cover image. This method aims to stay below the detection threshold of steganalysis algorithms, making the hidden information less noticeable. Steganalysis methods optimized for higher embedding rates may fail to detect such subtle alterations, rendering them ineffective against low-bit attacks.
Steganalysis is a trust-based detection solution and based on the drawbacks evasive malware can bypass steganalysis. As a result, a zero-trust prevention based on CDR is needed.
How Image Content Disarm and Reconstruction Works?
A recent study [2] examines an alternative approach for neutralizing steganography and malware attacks within images. Our method shares similarities and relies on transcoding.
Image transcoding involves converting an image file from one format to another, which may entail modifying the resolution, color depth, and format of the image data. In the broader context of digital media, transcoding refers to the direct conversion of encoding between different formats [2].
Transcoding is typically performed when the target device lacks support for the original image format or has limited storage capacity, necessitating a reduction in file size [2]. For instance, a high-resolution JPEG file might be transcoded into a lower resolution PNG file for improved website loading speed due to its smaller file size.
The process of image transcoding consists of two steps. Initially, the original data is decoded into an intermediate uncompressed format, after which it is encoded into the desired target format. This transcoding process can be either lossy or lossless. In lossy transcoding, certain information is lost during the conversion, resulting in a potential degradation of image quality. This method is commonly employed when the target device has limited storage capacity. Conversely, lossless transcoding retains all information and preserves image quality [2]. Typically, scaling modifications are utilized during transcoding.
It is important to note that transcoding differs from compression and trans-muxing/rewrapping. Compression involves reducing file size without altering the format, while trans-muxing/rewrapping changes the container format while keeping the data intact [2].
In summary, image transcoding plays a vital role in modern digital workflows by facilitating the conversion of images to the most suitable format for their intended use. It enables consistent viewing of image content across a diverse range of devices with varying capabilities and constraints [2].
Image Content Disarm and Reconstruction (CDR) employs transcoding and scaling techniques to fortify image files against evasive steganography and concealed metadata. This approach generates a new image file with a different format, devoid of metadata and extraneous information. The transcoded file can later be converted back to the original format. Transcoding has been proven as a secure measure against malware attacks [2].
Summary
The prevalence of steganography attacks is escalating [3], and present detection methodologies exhibit notable limitations. Consequently, Content Disarm and Reconstruction (CDR) emerges as a dependable solution, assuring absolute security without substantial visual alterations [1]. Therefore, integrating CDR into your security infrastructure merits consideration.
Pictorially, the images below represent a ‘before’ and ‘after’ application of CDR, with the former on the right and the latter on the left. It is discernible that there are no visual discrepancies perceptible to the naked eye.
To encapsulate, adopting a zero-trust approach to file prevention demonstrates remarkable efficiency and efficacy in countering elusive threats that traditional detection methods may overlook.
In the next blog we will reverse engineer malicious images.
References
[1] Muralidharan, T., Cohen, A., Cohen, A., & Nissim, N. (2022). The infinite race between steganography and steganalysis in images. Signal Processing, 108711.
[2] Eli Belkind and Ran Dubin and Amit Dvir, Open Image Content Disarm and Reconstruction}, 2023, https://arxiv.org/abs/2307.14057ץ
[3] Security Boulevard, Steganography in Cybersecurity: A Growing Attack Vector,
https://securityboulevard.com/2022/05/steganography-in-cybersecurity-a-growing-attack-vector/
[4] Muralidharan, Trivikram, et al. “The infinite race between steganography and steganalysis in images.” Signal Processing (2022): 108711.