Extracting ADS using Linux

This post will be covering a feature of the NTFS file system known as the Alternate Data Stream (ADS), focusing on how to properly identify and forensically extract these data streams from an NTFS partition using a Linux host.

.

Alternate Data Streams (ADS)

Despite being talked about fairly regularly in the forensic community, Alternate Data Streams (ADS) are not a well-known feature of the NTFS file system, nor are they officially well-documented. ADS was first implemented into Windows NT 3.1 to allow compatibility with the Hierarchical File System (HFS), designed for Macintosh systems at the time. The reason for this was tied into how HFS stores data using two main components; a data fork and a resource fork. On HFS, the data fork comprised the actual data and the resource fork was used by the host Operating System to interpret the data. This functionality is analogous to file extensions on Windows systems.

The Alternate Data Stream on the Windows NTFS file system was designed to play the part of the resource fork for HFS; providing Windows with a way to interpret the data on HFS volumes. Bear in mind that all files in NTFS consist of at least one visible data stream, usually referred to as $DATA or the ‘unnamed data stream’. An ADS is simply another data stream attached to a given file which is hidden from the user and even programs such as Windows Explorer. Some files on Windows systems will commonly incorporate multiple data streams, such as for the purpose of holding metadata. For example, a Microsoft Word document will often contain metadata (author, word count, page number, etc.) that is attached to the document via an ADS. When understanding ADS, it is easier to think of them as hidden files that are ‘attached’ to visible ones.

.

Abusing Alternate Data Streams

The ability to add another data stream to any file on NTFS, which is not only hidden from the user but difficult for security programs to detect, carries a high potential for abuse. Penetration testers sometimes use ADS to bypass expected behaviour in applications which do not account for input including an Alternate Data Stream. A historical vulnerability involving the abuse of ADS was seen in CVE-1999-0278, wherein attackers using Internet Information Services (IIS) could obtain source code for ASP files simply by appending the string “::$DATA” to the URL.

Alternate Data Streams on NTFS also exhibit unique qualities which make them a desirable target for attackers wishing to abuse them:

  • No attributes
  • Unreported Size
  • No size limitations
  • Multiple streams per file
  • Can affect directories and drives
  • Difficult to unintentionally access
  • Often go undetected by security programs

A common method attackers employ to abuse Alternate Data Streams is to add an executable payload to any file on the NTFS file system, which can then be executed with specific commands, without the users knowledge. Removing a malicious ADS is as simple as deleting the file it is attached to. However, should the ADS be attached to a critical system file such as the root of the file system, this would be much harder to reliably remove.

.

Forensic Relevance of ADS

From a forensic perspective, NTFS Alternate Data Streams have severe implications for Anti-Forensics, as an attacker could potentially hide multiple incriminating files or malicious payloads through hidden data streams on other files. However, these Alternate Data Streams are not hidden or exempt from forensic analysis programs, as will be demonstrated in later sections. Additionally, there are applications available to forensic examiners that are purposefully designed to detect hidden data streams on an NTFS system, such as:

These tools are mostly Windows-based and some of them are required to be executed on a live file system, which may not be considered forensically sound, especially if employed in a live forensic scenario. This small observation formed the basis for the experimentation conducted in the later sections of this post, which demonstrate that there are Linux tools which have the capability to detect ADS on an NTFS image file.

.

Assumption and Methodology

After researching the role of data streams on NTFS, I was naturally curious as to whether I could detect and forensically extract Alternate Data Streams using Linux commands. The assumption during these tests is that a forensic examiner has obtained a Windows NTFS image file and they want to run commands from a Linux-based host to ensure that nothing is hidden via suspicious data streams.

With this in mind, I created a matrix of functions to cover the steps the experiment could take, which I formed into a simple methodology as follows:

  • Create a Windows 10 Virtual Machine in VMWare
  • Create some superfluous ADS using the Command Prompt
  • Convert the VMDK image to RAW format
  • Forensically mount the image on Linux
  • Identify ADS using relevant commands
  • Extract any suspicious ADS from the disk image
  • Experimentation and Results

Firstly, I created a clean Windows 10 (Build Version 1803) virtual machine with VMWare Workstation 12 and used a single 40GB disk image file during the installation process. The next step was to create some exemplary Alternate Data Streams using the Windows Command Prompt (cmd.exe) as shown in Figure 1 below:

Figure 1 — Creating Alternate Data Streams

The first command shown above creates a simple ADS text file (hidden.txt), the content of which having been redirected from an echo command consisting of the string “First Alternate Data Stream”:

echo “First Alternate Data Stream” > test.txt:hidden.txt

Note that the main data stream (test.txt), does not need to exist beforehand and the result will be an empty text file with a hidden ADS file attached to it (hidden.txt). The second command shown in Figure 1 demonstrates how an executable file could be attached as an Alternate Data Stream. In this case, the ‘type’ command (similar to Linux ‘cat’), is used to display the contents of the default notepad program, which is then redirected to an ADS (hidden.exe), attached to another empty file (important.docx).

type C:\Windows\System32\notepad.exe > important.docx:hidden.exe

Running a ‘dir’ command on the directory containing the superfluous ADS files will only show the contents and basic properties (file size) of the empty files they are attached to, with no indication that Alternate Data Streams exist.

It is worth pointing out that at this stage of the testing; as long as your Windows system incorporates PowerShell version 3.0 (or later), you can read from and write to Alternate Data Streams using PS commands. You can utilise the ‘Get-ChildItem’ (GCI) PowerShell command to recursively check a directory for Alternate Data Streams as shown in Figure 2 below:

Figure 2 — PowerShell Recursively List Streams

The command shown above will recursively list the data streams associated with files in a given directory and through use of the ‘where’ command, will show any data streams other than the default $DATA:

gci -recurse | % { gi $_.FullName -stream * } | where stream -ne ‘:$Data’

Should a suspicious ADS be identified using this command, another PowerShell command can be issued to read the data it contains. This command is ‘Get-Content’ and can be utilised as follows:

Figure 3 — Read Contents of ADS

As Figure 3 above shows; simply supply the (-path) parameter with the original file path and the (-stream) parameter with the name of the ADS. In the example used in this test, the content of the ADS was simply the text as created in Figure 1 from the echo command.

With the ADS in place, the next step is to forensically analyse the contents of the virtual machine disk image using a Linux host. Because I want the disk image file to be compatible with commands provided by the Sleuth Kit, the VMDK image first needs to be converted to the RAW image format. This can be achieved using the ‘qemu-img’ command, typically provided by installing QEMU for Linux systems:

Figure 4 — QEMU Image Conversion

As shown in Figure 4, you can convert any VMWare disk image file (VMDK) to RAW format by following this syntax for ‘qemu-img’:

qemu-img convert -O raw <INPUT>.vmdk <OUTPUT>.raw

The length of this conversion process will be dependant on how large the input VMDK file is, but should result in the creation of a RAW image of the Windows virtual machine which can now be interrogated with Linux forensic tools.

The next step in the testing process is to attach the newly created RAW Windows disk image to a loop device, create the relevant partition mappings and then forensically mount the desired partition to a directory. This process is demonstrated below in Figure 5:

Figure 5 — Create Loop Device and Mount Partition

The partition mapping can be mounted using the normal Linux ‘mount’ command, however I decided to use the NTFS driver for Linux instead because I could more reliably read information from any potential Alternate Data Streams that may be present on the image. The ‘ntfs-3g’ driver can be easily installed on most mainstream distributions of Linux and provides the option to include all of the data streams associated with each file.

In Figure 5, the RAW disk image is first queried with the ‘mmls’ command incorporated into the Sleuth Kit forensics tool to look at the partition layout of the virtual machine. The ‘kpartx’ tool automatically creates a loop device for the disk image and dynamically adds partition mappings for the detected NTFS volumes. The desired partition is shown to be associated with the block device ‘/dev/mapper/loop0p2’, as the first mapping is simply Windows System Reserved. The next command shown mounts the desired partition to a directory on the Linux system using the NTFS driver:

ntfsmount -o ro,streams_interface=windows /dev/mapper/loop0p2 /mnt/Analysis/Windows

The (-o) parameter of this command specifies two things; the partition is to be mounted as read-only (ro) and preserve the ability to read named NTFS data streams (streams_interface=windows). The last two arguments of this command simply specify the partition mapping containing the user data and where on the Linux system it is to be mounted. As shown in Figure 5, the user data can now be accessed from the specified mount point.

With the virtual machine converted and successfully mounted, the next phase is to begin identifying any Alternate Data Streams that may be present on the NTFS partition. However, even browsing to the directory where the superfluous ADS were created on the command-line will still not display the Alternate Data Streams.

Interestingly, the Alternate Data Streams can be viewed by reading the extended attributes of the mounted files using a Linux command called ‘getfattr’. This command is incorporated into most Linux distributions by default under the package ‘attr’. Following the documentation for the NTFS driver, the ADS can be read by specifying the following extended attribute in ‘getfattr’:

ntfs.streams.list

Additionally, the ‘getattr’ command can be used recursively (-R) to search the entire NTFS partition for Alternate Data Streams and enumerate them as shown in Figure 6 below:

Figure 6 — Identify ADS using Extended Attributes

The output of the extended attributes display the superfluous Alternate Data Streams created at the start of the testing, along with two others I intentionally created, one in particular which I attached to the ‘TM4n6’ directory itself. In addition, other ADS were identified in the form of Zone Identifiers as part of Microsoft Edge, which is expected behaviour and not indicative of anything malicious. Remember that many applications on Windows utilise ADS and their mere presence should not immediately arouse suspicion.

With the Alternate Data Streams on this particular NTFS partition identified through the extended attributes, the command-set incorporated into the Sleuth Kit can be utilised to drill down on them. This phase of the testing will focus on how to extract information from the Alternate Data Streams previously found in the extended attributes.

Figure 7 — Sleuth Kit Commands to Extract ADS

In Figure 7 shown above, the ‘fls’ command was used to gather more information about the files previously located through the extended attributes. Interestingly, the Sleuth Kit command automatically lists the superfluous Alternate Data Streams and provides their independent inode number (MFT Entry Number) as highlighted in the output of the command. With this information, the contents of the ADS can be extracted using another Sleuth Kit command called ‘icat’. As Figure 7 shows, the content of (hidden.txt) was successfully extracted, in which the output can be redirected to another file on the host system if necessary for further analysis.

.

Experimentation Failures and Changes

I thought it would be beneficial to also include a few tests that I ran during this experimentation which either did not work as intended or outright failed, just in case anyone can learn from them. Remember that the sole purpose of this testing was to determine whether or not I could reliably extract information from NTFS Alternate Data Streams using nothing but Linux commands against a Windows disk image.

Originally, before I discovered that the extended attributes of the mounted NTFS partition could be read using ‘getfattr’, I was going to use PowerShell commands. In 2016, PowerShell was released by Microsoft as open-source and has been available ever since on their GitHub repository. Interestingly, this repository also contains PS versions suited for Linux distributions such as; Arch, Debian, Ubuntu, CentOS, RHEL and Fedora. Installing the RPM package on my RedHat-based distribution was easy and I could invoke a PowerShell prompt using the command ‘pwsh’. However, the PS commands as seen in Figures 2 and 3, would not work as intended as the appropriate commands (GCI and Get-Content) lacked the ability to interact with data streams. I am still unsure as to precisely why this is, but I speculate that the version of PowerShell available for Linux is likely older than Version 3, which is required for ADS interaction.

During the testing phase where I initially created the ADS, I ran known PowerShell commands to detect them on the live Windows system (shown in Figure 2 and 3). However, these commands did not seem to list directories as they failed to identify an ADS I created on the user folder. This proved not to be a major issue in the end as the extended attributes on Linux included directories in the output.

I am well aware that mounting the NTFS partition normally using the Linux ‘mount’ command would work just as well as directly specifying the NTFS driver for Linux. There is no harm to using the typical ‘mount’ command as the extended attributes showing the ADS will still be present, regardless of whether the driver is used or not. However, I chose to explicitly use the Linux NTFS driver for a number of reasons; I could specify the attribute I wanted in the output (streams), I received less errors when searching for ADS attached to directories and finally, large ADS files report an error if you try to read their contents without the NTFS driver. Overall, I used the NTFS driver to make the ADS output easier to manage and more reliable.

.

Concluding Statements

The experimentation conducted as part of this post set out to determine a reliable method for locating and extracting Alternate Data Streams from an NTFS partition using Linux commands. It was found that by forensically mounting the partition to the Linux host and reading the extended attributes of the files it contains, ADS could be identified. Should any of the ADS prove suspicious or warrant further investigation, the experimentation showed that Sleuth Kit commands could extract the data from these files.

Of course, the overall forensic relevance of ADS can be debated, however I think they are a very important aspect of Windows forensics due to their potential for Anti-Forensics. Additionally, the NTFS Alternate Data Streams have provided other useful artifacts to forensic examiners in the form of Zone Identifiers. Finally, the history ADS have in the field of information security means they could be a useful source of information to those in Incident Response.

Thank you for reading and I hope this post has taught you something new!

If you have any questions about this post, please send me an email using the form in the ‘Contact’ section at the top of the page. Alternatively, you can send me a message on Twitter.

.

Resources

While doing extra research for the content covered in this post, I found these online resources to be particularly insightful. I have also added a brief description of each one for convenience:

FlexHex — More information about NTFS data streams
ForensicFocus — A good article dissecting ADS
InfoSec Institiute — A useful video covering Alternate Data Streams