Blog

What’s your name? … My how you have changed.

Enable editing in Microsoft word

For several months I have seen documents with an embedded file and 2 versions of shellcode in the document property values for “Company” and “category”.

The VBA code also makes use of “custom.xml” to get obfuscated custom properties for use in the vba.

Looking around the only information I have seen so far was this great write-up by HP Threat Research located here.

According to that blog post they are calling that sample SVCReady based on a ET Pro network signature of the  C2 traffic.

But what about the rest of the samples using the same type of “loader”? Can we just call those documents SVCReadyLdr or SVCReadyLoader in order not to confuse the traffic with the loader document?

Trying to get a better handle on when this type of loader began showing up I wrote a yara rule to scan my personal repository for samples I have already looked at.

rule Find_SVC_Ready_Like_Property_Sheets

{
        meta:
            author = "David Ledbetter"
            source = "https://twitter.com/0xToxin/status/1564289244084011014"
            description = "Rule to find extracted Document property sheets containing SVC Ready like Shellcode."
            created = "2022-08-31"

        strings:
                $Cat1 = "<cp:category>" // category property xml name
                $Comp1 = "<Company>"    // Company property xml name
                $NewNop = "6f6f6f6f6f6f6f6f6f6f6f6f6f6f"   // New Style Shellcode start
                $OldNOP = "9090909090909090909090909090"   // Old Style Shellcode Start

        condition:
                ($Cat1 and ($NewNop or $OldNOP)) or ($Comp1 and ($NewNop or $OldNOP))

}

This yara rule will only find the two property sheets once the document has been decompressed.

File 1:

The very first sample I could find in my repository was from March 22 2022 on Twitter by @pr0xylife located here. The final malware was ID as IceID.

Taking a look at the vba we see where it is getting one of two properties depending on if it detects as a 32 or 64 bit system.

The highlighted string shows the call to the reverse string function that will get a value from “ActiveDocument.CustomDocumentProperties” and then reverse the string.

Looking at the value that gets called here we see it will download a file.

That downloaded file will end up getting loaded by the “Loader’ embedded in the shellcode.

We can see here that the shellcode starts with a 0x90 or a NOP (No Operation).

Here we see that the shellcode will build a series of stack strings/ api calls to load the loader.

The hash of the extracted loader is SHA256 : AFF07C6F1D01971C21F9FDC55736BCE98A3F98591A077172708BF147AF7DE4EE

So this sample will run the vba extract a url and shellcode then load the “loader” to then load the downloaded file.

Downloaded file hash: SHA256 : 98B3471AC865E7CC6CC5712AB0DB76C476FD861828267284A6AA40C802737B2E

The next thing I did was to get the IOC list from the blog post. The blog was posted on 6-6-2022

The dates here are the first seen dates on InQuest Labs not necessarily the dates the files were created. Two files were not found. Notice that the bulk of these files were not seen until after the blog post was posted.

File 2:

Let’s take a look at the one with the folder label “IOC-DOC-10”. Those with the labels are the ones I downloaded to look at. The rest I verified the “type” by looking at the code on InQuest Labs.

Decompressing this file and looking in the top folder we see something we don’t normally see. There is a file there named “svc32.dll” . That is not normal for a document.

Looking at that file in a hex editor we can see it is an executable and that it also appears to be UPX Packed..

svc32.dll – File Hash SHA256 : D3E69A33913507C80742A2D7A59C889EFE7AA8F52BEEF8D172764E049E03EAD5

Looking at the VBA we can see that this looks similar to to the first one we looked at where it will decode strings and then call property values.

Only in this case it is calling the embedded file instead of downloading it from the internet.

It is still using the same type of Shellcode in the property’s.

Still has the embedded loader in the shellcode. At this point you could just extract the loader part and inspect it with your normal executable tools.

Extracted Loader file hash: SHA256 : 0F539DE61CF492698E8086910117F361E8D35ABD0D32A9037A6CC1BD1280EC73

An interesting artifact I stumbled onto by accident is this. Notice the “TOPS-20” under the Host OS column. I don’t normally view files this way but it is still interesting to me.

File 3:

Let’s take a look at the first one on the list from June the 4th 2022.

This file has the same “TOPS-20”. Could this be used as an indicator ? I would have to research it more to be sure.

Looking at the embedded file in this document we see it is now obfuscated in some way.

Looking in the “docProps” folder we see there is no “custom.xml” file that would contain the custom values.

Looking at the vba code for “ThisDocument” . We can see they didn’t bother with any obfuscation in this version. This is the best it gets to having a clear view how this works.

At this point we discover that the shellcode is obfuscated too and it no longer has the embedded loader readily available to just pluck out.

File 4:

Skipping ahead to 08-22-2022 we see this sample on Twitter here from @James_inthe_box.

First of all this file has an extension of “RTF” but upon closer Inspection it is a zipped version of a word Doc.

We can see this one also has the TOPS-20 Host OS.

Interesting enough we can see that this embedded encoded file also starts with a 0xEA byte.

Something new we see here is the shellcode  starts with a 0x6F now instead of 0x90.

So now lets look at the vba and see what we have.

We can now see the obfuscation has gone a little overboard.

They have now added a case statement to get the values.

This is the function to convert the shellcode hex string to byte.

Here we can see several encoded string and the decoding function.

If you don’t want to step thru this in the vba debugger or the project is password protected but you can extract the vba with a tool, one thing we can do is build our own string decoder using Microsoft Word. The function has stayed the same but the keys changes in the samples I looked at.

Drop in three text boxes and a button then it is just a matter of borrowing some of the code and tying it to the button click.

I also made a VB . Net version because the Doc version requires you to use the keyboard for copy paste of the values.

At this point I was stumped with the Shellcode. I’m still finding my way around IDA so reached out to Twitter.  French @notareverser answered the call.

Sending them this public file they were able to quickly figure out that the  the hex string “6f6f6f6f6f6f6f6f6f6f6f6f6f6f6f6f6f6f6f” appeared to be xored by 0xFF.

Which I can quickly validate with this xor tool. So now we know we have to xor the shellcode.  But I did not see any xor function in the vba so what are they really doing.

‘French’ pointed me back to the “255” (0xFF) in the vba code.

The string name obfuscation aside, what this is doing is looping thru the hex string (Shellcode)  converting it to a byte array then for each byte it will subtract that value from 255 and output the result to a decoded byte array. Which works out the same as doing an xor by 255 (0xFF) for each encoded byte.

Now that we have the shellcode first layer decoded let’s take a look at it in IDA Free.

Once French @notareverser explained this to me it was easy to follow along in IDA.

It will make a series of jumps and calculations to calculate the encoded Data offset. In this case it ends up at offset 0x9C.

The decoding function here uses a byte array key and a single byte xor key to decode the encoded payload.

#!/usr/bin/env python3

import sys
import struct

# Decoder by @notareverser

# decode using an xor key buffer as well as a single byte key
def doubleDecode(data, longKey, byteKey):

    odata = bytearray()
    for x in range(len(data)):
        odata.append(data[x] ^ longKey[x%len(longKey)] ^ byteKey)
    return odata

 

_data = open(sys.argv[1], 'rb').read()

# when decoding the external payload, the start offset is zero
# when decoding from the embedded shellcode, the start offset is 0x9c

payloadStart = 0
#payloadStart = 0x9c

data = _data[payloadStart:]

# the first dword is unused
fugazi, keySize, payloadSize = struct.unpack_from("<LLL", data, 0)
byteKey = 0xaa
startOfKey = 12 # 3 dwords
startOfPayload = startOfKey+keySize
longKey = data[startOfKey:startOfPayload]
encodedData = data[startOfPayload:startOfPayload+payloadSize]
decodedData = doubleDecode(encodedData, longKey, byteKey)

sys.stdout.buffer.write(decodedData)

 

Above is the python version of the decoder that was sent to me.

Looking back at IDA and looking at the end of the Sub Code we see the data section and it is at offset0x9C. We could possibly just use the visual to locate the offset where the encoded data starts too.

According to the python decoder they are using a 12 byte header in these files.

These 12 bytes are split up into three Little Endian values. Reversing the bytes also gives an idea that in loc74 it is possibly looking for the first value of 0xE8 in the encoded data.

So if we add 12 bytes (size of header) to 0x9c the Key array will start at 0xA8 .

Adding to key start 0xA8 + 0x03fd (key length) = 0x4A5 this will start the encoded data.

Here we can see after decoding the shellcode  and dropping it is CyberChef to get the assembly there is still stack strings at the beginning of the shellcode.

Scrolling down further we can see where the decoded loader executable is.

Extracted decoded loader hash: SHA256 : E63A15F138521F52293829224233F5A053321871571D701E0E16698F5D99AE5F

After using the decoder at offset 0x00 and the Single byte key of 0xAA we have the decoded final malware.

One thing I have not seen before is the strings in these samples. This is something I may need to dig in deeper to later to try and understand how this works.

Decoded file hash: SHA256 : DA8C22842EA611DDD736E329A952658B2EF6A49085E9F8503EF5A5346E2CCE67

File 5:

The final file I want to show is a excel file that I stumbled onto while researching this.

Up until this point all I have seen was Word Docs.

The sample can be located here on InQuest Labs.

This is what the file looks like and  it had to have the file extension of xlsm or Office 2010 would not open it.

Looking at the IOC’s on InQuest Labs we see that there were a lot of indicators extracted.

The interesting parts is they were found in the “__SRP_0” file.

So was this file modified and reused for bad ?

Looking at the vba in this file it is similar to the last one but here they are loading the amsi.dll file. And the values in the arrays is the “Patch” values for the AMSI scan buffer.

We can see the corresponding “Name” to the api call to “LoadLibraryA” to load amsi.dll.

We can see that the shellcode here starts with the “6f” like the earlier version so we need to xor it by 0xFF and then we can load it into IDA to see what the key and offset is.

Looking here we see it is still using the single byte xor key of 0xAA.

Using the visual cue for the offset we can see it is still a offset 0x9C.

Still using stack strings.

Offset of the loader.

Extracted loader hash:  SHA256 : 9836B95DEF864B9517BAD9E414EC4589A2F68D5FE503D55D66FFBA0973DFD799

We can see that this fie starts with e 0xEA also.

After decoding using the offset of “0’” and the single bye decoding key of 0xAA

And the decoded file hash: SHA256 : 2F790BF4AF8106EC4584347DE9BC5B10146D9235B8E7C396865AA5B37436D204

In this series of five files we have seen the evolution of this loader implementing new forms of obfuscation in the vba as well as the shellcode as they steadily progress. We see that it uses Excel as well as Word documents.

Since the files are ‘Zipped” then there is not a easy way to to build detections against the compressed file. You can’t use a size for sections because of different compression ratios.

The best way I can think of to build detections is against the decompressed files. 

Again I would like to thank French @notareverser for their help with the IDA and shellcode and also for sending over the python decoder to help send home the idea of how it works. Also taking the time to teach me how to read the IDA output.

How Effective Is Your Email Security Stack?

Did you know, 80% of malware is delivered via email? How well do your defenses stand up to today’s emerging malware? Discover how effectively your email provider’s security performs with our Email Attack Simulation. You’ll receive daily reports on threats that bypassed your defenses as well as recommendations for closing the gap. Free of charge for 30 days.

Get My Email Attack Simulation