Blog

Analyzing Sophisticated PowerShell Targeting Japan

Analyzing Sophisticated PowerShell Targeting Japan

In this article, we dissect a sophisticated multi-stage PowerShell script that is targeting users in Japan. We found this instance on HybridAnalysis a few days back (on March 7). This malware sample is unique because it utilizes multi-layer of obfuscation, encryption, and steganography to protect its final payload from detection. As of writing this article, none of the AntiViruses on VirusTotal detect this attack.

The sample we’ll dive into originally popped up on our RADAR on March 7. The initial sample and some relevant reports:

As of the time of this blog post, none of the AVs on VirusTotal detects this sample (0/57), we’ll see why when we dive into the analysis.

Preliminaries

In this section, we review the obfuscation techniques that are used frequently in this malware. It relies heavily on the string formatting operator (-f), and escaping character (`) in PowerShell to obfuscate its final payload. These techniques are introduced by Daniel Bohannon in Invoke-Obfuscation: PowerShell obFUsk8tion Techniques & How To (Try To) D””e`Tec`T’Th’+‘em’ and are also implemented in his Invoke-Obfuscation framework.

Basic string formatting in PowerShell is performed by -f operator. The following line presents the syntax of this operator as described in here.

“String with placeholders” -f List of string values separated with comma (,) character

Placeholders occur in the format of {Index,Alignment:Format}, where Index is an index in the list of string values appearing after the -f operator. Alignment and format are not commonly used for obfuscation purposes, so we ignore them.

Example:

Consider the following PowerShell statement picked from the main sample:

"{2}{1}{0}" -f ']','te[','By'

After applying -f operator, we get ‘Byte[]’ (‘By’ is at index 2, ‘te[’ is at index 1, and ‘]’ is at index 0).

The Grave character (`) is used to escape characters that have special meaning in PowerShell. For example, if you want to use ” in “place ” first” you need to escape the double quote character in this string by placing the grave character before it. So you need to write “place `” first”.

The occurrence of the escape character before normal characters will not change their meaning. In other words, they are ignored as if they don’t exist.

Example:
Consider the following snippet extracted from the sample.

"lOAD`WiThPart`iAlN`AmE"

Respectively, `W, `i, and `A are W, i, and A. So the string is equal to “lOADWiThPartiAlNAmE”

Unwrapping obfuscation/encryption layers

In this section, we unwrap the obfuscations layers one by one based on the obfuscation techniques that we discussed earlier. You can also use the following Python script that we created to automatically deobfuscate encoded PowerShell scripts based on these techniques.

Different types of obfuscation in a PowerShell script
Fig 1. The original PowerShell script.

Fig 1. depicts the original PowerShell script we obtained from HybridAnalysis. On line 1, the grave characters are used to obfuscate “lOADWiThPartiAlNAmE”. Then the string formatting obfuscation technique is used to obfuscate “System.Security” and the “Out-Null” strings.

To deobfuscate the script, we can utilize a python script that we developed that can handle these two specific techniques.

python bash-deobfuscator.py -f obfuscated.powershell.script.ps1

Fig 2. shows the result of first layer deobfuscation of the original script.

Deobfuscated content after unwrapping the first layer
Fig 2. The original code after unwrapping the first layer of obfuscation (stage 1)

In the second line, Set-Alias (sa) cmdlet is used to create a new alias, “DF”, for “new-object”.

The code consists of two functions (from lines 3 to 18 and from 19 to 21). On line 21, we have:

${ZAE} = (&("get-culture"))."pAreNT"."nAmE"[0];

Get-Culture cmdlet retrieves the current culture of the system. The name field is based on RFC 4646. You can see the whole language codes in this GitHub repository. The first character of the name field is assigned to the ZAE variable.

On line 22, the function pLank is called by passing two arguments. mIss is set to a long encrypted string and colSs is set to ZAE. colSs is then passed as the first parameter to the constructor for Rfc2898DeriveBytes class; the resulted object is used to create two keys, namely DcZ and DeFs, on line 10 and 11. These keys are then passed to CreateDecryptor (DcZ as a key and DeFs as a IV) method of the RijndaelManaged object created on line 4.

In short, the pLank function decrypts the first parameter using the keys that are constructed from the second parameter. The second parameter is one of the lower case English alphabet (a-z). By testing these characters one by one, we realized that the decryption is meaningful when the character is jja-JP is the only language code that starts with j. Hence, we can say this malware is targeting systems that their current culture is set to ja-JP. It is reasonable to assume that such systems are located in Japan. As a result, we believe that this malware instance is targeting Japanese users and systems.

Finally, the decrypted code is executed on line 25.

To get the decrypted script, we can use the debugger in PowerShell ISE, which is installed on Windows 10 by default. We also change line 22 to ${ZAE} =‘j’, and also change line 25 to something else to prevent inadvertently executing the malware and then place a breakpoint on line 25. The following gif file shows how to do it.Fig 3. Debugging the PowerShell code to decrypt the payload.

Decrypted and decompressed code
Fig 4. shows the script that we get on Line 25 after decryption and decompression.
The payload after first round of decryption
Fig 4. The payload after first round of decryption (stage 2)

In the red part, GV is an alias for Get-Variable. Get-Variable ‘*mDR*’ matches with MaximumDriveCount variable. “MaximumDriveCount”[3,11,2] is [‘i’,‘e’,‘x’] and applying -join operator, we get “iex”.

The output of Get-Variable '*mDR*' statement
Fig 5. The output of Get-Variable ‘*mDR*’ statement

The result of the execution of the remaining code is a string. The long literal string is first decoded using the frOMbaSe64stRING function, then decompressed with DEFlATeSTREam. Finally, the result is converted to ASCII. We can run this part with PowerShell or PowerShell ISE to remove this layer. We can do that by removing the red part in Fig 4. By doing so, we get the code presented in Fig 6.

The second round of deobfuscation
Fig 6. The second round of deobfuscation (stage 3)

Instead of relying on defined variables, this stage relies on environment variables to construct “iex”. In the rest of command, the numbers are xored with 0x0c (12) and then converted to Chars and then joined together to form a string (note: this part is very slow and takes time). Then the resulted string is executed with iex.

The third round of deobfuscation
Fig 7. The third round of deobfuscation (stage 4)

In this stage, a string is constructed and piped to iex (the red part). $PSHome is a predefined variable that points to the PowerShell home directory.

To construct the string, each number (represented in binary) is first converted to a string, then to Int16 and then to a character. Finally, all of the resulted characters are joined together to form the final string.

The fourth round of deobfuscation
Fig 8. The fourth round of deobfuscation (stage 5)

Reiterating, the red part is how to construct iex. In the rest of this script, the format operator is used to construct a string. Then some characters and substrings are replaced by other characters using the replace function (the blue part). The resulted string is shown in Fig 9.

The fifth round of deobfuscation
Fig 9. The fifth round of deobfuscation (stage 6)

The same techniques are used in this PowerShell script. After deobfuscating this phase, we get:

The sixth round of deobfuscation
Fig 10. The sixth round of deobfuscation (stage 7)

We can again use our python script to deobfuscate the string formatting technique (note that the file is in ASCII format, so change the encoding in the python script from utf_16 to utf_8). We also can beautify the code to better understand what it does.

The seventh round of deobfuscation
Fig 11. The seventh round of deobfuscation (stage 8)

On line 44, based on the version of the current Windows, either v6B or v10A function is called. Lines 3 to 27 are similar to Lines 1 to 21 with few lines added. If the Windows version is equal to 6, then v6B is called. This function first attempts to download an image from two URLs (if the first URL if not available, then the second one is attempted). If the size of the image is above 55555, then a string that is embedded in the image is extracted on Lines 35 to 36.

Fig 12. The image containing PowerShell code (steganography technique)

As pointed out by @JaromirHorejsi, the steganography extraction function in this sample is very similar to the technique used by ursnif (similar to https://twitter.com/DissectMalware/status/1057518886709612546).

The steganography technique is similar to the technique used by ursnif
Fig 13. The steganography technique is similar to the technique used by ursnif

The extracted code, then, decrypted and decompressed with Nice function (Line 39). The resulted base-64 encoded string is decoded first and then executed (Line 40).

If the Windows version is not equal to 6, function v10A is called. First, a base64 encoded string is decoded and the result is loaded by [Reflection.Assembly]::Load function. Based on this function, we can infer that data is a .NET dll binary. Then the static function Stefan.gavbo.pf() is called.

Embedded .NET dll
Fig 14. Embedded .NET dll

The pf function downloads another image, shown in Fig 13.

The image that is downloaded by pf function. It contains PowerShell code
Fig 15. The image that is downloaded by pf function. It contains PowerShell code.

The extracted code from the download image in v6B is shown in the following figure.

Fig 16. The PowerShell code extracted from the downloaded image

After deobfuscation

Fig 17. After the first round of deobfuscation of the extracted PowerShell script

After another round of deobfuscation

After the second round of deobfuscation of the extracted PowerShell script
Fig 18. After the second round of deobfuscation of the extracted PowerShell script

After yet another round of obfuscation, we get:

After the third round of deobfuscation of the extracted PowerShell script
Fig 19. After the third round of deobfuscation of the extracted PowerShell script

The code downloads another image from one of the following URLs and then extracts, decompresses and executes the embedded PowerShell code.

https://i.imgur.com/vwN9O7y.png

The image downloaded by the script containing PowerShell code(steganography technique)
Fig 20. The image downloaded by the script containing PowerShell code(steganography technique)

The embedded code is:

 extracted code from the image
Fig 21. The extracted code from the image
unwrapping the embedded PE binary
Fig 22. unwrapping the embedded PE binary

Please note, if you are going to use the python script to deobfuscate, make sure you comment line 37.

On line 2415 in Fig 22., we have

$NiSs=$Ni."LC`Id"; $aCc=@(($niss %4),2,($niss %6),4,($niss %7),($niss %9),($niss %11),($niss %12),10,($niss %100),(($niss %50)+10),(($niss %50)-10),(($niss %800)+9));[byte[]]$MjT=$null;$sPKk=""+$NiSs;.("{0}{1}"-f 'E','du') ${glOBaL:`M`G`GG} ([system.text.encoding]::"As`cIi"."gEtBy`TES"($sPKk)) ([ref]$MjT) $aCc;${glOBaL:`M`G`GG}=([System.Text.Encoding]::"Ut`F8"."g`e`TsTrinG"($MjT))

Ni is assigned at the end of line 15 (Niisassignedattheendofline15(Ni=Get-Culture). The language id for the current culture is assigned to NiSs (NiSs(Ni.LCId). The language id for ja-JP is 1041. To make the code work, we replace it with that value.

$NiSs=1041; $aCc=@(($niss %4),2,($niss %6),4,($niss %7),($niss %9),($niss %11),($niss %12),10,($niss %100),(($niss %50)+10),(($niss %50)-10),(($niss %800)+9));[byte[]]$MjT=$null;$sPKk=""+$NiSs;.("{0}{1}"-f 'E','du') ${glOBaL:`M`G`GG} ([system.text.encoding]::"As`cIi"."gEtBy`TES"($sPKk)) ([ref]$MjT) $aCc;${glOBaL:`M`G`GG}=([System.Text.Encoding]::"Ut`F8"."g`e`TsTrinG"($MjT))

Now we can debug the code and extract the embedded PE file.

References

IOCs