How to decompile any Python binary
At F-Secure we often encounter binary payloads that are generated from compiled Python. These are usually generated with tools such as py2exe or PyInstaller to create a Windows executable. A notable example was the Triton malware recently discovered by FireEye[1], which used this exact technique.Due to the variety of payloads seen we frequently relied on multiple decompilation scripts or manual human intervention to obtain the source code. To speed this process up we decided to create a single analysis script that could decompile both py2exe and PyInstaller files and provide us the output.
Python to Executable
To start off we’re going to show you how payloads can be compiled in py2exe and PyInstaller.
To create a payload using py2exe:
- Install the py2exe package from http://www.py2exe.org/
- For the payload (in this case, we will name it hello.py), use a script like the one in Figure 1. The option “bundle_files” with the value of 1 will bundle everything including Python interpreter into one exe.
- Once the script is ready, we will issue the command “python setup.py py2exe”. This will create the executable, just like in Figure 2.
from distutils.core import setup import py2exe, sys, os sys.argv.append('py2exe') setup( options = {'py2exe': {'bundle_files': 1}}, #windows = [{'script': "hello.py"}], console = [{'script': "hello.py"}], zipfile = None, )
Figure 1
C:\Users\test\Desktop\test>python setup.py py2exe running py2exe *** searching for required modules *** *** parsing results *** *** finding dlls needed *** *** create binaries *** *** byte compile python files *** *** copy extensions *** *** copy dlls *** copying C:\Python27\lib\site-packages\py2exe\run.exe -> C:\Users\test\Desktop\test\dist\hello.exe Adding python27.dll as resource to C:\Users\test\Desktop\test\dist\hello.exe
Figure 2
To create a payload using PyInstaller:
- Install PyInstaller using pip (pip install pyinstaller).
- After that, we will issue the command “pyinstaller –onefile hello.py” (a reminder that ‘hello.py’ is our payload). This will bundle everything into one executable.
C:\Users\test\Desktop\test>pyinstaller --onefile hello.py 108 INFO: PyInstaller: 3.3.1 108 INFO: Python: 2.7.14 108 INFO: Platform: Windows-10-10.0.16299 ……………………………… 5967 INFO: checking EXE 5967 INFO: Building EXE because out00-EXE.toc is non existent 5982 INFO: Building EXE from out00-EXE.toc 5982 INFO: Appending archive to EXE C:\Users\test\Desktop\test\dist\hello.exe 6325 INFO: Building EXE from out00-EXE.toc completed successfully.
Figure 3
One Script to Rule Them All
A number of useful Python decompilation scripts already exist, including unpy2exe.py, pyinstxtractor.py and uncompyle6; however, each supports different options and file types. To speed up analysis the Countercept team created a single script (Github) that can be used as a one stop shop for decompilation, calling other scripts as needed.
The script operates as follows:
- Once a binary is specified as input, it will automatically determine if this is packed using py2exe or PyInstaller.
- After it has completed the check it will proceed using either unpy2exe.py or pyinstxtractor.py for unpacking.
- However, if the script detected any encrypted bytecode, it will ask whether it should proceed with the decryption process (Figure 7).
- Once everything is unpacked, it will proceed with decompiling all the extracted Python byte code by using uncompyle6.
- Occasionally, the main Python file, which contains the main logic for the program, can’t be decompiled. Usually, this is because it’s missing the magic bytes for the Python version number within the Python bytecodes. The “prepend” option in this script can be used to overcome this.
The available options are shown below:
test@test:python python_exe_unpack.py [*] On Python 2.7 usage: python_exe_unpack.py [-h] [-i INPUT] [-o OUTPUT] [-p PREPEND] This program will detect, unpack and decompile binary that is packed in either py2exe or pyinstaller. (Use only one option either -i or -p) optional arguments: -h, --help show this help message and exit -i INPUT exe that is packed using py2exe or pyinstaller (Use -o to specify the output directory) -o OUTPUT folder to store your unpacked and decompiled code. (Otherwise will default to current working directory and inside the folder "unpacked") -p PREPEND Option that prepend pyc without magic bytes. (Usually for pyinstaller main python file)
Figure 4
A Python binary can be decompiled by passing it to the script using the ‘i’ argument as below – Figure 5 shows a p2exe example and Figure 6 shows a PyInstaller example:
test@test:python python_exe_unpack.py -i sample/malware_1.exe [*] On Python 2.7 [*] This exe is packed using py2exe [*] Unpacking the binary now
Figure 5
test@test:python python_exe_unpack.py -i sample/malware_2.exe [*] On Python 2.7 [*] Processing sample/malware_42d5f609c0143ec808b45b247f2cbf8decce5bee0572a30c2437ecb6bf8b37b4 [*] Pyinstaller version: 2.0 [*] This exe is packed using pyinstaller [*] Unpacking the binary now [*] Python version: 26 [*] Length of package: 7346701 bytes [*] Found 66 files in CArchive [*] Beginning extraction...please standby [!] Warning: The script is running in a different python version than the one used to build the executable Run this script in Python26 to prevent extraction errors(if any) during unmarshalling [*] Found 423 files in PYZ archive [*] Successfully extracted pyinstaller exe.
Figure 6
PyInstaller has an option that can encrypt the Python bytecode bundle together with the exe (usually, other modules are required by the main Python file). As we can see from Figure 7, once encrypted Python bytecode is detected, it will ask whether or not to decrypt it with the key that the script retrieved from the exe itself.
test@test:python python_exe_unpack.py -i sample/malware_3.exe [*] On Python 2.7 [*] Processing sample/hello-pyinstaller-encrypted.exe [*] Pyinstaller version: 2.1+ [*] This exe is packed using pyinstaller [*] Unpacking the binary now [*] Python version: 27 [*] Length of package: 3210322 bytes [*] Found 20 files in CArchive [*] Beginning extraction...please standby [*] Found 196 files in PYZ archive [!] Error: Failed to decompress heapq, probably encrypted. Extracting as is. [!] Error: Failed to decompress encodings.cp932, probably encrypted. Extracting as is. [!] Error: Failed to decompress encodings.johab, probably encrypted. Extracting as is. [!] Error: Failed to decompress functools, probably encrypted. Extracting as is. [!] Error: Failed to decompress random, probably encrypted. Extracting as is. .......................................... [!] Error: Failed to decompress encodings.cp950, probably encrypted. Extracting as is. [*] Successfully extracted pyinstaller exe. [*] Encrypted pyc file is found. Decrypt it? [y/n]y decompiled 194 files: 0 okay, 2 failed [+] Binary unpacked successfully
Figure 7
Challenges with Python bytecode
Currently with unpy2exe or pyinstxtractor the Python bytecode file we get might not be complete and in turn it can’t be recognized by uncompyle6 to get the plain Python source code. This is caused by a missing Python bytecode version number. Therefore we included a prepend option; this will include a Python bytecode version number into it and help to ease the process of decompiling. As we can see from Figure 8, when we try to use uncompyle6 to decompile the .pyc file it returns an error. However, once we use the prepend option (Figure 9) we can see that the Python source code has been decompiled successfully.
test@test: uncompyle6 unpacked/malware_3.exe/archive.py Traceback (most recent call last): ………………………. ImportError: File name: 'unpacked/malware_3.exe/__pycache__/archive.cpython-35.pyc' doesn't exist
Figure 8
test@test:python python_exe_unpack.py -p unpacked/malware_3.exe/archive [*] On Python 2.7 [+] Magic bytes is already appeneded. # Successfully decompiled file [+] Successfully decompiled.
Figure 9
Real World Example
To demonstrate how the script can be used in the real world, we will test it against a Triton malware sample.
- Download the sample by using the hash found [1]. SHA1: dc81f383624955e0c0441734f9f1dabfe03f373c
- Run the script with the input as the sample we mentioned above.
- As we can see from our text editor, the Python code is retrieved and decompiled successfully.
test@test: python python_exe_unpack.py -i sample/triton_sample [*] On Python 2.7 [*] This exe is packed using py2exe [*] Unpacking the binary now # Successfully decompiled file
Figure 10
# uncompyle6 version 2.11.5 # Python bytecode 2.7 (62211) # Decompiled from: Python 2.7.12 (default, Nov 20 2017, 18:23:56) # [GCC 5.4.0 20160609] # Embedded file name: script_test.py # Compiled at: 2017-12-24 08:05:33 import TsHi import sh import struct import time import sys def PresetStatusField(TsApi, value): if len(value) != 4: return -1 script_code = '\x80\x00@<\x00\x00b\x80@\x00\x80<@ \x03|\x1c\x00\x82@\x04\x00b\x80`\x00\x80<@ \x03|\x0c\x00\x82@\x18\x00B8\x1c\x00\x00H\x80\x00\x80<\x00\x01\x84`@ \x02|\x18\x00\x80@\x04\x00B8\xc4\xff\xffK' + value[2:4] + '\x80<' + value[0:2] + '\x84`\x00\x00\x82\x90\xff\xff`8\x02\x00\x00D' AppendResult = TsApi.SafeAppendProgramMod(script_code) if not AppendResult: return -1 cp_info = TsApi.GetCpStatus() status = cp_info[40:44] if status != value: return 0 return 1 ………………………………………..
Figure 11
And problem solved! We’ve hopefully demonstrated how you can now decompile any Python binary – this will ideally improve efficiencies in your own hunt teams. This project is currently on Github and we welcome contributions.
[1] https://www.fireeye.com/blog/threat-research/2017/12/attackers-deploy-new-ics-attack-framework-triton.html
Categories