Skip to content

Combining with the PeFile module

PEfile is a Python module used to parse and analyze the structure and content of Windows executable files (PE files). It provides functions such as reading metadata, section table information, importing and exporting tables from PE files, allowing users to gain a deeper understanding of the internal structure and functions of PE files, which can be used for security analysis, antivirus research, reverse engineering, and other purposes.

Dynamically read PE header

The main function of the following code is to use the x32db debugger to obtain the main module base address of the target process, then read the first 1024 bytes of the module as the data of the PE file, and then use the pefile module to parse the data and print out the PE file. Some key information, such as the Magic field of OPTIONAL_HEADER, the MajorLinkerVersion field, and the Signature field of NT_HEADERS. Finally the connection to the debugger is closed.

python
from x32dbg import Debugger
import pefile

if __name__ == "__main__":
    dbg = Debugger(address="127.0.0.1", port=6589)

    if dbg.connect() == True:
    	# Get the base address of the current process
        base_address = dbg.get_main_module_base()
        print("Main program base address = {}".format(hex(base_address)))

        # read memory
        byte_array = bytearray()
        for index in range(0,1024):
            read = dbg.get_memory_byte(base_address + index)
            byte_array.append(read)

        # print PE OPTIONAL_HEADER
        pe_ptr = pefile.PE(data = byte_array)
        timedate = pe_ptr.OPTIONAL_HEADER.dump_dict()

        # Read specific fields
        magic = timedate.get("Magic")
        print(magic)

        majorlinkerversion = timedate.get("MajorLinkerVersion")
        print(majorlinkerversion)

        # print PE NT_HEADERS
        nt = pe_ptr.NT_HEADERS.dump_dict()
        print(nt.get("Signature").get("Value"))

        dbg.close_connect()
    else:
        print("Failed to connect debugger")

After the program runs successfully, it will output the following content:

python
Main program base address = 0x330000
{'FileOffset': 280, 'Offset': 0, 'Value': 267}
{'FileOffset': 282, 'Offset': 2, 'Value': 12}
17744

Dynamically read PE section table

The main purpose of this code is to use the x32dbg debugger to get the main module base address of the target process, and then read the first 1024 bytes of the module as the data of the PE file. Then, use the pefile module to parse these data to obtain the section information of the PE file, including the name, virtual size, virtual address, original data size, original data pointer and characteristics of each section. By analyzing these section information, you can gain insights into the structure and content of the PE file.

Parts of focus include:

INFO

  • Connect to the debugger and obtain the main module base address of the target process.
  • Read the first 1024 bytes of the target process as the data of the PE file.
  • Use the pefile module to parse the PE file data and obtain the IMAGE_SECTION_HEADER field information of each section.
  • Prints each section's name, virtual size, virtual address, raw data size, raw data pointer, and attributes.
python
from x32dbg import Debugger
import pefile
import tempfile

if __name__ == "__main__":
    dbg = Debugger(address="127.0.0.1", port=6589)

    if dbg.connect() == True:
        # Get the base address of the current process
        base_address = dbg.get_main_module_base()
        print("Main program base address = {}".format(hex(base_address)))

        byte_array = bytearray()
        for index in range(0, 1024):
            read_byte = dbg.get_memory_byte(base_address + index)
            byte_array.append(read_byte)

        # Write a sequence of bytes to a temporary file
        temp_file = tempfile.NamedTemporaryFile(delete=False)
        temp_file.write(byte_array)
        temp_file.close()

        # Use pefile to parse temporary files
        pe = pefile.PE(name=temp_file.name)

        # Get the IMAGE_SECTION_HEADER field of each section
        for section in pe.sections:
            print("Section Name:", section.Name.decode().strip('\x00'))
            print("Virtual Size:", hex(section.Misc_VirtualSize))
            print("Virtual Address:", hex(section.VirtualAddress))
            print("Size of Raw Data:", hex(section.SizeOfRawData))
            print("Pointer to Raw Data:", hex(section.PointerToRawData))
            print("Characteristics:", hex(section.Characteristics))
        dbg.close_connect()
    else:
        print("Failed to connect debugger")

After the program runs successfully, it will output the following content:

python
Main program base address = 0x330000
Section Name: .text
Virtual Size: 0xb74
Virtual Address: 0x1000
Size of Raw Data: 0xc00
Pointer to Raw Data: 0x400
Characteristics: 0x60000020

Verify that PE enables protection mode

The main purpose of this code is to obtain all loaded modules and analyze some security features of each module, including base address randomization, DEP (Data Execution Prevention), enforced integrity and SEH (Structured Exception Handling) Exception protection, etc.

The abbreviations here mean:

  • ASLR: Address Space Layout Randomization (address space layout randomization)
  • DEP: Data Execution Prevention
  • Force Integrity: Force Integrity Check (forced integrity check)
  • SEH Protection: Structured Exception Handling Protection
python
from x32dbg import Debugger
import pefile
import tempfile

if __name__ == "__main__":
    dbg = Debugger(address="127.0.0.1", port=6589)
    if dbg.connect() == True:
        # Get all loaded modules
        module_list = dbg.get_module()
        print("Name\t\tASLR\t\t DEP\t\tForce Integrity\t\tSEH Protection\t\t")
        print("-" * 100)

        for module_index in module_list:
            print("{:15}\t\t".format(module_index.get("Name")), end="")

            # Read the first 4096 bytes of the module as the data of the PE file
            byte_array = bytearray()
            for index in range(0, 4096):
                read_byte = dbg.get_memory_byte(module_index.get("Base") + index)
                byte_array.append(read_byte)

            # Use pefile to parse PE file data
            oPE = pefile.PE(data=byte_array)

            # ASLR => hex(pe.OPTIONAL_HEADER.DllCharacteristics) & 0x40 == 0x40
            if ((oPE.OPTIONAL_HEADER.DllCharacteristics & 64) == 64):
                print("True\t\t\t", end="")
            else:
                print("False\t\t\t", end="")

            # DEP => hex(pe.OPTIONAL_HEADER.DllCharacteristics) & 0x100 == 0x100
            if ((oPE.OPTIONAL_HEADER.DllCharacteristics & 256) == 256):
                print("True\t\t\t", end="")
            else:
                print("False\t\t\t", end="")

            # Force => hex(pe.OPTIONAL_HEADER.DllCharacteristics) & 0x80 == 0x80
            if ((oPE.OPTIONAL_HEADER.DllCharacteristics & 128) == 128):
                print("True\t\t\t", end="")
            else:
                print("False\t\t\t", end="")

            # SEH => hex(pe.OPTIONAL_HEADER.DllCharacteristics) & 0x400 == 0x400
            if ((oPE.OPTIONAL_HEADER.DllCharacteristics & 1024) == 1024):
                print("True\t\t\t", end="")
            else:
                print("False\t\t\t", end="")
            print()

        dbg.close_connect()
    else:
        print("Failed to connect debugger")

After the program runs successfully, it will output the following content:

python
Name		ASLR		 DEP		Force Integrity		SEH Protection
-------------------------------------------------------------------------
win32project.exe	True			True			False	        False
msvcr120.dll   		True			True			False			False
kernel32.dll   		True			True			False			False
msvcp_win.dll  		True			True			False			False
win32u.dll     		True			True			False			True
ucrtbase.dll   		True			True			False			False
user32.dll     		True			True			False			False
gdi32.dll      		True			True			False			True

Memory virtual address and physical address translation

This code connects to the target process through x32dbg, and then uses pefile to parse the PE file of the target process, thereby achieving the conversion of virtual address (VA) to file offset address (FOA), FOA to VA, and FOA to relative virtual address (RVA). The main functions include address translation and PE file parsing, highlighting the importance of virtual addresses, file offset addresses, and relative virtual addresses in reverse engineering and security analysis.

The three conversion functions in its class represent the address translation operations that are often required for reverse engineering and security analysis:

INFO

  • get_offset_from_va: This function converts virtual address (VA) to file offset address (FOA). When analyzing PE files, sometimes it is necessary to locate the offset position in the file based on the VA in memory, in order to obtain the corresponding data or perform other operations.
  • get_va_from_foa: This function converts the file offset address (FOA) to a virtual address (VA). When performing file parsing or modifying PE files, it is sometimes necessary to locate the virtual address in memory based on the offset position in the file, in order to perform operations in memory.
  • get_rva_from_foa: This function converts the file offset address (FOA) to a relative virtual address (RVA). When analyzing PE files, sometimes it is necessary to calculate the relative virtual address relative to the starting position of the file based on the offset position in the file, in order to locate the specific location in memory.
python
from x32dbg import Debugger
import pefile

class PEUtils:
    def __init__(self, debugger):
        self.debugger = debugger

    def get_offset_from_va(self, pe_ptr, va_address):
        memory_image_base = self.debugger.get_main_module_base()
        memory_local_rva = va_address - memory_image_base
        foa = pe_ptr.get_offset_from_rva(memory_local_rva)
        return foa

    def get_va_from_foa(self, pe_ptr, foa_address):
        rva = pe_ptr.get_rva_from_offset(foa_address)
        memory_image_base = self.debugger.get_main_module_base()
        va = memory_image_base + rva
        return va

    def get_rva_from_foa(self, pe_ptr, foa_address):
        sections = [s for s in pe_ptr.sections if s.contains_offset(foa_address)]
        if sections:
            section = sections[0]
            return (foa_address - section.PointerToRawData) + section.VirtualAddress
        else:
            return 0

if __name__ == "__main__":
    dbg = Debugger(address="127.0.0.1", port=6589)
    if dbg.connect() == True:

        # Initialize Conversion Class
        pe_utils = PEUtils(dbg)
        pe = pefile.PE(name=dbg.get_main_module_path())

        # Read addresses from files
        rva = pe.OPTIONAL_HEADER.AddressOfEntryPoint
        va = pe.OPTIONAL_HEADER.ImageBase + pe.OPTIONAL_HEADER.AddressOfEntryPoint
        foa = pe.get_offset_from_rva(pe.OPTIONAL_HEADER.AddressOfEntryPoint)
        print("File VA address: {} File FOA address: {} Retrieve RVA address from file: {}".format(hex(va), foa, hex(rva)))

        # Convert VA virtual address to FOA file offset
        eip = dbg.get_eip()
        foa = pe_utils.get_offset_from_va(pe, eip)
        print("Virtual address: 0x{: x} corresponds to file offset: {}".format(eip, foa))

        # Convert FOA file offset to VA virtual address
        va = pe_utils.get_va_from_foa(pe, foa)
        print("File address: {} Corresponding virtual address: 0x{: x}".format(foa, va))

        # Convert FOA file offset address to RVA relative address
        rva = pe_utils.get_rva_from_foa(pe, foa)
        print("File address: {} Corresponding RVA relative address: 0x{: x}".format(foa, rva))

        dbg.close_connect()
    else:
        print("Failed to connect debugger")

After the program runs successfully, it will output the following content:

python
File VA address: 0x4015bb File FOA address: 2491 Retrieve RVA address from file: 0x15bb
Virtual address: 0x 3315bb corresponds to file offset: 2491
File address: 2491 Corresponding virtual address: 0x 3315bb
File address: 2491 Corresponding RVA relative address: 0x 15bb