Appearance
Hook Scanning for Application layer Processes
Application layer hooks are a programming technique that allows programs to define or extend their behavior by executing specific code when a specific event occurs. In general, third-party applications use hooks to implement program extensions when needed. However, since memory data is modified and disk data remains the original data, it provides the possibility to scan these hooks. Specifically, the principle of hook scanning is to read the disassembly code from the PE file on the disk and compare it with the one in memory. When there is a difference between the two, it can be proven that the program has been hooked.
This case will use the Capstone
engine to implement hook scanning function. The reason for choosing this engine is because it supports the Python
package and can easily interact with the LyScript
plugin. In addition, the Capstone
engine has a wide range of applications in reverse engineering, vulnerability analysis, malicious code analysis, and other fields. The famous disassembly debugger IDA
also uses this engine.
The main features of the Capstone engine include:
- Supports multiple instruction sets: supports multiple instruction sets such as x86, ARM, MIPS, PowerPC, etc., and can run on different platforms.
- Lightweight and efficient: written in C language, the code is concise and efficient, and the disassembly speed is fast.
- Easy to use: Provides easy-to-use APIs and documentation, supporting multiple programming languages such as Python, Ruby, Java, etc.
- Customizability: Provides multiple configurable options to meet the needs of different users.
The installation of the Capstone engine is very easy, just execute pip install capstone
. When disassembling with Capstone, readers only need to pass in a PE file path and call it through md.disasm (HexCode, 0)
.
Read the disk machine code and disassemble it
In the following code, first use the pefile
library to read the PE file, obtain the ImageBase
base address of the file, as well as the VirtualAddress
, Misc_VirtualSize
, andPointerToRawData
information of the .text
section table. Next, the code calculates the starting address StartVA
and ending addressStopVA
of the .text
section table, and uses file pointers to read the raw data of the .text
section table in the file. Finally, the data is disassembled and returned through capstone
.
python
from x32dbg import Debugger
from capstone import *
import pefile
# Disassembling disk files
def DisassemblyFile(FilePath):
ref_list = []
# Load the PE file
pe = pefile.PE(FilePath)
# Get the base address of the program
ImageBase = pe.OPTIONAL_HEADER.ImageBase
# Find the .text section to extract disassembly information
for item in pe.sections:
if str(item.Name.decode('UTF-8').strip(b'\x00'.decode())) == '.text':
VirtualAddress = item.VirtualAddress
VirtualSize = item.Misc_VirtualSize
ActualOffset = item.PointerToRawData
# Calculate start and stop virtual addresses of the .text section
StartVA = ImageBase + VirtualAddress
StopVA = ImageBase + VirtualAddress + VirtualSize
# Open the file and read the binary data of the .text section
with open(FilePath, "rb") as fp:
fp.seek(ActualOffset)
HexCode = fp.read(VirtualSize)
# Initialize Capstone disassembler for x86 (32-bit)
md = Cs(CS_ARCH_X86, CS_MODE_32)
# Disassemble the binary code
for item in md.disasm(HexCode, 0):
# Calculate the address of each instruction
addr = hex(int(StartVA) + item.address)
# Construct a dictionary with address and assembly instruction
dic = {"Address": str(addr), "Assembly": item.mnemonic + " " + item.op_str}
# Append the dictionary to the reference list
ref_list.append(dic)
return ref_list
if __name__ == "__main__":
# Example usage of DisassemblyFile function
dasm = DisassemblyFile("C://Win32Project.exe")
# Print out disassembled instructions
for index in dasm:
print("{} | {}".format(index.get("Address"), index.get("Assembly")))
When running the above code snippet, all disassembly code in the .text
section of the Win32Project.exe
program can be output, and some of the output effects are as follows;
python
0x401000 | push ebp
0x401001 | mov ebp, esp
0x401003 | sub esp, 0x1c
0x401006 | push esi
0x401007 | mov esi, dword ptr [0x4020e0]
0x40100d | push edi
0x40100e | mov edi, dword ptr [ebp + 8]
0x401011 | push 0x64
0x401013 | push 0x403438
0x401018 | push 0x67
Read memory machine code and disassemble
Next, it is necessary to read the PE machine code from memory and disassemble it into an assembly instruction set through the Capstone
engine. In the following case, the DisamblyMemory
function is the specific details of implementing memory disassembly. In the case, first call get_memory_localbase
to obtain the program's entry address, then call get_memory_localsize
to obtain the program length. In the DisamblyMemory
function, use get_memory_byte
to read the machine code into memory one by one, and call md.disasm
to complete the disassembly.
python
from x32dbg import Debugger
from capstone import *
# Disassembling memory files
def DisassemblyMemory(address,offset,len):
ref_list = []
# Store the memory list as a byte array
ref_memory_list = bytearray()
# Loop byte by byte reading of data into variables
for index in range(offset,len):
char = dbg.get_memory_byte(address + index)
ref_memory_list.append(char)
# Disassemble data in memory
md = Cs(CS_ARCH_X86,CS_MODE_32)
for item in md.disasm(ref_memory_list,0x1):
addr = hex(int(pe_base) + item.address)
ref_list.append({"Address": str(addr), "Assembly": item.mnemonic + " " + item.op_str})
return ref_list
if __name__ == "__main__":
dbg = Debugger(address="127.0.0.1",port=6589)
if False == dbg.connect():
exit()
# Read the. text base address from memory
pe_base = dbg.get_memory_localbase()
# Read the length of. text in memory
pe_size = dbg.get_memory_localsize()
# Get memory disassembly code
dasm = DisassemblyMemory(pe_base,0,pe_size)
# Print out disassembled instructions
for index in dasm:
print("{} | {}".format(index.get("Address"), index.get("Assembly")))
dbg.close()
When running the above code snippet, the disassembly code in the program memory of Win32Project.exe
can be output. By comparison, it is found that the disassembly fragments of the two are consistent except for different memory addresses;
python
0xd61001 | push ebp
0xd61002 | mov ebp, esp
0xd61004 | sub esp, 0x1c
0xd61007 | push esi
0xd61008 | mov esi, dword ptr [0xd620e0]
0xd6100e | push edi
0xd6100f | mov edi, dword ptr [ebp + 8]
0xd61012 | push 0x64
0xd61014 | push 0xd63438
0xd61019 | push 0x67
Comparing disk and memory disassembly
Finally, by comparing the memory and disk files, it is possible to determine which locations have been linked. Of course, before comparing, it is necessary to ensure that the linear addresses in memory are consistent with those on the disk. Two functions are encapsulated here, where scand_for_hooks_all
is used to output all comparison parameters, while scan_for_hooks
is only used to output specific parameters.
python
from x32dbg import Debugger
from capstone import *
import pefile
# Disassembling memory files
def DisassemblyMemory(address,offset,len):
ref_list = []
# Store the memory list as a byte array
ref_memory_list = bytearray()
# Loop byte by byte reading of data into variables
for index in range(offset,len):
char = dbg.get_memory_byte(address + index)
ref_memory_list.append(char)
# Disassemble data in memory
md = Cs(CS_ARCH_X86,CS_MODE_32)
for item in md.disasm(ref_memory_list,0x1):
addr = hex(int(pe_base) + item.address)
ref_list.append({"Address": str(addr), "Assembly": item.mnemonic + " " + item.op_str})
return ref_list
# Disassembling disk files
def DisassemblyFile(FilePath):
ref_list = []
# Load the PE file
pe = pefile.PE(FilePath)
# Get the base address of the program
ImageBase = pe.OPTIONAL_HEADER.ImageBase
# Find the .text section to extract disassembly information
for item in pe.sections:
if str(item.Name.decode('UTF-8').strip(b'\x00'.decode())) == '.text':
VirtualAddress = item.VirtualAddress
VirtualSize = item.Misc_VirtualSize
ActualOffset = item.PointerToRawData
# Calculate start and stop virtual addresses of the .text section
StartVA = ImageBase + VirtualAddress
StopVA = ImageBase + VirtualAddress + VirtualSize
# Open the file and read the binary data of the .text section
with open(FilePath, "rb") as fp:
fp.seek(ActualOffset)
HexCode = fp.read(VirtualSize)
# Initialize Capstone disassembler for x86 (32-bit)
md = Cs(CS_ARCH_X86, CS_MODE_32)
# Disassemble the binary code
for item in md.disasm(HexCode, 0):
# Calculate the address of each instruction
addr = hex(int(StartVA) + item.address)
# Construct a dictionary with address and assembly instruction
dic = {"Address": str(addr), "Assembly": item.mnemonic + " " + item.op_str}
# Append the dictionary to the reference list
ref_list.append(dic)
return ref_list
# Function to scan for hooks in disassembly all
def scan_for_hooks_all(dasm_memory, dasm_file, mem_base, file_base):
hooks_found = False
for index in range(min(len(dasm_memory), len(dasm_file))):
if dasm_memory[index]["Assembly"] != dasm_file[index]["Assembly"]:
print("Hook found at address: {}".format(dasm_memory[index]["Address"]))
print("Memory disassembly: {}".format(dasm_memory[index]["Assembly"]))
print("File disassembly: {}".format(dasm_file[index]["Assembly"]))
hooks_found = True
return hooks_found
# Function to scan for hooks in disassembly
def scan_for_hooks(dasm_memory, dasm_file, mem_base, file_base):
hooks_found = False
for index in range(min(len(dasm_memory), len(dasm_file))):
# Subtract the memory or file base address from the current memory address to obtain a relative address
mem_addr = int(dasm_memory[index]["Address"],16) - mem_base
file_addr = int(dasm_file[index]["Address"],16) - file_base
# Skip if addresses don't match
if mem_addr != file_addr:
continue
if dasm_memory[index]["Assembly"] != dasm_file[index]["Assembly"]:
print("Hook found at address: {}".format(dasm_memory[index]["Address"]))
print("Memory disassembly: {}".format(dasm_memory[index]["Assembly"]))
print("File disassembly: {}".format(dasm_file[index]["Assembly"]))
hooks_found = True
return hooks_found
if __name__ == "__main__":
dbg = Debugger(address="127.0.0.1",port=6589)
if False == dbg.connect():
exit()
# Read the.text base address from memory
pe_base = dbg.get_memory_localbase()
# Read the length of.text in memory
pe_size = dbg.get_memory_localsize()
# Get memory disassembly code
dasm_memory = DisassemblyMemory(pe_base, 0, pe_size)
# Example usage of DisassemblyFile function
dasm_file = DisassemblyFile("C://Win32Project.exe")
# Compare memory and file disassembly
file_base = int(dasm_file[0]["Address"],16)
# Scan the hook situation
hooks_found = scan_for_hooks_all(dasm_memory, dasm_file, pe_base, file_base)
if not hooks_found:
print("No hooks found.")
dbg.close()
Firstly, run the scan_for_hooks_all
function, which can output all exception instructions and the memory address where the instruction is located. Due to the different base addresses in the file and memory, this method will output all eligible comparison items, as shown below;
python
Hook found at address: 0xd61008
Memory disassembly: mov esi, dword ptr [0xd620e0]
File disassembly: mov esi, dword ptr [0x4020e0]
Hook found at address: 0xd61014
Memory disassembly: push 0xd63438
File disassembly: push 0x403438
Hook found at address: 0xd61020
Memory disassembly: push 0xd63370
File disassembly: push 0x403370
Hook found at address: 0xd6102c
Memory disassembly: call 0xf1
File disassembly: call 0xf0
Hook found at address: 0xd61036
Memory disassembly: nop
File disassembly: push 0
When running the scan_for_hooks
function, it will output exceptions excluding uncorrected memory addresses, as shown below;
python
Hook found at address: 0xd61037
Memory disassembly: nop
File disassembly: push 0