Skip to content

Disassembly interface

Disassembly is the process of converting machine code or compiled binary files back into human readable assembly code. Assembly code is a low-level language that is closer to computer hardware instructions and is easier to understand and analyze compared to machine code. Through disassembly, programmers can understand the internal workings of the program, diagnose problems, reverse engineering, and more. Disassembly can be used to analyze, modify, optimize, and other operations on compiled programs, and is one of the important tools in software reverse engineering and security research.

Disassembly

get_disassembly_line

There are many disassembly functions in this series. First, let's introduce the get_disassembly_line function. This function takes a decimal memory address and disassembles the memory in the region it is located in. After successful execution, it outputs a dictionary with detailed disassembly parameters. You can get the corresponding information according to your needs. If it fails, it returns false.

python
>>> eip = dbg.get_eip()
>>> hex(eip)
'0x1915bb'
>>>
>>> dasm = dbg.get_disassembly_aline(eip)
>>> dasm
{'Address': 1643963, 'Assembly': 'call 0x001918A1', 'Size': 5}
>>>
>>> hex(dasm.get("Address"))
'0x1915bb'
>>> dasm.get("Assembly")
'call 0x001918A1'
>>> dasm.get("Size")
5

get_disassembly_type

If you want detailed disassembly information, you can use the get_disassembly_type function, which also requires passing in a memory address. The function successfully returns a dictionary, where IsBranch indicates whether it is a branch statement, IsCall indicates whether it is a call, and the final Type represents the type of the current statement.

python
>>> eip = dbg.get_eip()
>>>
>>> dasm = dbg.get_disassembly_type(eip)
>>> dasm
{
	'Address': 1643963, 
	'Assembly': 'call 0x001918A1', 
	'Size': 5, 
	'IsBranch': 1, 
	'IsCall': 1, 
	'Type': 4
}

get_disassembly_count

Sometimes, we need to disassemble a specified length of memory address at once, which can be achieved by calling the get_disassembly_count function, which allows the user to pass in a disassembly length, and the program will traverse a specific number of disassembly machine codes from the memory address down.

python
>>> eip = dbg.get_eip()
>>> dasm = dbg.get_disassembly_count(eip,10)
>>>
>>> for index in range(0,9):
...     address = dasm[index].get("Address")
...     assembly = dasm[index].get("Assembly")
...     size = dasm[index].get("Size")
...     print("{:08x} | {:40} | {}".format(address,assembly,size))
...
001915bb | call 0x001918A1                          | 5
001915c0 | jmp 0x0019140E                           | 5
001915c5 | push ebp                                 | 1
001915c6 | mov ebp, esp                             | 2
001915c8 | call dword ptr ds:[0x00192018]           | 6
001915ce | push 0x01                                | 2
001915d0 | mov dword ptr ds:[0x00193354], eax       | 5
001915d5 | call 0x00191B2C                          | 5
001915da | push dword ptr ss:[ebp+0x08]             | 3

get_disassembly_

The above methods are encapsulated interfaces. If you need to customize parameter acquisition, you can call the following methods to implement them. Users can combine these methods themselves;

  • By calling the get_disassembly_text function and passing in a decimal memory address, a disassembly code can be obtained at the current address.
  • By calling the get_disassembly_size function and passing in a decimal memory address, the length of the machine code at the current address can be obtained.
  • By calling the get_disassembly_operand function and passing in a decimal memory address, the operand of the assembly instruction at the current address can be obtained.
  • By calling the get_branch_destination function and passing in a decimal memory address, the current call or Jmp jump operand can be obtained.
python
>>> eip = dbg.get_eip()
>>>
>>> dbg.get_disassembly_text(eip)
'call win32project.1918A1'
>>>
>>> dbg.get_disassembly_size(eip)
5
>>>
>>> dbg.get_disassembly_operand(eip)
{'Operands': 1644705, 'Size': 0}
>>>
>>> dbg.get_branch_destination(eip)
1644705

Assembly

assembld_code_hex

Assembly series instructions translate an assembly instruction passed in by a user into hexadecimal machine code format. Firstly, the specific information of the assembly instruction machine code can be obtained by calling assembld_code_hex. This function takes a string parameter and returns a dictionary format after successful execution.

python
>>> dbg.assemble_code_hex("push esi")
{'Assembly': 'push esi', 'Hex': '56', 'Size': 1}
>>>
>>> dbg.assemble_code_hex("push edi")
{'Assembly': 'push edi', 'Hex': '57', 'Size': 1}
>>>
>>> assembly = dbg.assemble_code_hex("mov eax,1")
>>>
>>> assembly.get("Hex")
'B801000000'

assembld_code_size

If the user only needs to obtain the length of a specific assembly instruction without any additional information, then assembld_code_size is a good choice, which takes a string instruction and outputs the length information of that instruction.

python
>>> dbg.assemble_code_size("mov eax,1")
5
>>> dbg.assemble_code_size("jmp esp")
2
>>> dbg.assemble_code_size("push ecx")
1

assembl_write_memory

Sometimes, we need to write an assembly instruction to a specific memory, which can be implemented using either assembld_at or assembl_write_memory, both of which accept memory addresses and assembly instructions and automatically write them out to the corresponding memory.

python
>>> eip = dbg.get_eip()
>>> hex(eip)
'0x7733f127'
>>>
>>> dbg.assemble_at(eip,"nop")
True
>>> dbg.assemble_write_memory(eip,"mov eax,1")
True