Reference：

https://arttnba3.cn/2022/11/24/ANGR-0X00-ANGR_CTF/

angr_ctf 笔记

一、环境搭建

安装依赖：

sudo apt-get install gcc-multilib

安装angr：

sudo pip3 install angr

题目环境：

git clone https://github.com/jakespringer/angr_ctf.git

使用说明：

cd 00_angr_find/
sudo python3 generate.py
	Usage: ./generate.py [seed] [output_file]
sudo python3 generate.py 654123 00_angr_find

二、angr-基本用法

>>> project = angr.Project(path_to_file)	#第一步就是创建一个 angr.Project 类
>>> project.arch			#返回程序使用的指令架构
# <Arch X86 (LE)>
>>> hex(project.entry)		#返回程序入口点
'0x80490b0'
>>> project.filename		#返回程序名
'./00_angr_find'

factory-实用类工厂

project.factory 提供了一些类的构造器

（1）block

angr 以基本块为单位分析代码，可以通过 project.factory.block(address) 获取给定地址所在的基本块，一个 Block 类实例：

>>> block = project.factory.block(project.entry)
>>> block
<Block for 0x80490b0, 20 bytes>
>>> block.pp()
         _start:
80490b0  endbr32 
80490b4  xor     ebp, ebp
80490b6  pop     esi
80490b7  mov     ecx, esp
80490b9  and     esp, 0xfffffff0
80490bc  push    eax
80490bd  push    esp
80490be  push    edx
80490bf  call    0x80490dd
>>> block.instructions		#指令个数
9
>>> block.instruction_addrs
(134516912, 134516916, 134516918, 134516919, 134516921, 134516924, 134516925, 134516926, 134516927)					#指令对应地址
>>> 

（2）state - 模拟执行状态

angr 使用 SimState 类表示一个 模拟的程序状态 （simulated program state），各种操作实际上是由一个 state 步进到另一个 state 的过程

project.factory.entry_state() ：获取一个程序的初始执行状态

project.factory.blank_state(addr) ：获取一个程序从指定地址开始执行的空白状态

>>> state = project.factory.entry_state()	#获得程序初始执行状态
# 寄存器状态组，可以通过寄存器名称来访问对应的寄存器
>>> state.regs
<angr.state_plugins.view.SimRegNameView object at 0x7fe6a3cccd00>
>>> state.regs.esp			# visit esp		
<BV32 0x7ffeffac>
>>> state.regs.ebp			# visit ebp		
<BV32 0x0>
>>> state.regs.eip			# visit eip
<BV32 0x80490b0>
>>> state.mem[0x80490B0].long	# 通过 state.mem[addr].type 完成内存访问
<long (32 bits) <BV32 0xfb1e0ff3> at 0x80490b0>
>>> state.memory.load(0x80490b0, 8) # 获取该地址上指定大小的位向量
<BV64 0xf30f1efb31ed5e89>
>>> state.posix						# POSIX 相关的环境接口
<angr.state_plugins.posix.SimSystemPosix object at 0x7fe6a3ccca60>

state.regs：寄存器状态组，其中每个寄存器都为一个 位向量 （BitVector），可以通过寄存器名称来访问对应的寄存器（例如 state.regs.esp -= 12 ）

state.mem：该状态的内存访问接口，可以直接通过 state.mem[addr].type 完成内存访问（例如 state.mem[0x1000].long = 4 ，对于读而言还需指定 .resolved 或 .concrete 表示位向量或是实际值，例如 state.mem[0x1000].long.concrete）

state.memory：另一种形式的内存访问接口：

state.memory.load(addr, size_in_bytes) ：获取该地址上指定大小的位向量
state.memory.store(addr, bitvector) ：将一个位向量存储到指定地址

state.posix：POSIX 相关的环境接口，例如 state.posix.dumps(fileno) 获取对应文件描述符上的流

（3）simulation_manager - 模拟执行器

angr 将一个状态的执行方法独立成一个 SimulationManager 类，以下两种写法等效：

>>> simgr = project.factory.simgr(state)
>>> simgr = project.factory.simulation_manager(state)

simgr.step()：以基本块为单位的单步执行

simgr.explore(addr)：路径探索，即执行到指定地址并进行约束求解，将执行完成的状态放在 simgr.found

simgr.found ：若无法求解则该列表为空

00_angr_find

源程序中会对我们输入的s1进行简单的加密（complex_function），使得加密后的结果需要等于 “RIMNUAWT”，所以我们通过 angr 路径探索，直接将目的地址设置为 puts(“Good Job.”)，然后求解即可。

最终效果如下所示：

import angr
import sys

path_to_file = './00_angr_find'

def main():
    # Create an Angr project, subsequent operations are built upon it
    project = angr.Project(path_to_file)

    # these two options make angr's result more efficient and accurate
    initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )
    # initial_state = project.factory.entry_state()

    # Create a simulation manager initialized with the starting state. 
    simulation = project.factory.simgr(initial_state)

    # input addr we want to reach
    print_good_address = 0x80492F8
    
    # This function will keep executing until it either finds a solution or it 
    # has explored every possible path through the executable.
    simulation.explore(find = print_good_address)

    # Check that we have found a solution
    if simulation.found:
        # The explore method stops after it finds a single state that arrives at the
        # target address.
        solution_state = simulation.found[0]
        # Print the string that Angr wrote to stdin to follow solution_state. This 
        # is our solution.
        print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
    else:
        raise Exception("Could not find the solution!")

if __name__ == '__main__':
    main()

01_angr_avoid

相较于前一题，这题增加了 avoid 参数的使用

通常使用模拟器的 .explore() 方法来进行路径探索，传入的默认参数为 find：

find 参数：一个/一组令模拟器终止运行的地址，符合的执行状态结果会被放到 .found 列表中

avoid 参数：模拟器运行中应当要避开的地址，当一个状态执行到这样的地址时，其会被放在 .avoided 列表中并不再往后执行

num_find 参数：指定需要寻找的解状态的数量，若未指定则会在 .found 列表中存储所有的解状态

main函数过大不能反编译，但可以通过汇编找到关键逻辑

后面会发现有很多调用 avoid_me 的函数调用，

为使我们能够输出 Good Job，因此 should_succeed 应该为 1，而 avoid_me 会将该值置为 0，所以这里我们需要避免执行 avoid_me 函数。

在前一题的基础上加上如下代码即可：

avoid_path_address = 0x08049243
simulation.explore(find = print_good_address, avoid = avoid_path_address)

exp

import angr
import sys

path_to_file = './01_avoid'

def main():
    project = angr.Project(path_to_file)
    initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )
    # initial_state = project.factory.entry_state()
    simulation = project.factory.simgr(initial_state)
    print_good_address = 0x08049280
    avoid_path_address = 0x08049243
    simulation.explore(find = print_good_address, avoid = avoid_path_address)
    if simulation.found:
        solution_state = simulation.found[0]
        print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
    else:
        raise Exception("Could not find the solution!")

if __name__ == '__main__':
    main()

02_angr_find_condition

explore 中的参数 find 和 avoid 可以不只是目标地址，而且还可以是自定义函数，参数为模拟状态，返回值为判定条件的布尔值。

若要寻找一条输出指定字符串的路径，可以选择通过判断该字符串是否在输出中的方式，可以通过 state.posix.dumps(文件描述符) 来获取对应文件描述符上的字符流：

def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

simulation.explore(find = find_path, avoid = avoid_path)

与第一题类似，但是从反汇编可以看到有很多的路径

这导致我们想要依靠指定一个目标地址，然后探索出对应的路径并不现实，所以这里我们就需要通过自定义函数来设置路径探索条件。

import angr
import sys

path_to_file = './02_find_condition'

def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
    return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
    project = angr.Project(path_to_file)
    initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )
    # initial_state = project.factory.entry_state()
    simulation = project.factory.simgr(initial_state)
    # print_good_address = 0x08049280
    simulation.explore(find = find_path, avoid = avoid_path)
    if simulation.found:
        solution_state = simulation.found[0]
        print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
    else:
        raise Exception("Could not find the solution!")

if __name__ == '__main__':
    main()

三、angr-符号化

Claripy - angr 的求解引擎

（1）bitvector - 位向量

位向量（bitvector）是 angr 求解引擎中的一个重要部分，其表示了 一组位 （a sequence of bits）

可以通过 claripy.BVV(int_value, size_in_bits) 或 claripy.BVV(string_value) 创建带有具体值（concrete value）的指定长度的位向量值（bitvector value）：

>>> import claripy
>>> bvv = claripy.BVV(b'bsd_crow')
>>> bvv
<BV64 0x6273645f63726f77>
>>> bvv = claripy.BVV(0xdeadbeef, 32)
>>> bvv
<BV32 0xdeadbeef>

相同长度的位向量可以进行运算，对于不同长度的位向量则可以通过

.zero_extend(extended_bits) ：完成位扩展（0填充）后进行运算，位运算也会存在溢出

>>> bvv = claripy.BVV(b'bsd_crow')
>>> bvv2 = claripy.BVV(0xdeadbeef, 32)
>>> bvv2 = bvv2.zero_extend(32)
>>> bvv + bvv
<BV64 0xc4e6c8bec6e4deee>
>>> bvv + bvv2
<BV64 0x6273646042202e66>
>>> bvv * bvv2
<BV64 0x1aecf1db4bfb6219>

位向量除了代表具体值（concrete value）的 bitvector value 外，还有代表符号变量（symbolic variable）的 bitvector symbol

claripy.BVS(name, size_in_bits) ：创建带名字的指定长度的位向量符号（bitvector symbol）

>>> bvs = claripy.BVS('x', 32)
>>> bvs2 = claripy.BVS('y', 32)
>>> bvs + bvs2
<BV32 x_0_32 + y_1_32>
>>> (bvs - bvs2)/bvs
<BV32 (x_0_32 - y_1_32) / x_0_32>

可以通过 .op 与 .args 获得位向量的运算类型与参数：

>>> bvv.op
'BVV'
>>> bvs.op
'BVS'
>>> bvs.args
('x_0_32', None, None, None, False, False, None)
>>> bvv.args
(7094124198891777911, 64)

state.solver - 状态的求解接口

在对位向量符号进行具体值求解的时候，可以将位向量符号放到状态内存或寄存器中，在完成路径探索后，就可以使用 solution_state.solver.eval 来获取对应状态下的值。

# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', 32) # 32-bit registers

# set the init_state's register to corresponding symbolic variables
initial_state.regs.eax = password_0

solution_0 = solution_state.solver.eval(password_0)

03_angr_symbolic_register

符号化寄存器

initial_state.regs.eax = password_0

创建一个从 get_user_input 之后开始执行的空白状态，将对应的寄存器设置为位向量符号，然后求解并获取对应符号变量值。

import angr
import sys
import claripy

path_to_file = './03_symbolic_registers'

def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
    return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
    project = angr.Project(path_to_file)
    start_addr = 0x0804958A	# set addr after get_user_input
    initial_state = project.factory.entry_state(
        addr = start_addr,
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )
    # initial_state = project.factory.blank_state(addr = start_addr)

    # create symbolic variables with claripy
    password_0 = claripy.BVS('password_0', 32) # 32-bit registers
    password_1 = claripy.BVS('password_1', 32) # 32-bit registers
    password_2 = claripy.BVS('password_2', 32) # 32-bit registers
    
    # set the init_state's register to corresponding symbolic variables
    initial_state.regs.eax = password_0
    initial_state.regs.ebx = password_1
    initial_state.regs.edx = password_2

    # solve question
    simulation = project.factory.simgr(initial_state)
    simulation.explore(find = find_path, avoid = avoid_path)

    # check result
    if simulation.found:
        solution_state = simulation.found[0]
        # get answer from solver.eval
        solution_0 = solution_state.solver.eval(password_0)
        solution_1 = solution_state.solver.eval(password_1)
        solution_2 = solution_state.solver.eval(password_2)

        print("password_0 ==> {}".format(hex(solution_0)))
        print("password_1 ==> {}".format(hex(solution_1)))
        print("password_2 ==> {}".format(hex(solution_2)))
    else:
        raise Exception("Could not find the solution!")

if __name__ == '__main__':
    main()

04_angr_symbolic_stack

如题目所示这道题需要我们将栈设置为符号变量进行求解

initial_state.stack_push(password_0)

程序还是很简单的，输入两个数进行约束求解，虽然用前面的方法也能进行解决，但是这道题我们主要学习如何将栈空间设置为符号变量，需要注意的是当我们通过下面命令创建一个空白状态时，在必要时我们需要自行设置当前状态环境，以便符号执行。

initial_state = project.factory.blank_state(addr = start_addr)

假设我们这里需要从 scanf 后的 add esp, 10h 开始符号执行，那我们就相当于跳过了 scanf 的执行，需要我们设置一下当前的栈空间。

关于这一点源文件 scanfold04 中做了详细介绍，我这里就简单说明一下

# We are jumping into the middle of a function! Therefore, we need to account
# for how the function constructs the stack. The second instruction of the
# function is:
#   mov    %esp,%ebp
# At which point it allocates the part of the stack frame we plan to target:
#   sub    $0x18,%esp
# Note the value of esp relative to ebp. The space between them is (usually)
# the stack space. Since esp was decreased by 0x18
#
#        /-------- The stack --------\
# ebp -> |                           |
#        |---------------------------|
#        |                           |
#        |---------------------------|
#         . . . (total of 0x18 bytes)
#         . . . Somewhere in here is
#         . . . the data that stores
#         . . . the result of scanf.
# esp -> |                           |
#        \---------------------------/

因为我们将 start_addr 直接设置到了 main 函数中间，所以我们需要从分析 main 开始设置正确的栈空间排布，

.text:08049385 89 E5                         mov     ebp, esp
.text:08049387 83 EC 18                      sub     esp, 18h

从 start_addr 开始时，对应的栈排布为

#            /-------- The stack --------\
# ebp ->     |          padding          |
#            |---------------------------|
# ebp - 0x01 |       more padding        |
#            |---------------------------|
# ebp - 0x02 |     even more padding     |
#            |---------------------------|
#                        . . .               <- How much padding? Hint: how
#            |---------------------------|      many bytes is password0?
# ebp - 0x0b |   password0, second byte  |
#            |---------------------------|
# ebp - 0x0c |   password0, first byte   |
#            |---------------------------|
# ebp - 0x0d |   password1, last byte    |
#            |---------------------------|
#                        . . .
#            |---------------------------|
# ebp - 0x10 |   password1, first byte   |
#            |---------------------------|
#                        . . .
#            |---------------------------|
# esp ->     |                           |
#            \---------------------------/

我们这里需要准确找到两个变量的正确位置，从而来设置符号变量

反应在 angr 中的代码为：

# Since we are starting after scanf, we are skipping this stack construction
# step. To make up for this, we need to construct the stack ourselves.
initial_state.regs.ebp = initial_state.regs.esp

# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', 32) # 32-bit registers
password_1 = claripy.BVS('password_1', 32) # 32-bit registers
# As you can see, the call to scanf looks like this:
# scanf(  0x80489c3,   ebp - 0xc,   ebp - 0x10  )
#      format_string    password0    password1
# From this, we can construct our new
# that's why we decrease the esp 0x8
initial_state.regs.esp -= 0x8

# set the init_state's stack to correspond symbolic variables
initial_state.stack_push(password_0)
initial_state.stack_push(password_1)
# restore esp 
initial_state.regs.esp -= 0x18   # equal sub rsp, 18h after the instruction of add  esp, 10h

import angr
import sys
import claripy

path_to_file = './04_symbolic_stack'

def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
    return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
    project = angr.Project(path_to_file)
    start_addr = 0x0804939F
    initial_state = project.factory.blank_state(addr = start_addr)

    # Since we are starting after scanf,we are skipping this stack construction
    # step. To make up for this, we need to construct the stack ourselves.
    initial_state.regs.ebp = initial_state.regs.esp

    # create symbolic variables with claripy
    password_0 = claripy.BVS('password_0', 32) # 32-bit registers
    password_1 = claripy.BVS('password_1', 32) # 32-bit registers

    # As you can see, the call to scanf looks like this:
    # scanf(  0x80489c3,   ebp - 0xc,   ebp - 0x10  )
    #      format_string    password0    password1
    # From this, we can construct our new
    # that's why we decrease the esp 0x8
    initial_state.regs.esp -= 0x8

    # set the init_state's stack to correspond symbolic variables
    initial_state.stack_push(password_0)
    initial_state.stack_push(password_1)

    # restore esp 
    initial_state.regs.esp -= 0x18

    # solve question
    simulation = project.factory.simgr(initial_state)
    simulation.explore(find = find_path, avoid = avoid_path)

    # check result
    if simulation.found:
        solution_state = simulation.found[0]
        # get answer from solver.eval
        solution_0 = solution_state.solver.eval(password_0)
        solution_1 = solution_state.solver.eval(password_1)

        print("password_0 ==> {}".format(solution_0))
        print("password_1 ==> {}".format(solution_1))
    else:
        raise Exception("Could not find the solution!")

if __name__ == '__main__':
    main()

05_angr_symbolic_memory

顾名思义即符号化内存，可以使用 state.memory 的对应接口进行操作：

state.memory.load(addr, size_in_bytes) ：获取该地址上指定大小的位向量
state.memory.store(addr, bitvector) ：将一个位向量存储到指定地址
solution_state.solver.eval(bitvector, cast_to=bytes).decode()：获取指定符号的值

要求输入四个值，每个值占 8 个字节。

这里唯一需要注意的一点是，要将 start_addr 设置为紧接着 scanf 之后的一条指令的地址

import angr
import sys
import claripy

path_to_file = './05_symbolic_memory'

def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
    return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
    project = angr.Project(path_to_file)
    start_addr = 0x08049299 # first insn after scanf()
    initial_state = project.factory.blank_state(
        addr = start_addr,
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )

    # initial_state = project.factory.entry_state(addr = start_addr)

    # create symbolic variables with claripy
    password_0 = claripy.BVS('password_0', 64) # 32-bit registers
    password_1 = claripy.BVS('password_1', 64) # 32-bit registers
    password_2 = claripy.BVS('password_2', 64) # 32-bit registers
    password_3 = claripy.BVS('password_3', 64) # 32-bit registers

    # set symbolic memory variable
    initial_state.memory.store(0x0957F640, password_0)
    initial_state.memory.store(0x0957F648, password_1)
    initial_state.memory.store(0x0957F650, password_2)
    initial_state.memory.store(0x0957F658, password_3)

    # solve question
    simulation = project.factory.simgr(initial_state)
    simulation.explore(find = find_path, avoid = avoid_path)

    # check result
    if simulation.found:
        solution_state = simulation.found[0]
        # get answer from solver.eval
        solution_0 = solution_state.solver.eval(password_0, cast_to=bytes).decode()
        solution_1 = solution_state.solver.eval(password_1, cast_to=bytes).decode()
        solution_2 = solution_state.solver.eval(password_2, cast_to=bytes).decode()
        solution_3 = solution_state.solver.eval(password_3, cast_to=bytes).decode()

        print("password_0 ==> {}".format(solution_0))
        print("password_1 ==> {}".format(solution_1))
        print("password_2 ==> {}".format(solution_2))
        print("password_3 ==> {}".format(solution_3))
    else:
        raise Exception("Could not find the solution!")

if __name__ == '__main__':
    main()

06_angr_symbolic_dynamic_memory

与上一题类似但是这题将值保存在了堆中，虽然我们不知道堆地址但是无妨，因为符号执行压根不会去执行程序，因此我们这里直接设置一个 fake 地址，也没有任何关系，buffer0 指针依旧是在 bss 上的，我们链接起来就好

buf0_addr = 0x9CECEA8
fake_chunk0 = 0xdeadbee0
# set buf_addr ==> fake_chunk0, because the program will not excute really
initial_state.memory.store(buf0_addr, fake_chunk0, endness=project.arch.memory_endness)
# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', 64) # 32-bit registers

# set symbolic memory variable
initial_state.memory.store(fake_chunk0, password_0)

注意事项：试想如果我们这里 buffer0 是在栈上怎么办，那么 04 就可以派上用场，依旧能解决问题。

import angr
import sys
import claripy


def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
    return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
    path_to_file = './06_symbolic_dynamic_memory'
    project = angr.Project(path_to_file)
    
    start_addr = 0x08049299 # first insn after scanf()
    initial_state = project.factory.blank_state(
        addr = start_addr,
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )

    # initial_state = project.factory.entry_state(addr = start_addr)
    buf0_addr = 0x9CECEA8
    buf1_addr = 0x9CECEAC
    fake_chunk0 = 0xdeadbee0
    fake_chunk1 = 0xdeadbef0
    # set buf_addr ==> fake_chunk0, because the program will not excute really
    initial_state.memory.store(buf0_addr, fake_chunk0, endness=project.arch.memory_endness)
    initial_state.memory.store(buf1_addr, fake_chunk1, endness=project.arch.memory_endness)


    # create symbolic variables with claripy
    password_0 = claripy.BVS('password_0', 64) # 32-bit registers
    password_1 = claripy.BVS('password_1', 64) # 32-bit registers

    # set symbolic memory variable
    initial_state.memory.store(fake_chunk0, password_0)
    initial_state.memory.store(fake_chunk1, password_1)

    # solve question
    simulation = project.factory.simgr(initial_state)
    simulation.explore(find = find_path, avoid = avoid_path)

    # check result
    if simulation.found:
        solution_state = simulation.found[0]
        # get answer from solver.eval
        solution_0 = solution_state.solver.eval(password_0, cast_to=bytes).decode()
        solution_1 = solution_state.solver.eval(password_1, cast_to=bytes).decode()

        print("password_0 ==> {}".format(solution_0))
        print("password_1 ==> {}".format(solution_1))
    else:
        raise Exception("Could not find the solution!")

if __name__ == '__main__':
    main()

07_angr_symbolic_file

即符号化文件，关键代码如下：

angr.SimFile()：来创建一个模拟文件

state.fs.insert(sim_file) ：将 SimFile 插入到一个状态的文件系统中

symbolic_file_size_bytes = 0x40

# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', symbolic_file_size_bytes * 8) # 0x40 file_size
filename = 'EZJXEZBK.txt'
password_file = angr.storage.SimFile(filename, password_0, size = symbolic_file_size_bytes)

# load the SimFile
initial_state.fs.insert(filename, password_file)

通过上面的代码可以通过 angr 创建一个模拟文件，从而用来符号执行

ignore_me 会将输入的 buffer 内容读入到文件中，我们这里所做的就是需要模拟这个文件，start_addr 我们设置在 ignore_me 的后一条汇编指令。

exp

import angr
import sys
import claripy


def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
    return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
    path_to_file = './07_symbolic_file'
    project = angr.Project(path_to_file)
    
    start_addr = 0x0804946C # first insn returned from ignoreme()
    initial_state = project.factory.blank_state(
        addr = start_addr,
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )

    # initial_state = project.factory.entry_state(addr = start_addr)
    symbolic_file_size_bytes = 0x40

    # create symbolic variables with claripy
    password_0 = claripy.BVS('password_0', symbolic_file_size_bytes * 8) # 0x40 file_size
    filename = 'EZJXEZBK.txt'
    password_file = angr.storage.SimFile(filename, password_0, size = symbolic_file_size_bytes)
    
    # load the SimFile
    initial_state.fs.insert(filename, password_file)
    
    # solve question
    simulation = project.factory.simgr(initial_state)
    simulation.explore(find = find_path, avoid = avoid_path)

    # check result
    if simulation.found:
        solution_state = simulation.found[0]
        # get answer from solver.eval
        solution_0 = solution_state.solver.eval(password_0, cast_to=bytes).decode()

        print("password_0 ==> {}".format(solution_0))
    else:
        raise Exception("Could not find the solution!")

if __name__ == '__main__':
    main()

四、angr-约束条件

08_angr_constraints

乍一看这道题跟前面的题目都差不多（实际上确实差不多），但是如果使用前面路径探索的代码会发现长时间跑不出结果，原因就在于 check_equals 中暗藏玄机，

这里逐个判断每个字符是否相等，若相同时会让计数器 v3 加 1，但这里实际存在一个分支路径爆炸的问题，来看看汇编进行理解。

由于总共有16个字符，所以要进行16分支判断，也就是说我们的符号执行过程中需要保存 2^16 = 65536 个分支，所以这会导致执行起来的效果变得异常慢，由于每次判断时都是与固定字符串进行比较，所以我们这里可以通过添加约束条件，从而在执行 check_equals 函数前，就把我们需要的正确的输入值 buffer 给拿到。

solution_state.add_constraints(constrain_parameter_bitvector == b'BOIKVOJYEZJXEZBK')

注意事项：

需要知道的是，在添加完约束条件后，angr 会调用 z3 完成对满足约束条件的值的求解
探索路径 addr_to_check_constraint 设置在 check_equals 之前的一行指令地址

import angr
import sys
import claripy

def solve():
    path_to_file = './08_constraints'
    proj = angr.Project(path_to_file)

    start_addr = 0x080492ED     # first instruction after scanf
    init_state = proj.factory.blank_state(
        addr = start_addr,
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )

    # create symbolic varialble
    buffer_addr = 0x0804C034
    buffer_size = 0x10
    buffer = claripy.BVS('buffer', buffer_size*8)

    # set addr to correspond var
    init_state.memory.store(buffer_addr, buffer)

    addr_to_check_constraint = 0x08049339  # first instruction before check_equals
    simulation = proj.factory.simgr(init_state)
    simulation.explore(find = addr_to_check_constraint)

    # check result
    if simulation.found:
        solution_state = simulation.found[0]

        # get encoded buffer content
        constrain_parameter_bitvector = solution_state.memory.load(buffer_addr, buffer_size)
        # add constrain condition
        solution_state.add_constraints(constrain_parameter_bitvector == b'BOIKVOJYEZJXEZBK')
        solution = solution_state.solver.eval(buffer, cast_to=bytes).decode()
        print(solution)
    else:
        raise Exception("Could not find the solution")

if __name__ == '__main__':
    solve()

五、angr-函数操作

project.hook(): 函数钩子

在实际需求中，angr 还可以使用 project.hook(addr = call_insn_addr, hook = my_function, length = n) 来 hook 掉对应的 call 指令，其中

call_insn_addr ：为 call 指令的地址

my_function：为我们的自定义函数

length：为 call 指令的长度：

# method 1
@project.hook(0x1234, length=5)
def my_hook_func(state):
    # do something, this is an example
    state.regs.eax = 0xdeadbeef

# method 2
def my_hook_func2(state):
    # do something, this is an example
    state.regs.eax = 0xdeadbeef
proj.hook(addr = 0x5678, hook = my_hook_func2, length = 5)

09_angr_hooks

这题相当于第 8 题，多了一个将 password 进行加密后，在输入一次，判断输入是否和加密后的 password 相同，可以发现用第 8 题的脚本，仍然可以跑出第一部分结果，但是这道题我们要用hook来模拟 check_equals。

但在这道题中我们可以学习到连续状态的路径探索，以往我们都是一个状态 explore 一次，在这道题中我们会 explore 两次。

simulation.explore(find = addr_to_check_constraint) # 第一次
# step.1 compute first solution
if simulation.found == 0:
    raise Exception("could not find solution")

check_state = simulation.found[0]
# add constrain condition
check_state.add_constraints(check_state.regs.eax == 1)
solution = check_state.solver.eval(buffer, cast_to=bytes).decode()
print("solution ==> {}".format(solution))

# step.2 compute second solution
simulation = project.factory.simgr(check_state)
simulation.explore(find = 0x080493A3)       #  # 第二次
if simulation.found == 0:
    raise Exception("could not find solution")

由于这里依然存在路径爆炸的问题，所以我们这里使用 hook 来解决这一问题。

注意事项：这里使用 claripy.If() 创建一个比较并将值给到 eax 寄存器作为返回值，最后就是常规的内存符号化后求解即可

import angr
import sys
import claripy

def solve():
    path_to_file = './09_hooks'
    project = angr.Project(path_to_file)
    password_str = b'BOIKVOJYEZJXEZBK'

    start_addr = 0x080492FD    # first instruction after scanf
    init_state = project.factory.blank_state(
        addr = start_addr,
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )
    buffer_addr = 0x0804C038
    buffer_size = 0x10
    buffer = claripy.BVS('buffer', buffer_size * 8)
    # store symbolic var
    init_state.memory.store(buffer_addr, buffer)

    # since we skip qmemcpy for password, we have to write origianl content into its addr
    password_addr = 0x0804C04C
    password_size = 0x10
    password = claripy.BVV(int.from_bytes(password_str, "big"), password_size * 8)
    init_state.memory.store(password_addr, password)

    # set hook function to replace check_equals
    first_check_equals_addr = 0x804934E
    @project.hook(first_check_equals_addr, length=5)
    def my_hook_func(state):
        buffer = state.memory.load(buffer_addr, buffer_size)
        state.regs.eax = claripy.If(buffer == password_str, #constraint
                                    # if success return 1, else return 0
                                    claripy.BVV(1, 32), claripy.BVV(0, 32))

    # explore path
    addr_to_check_constraint = 0x08049353    # Attention：first instruction after check_equals
    simulation = project.factory.simgr(init_state)
    simulation.explore(find = addr_to_check_constraint)

    # step.1 compute first solution
    if simulation.found == 0:
        raise Exception("could not find solution")

    check_state = simulation.found[0]
    # add constrain condition
    check_state.add_constraints(check_state.regs.eax == 1)
    solution = check_state.solver.eval(buffer, cast_to=bytes).decode()
    print("solution ==> {}".format(solution))

    # step.2 compute second solution
    simulation = project.factory.simgr(check_state)
    simulation.explore(find = 0x080493A3)       # last instruction before scanf
    if simulation.found == 0:
        raise Exception("could not find solution")

    solution_state = simulation.found[0]
    password = solution_state.memory.load(password_addr, password_size)
    solution2 = solution_state.solver.eval(password, cast_to=bytes).decode()
    print("solution2 ==> {}".format(solution2))

if __name__ == '__main__':
    solve()

angr.SimProcedure - 模拟函数（过程）

在 angr 中 angr.SimProcedure 类用来表示在一个状态上的一个运行过程——即函数实际上是一个 SimPrecedure

可以通过创建一个继承自 angr.SimProcedure 的类并重写 run() 方法的方式来表示一个自定义函数，其中 run() 方法的参数为该函数所接收的参数：

class MyProcedure(angr.SimProcedure):
    def run(self, arg1, arg2):
        # do something, this's an example
        return self.state.memory.load(arg1, arg2)

自定义函数会对文件中原有函数进行替换，angr 默认会用内置的一些 SimProcedure 去替换一些库函数。

在已经有该二进制文件的符号表，可以直接使用 project.hook_symbol(symbol_str, sim_procedure_instance) 来自动 hook 掉文件中所有的对应符号，其中 run() 方法的参数为被替换函数所接收的参数：

import angr
import claripy

class MyProcedure(angr.SimProcedure):
    def run(self, arg1, arg2):
        # do something, this's an example
        return self.state.memory.load(arg1, arg2)

proj = angr.Project('./test')
proj.hook_symbol('func_to_hook', MyProcedure())

在 SimProcedure 的 run() 过程中也可以使用一些有用的成员函数：

ret(expr): 函数返回
jump(addr): 跳转到指定地址
exit(code): 终止程序
call(addr, args, continue_at): 调用文件中的函数
inline_call(procedure, *args): 内联地调用另一个 SimProcedure

10_angr_simprocedures

这道题有点反编译出来的效果不是很好，可以看汇编代码来进行理解。

在执行完 complex_function 之后，下面会有非常多的伪路径，其中每个都包含 check_equals 函数调用，而且我们前面也知道每个 check_equals 中都会存在路径爆炸，尤其在有这么多伪路径的情况下，我们不可能在像第9题那样挨个去为每个调用点设置 hook 函数。

所以这时候我们可以通过 simprocedures 为所有函数 check_equals 设置 hook。

方法跟上一题 hook 函数类似，但是这里我们用类代替写 hook。

class ReplacementCheckEquals(angr.SimProcedure):
    def run(self, buffer_addr, buffer_len):
        buffer = self.state.memory.load(buffer_addr, buffer_len)
        compared_str = b'BOIKVOJYEZJXEZBK'
        return claripy.If(buffer == compared_str, claripy.BVV(1, 32), claripy.BVV(0, 32))
project.hook_symbol('check_equals_BOIKVOJYEZJXEZBK', ReplacementCheckEquals())

import angr
import sys
import claripy

class ReplacementCheckEquals(angr.SimProcedure):
    def run(self, buffer_addr, buffer_len):
        buffer = self.state.memory.load(buffer_addr, buffer_len)
        compared_str = b'BOIKVOJYEZJXEZBK'
        return claripy.If(buffer == compared_str, claripy.BVV(1, 32), claripy.BVV(0, 32))

def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
    return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def exp():
    path_to_file = './10_simprocedures'
    project = angr.Project(path_to_file)

    # set context environment
    initial_state = project.factory. entry_state(
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )

    # hook the check_equals()
    project.hook_symbol('check_equals_BOIKVOJYEZJXEZBK', ReplacementCheckEquals())

    # begin to explore path
    simulation = project.factory.simgr(initial_state)
    simulation.explore(find = find_path, avoid = avoid_path)
    if simulation.found:
        solution_state = simulation.found[0]
        print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
    else:
        raise Exception("Could not find the solution")


if __name__ == '__main__':
    exp()

11_angr_sim_scanf

反编译出来的效果依然不尽人意，程序中存在很多 scanf 的伪路径，这一题我们可以直接用第2题的脚本跑通，但是我们需要通过这一题来对前面学到的 simprocedure 对 scanf 函数进行模拟，关键代码如下。

class MyScanfProcedure(angr.SimProcedure):
    def run(self, fmt_str, buffer0_addr, buffer1_addr):
        buffer0 = claripy.BVS('buffer0', 4 * 8)
        buffer1 = claripy.BVS('buffer1', 4 * 8)
        self.state.memory.store(buffer0_addr, buffer0)
        self.state.memory.store(buffer1_addr, buffer1)
        self.state.globals['buffer_0'] = buffer0
        self.state.globals['buffer_1'] = buffer1
        
# hook the check_equals()
project.hook_symbol('__isoc99_scanf', MyScanfProcedure())
buffer0 = solution_state.globals['buffer_0']
password0 = solution_state.solver.eval(buffer0, cast_to = bytes)
print("password0: {}".format(int.from_bytes(password0, 'little')))

注意事项：若是在 run() 方法内创建 BVS ，则可以通过将其储存到 state.globals 列表的方式以便后续取用

import angr
import sys
import claripy

class MyScanfProcedure(angr.SimProcedure):
    def run(self, fmt_str, buffer0_addr, buffer1_addr):
        buffer0 = claripy.BVS('buffer0', 4 * 8)
        buffer1 = claripy.BVS('buffer1', 4 * 8)
        self.state.memory.store(buffer0_addr, buffer0)
        self.state.memory.store(buffer1_addr, buffer1)
        self.state.globals['buffer_0'] = buffer0
        self.state.globals['buffer_1'] = buffer1

def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
    return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def exp():
    path_to_file = './11_sim_scanf'
    project = angr.Project(path_to_file)

    # set context environment
    initial_state = project.factory.entry_state(
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )

    # hook the check_equals()
    project.hook_symbol('__isoc99_scanf', MyScanfProcedure())

    # begin to explore path
    simulation = project.factory.simgr(initial_state)
    simulation.explore(find = find_path, avoid = avoid_path)
    
    if simulation.found:
        solution_state = simulation.found[0]
        # get value method1
        # buffer0 = solution_state.memory.load(0x80605C4, 4)
        # buffer1 = solution_state.memory.load(0x8057274, 4)

        # get value method2
        buffer0 = solution_state.globals['buffer_0']
        buffer1 = solution_state.globals['buffer_1']
        password0 = solution_state.solver.eval(buffer0, cast_to = bytes)
        password1 = solution_state.solver.eval(buffer1, cast_to = bytes)
        print("password0: {}".format(int.from_bytes(password0, 'little')))
        print("password1: {}".format(int.from_bytes(password1, 'little')))

    else:
        raise Exception("Could not find the solution")

if __name__ == '__main__':
    exp()

六、angr-路径合并

12_angr_veritesting

这道题与前面不一样的是，在边执行 complex_function 的时候，边进行 逐个字符检查，会导致路径爆炸。

在angr中可以通过在创建 simgr 时指定参数 veritesting=True，这样 angr 就会在运行过程中进行路径合并，缓解路径爆炸的问题。

不知道这道题的成功率为什么忽高忽低，偶尔会很快，偶尔又比较慢。

import angr
import sys
import claripy

def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
    return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def exp():
    path_to_file = './12_veritesting'
    project = angr.Project(path_to_file)

    # set context environment
    initial_state = project.factory.entry_state()

    # begin to explore path
    simulation = project.factory.simgr(initial_state, veritesting = True)
    simulation.explore(find = 0x80492FF, avoid = 0x8049311)   
    # simulation.explore(find = find_path, avoid = avoid_path) 
    
    if simulation.found:
        solution_state = simulation.found[0]
        print("solution ==> {}".format(solution_state.posix.dumps(sys.stdin.fileno())))

    else:
        raise Exception("Could not find the solution")


if __name__ == '__main__':
    exp()

七、angr-库操作

13_angr_static_binary

与之前的题目不一样的是，这道题采用的是静态编译，我们之前都做的是动态编译的题目，这两者之间的区别是后者在执行时，angr 会自动通过 SimProcedures 去替换标准库函数，因为这样的效率会更快，而在静态编译的题目中，angr 并不会自动去替换，而是需要我们手动设置 hook 函数。样例如下：

angr.SIM_PROCEDURES['libc']['strcmp']
angr.SIM_PROCEDURES['libc']['scanf']
angr.SIM_PROCEDURES['libc']['printf']
angr.SIM_PROCEDURES['libc']['puts']
angr.SIM_PROCEDURES['libc']['exit']

同样的我们还是可以使用前面的 project.hook 的方法去进行设置 hook 函数：

project.hook(malloc_address, angr.SIM_PROCEDURES['libc']['malloc']())

需要注意的是 __libc_start_main 需要采用下面的方法去进行替换，如果不用 SimProcedures hook 掉这个函数，angr 的执行时间将会变得很长。

project.hook_symbol('__libc_start_main', 
                        angr.SIM_PROCEDURES['glibc']['__libc_start_main']())

注意事项：在上面比如说我们要 hook 掉 puts 函数，不像前面的题那样，在 call 处进行 hook ，而是在该函数静态编译后的代码实现处

最后效果如下：

import angr
import sys
import claripy

def find_path(state):
    return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
    return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def exp():
    path_to_file = './13_static_binary'
    project = angr.Project(path_to_file)

    # set context environment
    initial_state = project.factory.entry_state()

    # hook function
    project.hook(0x08051F00, angr.SIM_PROCEDURES['libc']['scanf']())
    project.hook(0x08051EB0, angr.SIM_PROCEDURES['libc']['printf']())
    project.hook(0x0805EED0, angr.SIM_PROCEDURES['libc']['puts']())
    project.hook(0x0806D8C0, angr.SIM_PROCEDURES['libc']['strcmp']())
    project.hook_symbol('__libc_start_main', 
                        angr.SIM_PROCEDURES['glibc']['__libc_start_main']())

    simulation = project.factory.simgr(initial_state)
    simulation.explore(find = find_path, avoid = avoid_path) 
    
    if simulation.found:
        solution_state = simulation.found[0]
        print("solution ==> {}".format(solution_state.posix.dumps(sys.stdin.fileno())))
    else:
        raise Exception("Could not find the solution")

if __name__ == '__main__':
    exp()

14_angr_shared_library

$ sudo python3 generate.py 123654 14__shared_library
/usr/bin/ld: cannot find 14__shared_library: No such file or directory
/usr/bin/ld: cannot find -l14__shared_library: No such file or directory
collect2: error: ld returned 1 exit status

编译的时候会报错，修改为下面的命令即可

$ sudo python3 generate.py 123654 ./14__shared_library

虽然这题我们可以通过前面第一题的脚本直接暴力出，但是这一题我们需要学习的是模拟动态库的符号执行，也就是对 .so 文件进行符号执行，源程序中会调用 lib14__shared_library.so 中的 validate 函数。

这是具体的实现

需要注意的是动态库开了 PIE，所以我们需要为程序手动随机设置一个基地址，设置方法如下：

base_addr = 0x400000
project = angr.Project(path_to_file, load_options = {
    'main_opts': {
    	'custom_base_addr': base_addr
    }
})

将符号执行模拟地址设置于 validate 函数开始，并我们提前向栈里面传递参数（32位），来模拟调用 validate 时的状态。

# emulate validate func ==> validate(password, 8)
initial_state.regs.ebp = initial_state.regs.esp
initial_state.stack_push(8)
initial_state.stack_push(password_addr)
initial_state.stack_push(0xdeadbeef) # ret addr

最后效果如下：

import angr
import sys
import claripy

def main():
    path_to_file = './lib14__shared_library.so'
    base_addr = 0x400000
    project = angr.Project(path_to_file, load_options = {
        'main_opts': {
            'custom_base_addr': base_addr
        }
    })

    start_addr = base_addr + 0x0001244 # addr of validate
    initial_state = project.factory.blank_state(
        addr = start_addr,
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                        angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )
    password_addr = base_addr + 0x100000
    password = claripy.BVS('password', 8 * 8)
    initial_state.memory.store(password_addr, password)

    # emulate validate func ==> validate(password, 8)
    initial_state.regs.ebp = initial_state.regs.esp
    initial_state.stack_push(8)
    initial_state.stack_push(password_addr)
    initial_state.stack_push(0xdeadbeef) # ret addr


    # initial_state = project.factory.entry_state()
    simulation = project.factory.simgr(initial_state)
    find_path = base_addr + 0x000012ED
    simulation.explore(find = find_path)
    
    if simulation.found:
        solution_state = simulation.found[0]
        solution_state.add_constraints(solution_state.regs.eax == 1)
        solution = solution_state.solver.eval(password, cast_to=bytes).decode()
        print(solution)

    else:
        raise Exception("Could not find the solution!")

if __name__ == '__main__':
    main()

八、angr-漏洞利用

15_angr_arbitrary_read

程序中存在一个溢出，我们的任务是让程序输出 Good Job.，从pwn的角度来说这个实现非常简单，但是我们这里要学习如何使用 angr 来进行漏洞利用。

源程序中调用的关键函数主要有两个，一个是 scanf 用来接受字符串，一个是 puts 用于输出字符串。

（1）为了获取到我们输入到 scanf 中的字符串，我们将该函数进行 hook，下面代码参考 a3 师傅。

class MySimScanfProcedure(angr.SimProcedure):
    def run(self, fmt_str, key_addr, chr_arr_addr):
        key_bvs = claripy.BVS('key', 4 * 8)
        chr_arr_bvs = claripy.BVS('chr_arr', 20 * 8)
        for ch in chr_arr_bvs.chop(bits = 8):
            self.state.add_constraints(ch >= '0', ch <= 'z')
        self.state.memory.store(key_addr, key_bvs,
                                endness = project.arch.memory_endness)
        self.state.memory.store(chr_arr_addr, chr_arr_bvs)
        self.state.globals['password_0'] = key_bvs
        self.state.globals['password_1'] = chr_arr_bvs

注意事项：上面代码使用 BVS.chop(bits=n) 来将符号位向量按照一定尺寸进行分割，将字符限制在了 ‘0’-‘z’ 之间。

（2）为了检查 puts 输出的字符串是否是 ‘Good Job.’，我们在设置一个自定义函数进行判断。

def is_success(state):
    puts_plt_addr = 0x08049060
    if state.addr != puts_plt_addr:
        return False
    good_str_addr = 0x424F494B
    puts_param = state.memory.load(state.regs.esp + 4, 4,
                                endness = project.arch.memory_endness)
    if state.solver.symbolic(puts_param):
        copy_state = state.copy()
        copy_state.add_constraints(puts_param == good_str_addr)
        if copy_state.satisfiable():
            state.add_constraints(puts_param == good_str_addr)
            return True
        else:
            return False
    else:
        return False

其中检查的地址设在了 puts_plt 处的位置，并通过获取字符串地址，在设置约束条件之后即可，求解出正确的输入，下面是对上面代码中的一些关键函数进行说明。

使用 state.memory.load() 将 puts() 的参数提取出来
使用 state.solver.symbolic() 判断 puts() 的参数是否是我们的 BVS
使用 state.copy() 获取当前状态的副本，之后在该副本上添加约束以避免影响到原状态
使用 state.satisfiable() 判断是否满足约束，若是则添加到原状态中求解即可

import angr
import sys
import claripy

def exp():
    path_to_file = './15_arbitrary_read'
    project = angr.Project(path_to_file)
    class MySimScanfProcedure(angr.SimProcedure):
        def run(self, fmt_str, key_addr, chr_arr_addr):
            key_bvs = claripy.BVS('key', 4 * 8)
            chr_arr_bvs = claripy.BVS('chr_arr', 20 * 8)
            for ch in chr_arr_bvs.chop(bits = 8):
                self.state.add_constraints(ch >= '0', ch <= 'z')
            self.state.memory.store(key_addr, key_bvs,
                                    endness = project.arch.memory_endness)
            self.state.memory.store(chr_arr_addr, chr_arr_bvs)
            self.state.globals['password_0'] = key_bvs
            self.state.globals['password_1'] = chr_arr_bvs

    def is_success(state):
        puts_plt_addr = 0x08049060
        if state.addr != puts_plt_addr:
            return False
        good_str_addr = 0x424F494B
        puts_param = state.memory.load(state.regs.esp + 4, 4,
                                    endness = project.arch.memory_endness)
        if state.solver.symbolic(puts_param):
            copy_state = state.copy()
            copy_state.add_constraints(puts_param == good_str_addr)
            if copy_state.satisfiable():
                state.add_constraints(puts_param == good_str_addr)
                return True
            else:
                return False
        else:
            return False

    initial_state = project.factory.entry_state(
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                        angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )
    # hook scanf func
    project.hook_symbol('__isoc99_scanf', MySimScanfProcedure())

    simulation = project.factory.simgr(initial_state)
    simulation.explore(find = is_success)

    if simulation.found:
        print("found")
        solution_state = simulation.found[0]
        print("password_0 ==> {}".format(solution_state.solver.eval(solution_state.globals['password_0'])))
        print("password_1 ==> {}".format(solution_state.solver.eval(solution_state.globals['password_1'], cast_to = bytes)))
    else:
        raise Exception("Could not found solution")

if __name__ == '__main__':
    exp()

16_angr_arbitrary_write

跟上一题类似，不过这一题要做的就是修改把路径探索条件设置在 strcpy 那里进行条件检查：

def is_success(state):
    strcpy_plt_addr = 0x08049080
    if state.addr != strcpy_plt_addr:
        return False
    good_str_addr = 0x0804A031
    dest_param1 = state.memory.load(state.regs.esp + 4, 4,
                                endness = project.arch.memory_endness)
    s_param2 = state.memory.load(state.regs.esp + 8, 4,
                                endness = project.arch.memory_endness)
    compared_str = state.memory.load(s_param2, 8)
    password_addr = 0x44414D48

    if state.solver.symbolic(dest_param1) and state.solver.symbolic(compared_str):
        copy_state = state.copy()
        copy_state.add_constraints(dest_param1 == password_addr)
        copy_state.add_constraints(compared_str == b"VMHNNFKL")
        if copy_state.satisfiable():
            state.add_constraints(dest_param1 == password_addr)
            state.add_constraints(compared_str == b"VMHNNFKL")
            return True
        else:
            return False
    else:
        return False

一开始这题我没有设置 is_success 检查，直接暴力探索，但是好像跑不出来结果

import angr
import sys
import claripy

def exp():
    path_to_file = './16_arbitrary_write'
    project = angr.Project(path_to_file)
    class MySimScanfProcedure(angr.SimProcedure):
        def run(self, fmt_str, key_addr, chr_arr_addr):
            key_bvs = claripy.BVS('key', 4 * 8)
            chr_arr_bvs = claripy.BVS('chr_arr', 20 * 8)
            for ch in chr_arr_bvs.chop(bits = 8):
                self.state.add_constraints(ch >= '0', ch <= 'z')
            self.state.memory.store(key_addr, key_bvs,
                                    endness = project.arch.memory_endness)
            self.state.memory.store(chr_arr_addr, chr_arr_bvs)
            self.state.globals['password_0'] = key_bvs
            self.state.globals['password_1'] = chr_arr_bvs

    def is_success(state):
        strcpy_plt_addr = 0x08049080
        if state.addr != strcpy_plt_addr:
            return False
        good_str_addr = 0x0804A031
        dest_param1 = state.memory.load(state.regs.esp + 4, 4,
                                    endness = project.arch.memory_endness)
        s_param2 = state.memory.load(state.regs.esp + 8, 4,
                                    endness = project.arch.memory_endness)
        compared_str = state.memory.load(s_param2, 8)
        password_addr = 0x44414D48

        if state.solver.symbolic(dest_param1) and state.solver.symbolic(compared_str):
            copy_state = state.copy()
            copy_state.add_constraints(dest_param1 == password_addr)
            copy_state.add_constraints(compared_str == b"VMHNNFKL")
            if copy_state.satisfiable():
                state.add_constraints(dest_param1 == password_addr)
                state.add_constraints(compared_str == b"VMHNNFKL")
                return True
            else:
                return False
        else:
            return False

    initial_state = project.factory.entry_state(
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                        angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )
    # hook scanf func
    project.hook_symbol('__isoc99_scanf', MySimScanfProcedure())

    simulation = project.factory.simgr(initial_state)
    simulation.explore(find = is_success)

    if simulation.found:
        print("found")
        solution_state = simulation.found[0]
        print("password_0 ==> {}".format(solution_state.solver.eval(solution_state.globals['password_0'])))
        print("password_1 ==> {}".format(solution_state.solver.eval(solution_state.globals['password_1'], cast_to = bytes)))
    else:
        raise Exception("Could not found solution")

if __name__ == '__main__':
    exp()

17_angr_arbitrary_jump

上面的 read_input 存在栈溢出

stash 类型

在 angr 有以下几种 stash：

simgr.active：活跃的状态列表。在未指定替代的情况下会被模拟器默认执行
simgr.deadended：死亡的状态列表。当一个状态无法再被继续执行时（例如没有有效指令、无效的指令指针、不满足其所有的后继（successors））便会被归入该列表
simgr.pruned：被剪枝的状态列表。在指定了 LAZY_SOLVES 时，状态仅在必要时检查可满足性，当一个状态在指定了 LAZY_SOLVES 时被发现是不可满足的（unsat），状态层（state hierarchy）将会被遍历以确认在其历史中最初变为不满足的时间，该点及其所有后代都会被剪枝（pruned）并放入该列表
simgr.unconstrained：不受约束的状态列表。当创建 SimulationManager 时指定了 save_unconstrained=True，则被认为不受约束的（unconstrained，即指令指针被用户数据或其他来源的符号化数据控制）状态会被归入该列表
simgr.unsat：不可满足的状态列表。当创建 SimulationManager 时指定了 save_unsat=True，则被认为无法被满足的（unsatisfiable，即存在约束冲突的状态，例如在同一时刻要求输入既是"AAAA" 又是 "BBBB"）状态会被归入该列表

还有一种不是 stash 的状态列表——errored，若在执行中产生了错误，则状态与其产生的错误会被包裹在一个 ErrorRecord 实例中（可通过 record.state 与 record.error 访问），该 record 会被插入到 errored 中，我们可以通过 record.debug() 启动一个调试窗口

stash 操作

我们可以使用 stash.move() 来在 stash 之间转移放置状态，用法如下：

simgr.move(from_stash = 'unconstrained', to_stash = 'active')

在转移当中我们还可以通过指定 filter_func 参数来进行过滤：

def filter_func(state):
    print_good_addr = 0x424F4954
    return state.satisfiable(
        extra_constraints = (state.regs.eip == print_good_addr, ))
        
simulation.move(from_stash = 'unconstrained', to_stash = 'found', filter_func = filter_func)

stash 本质上是个 list，因此在初始化时可以通过字典的方式指定每个 stash 的初始内容：

simulation = project.factory.simgr(initial_state,
                              save_unconstrained = True,
                              stashes = {
                                  'active':[initial_state],
                                  'unconstrained':[],
                                  'found':[]
                              })

import angr
import angr.factory
import claripy

def filter_func(state):
    print_good_addr = 0x424F4954
    return state.satisfiable(
        extra_constraints = (state.regs.eip == print_good_addr, ))

def exp():
    path_to_file = './17_arbitrary_jump'
    project = angr.Project(path_to_file)

    class MySimScanfProcedure(angr.SimProcedure):
        def run(self, fmt_str, data_addr):
            data_bvs = claripy.BVS('data', 200 * 8)
            for ch in data_bvs.chop(bits = 8):
                self.state.add_constraints(ch >= '0', ch <= 'z')
            self.state.memory.store(data_addr, data_bvs)
            self.state.globals['password'] = data_bvs

    project.hook_symbol('__isoc99_scanf', MySimScanfProcedure())
    # create simgr that can save unconstraints
    initial_state = project.factory.entry_state()
    simulation = project.factory.simgr(initial_state,
                                  save_unconstrained = True,
                                  stashes = {
                                      'active':[initial_state],
                                      'unconstrained':[],
                                      'found':[]
                                  })
    
    # simulated execution by steps
    while not simulation.found:
        if (not simulation.active) and (not simulation.unconstrained):
            break
        # check for unconstrained
        simulation.move(from_stash = 'unconstrained',
                   to_stash = 'found',
                   filter_func = filter_func)
        # step to next basic block
        simulation.step()


    if simulation.found:
        print("found")
        solution_state = simulation.found[0]
        print(solution_state)
        print_good_addr = 0x424F4954
        solution_state.add_constraints(solution_state.regs.eip == print_good_addr)
        print("password_0 ==> {}".format(solution_state.solver.eval(
            solution_state.globals['password'], cast_to = bytes)))
    else:
        raise Exception("Could not found solution")    

if __name__ == '__main__':
    exp()

总结

感觉 angr 在漏洞利用方面表现得较为单薄，尽管可能是一个简单的栈溢出，我们依然有很多要考虑的因素。但由于 angr 拥有着丰富的 API，不得不重试其在某些方面的重要性，笔者之所以这里尝试学习 angr 是因为在一次比赛中，通过简单使用 angr 的几个调用，便可以直接将题目秒掉。通过做完这些练手入门题目，发现 angr 至少在逆向这块能发挥出不错的潜能，希望后期可以在多了解 angr 的其他功能。

angr 学习笔记