angr 学习笔记

henry Lv4

Reference:

https://arttnba3.cn/2022/11/24/ANGR-0X00-ANGR_CTF/

angr_ctf 笔记

一、环境搭建

安装依赖:

1
sudo apt-get install gcc-multilib

安装angr:

1
sudo pip3 install angr

题目环境:

1
git clone https://github.com/jakespringer/angr_ctf.git

使用说明:

1
2
3
4
cd 00_angr_find/
sudo python3 generate.py
Usage: ./generate.py [seed] [output_file]
sudo python3 generate.py 654123 00_angr_find

二、angr-基本用法

1
2
3
4
5
6
7
>>> project = angr.Project(path_to_file)	#第一步就是创建一个 angr.Project 类
>>> project.arch #返回程序使用的指令架构
# <Arch X86 (LE)>
>>> hex(project.entry) #返回程序入口点
'0x80490b0'
>>> project.filename #返回程序名
'./00_angr_find'

factory-实用类工厂

project.factory 提供了一些类的构造器

(1)block

angr 以基本块为单位分析代码,可以通过 project.factory.block(address) 获取给定地址所在的基本块,一个 Block 类实例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
>>> block = project.factory.block(project.entry)
>>> block
<Block for 0x80490b0, 20 bytes>
>>> block.pp()
_start:
80490b0 endbr32
80490b4 xor ebp, ebp
80490b6 pop esi
80490b7 mov ecx, esp
80490b9 and esp, 0xfffffff0
80490bc push eax
80490bd push esp
80490be push edx
80490bf call 0x80490dd
>>> block.instructions #指令个数
9
>>> block.instruction_addrs
(134516912, 134516916, 134516918, 134516919, 134516921, 134516924, 134516925, 134516926, 134516927) #指令对应地址
>>>

(2)state - 模拟执行状态

angr 使用 SimState 类表示一个 模拟的程序状态 (simulated program state),各种操作实际上是由一个 state 步进到另一个 state 的过程

project.factory.entry_state() :获取一个程序的初始执行状态

project.factory.blank_state(addr) :获取一个程序从指定地址开始执行的空白状态

nipaste_2024-05-07_01-07-1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> state = project.factory.entry_state()	#获得程序初始执行状态
# 寄存器状态组,可以通过寄存器名称来访问对应的寄存器
>>> state.regs
<angr.state_plugins.view.SimRegNameView object at 0x7fe6a3cccd00>
>>> state.regs.esp # visit esp
<BV32 0x7ffeffac>
>>> state.regs.ebp # visit ebp
<BV32 0x0>
>>> state.regs.eip # visit eip
<BV32 0x80490b0>
>>> state.mem[0x80490B0].long # 通过 state.mem[addr].type 完成内存访问
<long (32 bits) <BV32 0xfb1e0ff3> at 0x80490b0>
>>> state.memory.load(0x80490b0, 8) # 获取该地址上指定大小的位向量
<BV64 0xf30f1efb31ed5e89>
>>> state.posix # POSIX 相关的环境接口
<angr.state_plugins.posix.SimSystemPosix object at 0x7fe6a3ccca60>

state.regs寄存器状态组,其中每个寄存器都为一个 位向量 (BitVector),可以通过寄存器名称来访问对应的寄存器(例如 state.regs.esp -= 12

state.mem该状态的内存访问接口,可以直接通过 state.mem[addr].type 完成内存访问(例如 state.mem[0x1000].long = 4 ,对于读而言还需指定 .resolved.concrete 表示位向量或是实际值,例如 state.mem[0x1000].long.concrete

state.memory另一种形式的内存访问接口

  • state.memory.load(addr, size_in_bytes)获取该地址上指定大小的位向量
  • state.memory.store(addr, bitvector) :将一个位向量存储到指定地址

state.posixPOSIX 相关的环境接口,例如 state.posix.dumps(fileno) 获取对应文件描述符上的流

(3)simulation_manager - 模拟执行器

angr 将一个状态的执行方法独立成一个 SimulationManager 类,以下两种写法等效:

1
2
>>> simgr = project.factory.simgr(state)
>>> simgr = project.factory.simulation_manager(state)

simgr.step()以基本块为单位的单步执行

simgr.explore(addr):路径探索,即执行到指定地址并进行约束求解,将执行完成的状态放在 simgr.found

simgr.found :若无法求解则该列表为空

00_angr_find

nipaste_2024-05-07_01-42-0

源程序中会对我们输入的s1进行简单的加密(complex_function),使得加密后的结果需要等于 “RIMNUAWT”,所以我们通过 angr 路径探索,直接将目的地址设置为 puts(“Good Job.”),然后求解即可。

最终效果如下所示:

nipaste_2024-05-07_01-40-5

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import angr
import sys

path_to_file = './00_angr_find'

def main():
# Create an Angr project, subsequent operations are built upon it
project = angr.Project(path_to_file)

# these two options make angr's result more efficient and accurate
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# initial_state = project.factory.entry_state()

# Create a simulation manager initialized with the starting state.
simulation = project.factory.simgr(initial_state)

# input addr we want to reach
print_good_address = 0x80492F8

# This function will keep executing until it either finds a solution or it
# has explored every possible path through the executable.
simulation.explore(find = print_good_address)

# Check that we have found a solution
if simulation.found:
# The explore method stops after it finds a single state that arrives at the
# target address.
solution_state = simulation.found[0]
# Print the string that Angr wrote to stdin to follow solution_state. This
# is our solution.
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception("Could not find the solution!")

if __name__ == '__main__':
main()

01_angr_avoid

相较于前一题,这题增加了 avoid 参数的使用

通常使用模拟器的 .explore() 方法来进行路径探索,传入的默认参数为 find:

find 参数:一个/一组令模拟器终止运行的地址,符合的执行状态结果会被放到 .found 列表中

avoid 参数:模拟器运行中应当要避开的地址,当一个状态执行到这样的地址时,其会被放在 .avoided 列表中并不再往后执行

num_find 参数:指定需要寻找的解状态的数量,若未指定则会在 .found 列表中存储所有的解状态

nipaste_2024-05-07_02-02-2

main函数过大不能反编译,但可以通过汇编找到关键逻辑nipaste_2024-05-07_02-03-2

后面会发现有很多调用 avoid_me 的函数调用,

nipaste_2024-05-07_02-04-2

为使我们能够输出 Good Job,因此 should_succeed 应该为 1,而 avoid_me 会将该值置为 0,所以这里我们需要避免执行 avoid_me 函数。

nipaste_2024-05-07_02-04-3

在前一题的基础上加上如下代码即可:

1
2
avoid_path_address = 0x08049243
simulation.explore(find = print_good_address, avoid = avoid_path_address)

nipaste_2024-05-07_02-06-5

exp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import angr
import sys

path_to_file = './01_avoid'

def main():
project = angr.Project(path_to_file)
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# initial_state = project.factory.entry_state()
simulation = project.factory.simgr(initial_state)
print_good_address = 0x08049280
avoid_path_address = 0x08049243
simulation.explore(find = print_good_address, avoid = avoid_path_address)
if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception("Could not find the solution!")

if __name__ == '__main__':
main()

02_angr_find_condition

explore 中的参数 find 和 avoid 可以不只是目标地址,而且还可以是自定义函数,参数为模拟状态,返回值为判定条件的布尔值

若要寻找一条输出指定字符串的路径,可以选择通过判断该字符串是否在输出中的方式,可以通过 state.posix.dumps(文件描述符) 来获取对应文件描述符上的字符流:

1
2
3
4
def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

simulation.explore(find = find_path, avoid = avoid_path)

nipaste_2024-05-07_11-31-2

与第一题类似,但是从反汇编可以看到有很多的路径

nipaste_2024-05-07_11-38-0

这导致我们想要依靠指定一个目标地址,然后探索出对应的路径并不现实,所以这里我们就需要通过自定义函数来设置路径探索条件。

nipaste_2024-05-07_11-40-2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import angr
import sys

path_to_file = './02_find_condition'

def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
project = angr.Project(path_to_file)
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# initial_state = project.factory.entry_state()
simulation = project.factory.simgr(initial_state)
# print_good_address = 0x08049280
simulation.explore(find = find_path, avoid = avoid_path)
if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception("Could not find the solution!")

if __name__ == '__main__':
main()

三、angr-符号化

Claripy - angr 的求解引擎

(1)bitvector - 位向量

位向量(bitvector)是 angr 求解引擎中的一个重要部分,其表示了 一组位 (a sequence of bits)

可以通过 claripy.BVV(int_value, size_in_bits)claripy.BVV(string_value) 创建带有具体值(concrete value)的指定长度的位向量值(bitvector value):

1
2
3
4
5
6
7
>>> import claripy
>>> bvv = claripy.BVV(b'bsd_crow')
>>> bvv
<BV64 0x6273645f63726f77>
>>> bvv = claripy.BVV(0xdeadbeef, 32)
>>> bvv
<BV32 0xdeadbeef>

相同长度的位向量可以进行运算,对于不同长度的位向量则可以通过

.zero_extend(extended_bits) :完成位扩展(0填充)后进行运算,位运算也会存在溢出

1
2
3
4
5
6
7
8
9
>>> bvv = claripy.BVV(b'bsd_crow')
>>> bvv2 = claripy.BVV(0xdeadbeef, 32)
>>> bvv2 = bvv2.zero_extend(32)
>>> bvv + bvv
<BV64 0xc4e6c8bec6e4deee>
>>> bvv + bvv2
<BV64 0x6273646042202e66>
>>> bvv * bvv2
<BV64 0x1aecf1db4bfb6219>

位向量除了代表具体值(concrete value)的 bitvector value 外,还有代表符号变量(symbolic variable)的 bitvector symbol

claripy.BVS(name, size_in_bits) :创建带名字的指定长度的位向量符号(bitvector symbol)

1
2
3
4
5
6
>>> bvs = claripy.BVS('x', 32)
>>> bvs2 = claripy.BVS('y', 32)
>>> bvs + bvs2
<BV32 x_0_32 + y_1_32>
>>> (bvs - bvs2)/bvs
<BV32 (x_0_32 - y_1_32) / x_0_32>

可以通过 .op.args 获得位向量的运算类型与参数:

1
2
3
4
5
6
7
8
>>> bvv.op
'BVV'
>>> bvs.op
'BVS'
>>> bvs.args
('x_0_32', None, None, None, False, False, None)
>>> bvv.args
(7094124198891777911, 64)

state.solver - 状态的求解接口

在对位向量符号进行具体值求解的时候,可以将位向量符号放到状态内存或寄存器中,在完成路径探索后,就可以使用 solution_state.solver.eval 来获取对应状态下的值。

1
2
3
4
5
6
7
# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', 32) # 32-bit registers

# set the init_state's register to corresponding symbolic variables
initial_state.regs.eax = password_0

solution_0 = solution_state.solver.eval(password_0)

03_angr_symbolic_register

符号化寄存器

1
initial_state.regs.eax = password_0

nipaste_2024-05-07_16-48-1

创建一个从 get_user_input 之后开始执行的空白状态,将对应的寄存器设置为位向量符号,然后求解并获取对应符号变量值。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import angr
import sys
import claripy

path_to_file = './03_symbolic_registers'

def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
project = angr.Project(path_to_file)
start_addr = 0x0804958A # set addr after get_user_input
initial_state = project.factory.entry_state(
addr = start_addr,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# initial_state = project.factory.blank_state(addr = start_addr)

# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', 32) # 32-bit registers
password_1 = claripy.BVS('password_1', 32) # 32-bit registers
password_2 = claripy.BVS('password_2', 32) # 32-bit registers

# set the init_state's register to corresponding symbolic variables
initial_state.regs.eax = password_0
initial_state.regs.ebx = password_1
initial_state.regs.edx = password_2

# solve question
simulation = project.factory.simgr(initial_state)
simulation.explore(find = find_path, avoid = avoid_path)

# check result
if simulation.found:
solution_state = simulation.found[0]
# get answer from solver.eval
solution_0 = solution_state.solver.eval(password_0)
solution_1 = solution_state.solver.eval(password_1)
solution_2 = solution_state.solver.eval(password_2)

print("password_0 ==> {}".format(hex(solution_0)))
print("password_1 ==> {}".format(hex(solution_1)))
print("password_2 ==> {}".format(hex(solution_2)))
else:
raise Exception("Could not find the solution!")

if __name__ == '__main__':
main()

04_angr_symbolic_stack

如题目所示这道题需要我们将栈设置为符号变量进行求解

1
initial_state.stack_push(password_0)

nipaste_2024-05-07_17-32-4

程序还是很简单的,输入两个数进行约束求解,虽然用前面的方法也能进行解决,但是这道题我们主要学习如何将栈空间设置为符号变量,需要注意的是当我们通过下面命令创建一个空白状态时,在必要时我们需要自行设置当前状态环境,以便符号执行。

1
initial_state = project.factory.blank_state(addr = start_addr)

nipaste_2024-05-07_17-37-1

假设我们这里需要从 scanf 后的 add esp, 10h 开始符号执行,那我们就相当于跳过了 scanf 的执行,需要我们设置一下当前的栈空间。

关于这一点源文件 scanfold04 中做了详细介绍,我这里就简单说明一下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# We are jumping into the middle of a function! Therefore, we need to account
# for how the function constructs the stack. The second instruction of the
# function is:
# mov %esp,%ebp
# At which point it allocates the part of the stack frame we plan to target:
# sub $0x18,%esp
# Note the value of esp relative to ebp. The space between them is (usually)
# the stack space. Since esp was decreased by 0x18
#
# /-------- The stack --------\
# ebp -> | |
# |---------------------------|
# | |
# |---------------------------|
# . . . (total of 0x18 bytes)
# . . . Somewhere in here is
# . . . the data that stores
# . . . the result of scanf.
# esp -> | |
# \---------------------------/

因为我们将 start_addr 直接设置到了 main 函数中间,所以我们需要从分析 main 开始设置正确的栈空间排布,

1
2
.text:08049385 89 E5                         mov     ebp, esp
.text:08049387 83 EC 18 sub esp, 18h

从 start_addr 开始时,对应的栈排布为

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#            /-------- The stack --------\
# ebp -> | padding |
# |---------------------------|
# ebp - 0x01 | more padding |
# |---------------------------|
# ebp - 0x02 | even more padding |
# |---------------------------|
# . . . <- How much padding? Hint: how
# |---------------------------| many bytes is password0?
# ebp - 0x0b | password0, second byte |
# |---------------------------|
# ebp - 0x0c | password0, first byte |
# |---------------------------|
# ebp - 0x0d | password1, last byte |
# |---------------------------|
# . . .
# |---------------------------|
# ebp - 0x10 | password1, first byte |
# |---------------------------|
# . . .
# |---------------------------|
# esp -> | |
# \---------------------------/

我们这里需要准确找到两个变量的正确位置,从而来设置符号变量

nipaste_2024-05-07_17-52-4

反应在 angr 中的代码为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Since we are starting after scanf, we are skipping this stack construction
# step. To make up for this, we need to construct the stack ourselves.
initial_state.regs.ebp = initial_state.regs.esp

# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', 32) # 32-bit registers
password_1 = claripy.BVS('password_1', 32) # 32-bit registers
# As you can see, the call to scanf looks like this:
# scanf( 0x80489c3, ebp - 0xc, ebp - 0x10 )
# format_string password0 password1
# From this, we can construct our new
# that's why we decrease the esp 0x8
initial_state.regs.esp -= 0x8

# set the init_state's stack to correspond symbolic variables
initial_state.stack_push(password_0)
initial_state.stack_push(password_1)
# restore esp
initial_state.regs.esp -= 0x18 # equal sub rsp, 18h after the instruction of add esp, 10h

nipaste_2024-05-07_17-55-3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
import angr
import sys
import claripy

path_to_file = './04_symbolic_stack'

def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
project = angr.Project(path_to_file)
start_addr = 0x0804939F
initial_state = project.factory.blank_state(addr = start_addr)

# Since we are starting after scanf,we are skipping this stack construction
# step. To make up for this, we need to construct the stack ourselves.
initial_state.regs.ebp = initial_state.regs.esp

# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', 32) # 32-bit registers
password_1 = claripy.BVS('password_1', 32) # 32-bit registers

# As you can see, the call to scanf looks like this:
# scanf( 0x80489c3, ebp - 0xc, ebp - 0x10 )
# format_string password0 password1
# From this, we can construct our new
# that's why we decrease the esp 0x8
initial_state.regs.esp -= 0x8

# set the init_state's stack to correspond symbolic variables
initial_state.stack_push(password_0)
initial_state.stack_push(password_1)

# restore esp
initial_state.regs.esp -= 0x18

# solve question
simulation = project.factory.simgr(initial_state)
simulation.explore(find = find_path, avoid = avoid_path)

# check result
if simulation.found:
solution_state = simulation.found[0]
# get answer from solver.eval
solution_0 = solution_state.solver.eval(password_0)
solution_1 = solution_state.solver.eval(password_1)

print("password_0 ==> {}".format(solution_0))
print("password_1 ==> {}".format(solution_1))
else:
raise Exception("Could not find the solution!")

if __name__ == '__main__':
main()

05_angr_symbolic_memory

顾名思义即符号化内存,可以使用 state.memory 的对应接口进行操作:

  • state.memory.load(addr, size_in_bytes) :获取该地址上指定大小的位向量
  • state.memory.store(addr, bitvector)将一个位向量存储到指定地址
  • solution_state.solver.eval(bitvector, cast_to=bytes).decode():获取指定符号的值

nipaste_2024-05-08_01-00-5

要求输入四个值,每个值占 8 个字节。

nipaste_2024-05-08_01-14-4

这里唯一需要注意的一点是,要将 start_addr 设置为紧接着 scanf 之后的一条指令的地址

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
import angr
import sys
import claripy

path_to_file = './05_symbolic_memory'

def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
project = angr.Project(path_to_file)
start_addr = 0x08049299 # first insn after scanf()
initial_state = project.factory.blank_state(
addr = start_addr,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# initial_state = project.factory.entry_state(addr = start_addr)

# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', 64) # 32-bit registers
password_1 = claripy.BVS('password_1', 64) # 32-bit registers
password_2 = claripy.BVS('password_2', 64) # 32-bit registers
password_3 = claripy.BVS('password_3', 64) # 32-bit registers

# set symbolic memory variable
initial_state.memory.store(0x0957F640, password_0)
initial_state.memory.store(0x0957F648, password_1)
initial_state.memory.store(0x0957F650, password_2)
initial_state.memory.store(0x0957F658, password_3)

# solve question
simulation = project.factory.simgr(initial_state)
simulation.explore(find = find_path, avoid = avoid_path)

# check result
if simulation.found:
solution_state = simulation.found[0]
# get answer from solver.eval
solution_0 = solution_state.solver.eval(password_0, cast_to=bytes).decode()
solution_1 = solution_state.solver.eval(password_1, cast_to=bytes).decode()
solution_2 = solution_state.solver.eval(password_2, cast_to=bytes).decode()
solution_3 = solution_state.solver.eval(password_3, cast_to=bytes).decode()

print("password_0 ==> {}".format(solution_0))
print("password_1 ==> {}".format(solution_1))
print("password_2 ==> {}".format(solution_2))
print("password_3 ==> {}".format(solution_3))
else:
raise Exception("Could not find the solution!")

if __name__ == '__main__':
main()

06_angr_symbolic_dynamic_memory

nipaste_2024-05-08_01-28-1

与上一题类似但是这题将值保存在了堆中,虽然我们不知道堆地址但是无妨,因为符号执行压根不会去执行程序,因此我们这里直接设置一个 fake 地址,也没有任何关系,buffer0 指针依旧是在 bss 上的,我们链接起来就好

1
2
3
4
5
6
7
8
9
buf0_addr = 0x9CECEA8
fake_chunk0 = 0xdeadbee0
# set buf_addr ==> fake_chunk0, because the program will not excute really
initial_state.memory.store(buf0_addr, fake_chunk0, endness=project.arch.memory_endness)
# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', 64) # 32-bit registers

# set symbolic memory variable
initial_state.memory.store(fake_chunk0, password_0)

注意事项:试想如果我们这里 buffer0 是在栈上怎么办,那么 04 就可以派上用场,依旧能解决问题。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
import angr
import sys
import claripy


def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
path_to_file = './06_symbolic_dynamic_memory'
project = angr.Project(path_to_file)

start_addr = 0x08049299 # first insn after scanf()
initial_state = project.factory.blank_state(
addr = start_addr,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# initial_state = project.factory.entry_state(addr = start_addr)
buf0_addr = 0x9CECEA8
buf1_addr = 0x9CECEAC
fake_chunk0 = 0xdeadbee0
fake_chunk1 = 0xdeadbef0
# set buf_addr ==> fake_chunk0, because the program will not excute really
initial_state.memory.store(buf0_addr, fake_chunk0, endness=project.arch.memory_endness)
initial_state.memory.store(buf1_addr, fake_chunk1, endness=project.arch.memory_endness)


# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', 64) # 32-bit registers
password_1 = claripy.BVS('password_1', 64) # 32-bit registers

# set symbolic memory variable
initial_state.memory.store(fake_chunk0, password_0)
initial_state.memory.store(fake_chunk1, password_1)

# solve question
simulation = project.factory.simgr(initial_state)
simulation.explore(find = find_path, avoid = avoid_path)

# check result
if simulation.found:
solution_state = simulation.found[0]
# get answer from solver.eval
solution_0 = solution_state.solver.eval(password_0, cast_to=bytes).decode()
solution_1 = solution_state.solver.eval(password_1, cast_to=bytes).decode()

print("password_0 ==> {}".format(solution_0))
print("password_1 ==> {}".format(solution_1))
else:
raise Exception("Could not find the solution!")

if __name__ == '__main__':
main()

07_angr_symbolic_file

即符号化文件,关键代码如下:

angr.SimFile():来创建一个模拟文件

state.fs.insert(sim_file) :将 SimFile 插入到一个状态的文件系统中

1
2
3
4
5
6
7
8
9
symbolic_file_size_bytes = 0x40

# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', symbolic_file_size_bytes * 8) # 0x40 file_size
filename = 'EZJXEZBK.txt'
password_file = angr.storage.SimFile(filename, password_0, size = symbolic_file_size_bytes)

# load the SimFile
initial_state.fs.insert(filename, password_file)

通过上面的代码可以通过 angr 创建一个模拟文件,从而用来符号执行

nipaste_2024-05-08_02-01-0

ignore_me 会将输入的 buffer 内容读入到文件中,我们这里所做的就是需要模拟这个文件,start_addr 我们设置在 ignore_me 的后一条汇编指令。

nipaste_2024-05-08_01-59-5

exp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import angr
import sys
import claripy


def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def main():
path_to_file = './07_symbolic_file'
project = angr.Project(path_to_file)

start_addr = 0x0804946C # first insn returned from ignoreme()
initial_state = project.factory.blank_state(
addr = start_addr,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# initial_state = project.factory.entry_state(addr = start_addr)
symbolic_file_size_bytes = 0x40

# create symbolic variables with claripy
password_0 = claripy.BVS('password_0', symbolic_file_size_bytes * 8) # 0x40 file_size
filename = 'EZJXEZBK.txt'
password_file = angr.storage.SimFile(filename, password_0, size = symbolic_file_size_bytes)

# load the SimFile
initial_state.fs.insert(filename, password_file)

# solve question
simulation = project.factory.simgr(initial_state)
simulation.explore(find = find_path, avoid = avoid_path)

# check result
if simulation.found:
solution_state = simulation.found[0]
# get answer from solver.eval
solution_0 = solution_state.solver.eval(password_0, cast_to=bytes).decode()

print("password_0 ==> {}".format(solution_0))
else:
raise Exception("Could not find the solution!")

if __name__ == '__main__':
main()

四、angr-约束条件

08_angr_constraints

nipaste_2024-05-10_15-17-3

乍一看这道题跟前面的题目都差不多(实际上确实差不多),但是如果使用前面路径探索的代码会发现长时间跑不出结果,原因就在于 check_equals 中暗藏玄机

nipaste_2024-05-10_15-19-5

这里逐个判断每个字符是否相等,若相同时会让计数器 v3 加 1,但这里实际存在一个分支路径爆炸的问题,来看看汇编进行理解。

nipaste_2024-05-10_15-23-4

由于总共有16个字符,所以要进行16分支判断,也就是说我们的符号执行过程中需要保存 2^16 = 65536 个分支,所以这会导致执行起来的效果变得异常慢,由于每次判断时都是与固定字符串进行比较,所以我们这里可以通过添加约束条件,从而在执行 check_equals 函数前,就把我们需要的正确的输入值 buffer 给拿到。

1
solution_state.add_constraints(constrain_parameter_bitvector == b'BOIKVOJYEZJXEZBK')

注意事项:

  • 需要知道的是,在添加完约束条件后,angr 会调用 z3 完成对满足约束条件的值的求解
  • 探索路径 addr_to_check_constraint 设置在 check_equals 之前的一行指令地址

nipaste_2024-05-10_15-32-1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import angr
import sys
import claripy

def solve():
path_to_file = './08_constraints'
proj = angr.Project(path_to_file)

start_addr = 0x080492ED # first instruction after scanf
init_state = proj.factory.blank_state(
addr = start_addr,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# create symbolic varialble
buffer_addr = 0x0804C034
buffer_size = 0x10
buffer = claripy.BVS('buffer', buffer_size*8)

# set addr to correspond var
init_state.memory.store(buffer_addr, buffer)

addr_to_check_constraint = 0x08049339 # first instruction before check_equals
simulation = proj.factory.simgr(init_state)
simulation.explore(find = addr_to_check_constraint)

# check result
if simulation.found:
solution_state = simulation.found[0]

# get encoded buffer content
constrain_parameter_bitvector = solution_state.memory.load(buffer_addr, buffer_size)
# add constrain condition
solution_state.add_constraints(constrain_parameter_bitvector == b'BOIKVOJYEZJXEZBK')
solution = solution_state.solver.eval(buffer, cast_to=bytes).decode()
print(solution)
else:
raise Exception("Could not find the solution")

if __name__ == '__main__':
solve()

五、angr-函数操作

project.hook(): 函数钩子

在实际需求中,angr 还可以使用 project.hook(addr = call_insn_addr, hook = my_function, length = n) 来 hook 掉对应的 call 指令,其中

call_insn_addr :为 call 指令的地址

my_function: 为我们的自定义函数

length: 为 call 指令的长度:

1
2
3
4
5
6
7
8
9
10
11
# method 1
@project.hook(0x1234, length=5)
def my_hook_func(state):
# do something, this is an example
state.regs.eax = 0xdeadbeef

# method 2
def my_hook_func2(state):
# do something, this is an example
state.regs.eax = 0xdeadbeef
proj.hook(addr = 0x5678, hook = my_hook_func2, length = 5)

09_angr_hooks

nipaste_2024-05-10_17-38-1

这题相当于第 8 题,多了一个将 password 进行加密后,在输入一次,判断输入是否和加密后的 password 相同,可以发现用第 8 题的脚本,仍然可以跑出第一部分结果,但是这道题我们要用hook来模拟 check_equals

但在这道题中我们可以学习到连续状态的路径探索,以往我们都是一个状态 explore 一次,在这道题中我们会 explore 两次。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
simulation.explore(find = addr_to_check_constraint) # 第一次
# step.1 compute first solution
if simulation.found == 0:
raise Exception("could not find solution")

check_state = simulation.found[0]
# add constrain condition
check_state.add_constraints(check_state.regs.eax == 1)
solution = check_state.solver.eval(buffer, cast_to=bytes).decode()
print("solution ==> {}".format(solution))

# step.2 compute second solution
simulation = project.factory.simgr(check_state)
simulation.explore(find = 0x080493A3) # # 第二次
if simulation.found == 0:
raise Exception("could not find solution")

由于这里依然存在路径爆炸的问题,所以我们这里使用 hook 来解决这一问题。

注意事项:这里使用 claripy.If() 创建一个比较并将值给到 eax 寄存器作为返回值,最后就是常规的内存符号化后求解即可

nipaste_2024-05-10_17-48-1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import angr
import sys
import claripy

def solve():
path_to_file = './09_hooks'
project = angr.Project(path_to_file)
password_str = b'BOIKVOJYEZJXEZBK'

start_addr = 0x080492FD # first instruction after scanf
init_state = project.factory.blank_state(
addr = start_addr,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
buffer_addr = 0x0804C038
buffer_size = 0x10
buffer = claripy.BVS('buffer', buffer_size * 8)
# store symbolic var
init_state.memory.store(buffer_addr, buffer)

# since we skip qmemcpy for password, we have to write origianl content into its addr
password_addr = 0x0804C04C
password_size = 0x10
password = claripy.BVV(int.from_bytes(password_str, "big"), password_size * 8)
init_state.memory.store(password_addr, password)

# set hook function to replace check_equals
first_check_equals_addr = 0x804934E
@project.hook(first_check_equals_addr, length=5)
def my_hook_func(state):
buffer = state.memory.load(buffer_addr, buffer_size)
state.regs.eax = claripy.If(buffer == password_str, #constraint
# if success return 1, else return 0
claripy.BVV(1, 32), claripy.BVV(0, 32))

# explore path
addr_to_check_constraint = 0x08049353 # Attention:first instruction after check_equals
simulation = project.factory.simgr(init_state)
simulation.explore(find = addr_to_check_constraint)

# step.1 compute first solution
if simulation.found == 0:
raise Exception("could not find solution")

check_state = simulation.found[0]
# add constrain condition
check_state.add_constraints(check_state.regs.eax == 1)
solution = check_state.solver.eval(buffer, cast_to=bytes).decode()
print("solution ==> {}".format(solution))

# step.2 compute second solution
simulation = project.factory.simgr(check_state)
simulation.explore(find = 0x080493A3) # last instruction before scanf
if simulation.found == 0:
raise Exception("could not find solution")

solution_state = simulation.found[0]
password = solution_state.memory.load(password_addr, password_size)
solution2 = solution_state.solver.eval(password, cast_to=bytes).decode()
print("solution2 ==> {}".format(solution2))

if __name__ == '__main__':
solve()

angr.SimProcedure - 模拟函数(过程)

在 angr 中 angr.SimProcedure 类用来表示在一个状态上的一个运行过程——即函数实际上是一个 SimPrecedure

可以通过创建一个继承自 angr.SimProcedure 的类并重写 run() 方法的方式来表示一个自定义函数,其中 run() 方法的参数为该函数所接收的参数

1
2
3
4
class MyProcedure(angr.SimProcedure):
def run(self, arg1, arg2):
# do something, this's an example
return self.state.memory.load(arg1, arg2)

自定义函数会对文件中原有函数进行替换,angr 默认会用内置的一些 SimProcedure 去替换一些库函数。

在已经有该二进制文件的符号表,可以直接使用 project.hook_symbol(symbol_str, sim_procedure_instance) 来自动 hook 掉文件中所有的对应符号,其中 run() 方法的参数为被替换函数所接收的参数

1
2
3
4
5
6
7
8
9
10
import angr
import claripy

class MyProcedure(angr.SimProcedure):
def run(self, arg1, arg2):
# do something, this's an example
return self.state.memory.load(arg1, arg2)

proj = angr.Project('./test')
proj.hook_symbol('func_to_hook', MyProcedure())

在 SimProcedure 的 run() 过程中也可以使用一些有用的成员函数:

  • ret(expr): 函数返回
  • jump(addr): 跳转到指定地址
  • exit(code): 终止程序
  • call(addr, args, continue_at): 调用文件中的函数
  • inline_call(procedure, *args): 内联地调用另一个 SimProcedure

10_angr_simprocedures

这道题有点反编译出来的效果不是很好,可以看汇编代码来进行理解。

nipaste_2024-05-10_22-50-5

在执行完 complex_function 之后,下面会有非常多的伪路径,其中每个都包含 check_equals 函数调用,而且我们前面也知道每个 check_equals 中都会存在路径爆炸,尤其在有这么多伪路径的情况下,我们不可能在像第9题那样挨个去为每个调用点设置 hook 函数

所以这时候我们可以通过 simprocedures 为所有函数 check_equals 设置 hook。

nipaste_2024-05-10_22-51-5

方法跟上一题 hook 函数类似,但是这里我们用类代替写 hook。

1
2
3
4
5
6
class ReplacementCheckEquals(angr.SimProcedure):
def run(self, buffer_addr, buffer_len):
buffer = self.state.memory.load(buffer_addr, buffer_len)
compared_str = b'BOIKVOJYEZJXEZBK'
return claripy.If(buffer == compared_str, claripy.BVV(1, 32), claripy.BVV(0, 32))
project.hook_symbol('check_equals_BOIKVOJYEZJXEZBK', ReplacementCheckEquals())

nipaste_2024-05-10_22-58-3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
import angr
import sys
import claripy

class ReplacementCheckEquals(angr.SimProcedure):
def run(self, buffer_addr, buffer_len):
buffer = self.state.memory.load(buffer_addr, buffer_len)
compared_str = b'BOIKVOJYEZJXEZBK'
return claripy.If(buffer == compared_str, claripy.BVV(1, 32), claripy.BVV(0, 32))

def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def exp():
path_to_file = './10_simprocedures'
project = angr.Project(path_to_file)

# set context environment
initial_state = project.factory. entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# hook the check_equals()
project.hook_symbol('check_equals_BOIKVOJYEZJXEZBK', ReplacementCheckEquals())

# begin to explore path
simulation = project.factory.simgr(initial_state)
simulation.explore(find = find_path, avoid = avoid_path)
if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception("Could not find the solution")


if __name__ == '__main__':
exp()

11_angr_sim_scanf

nipaste_2024-05-10_23-27-4

反编译出来的效果依然不尽人意,程序中存在很多 scanf 的伪路径,这一题我们可以直接用第2题的脚本跑通,但是我们需要通过这一题来对前面学到的 simprocedurescanf 函数进行模拟,关键代码如下。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class MyScanfProcedure(angr.SimProcedure):
def run(self, fmt_str, buffer0_addr, buffer1_addr):
buffer0 = claripy.BVS('buffer0', 4 * 8)
buffer1 = claripy.BVS('buffer1', 4 * 8)
self.state.memory.store(buffer0_addr, buffer0)
self.state.memory.store(buffer1_addr, buffer1)
self.state.globals['buffer_0'] = buffer0
self.state.globals['buffer_1'] = buffer1

# hook the check_equals()
project.hook_symbol('__isoc99_scanf', MyScanfProcedure())
buffer0 = solution_state.globals['buffer_0']
password0 = solution_state.solver.eval(buffer0, cast_to = bytes)
print("password0: {}".format(int.from_bytes(password0, 'little')))

注意事项:若是在 run() 方法内创建 BVS ,则可以通过将其储存到 state.globals 列表的方式以便后续取用

nipaste_2024-05-10_23-32-1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
import angr
import sys
import claripy

class MyScanfProcedure(angr.SimProcedure):
def run(self, fmt_str, buffer0_addr, buffer1_addr):
buffer0 = claripy.BVS('buffer0', 4 * 8)
buffer1 = claripy.BVS('buffer1', 4 * 8)
self.state.memory.store(buffer0_addr, buffer0)
self.state.memory.store(buffer1_addr, buffer1)
self.state.globals['buffer_0'] = buffer0
self.state.globals['buffer_1'] = buffer1

def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def exp():
path_to_file = './11_sim_scanf'
project = angr.Project(path_to_file)

# set context environment
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# hook the check_equals()
project.hook_symbol('__isoc99_scanf', MyScanfProcedure())

# begin to explore path
simulation = project.factory.simgr(initial_state)
simulation.explore(find = find_path, avoid = avoid_path)

if simulation.found:
solution_state = simulation.found[0]
# get value method1
# buffer0 = solution_state.memory.load(0x80605C4, 4)
# buffer1 = solution_state.memory.load(0x8057274, 4)

# get value method2
buffer0 = solution_state.globals['buffer_0']
buffer1 = solution_state.globals['buffer_1']
password0 = solution_state.solver.eval(buffer0, cast_to = bytes)
password1 = solution_state.solver.eval(buffer1, cast_to = bytes)
print("password0: {}".format(int.from_bytes(password0, 'little')))
print("password1: {}".format(int.from_bytes(password1, 'little')))

else:
raise Exception("Could not find the solution")

if __name__ == '__main__':
exp()

六、angr-路径合并

12_angr_veritesting

nipaste_2024-05-11_13-32-5

这道题与前面不一样的是,在边执行 complex_function 的时候,边进行 逐个字符检查,会导致路径爆炸。

在angr中可以通过在创建 simgr 时指定参数 veritesting=True,这样 angr 就会在运行过程中进行路径合并,缓解路径爆炸的问题。

nipaste_2024-05-11_13-52-5

不知道这道题的成功率为什么忽高忽低,偶尔会很快,偶尔又比较慢。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import angr
import sys
import claripy

def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def exp():
path_to_file = './12_veritesting'
project = angr.Project(path_to_file)

# set context environment
initial_state = project.factory.entry_state()

# begin to explore path
simulation = project.factory.simgr(initial_state, veritesting = True)
simulation.explore(find = 0x80492FF, avoid = 0x8049311)
# simulation.explore(find = find_path, avoid = avoid_path)

if simulation.found:
solution_state = simulation.found[0]
print("solution ==> {}".format(solution_state.posix.dumps(sys.stdin.fileno())))

else:
raise Exception("Could not find the solution")


if __name__ == '__main__':
exp()

七、angr-库操作

13_angr_static_binary

nipaste_2024-05-11_15-07-2

与之前的题目不一样的是,这道题采用的是静态编译,我们之前都做的是动态编译的题目,这两者之间的区别是后者在执行时,angr 会自动通过 SimProcedures 去替换标准库函数,因为这样的效率会更快,而在静态编译的题目中,angr 并不会自动去替换,而是需要我们手动设置 hook 函数。样例如下:

1
2
3
4
5
angr.SIM_PROCEDURES['libc']['strcmp']
angr.SIM_PROCEDURES['libc']['scanf']
angr.SIM_PROCEDURES['libc']['printf']
angr.SIM_PROCEDURES['libc']['puts']
angr.SIM_PROCEDURES['libc']['exit']

同样的我们还是可以使用前面的 project.hook 的方法去进行设置 hook 函数:

1
project.hook(malloc_address, angr.SIM_PROCEDURES['libc']['malloc']())

需要注意的是 __libc_start_main 需要采用下面的方法去进行替换,如果不用 SimProcedures hook 掉这个函数,angr 的执行时间将会变得很长。

1
2
project.hook_symbol('__libc_start_main', 
angr.SIM_PROCEDURES['glibc']['__libc_start_main']())

注意事项:在上面比如说我们要 hook 掉 puts 函数,不像前面的题那样,在 call 处进行 hook ,而是在该函数静态编译后的代码实现处

nipaste_2024-05-11_15-20-0

最后效果如下:

nipaste_2024-05-11_15-20-2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import angr
import sys
import claripy

def find_path(state):
return b'Good Job.' in state.posix.dumps(sys.stdout.fileno())

def avoid_path(state):
return b'Try again.' in state.posix.dumps(sys.stdout.fileno())

def exp():
path_to_file = './13_static_binary'
project = angr.Project(path_to_file)

# set context environment
initial_state = project.factory.entry_state()

# hook function
project.hook(0x08051F00, angr.SIM_PROCEDURES['libc']['scanf']())
project.hook(0x08051EB0, angr.SIM_PROCEDURES['libc']['printf']())
project.hook(0x0805EED0, angr.SIM_PROCEDURES['libc']['puts']())
project.hook(0x0806D8C0, angr.SIM_PROCEDURES['libc']['strcmp']())
project.hook_symbol('__libc_start_main',
angr.SIM_PROCEDURES['glibc']['__libc_start_main']())

simulation = project.factory.simgr(initial_state)
simulation.explore(find = find_path, avoid = avoid_path)

if simulation.found:
solution_state = simulation.found[0]
print("solution ==> {}".format(solution_state.posix.dumps(sys.stdin.fileno())))
else:
raise Exception("Could not find the solution")

if __name__ == '__main__':
exp()

14_angr_shared_library

1
2
3
4
$ sudo python3 generate.py 123654 14__shared_library
/usr/bin/ld: cannot find 14__shared_library: No such file or directory
/usr/bin/ld: cannot find -l14__shared_library: No such file or directory
collect2: error: ld returned 1 exit status

编译的时候会报错,修改为下面的命令即可

1
$ sudo python3 generate.py 123654 ./14__shared_library

虽然这题我们可以通过前面第一题的脚本直接暴力出,但是这一题我们需要学习的是模拟动态库的符号执行,也就是对 .so 文件进行符号执行,源程序中会调用 lib14__shared_library.so 中的 validate 函数。

nipaste_2024-05-11_15-43-5

这是具体的实现

nipaste_2024-05-11_15-45-0

需要注意的是动态库开了 PIE,所以我们需要为程序手动随机设置一个基地址,设置方法如下:

1
2
3
4
5
6
base_addr = 0x400000
project = angr.Project(path_to_file, load_options = {
'main_opts': {
'custom_base_addr': base_addr
}
})

将符号执行模拟地址设置于 validate 函数开始,并我们提前向栈里面传递参数(32位),来模拟调用 validate 时的状态。

1
2
3
4
5
# emulate validate func ==> validate(password, 8)
initial_state.regs.ebp = initial_state.regs.esp
initial_state.stack_push(8)
initial_state.stack_push(password_addr)
initial_state.stack_push(0xdeadbeef) # ret addr

最后效果如下:

nipaste_2024-05-11_16-12-3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import angr
import sys
import claripy

def main():
path_to_file = './lib14__shared_library.so'
base_addr = 0x400000
project = angr.Project(path_to_file, load_options = {
'main_opts': {
'custom_base_addr': base_addr
}
})

start_addr = base_addr + 0x0001244 # addr of validate
initial_state = project.factory.blank_state(
addr = start_addr,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
password_addr = base_addr + 0x100000
password = claripy.BVS('password', 8 * 8)
initial_state.memory.store(password_addr, password)

# emulate validate func ==> validate(password, 8)
initial_state.regs.ebp = initial_state.regs.esp
initial_state.stack_push(8)
initial_state.stack_push(password_addr)
initial_state.stack_push(0xdeadbeef) # ret addr


# initial_state = project.factory.entry_state()
simulation = project.factory.simgr(initial_state)
find_path = base_addr + 0x000012ED
simulation.explore(find = find_path)

if simulation.found:
solution_state = simulation.found[0]
solution_state.add_constraints(solution_state.regs.eax == 1)
solution = solution_state.solver.eval(password, cast_to=bytes).decode()
print(solution)

else:
raise Exception("Could not find the solution!")

if __name__ == '__main__':
main()

八、angr-漏洞利用

15_angr_arbitrary_read

nipaste_2024-05-11_17-15-4

程序中存在一个溢出,我们的任务是让程序输出 Good Job.,从pwn的角度来说这个实现非常简单,但是我们这里要学习如何使用 angr 来进行漏洞利用。

nipaste_2024-05-11_17-17-0

源程序中调用的关键函数主要有两个,一个是 scanf 用来接受字符串,一个是 puts 用于输出字符串。

(1)为了获取到我们输入到 scanf 中的字符串,我们将该函数进行 hook,下面代码参考 a3 师傅。

1
2
3
4
5
6
7
8
9
10
11
class MySimScanfProcedure(angr.SimProcedure):
def run(self, fmt_str, key_addr, chr_arr_addr):
key_bvs = claripy.BVS('key', 4 * 8)
chr_arr_bvs = claripy.BVS('chr_arr', 20 * 8)
for ch in chr_arr_bvs.chop(bits = 8):
self.state.add_constraints(ch >= '0', ch <= 'z')
self.state.memory.store(key_addr, key_bvs,
endness = project.arch.memory_endness)
self.state.memory.store(chr_arr_addr, chr_arr_bvs)
self.state.globals['password_0'] = key_bvs
self.state.globals['password_1'] = chr_arr_bvs

注意事项:上面代码使用 BVS.chop(bits=n) 来将符号位向量按照一定尺寸进行分割,将字符限制在了 ‘0’-‘z’ 之间。

(2)为了检查 puts 输出的字符串是否是 ‘Good Job.’,我们在设置一个自定义函数进行判断。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def is_success(state):
puts_plt_addr = 0x08049060
if state.addr != puts_plt_addr:
return False
good_str_addr = 0x424F494B
puts_param = state.memory.load(state.regs.esp + 4, 4,
endness = project.arch.memory_endness)
if state.solver.symbolic(puts_param):
copy_state = state.copy()
copy_state.add_constraints(puts_param == good_str_addr)
if copy_state.satisfiable():
state.add_constraints(puts_param == good_str_addr)
return True
else:
return False
else:
return False

其中检查的地址设在了 puts_plt 处的位置,并通过获取字符串地址,在设置约束条件之后即可,求解出正确的输入,下面是对上面代码中的一些关键函数进行说明。

  • 使用 state.memory.load()puts() 的参数提取出来
  • 使用 state.solver.symbolic() 判断 puts() 的参数是否是我们的 BVS
  • 使用 state.copy() 获取当前状态的副本,之后在该副本上添加约束以避免影响到原状态
  • 使用 state.satisfiable() 判断是否满足约束,若是则添加到原状态中求解即可

nipaste_2024-05-11_17-30-3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
import angr
import sys
import claripy

def exp():
path_to_file = './15_arbitrary_read'
project = angr.Project(path_to_file)
class MySimScanfProcedure(angr.SimProcedure):
def run(self, fmt_str, key_addr, chr_arr_addr):
key_bvs = claripy.BVS('key', 4 * 8)
chr_arr_bvs = claripy.BVS('chr_arr', 20 * 8)
for ch in chr_arr_bvs.chop(bits = 8):
self.state.add_constraints(ch >= '0', ch <= 'z')
self.state.memory.store(key_addr, key_bvs,
endness = project.arch.memory_endness)
self.state.memory.store(chr_arr_addr, chr_arr_bvs)
self.state.globals['password_0'] = key_bvs
self.state.globals['password_1'] = chr_arr_bvs

def is_success(state):
puts_plt_addr = 0x08049060
if state.addr != puts_plt_addr:
return False
good_str_addr = 0x424F494B
puts_param = state.memory.load(state.regs.esp + 4, 4,
endness = project.arch.memory_endness)
if state.solver.symbolic(puts_param):
copy_state = state.copy()
copy_state.add_constraints(puts_param == good_str_addr)
if copy_state.satisfiable():
state.add_constraints(puts_param == good_str_addr)
return True
else:
return False
else:
return False

initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# hook scanf func
project.hook_symbol('__isoc99_scanf', MySimScanfProcedure())

simulation = project.factory.simgr(initial_state)
simulation.explore(find = is_success)

if simulation.found:
print("found")
solution_state = simulation.found[0]
print("password_0 ==> {}".format(solution_state.solver.eval(solution_state.globals['password_0'])))
print("password_1 ==> {}".format(solution_state.solver.eval(solution_state.globals['password_1'], cast_to = bytes)))
else:
raise Exception("Could not found solution")

if __name__ == '__main__':
exp()

16_angr_arbitrary_write

跟上一题类似,不过这一题要做的就是修改把路径探索条件设置在 strcpy 那里进行条件检查:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def is_success(state):
strcpy_plt_addr = 0x08049080
if state.addr != strcpy_plt_addr:
return False
good_str_addr = 0x0804A031
dest_param1 = state.memory.load(state.regs.esp + 4, 4,
endness = project.arch.memory_endness)
s_param2 = state.memory.load(state.regs.esp + 8, 4,
endness = project.arch.memory_endness)
compared_str = state.memory.load(s_param2, 8)
password_addr = 0x44414D48

if state.solver.symbolic(dest_param1) and state.solver.symbolic(compared_str):
copy_state = state.copy()
copy_state.add_constraints(dest_param1 == password_addr)
copy_state.add_constraints(compared_str == b"VMHNNFKL")
if copy_state.satisfiable():
state.add_constraints(dest_param1 == password_addr)
state.add_constraints(compared_str == b"VMHNNFKL")
return True
else:
return False
else:
return False

nipaste_2024-05-11_17-59-1

一开始这题我没有设置 is_success 检查,直接暴力探索,但是好像跑不出来结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import angr
import sys
import claripy

def exp():
path_to_file = './16_arbitrary_write'
project = angr.Project(path_to_file)
class MySimScanfProcedure(angr.SimProcedure):
def run(self, fmt_str, key_addr, chr_arr_addr):
key_bvs = claripy.BVS('key', 4 * 8)
chr_arr_bvs = claripy.BVS('chr_arr', 20 * 8)
for ch in chr_arr_bvs.chop(bits = 8):
self.state.add_constraints(ch >= '0', ch <= 'z')
self.state.memory.store(key_addr, key_bvs,
endness = project.arch.memory_endness)
self.state.memory.store(chr_arr_addr, chr_arr_bvs)
self.state.globals['password_0'] = key_bvs
self.state.globals['password_1'] = chr_arr_bvs

def is_success(state):
strcpy_plt_addr = 0x08049080
if state.addr != strcpy_plt_addr:
return False
good_str_addr = 0x0804A031
dest_param1 = state.memory.load(state.regs.esp + 4, 4,
endness = project.arch.memory_endness)
s_param2 = state.memory.load(state.regs.esp + 8, 4,
endness = project.arch.memory_endness)
compared_str = state.memory.load(s_param2, 8)
password_addr = 0x44414D48

if state.solver.symbolic(dest_param1) and state.solver.symbolic(compared_str):
copy_state = state.copy()
copy_state.add_constraints(dest_param1 == password_addr)
copy_state.add_constraints(compared_str == b"VMHNNFKL")
if copy_state.satisfiable():
state.add_constraints(dest_param1 == password_addr)
state.add_constraints(compared_str == b"VMHNNFKL")
return True
else:
return False
else:
return False

initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# hook scanf func
project.hook_symbol('__isoc99_scanf', MySimScanfProcedure())

simulation = project.factory.simgr(initial_state)
simulation.explore(find = is_success)

if simulation.found:
print("found")
solution_state = simulation.found[0]
print("password_0 ==> {}".format(solution_state.solver.eval(solution_state.globals['password_0'])))
print("password_1 ==> {}".format(solution_state.solver.eval(solution_state.globals['password_1'], cast_to = bytes)))
else:
raise Exception("Could not found solution")

if __name__ == '__main__':
exp()

17_angr_arbitrary_jump

nipaste_2024-05-11_21-03-3

上面的 read_input 存在栈溢出

stash 类型

在 angr 有以下几种 stash:

  • simgr.active:活跃的状态列表。在未指定替代的情况下会被模拟器默认执行
  • simgr.deadended:死亡的状态列表。当一个状态无法再被继续执行时(例如没有有效指令、无效的指令指针、不满足其所有的后继(successors))便会被归入该列表
  • simgr.pruned:被剪枝的状态列表。在指定了 LAZY_SOLVES 时,状态仅在必要时检查可满足性,当一个状态在指定了 LAZY_SOLVES 时被发现是不可满足的(unsat),状态层(state hierarchy)将会被遍历以确认在其历史中最初变为不满足的时间,该点及其所有后代都会被 剪枝 (pruned)并放入该列表
  • simgr.unconstrained:不受约束的状态列表。当创建 SimulationManager 时指定了 save_unconstrained=True,则被认为不受约束的(unconstrained,即指令指针被用户数据或其他来源的符号化数据控制)状态会被归入该列表
  • simgr.unsat:不可满足的状态列表。当创建 SimulationManager 时指定了 save_unsat=True,则被认为无法被满足的(unsatisfiable,即存在约束冲突的状态,例如在同一时刻要求输入既是"AAAA" 又是 "BBBB")状态会被归入该列表

还有一种不是 stash 的状态列表——errored,若在执行中产生了错误,则状态与其产生的错误会被包裹在一个 ErrorRecord 实例中(可通过 record.staterecord.error 访问),该 record 会被插入到 errored 中,我们可以通过 record.debug() 启动一个调试窗口

stash 操作

我们可以使用 stash.move() 来在 stash 之间转移放置状态,用法如下:

1
simgr.move(from_stash = 'unconstrained', to_stash = 'active')

在转移当中我们还可以通过指定 filter_func 参数来进行过滤:

1
2
3
4
5
6
def filter_func(state):
print_good_addr = 0x424F4954
return state.satisfiable(
extra_constraints = (state.regs.eip == print_good_addr, ))

simulation.move(from_stash = 'unconstrained', to_stash = 'found', filter_func = filter_func)

stash 本质上是个 list,因此在初始化时可以通过字典的方式指定每个 stash 的初始内容:

1
2
3
4
5
6
7
simulation = project.factory.simgr(initial_state,
save_unconstrained = True,
stashes = {
'active':[initial_state],
'unconstrained':[],
'found':[]
})

nipaste_2024-05-11_22-24-4

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
import angr
import angr.factory
import claripy

def filter_func(state):
print_good_addr = 0x424F4954
return state.satisfiable(
extra_constraints = (state.regs.eip == print_good_addr, ))

def exp():
path_to_file = './17_arbitrary_jump'
project = angr.Project(path_to_file)

class MySimScanfProcedure(angr.SimProcedure):
def run(self, fmt_str, data_addr):
data_bvs = claripy.BVS('data', 200 * 8)
for ch in data_bvs.chop(bits = 8):
self.state.add_constraints(ch >= '0', ch <= 'z')
self.state.memory.store(data_addr, data_bvs)
self.state.globals['password'] = data_bvs

project.hook_symbol('__isoc99_scanf', MySimScanfProcedure())
# create simgr that can save unconstraints
initial_state = project.factory.entry_state()
simulation = project.factory.simgr(initial_state,
save_unconstrained = True,
stashes = {
'active':[initial_state],
'unconstrained':[],
'found':[]
})

# simulated execution by steps
while not simulation.found:
if (not simulation.active) and (not simulation.unconstrained):
break
# check for unconstrained
simulation.move(from_stash = 'unconstrained',
to_stash = 'found',
filter_func = filter_func)
# step to next basic block
simulation.step()


if simulation.found:
print("found")
solution_state = simulation.found[0]
print(solution_state)
print_good_addr = 0x424F4954
solution_state.add_constraints(solution_state.regs.eip == print_good_addr)
print("password_0 ==> {}".format(solution_state.solver.eval(
solution_state.globals['password'], cast_to = bytes)))
else:
raise Exception("Could not found solution")

if __name__ == '__main__':
exp()

总结

​ 感觉 angr 在漏洞利用方面表现得较为单薄,尽管可能是一个简单的栈溢出,我们依然有很多要考虑的因素。但由于 angr 拥有着丰富的 API,不得不重试其在某些方面的重要性,笔者之所以这里尝试学习 angr 是因为在一次比赛中,通过简单使用 angr 的几个调用,便可以直接将题目秒掉。通过做完这些练手入门题目,发现 angr 至少在逆向这块能发挥出不错的潜能,希望后期可以在多了解 angr 的其他功能。

  • Title: angr 学习笔记
  • Author: henry
  • Created at : 2024-05-11 22:42:25
  • Updated at : 2024-05-11 22:47:59
  • Link: https://henrymartin262.github.io/2024/05/11/angr-学习笔记/
  • License: This work is licensed under CC BY-NC-SA 4.0.
 Comments