How to debug crash in CFS scheduler in linux kernel - oop

Recently, I've encounterd a kernel oops on a embedded linux system based on I.MX6ULL and 5.10.9 version of kernel. This issue has bothered me for more than a week. It's really painful.
First of all, let me show the kernel oops for your reference.
<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 0000001c
pgd = f6125cdd
[0000001c] *pgd=00000000
Internal error: Oops: 17 [#1] PREEMPT ARM
Modules linked in: lp(O) lrdmwl_sdio(O) lrdmwl(O) mac80211(O) cfg80211(O) compat(O) g_serial usb_f_serial u_serial rfid(O) industrial(O) applicator(O) libcomposite tm(PO) cutter(O) sensor(O) motor(O) pe(O) ui_leds(O) doorbell(O) psu(O) ui_buttons(O) modules(O)
CPU: 0 PID: 1107 Comm: kworker/0:0 Tainted: P O 5.10.9 #10
Hardware name: Freescale i.MX6 Ultralite (Device Tree)
Workqueue: events sdio_irq_work
PC is at set_next_entity+0x8/0x244
LR is at pick_next_task_fair+0xc0/0x3d0
pc : [<c01496b0>] lr : [<c0149b00>] psr: 20000093
sp : c105bc04 ip : 00000000 fp : c105bc8c
r10: c30de198 r9 : 0000000a r8 : cbfe6bc9
r7 : 00000000 r6 : c0c0df18 r5 : 00000000 r4 : 00000000
r3 : c105a000 r2 : 00000002 r1 : 00000000 r0 : c0c0df18
Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c53c7d Table: 81cd4059 DAC: 00000051
Process kworker/0:0 (pid: 1107, stack limit = 0x4d6559d7)
Stack: (0xc105bc04 to 0xc105c000)
bc00: c0c0df00 00000000 c0c0df18 00000000 cbfe6bc9 0000000a c30de198
bc20: c105bc8c c0149b00 c30dde00 c0c0df00 cbfe6bc9 0000000a cbfe6bc9 c07274c8
bc40: 40000093 c057d8e8 c18cf380 c057e284 00000000 00000000 0000353a c0727acc
bc60: c105bd6c 00000000 00000000 c30dde00 ffffe000 c105bdb4 c105bdb8 00000002
bc80: 00000001 00000000 c105bc9c c0727acc 7fffffff ffffe000 10000100 c072b4ac
bca0: c1155210 00000013 c18cf380 0000000b c105bd3c 00000000 ffffe000 00000000
bcc0: 10000100 7fffffff ffffe000 c105bdb4 c105bdb8 00000002 00000001 c0728ebc
bce0: 10000100 c30dde00 c105bdb8 c105bdb8 c18cf000 c105bda4 c105bdb4 00000000
bd00: c19aa800 c056669c 00000100 0000ffff 00000000 00000001 c19aa800 c0571cb4
bd20: c0d09494 c09e2bd8 c09e2bb0 c7f34162 00000900 00000100 81e0b900 00000035
bd40: 10000100 00000000 00000000 00000000 00000000 000001b5 00000000 00000000
bd60: 00000000 c105bd6c c105bda4 3b9aca00 00000000 00000100 00000001 00000000
bd80: 00000000 00000200 00000000 00000000 c105bda4 00000001 00000001 c105bd2c
bda0: 00000001 00000000 c105bd3c c105bd6c 00000000 00000000 c105bce8 c105bce8
bdc0: 00000000 c105bdc4 c105bdc4 c0565bac 00000000 00000000 00000000 00000000
bde0: c30de19c 00000100 00000000 c1155400 c1e0ba00 00000000 00000000 00000000
be00: 00000000 c057328c 00000000 c1e0b900 00000000 00000100 c30dde00 00000000
be20: 00000100 00000001 c1e0a800 c1e0b900 c1dc1da0 c1e0a800 c0c0d4d8 c0573480
be40: c1e0b900 00000100 c105beb4 bf1b705c c0c0df00 c11a4000 00000000 c193a8c0
be60: c105beb4 c18cf000 c19aa800 00000001 c7edc200 00000000 c18cf2d8 bf1b8388
be80: 00000000 c0727bd0 c1dc1da0 00000000 00000000 c1dc2da0 c014e830 00000000
bea0: 00000003 c18cf274 c1dc0ca0 00000004 60000013 00000001 00000000 00000000
bec0: 00000002 c046cc94 00000000 c18cf000 c19aa800 00000001 c7edc200 00000000
bee0: c18cf2d8 00000000 c0c0d4d8 c05736a0 00000100 00000122 c7edc200 c18cf000
bf00: c18cf2d4 c18cf000 c18cf2d4 00000000 c7edc200 00000000 c18cf2d8 c0573cb8
bf20: c18cf2d4 c4d14880 00000000 c0137088 c0cdb0a0 ffffe000 c4d14880 c0c0d4d8
bf40: c4d14894 c0cdb0a0 ffffe000 c0c0d4ec 00000008 c0137354 c0d09426 c09dfd28
bf60: c3085300 c3085300 c3085bc0 c105a000 00000000 c0137310 c4d14880 c10dfed4
bf80: c3085320 c013c7ec 00000000 c3085bc0 c013c6a8 00000000 00000000 00000000
bfa0: 00000000 00000000 00000000 c0100148 00000000 00000000 00000000 00000000
bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<c01496b0>] (set_next_entity) from [<cbfe6bc9>] (0xcbfe6bc9)
Code: c0c0df18 c0c0df48 e92d4ff0 e1a04001 (e591301c)
---[ end trace b08b1ab27e2a6927 ]---
note: kworker/0:0[1107] exited with preempt_count 2
By using gdb and addr2line, it shows that the NULL pointer dereference happens at the if statment in set_next_entity(), please refer to the folloing code sample, and the parameter se is a NULL pointer.
set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
/* 'current' is not kept within the tree. */
if (se->on_rq) {
By a little bit deep diving, it turns out that the NULL pointer "se" is returned by pick_next_entity(cfs_rq, NULL) which called by pick_next_task_fair. The sample code show as below. Thus, something is woring in the pick_next_entity(cfs_rq, NULL) just before calling set_net_entity(cfs_rq, se).
pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
……
do {
se = pick_next_entity(cfs_rq, NULL);
set_next_entity(cfs_rq, se);
cfs_rq = group_cfs_rq(se);
} while (cfs_rq);
And then, I've tried to dump some members in cfs_rq in pick_next_entity(cfs_rq, NULL). It shows as below. As you can see, the root node and leftmost of the rb_tree are equal to 0/NULL. So I guess the rb_tree has corrupted sometime early than calling this pick_next_entity().
(pick_next_entity:4505)cfs_rq[0xC0C0DF18]
(pick_next_entity:4507)cfs_rq->curr[0x0]
(pick_next_entity:4508)cfs_rq->next[0x0]
(pick_next_entity:4509)cfs_rq->last[0x0]
(pick_next_entity:4510)cfs_rq->skip[0x0]
(pick_next_entity:4511)cfs_rq->nr_running[1]
(pick_next_entity:4512)cfs_rq->h_nr_running[1]
(pick_next_entity:4513)cfs_rq->idle_h_nr_running[0]
(pick_next_entity:4514)cfs_rq->tasks_timeline.rb_leftmost[0x0]
(pick_next_entity:4515)cfs_rq->tasks_timeline.rb_root.rb_node[0x0]
(pick_next_task_fair:7131) picked next entity.
8<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 0000001c
As I'm not familiar with the fair scheduler and rb tree of kernel, I have no idea how to detect when the rb tree corrupted and potential reasons cause this issue. But I found a discussion of a similar issue on this url, which provide the folloing patch to detect when the rb tree corrupted.
diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 1fd61a9af45c..b4b4df3ad0fc 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
## -130,7 +130,28 ## struct rb_root_cached {
#define RB_ROOT_CACHED (struct rb_root_cached) { {NULL, }, NULL }
/* Same as rb_first(), but O(1) */
-#define rb_first_cached(root) (root)->rb_leftmost
+#define __rb_first_cached(root) (root)->rb_leftmost
+
+#ifndef CONFIG_RBTREE_DEBUG
+# define rb_first_cached(root) __rb_first_cached(root)
+# define rbtree_cached_debug(root) do { } while(0)
+
+#else
+static inline struct rb_node *rb_first_cached(struct rb_root_cached *root)
+{
+ struct rb_node *leftmost = __rb_first_cached(root);
+
+ WARN_ON(leftmost != rb_first(&root->rb_root));
+ return leftmost;
+}
+
+#define rbtree_cached_debug(root) \
+do { \
+ WARN_ON(rb_first(&(root)->rb_root) != __rb_first_cached((root))); \
+ WARN_ON(!RB_EMPTY_ROOT(&(root)->rb_root) && !__rb_first_cached((root))); \
+ WARN_ON(RB_EMPTY_ROOT(&(root)->rb_root) && __rb_first_cached((root))); \
+} while (0)
+#endif /* CONFIG_RBTREE_DEBUG */
static inline void rb_insert_color_cached(struct rb_node *node,
struct rb_root_cached *root,
## -139,6 +160,8 ## static inline void rb_insert_color_cached(struct rb_node *node,
if (leftmost)
root->rb_leftmost = node;
rb_insert_color(node, &root->rb_root);
+
+ rbtree_cached_debug(root);
}
static inline void rb_erase_cached(struct rb_node *node,
## -147,6 +170,8 ## static inline void rb_erase_cached(struct rb_node *node,
if (root->rb_leftmost == node)
root->rb_leftmost = rb_next(node);
rb_erase(node, &root->rb_root);
+
+ rbtree_cached_debug(root);
}
static inline void rb_replace_node_cached(struct rb_node *victim,
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2f6fb96405af..62ab9f978bc6 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
## -1727,6 +1727,16 ## config BACKTRACE_SELF_TEST
Say N if you are unsure.
+config RBTREE_DEBUG
+ bool "Red-Black tree sanity tests"
+ depends on DEBUG_KERNEL
+ help
+ This option enables runtime sanity checks on all variants
+ of the rbtree library. Doing so can cause significant overhead,
+ so only enable it in non-production environments.
+
+ Say N if you are unsure.
+
After implementing this patch, multiple warning message can be found before the kernel oops, and the first warning shows as below. By some analysis of this message, it turns out that the first corruption of rb tree is in rb_first_cached() which called by __dequeue_entity() which called by set_next_entity().
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at ./include/linux/rbtree.h:175 set_next_entity+0x1f4/0x2dc
Modules linked in: lp(O) g_serial usb_f_serial u_serial rfid(O) industrial(O) applicator(O) libcomposite tm(PO) cutter(O) sensor(O) motor(O) pe(O) ui_leds(O) doorbell(O) psu(O) ui_buttons(O) modules(O)
CPU: 0 PID: 0 Comm: swapper Tainted: P O 5.10.9 #11
Hardware name: Freescale i.MX6 Ultralite (Device Tree)
[<c010bb64>] (unwind_backtrace) from [<c0109ed8>] (show_stack+0x10/0x14)
[<c0109ed8>] (show_stack) from [<c0120ba4>] (__warn+0xe4/0xe8)
[<c0120ba4>] (__warn) from [<c0120c38>] (warn_slowpath_fmt+0x90/0xa0)
[<c0120c38>] (warn_slowpath_fmt) from [<c014987c>] (set_next_entity+0x1f4/0x2dc)
[<c014987c>] (set_next_entity) from [<c0149d40>] (pick_next_task_fair+0xbc/0x3c4)
[<c0149d40>] (pick_next_task_fair) from [<c07286a8>] (__schedule+0x200/0x7b0)
[<c07286a8>] (__schedule) from [<c072905c>] (schedule_idle+0x38/0x7c)
[<c072905c>] (schedule_idle) from [<c01478a8>] (do_idle+0x14c/0x214)
[<c01478a8>] (do_idle) from [<c0147c40>] (cpu_startup_entry+0xc/0x10)
[<c0147c40>] (cpu_startup_entry) from [<c0b00d44>] (start_kernel+0x42c/0x444)
---[ end trace 5b768ae127781c97 ]---
That's all I got at this time, but still have no idea why the rb tree corrupted and not able to get the root cause of this issue. Any suggestion to debug this kind of issue, to get more information related to it for deep analysis will be greatly appreciated. Could it be caused by some defect in device drivers? or it might be a kernel bug?

Related

How to read value of a member of Process Environment Block in Windbg

I'm new to debugging and I'm trying to understand what values are being populated in the members of a Process Environment Block (PEB). for example what does 0x7fb80000, of AnsiCodePageData mean? How to read it?
+0x050 SharedData : (null)
+0x054 ReadOnlyStaticServerData : 0x7fa504b0 -> (null)
+0x058 AnsiCodePageData : 0x7fb80000 Void
+0x05c OemCodePageData : 0x7fb80000 Void
+0x060 UnicodeCaseTableData : 0x7fba7c24 Void
+0x064 NumberOfProcessors : 2
+0x068 NtGlobalFlag : 0x70
0:000> dt ntdll!_PEB -y Ansi #$peb
+0x058 AnsiCodePageData : 0x7ffb0000 Void
using !address
0:000> !address 7ffb0000
Usage: Other
Base Address: 7ffb0000
End Address: 7ffd3000
Region Size: 00023000 ( 140.000 kB)
State: 00001000 MEM_COMMIT
Protect: 00000002 PAGE_READONLY
Type: 00040000 MEM_MAPPED
Allocation Base: 7ffb0000
Allocation Protect: 00000002 PAGE_READONLY
Additional info: NLS Tables
Content source: 1 (target), length: 23000
dumping raw contents
0:000> dc 7ffb0000
7ffb0000 04e4000d 003f0001 003f003f 0000003f ......?.?.?.?...
7ffb0010 00000000 00000000 01030000 00010000 ................
7ffb0020 00030002 00050004 00070006 00090008 ................
7ffb0030 000b000a 000d000c 000f000e 00110010 ................
7ffb0040 00130012 00150014 00170016 00190018 ................
7ffb0050 001b001a 001d001c 001f001e 00210020 ............ .!.
7ffb0060 00230022 00250024 00270026 00290028 ".#.$.%.&.'.(.).
7ffb0070 002b002a 002d002c 002f002e 00310030 *.+.,.-.../.0.1.
0:000>
using !vprot
0:000> !vprot 7ffb0000
BaseAddress: 7ffb0000
AllocationBase: 7ffb0000
AllocationProtect: 00000002 PAGE_READONLY
RegionSize: 00023000
State: 00001000 MEM_COMMIT
Protect: 00000002 PAGE_READONLY
Type: 00040000 MEM_MAPPED
0:000>

NSData with different content have the same hash

I was using hash function on NSData when I have noticed that the hash was the same for two different data contents.
This was my code:
NSDictionary * d1 = #{#"sums":#[#{#"label":#"Work", #"value":#30}, #{#"label":#"Transport", #"value":#50}, #{#"label":#"Material", #"value":#300}]};
NSData * dd1 = [NSKeyedArchiver archivedDataWithRootObject:d1];
NSDictionary * d2 = #{#"sums":#[#{#"label":#"Work", #"value":#30}, #{#"label":#"Transport", #"value":#50}, #{#"label":#"Material", #"value":#9}]};
NSData * dd2 = [NSKeyedArchiver archivedDataWithRootObject:d2];
NSInteger hashd1 = [dd1 hash];
NSInteger hashd2 = [dd2 hash];
Last value is different, but hash values are the same.
I was wondering how is hash for NSData calculated in Objective-C as there is no clear guidance from the docs.?
Printing description of hashd1:
(NSInteger) hashd1 = 211676908
Printing description of hashd2:
(NSInteger) hashd2 = 211676908
Printing description of dd2:
<62706c69 73743030 d4010203 04050641 42582476 65727369 6f6e5824 6f626a65 63747359 24617263 68697665 72542474 6f701200 0186a0af 10110708 11121820 21222324 2a323334 3c3d3e55 246e756c 6cd3090a 0b0c0e10 574e532e 6b657973 5a4e532e 6f626a65 63747356 24636c61 7373a10d 8002a10f 80038009 5473756d 73d20a0b 1317a314 15168004 800a800d 8010d309 0a0b191c 10a21a1b 80058006 a21d1e80 07800880 09556c61 62656c55 76616c75 6554576f 726b101e d2252627 285a2463 6c617373 6e616d65 5824636c 61737365 735c4e53 44696374 696f6e61 7279a227 29584e53 4f626a65 6374d309 0a0b2b2e 10a21a1b 80058006 a22f3080 0b800c80 09595472 616e7370 6f727410 32d3090a 0b353810 a21a1b80 058006a2 393a800e 800f8009 584d6174 65726961 6c1205f5 e100d225 263f4057 4e534172 726179a2 3f295f10 0f4e534b 65796564 41726368 69766572 d1434454 726f6f74 80010008 0011001a 0023002d 00320037 004b0051 00580060 006b0072 00740076 0078007a 007c0081 0086008a 008c008e 00900092 0099009c 009e00a0 00a300a5 00a700a9 00af00b5 00ba00bc 00c100cc 00d500e2 00e500ee 00f500f8 00fa00fc 00ff0101 01030105 010f0111 0118011b 011d011f 01220124 01260128 01310136 013b0143 01460158 015b0160 00000000 00000201 00000000 00000045 00000000 00000000 00000000 00000162>
Printing description of dd1:
<62706c69 73743030 d4010203 04050641 42582476 65727369 6f6e5824 6f626a65 63747359 24617263 68697665 72542474 6f701200 0186a0af 10110708 11121820 21222324 2a323334 3c3d3e55 246e756c 6cd3090a 0b0c0e10 574e532e 6b657973 5a4e532e 6f626a65 63747356 24636c61 7373a10d 8002a10f 80038009 5473756d 73d20a0b 1317a314 15168004 800a800d 8010d309 0a0b191c 10a21a1b 80058006 a21d1e80 07800880 09556c61 62656c55 76616c75 6554576f 726b101e d2252627 285a2463 6c617373 6e616d65 5824636c 61737365 735c4e53 44696374 696f6e61 7279a227 29584e53 4f626a65 6374d309 0a0b2b2e 10a21a1b 80058006 a22f3080 0b800c80 09595472 616e7370 6f727410 32d3090a 0b353810 a21a1b80 058006a2 393a800e 800f8009 584d6174 65726961 6c11012c d225263f 40574e53 41727261 79a23f29 5f100f4e 534b6579 65644172 63686976 6572d143 4454726f 6f748001 00080011 001a0023 002d0032 0037004b 00510058 0060006b 00720074 00760078 007a007c 00810086 008a008c 008e0090 00920099 009c009e 00a000a3 00a500a7 00a900af 00b500ba 00bc00c1 00cc00d5 00e200e5 00ee00f5 00f800fa 00fc00ff 01010103 0105010f 01110118 011b011d 011f0122 01240126 01280131 01340139 01410144 01560159 015e0000 00000000 02010000 00000000 00450000 00000000 00000000 00000000 0160>
even the byte length is different
(lldb) po [dd1 length]
522
(lldb) po [dd2 length]
524
It's an implementation detail of NSData, but it only uses the first 80 bytes of data to compute the hash, viz: https://opensource.apple.com/source/CF/CF-635.21/CFData.c:
static CFHashCode __CFDataHash(CFTypeRef cf) {
CFDataRef data = (CFDataRef)cf;
return CFHashBytes((uint8_t *)CFDataGetBytePtr(data), __CFMin(__CFDataLength(data), 80));
}
The keyed archiver adds enough preamble that the two results are the same up to that length.
You might like the additional detail available at How does NSData's implementation of the hash method work?

Retrieving IOCTL Input Buffer Content From Crash Dump + Windbg[BSOD]

We know user mode applications can pass IOCTL code and data buffer to kernel device drivers by calling DeviceIoControl() API.
BOOL WINAPI DeviceIoControl(
_In_ HANDLE hDevice,
_In_ DWORD dwIoControlCode, <--Control Code
_In_opt_ LPVOID lpInBuffer, <- Input buffer pointer
_In_ DWORD nInBufferSize, <- Input buffer size
_Out_opt_ LPVOID lpOutBuffer,
_In_ DWORD nOutBufferSize,
_Out_opt_ LPDWORD lpBytesReturned,
_Inout_opt_ LPOVERLAPPED lpOverlapped
);
I've a situation, where an user mode application sometime passing an IOCTL buffer to a Kernel driver and which is causing BSOD again and again. Every time i'm getting kernel memory dump for BSOD.
So my question is, is it possible to find the exact malformed input buffer and IOCTL code which causes the BSOD from the Kernel memory dump so that I can reproduce the BSOD using simple C prog.
As you can find from the stack trace, its crashing just after ntDeviceIoContrilFile call.
kd> kb
ChildEBP RetAddr Args to Child
b8048798 805246fb 00000050 ffff0000 00000001 nt!KeBugCheckEx+0x1b
b80487e4 804e1ff1 00000001 ffff0000 00000000 nt!MmAccessFault+0x6f5
b80487e4 804ed0db 00000001 ffff0000 00000000 nt!KiTrap0E+0xcc
b80488b4 804ed15a 88e23a38 b8048900 b80488f4 nt!IopCompleteRequest+0x92
b8048904 806f2c0a 00000000 00000000 b804891c nt!KiDeliverApc+0xb3
b8048904 806ed0b3 00000000 00000000 b804891c hal!HalpApcInterrupt2ndEntry+0x31
b8048990 804e59ec 88e23a38 88e239f8 00000000 hal!KfLowerIrql+0x43
b80489b0 804ed174 88e23a38 896864c8 00000000 nt!KeInsertQueueApc+0x4b
b80489e4 f7432123 8960e9d8 8980b300 00000000 nt!IopfCompleteRequest+0x1d8
WARNING: Stack unwind information not available. Following frames may be wrong.
b80489f8 804e3d77 0000001c 0000001c 806ed070 NinjaDriver+0x1123
b8048a08 8056a9ab 88e23a8c 896864c8 88e239f8 nt!IopfCallDriver+0x31
b8048a1c 8057d9f7 89817030 88e239f8 896864c8 nt!IopSynchronousServiceTail+0x60
b8048ac4 8057fbfa 00000090 00000000 00000000 nt!IopXxxControlFile+0x611
b8048af8 b6e6a06f 00000090 00000000 00000000 nt!NtDeviceIoControlFile+0x2a
b8048b8c b6e6a5c3 00000001 00000090 00000000 Ninja+0x506f
b8048c80 b6e6ab9b 00000001 88da9898 00000090 Ninja+0x55c3
b8048d34 804df06b 00000090 00000000 00000000 Ninja+0x5b9b
b8048d34 7c90ebab 00000090 00000000 00000000 nt!KiFastCallEntry+0xf8
00f8fd7c 00000000 00000000 00000000 00000000 0x7c90ebab
Thanks in Advance,
You would need the function signature for nt!NtDeviceIoControlFile. With that info unassemble backwards from nt!NtDeviceIoControlFile's return address with ub b6e6a06f. This will show you how Ninja sets up the arguments for its call to nt!NtDeviceIoControlFile. Find the args that correspond to the ioctl code and buffer and then dump their contents.
Note that registers will have been reused so you may need to dig back further in the disassembly to get the correct values from the non-volatile registers which will have been saved on the stack before the function call.
In the windbg help file (debugger.chm) there is a very useful page titled "x86 Architecture". In this case, you may want to read the sections titled "Registers" and "Calling Conventions".

ring0 APC DLL injection crash target process on win7

I am trying to implement a ring0 dll injector driver and implement by APC injection. the code work perfectly on win XP. by on win7, it keeps crashing the target process.
here is the code:
NTSTATUS InjectDll( IN ULONG ulProcessId, HANDLE hEvent, IN TCHAR * pszDllPath )
{
NTSTATUS ntStatus;
HANDLE hProcess = NULL;
OBJECT_ATTRIBUTES ObjectAttributes;
CLIENT_ID ClientId;
PKEVENT pevtWait;
PKPROCESS pPcb = NULL;
PVOID pArg = NULL;
ULONG ulSize = 0;
KAPC_STATE ApcState;
PKAPC pApc;
PVOID pTcb;
PVOID pfnLoadLibrary;
KPROCESSOR_MODE PreviousMode = ExGetPreviousMode();
LARGE_INTEGER nTimeout;
nTimeout.QuadPart = (LONGLONG) -300000000;
do
{
InitializeObjectAttributes(
&ObjectAttributes,
NULL,
0,
NULL,
NULL)
ClientId.UniqueProcess = (HANDLE)ulProcessId;
ClientId.UniqueThread = 0;
ntStatus = ZwOpenProcess(
&hProcess,
PROCESS_ALL_ACCESS,
&ObjectAttributes,
&ClientId
);
if ( NT_ERROR(ntStatus) )
{
DbgPrint("Open process %d failed - error 0x%X\r\n", ulProcessId, ntStatus);
break;
}
ulSize = MAX_FILENAME_LEN * sizeof(WCHAR);
ntStatus = ZwAllocateVirtualMemory(
hProcess,
&pArg,
0,
&ulSize,
MEM_RESERVE|MEM_COMMIT,
PAGE_READWRITE);
if ( NT_ERROR(ntStatus) )
{
DbgPrint("ZwAllocateVirtualMemory failed - error 0x%X\r\n", ntStatus);
// thus, make it leak in target process space!
break;
}
ntStatus = ObReferenceObjectByHandle(
hProcess,
PROCESS_ALL_ACCESS,
0,//should be PsProcessType,
PreviousMode,
(PVOID*)&pPcb,
NULL);
if ( NT_ERROR(ntStatus) )
{
DbgPrint("ObReferenceObjectByHandle 0x%X failed - error 0x%X\r\n", hProcess, ntStatus);
break;
}
//enter target process space
KeStackAttachProcess(pPcb,&ApcState);
pfnLoadLibrary = GetLoadLibraryAddress(pPcb);
if (!pfnLoadLibrary)
{
DbgPrint("Failed to get address of LoadLibrary\r\n");
// leave target process space
KeUnstackDetachProcess(&ApcState);
break;
}
else
{
DbgPrint("Get address of LoadLibrary : 0x%X\r\n", pfnLoadLibrary);
}
RtlCopyMemory( pArg, pszDllPath, MAX_FILENAME_LEN * sizeof(TCHAR));
// leave target process space
KeUnstackDetachProcess(&ApcState);
// get target thread
pTcb = GetThreadByProcess(pPcb);
if (!pTcb)
{
DbgPrint("Get thread failed!\r\n");
ntStatus = STATUS_UNSUCCESSFUL;
break;
}
// start inject
pApc = (PKAPC)ExAllocatePoolWithTag(NonPagedPool,sizeof(KAPC),POOL_TAG);
if (!pApc)
{
DbgPrint("Failed to allocate memory for the APC structure\r\n");
ntStatus = STATUS_INSUFFICIENT_RESOURCES;
break;
}
KeInitializeApc(
pApc,
(PETHREAD)pTcb,
OriginalApcEnvironment,
&ApcKernelRoutine,
NULL,
(PKNORMAL_ROUTINE)pfnLoadLibrary,
UserMode,
pArg);
if (!KeInsertQueueApc(pApc,NULL,NULL,0))
{
DbgPrint("Failed to insert APC\r\n");
ntStatus = STATUS_UNSUCCESSFUL;
ExFreePool(pApc);
break;
}
} while (0);
if ( pPcb )
{
ObDereferenceObject(pPcb);
}
if ( hProcess)
{
ZwClose(hProcess);
}
DbgPrint("InjectDll end - status = 0x%X\r\n", ntStatus);
return ntStatus;
}
injectDll is called by DeviceIoControl serivce routine.
DRIVER_DISPATCH InjectorDeviceControl;
NTSTATUS NTAPI InjectorDeviceControl(
IN PDEVICE_OBJECT pDeviceObject,
PIRP pIrp)
{
PIO_STACK_LOCATION pStack;
PINJECT_DLL_PARAM pParam;
NTSTATUS ntStatus;
DbgPrint("call InjectorDeviceControl...\r\n");
/* Get the stack location and parameters */
pStack = IoGetCurrentIrpStackLocation(pIrp);
pParam = (PINJECT_DLL_PARAM)pIrp->AssociatedIrp.SystemBuffer;
if (pStack->Parameters.DeviceIoControl.IoControlCode != IOCTL_INJECTDLL)
{
/* Unsupported command */
ntStatus = STATUS_NOT_IMPLEMENTED;
}
else
{
/* Validate the input buffer length */
if (pStack->Parameters.DeviceIoControl.InputBufferLength < sizeof(INJECT_DLL_PARAM))
{
/* Invalid buffer */
ntStatus = STATUS_INVALID_PARAMETER;
}
else
{
/* Inject dll */
ntStatus = InjectDll(pParam->ulProcessId, pParam->hEvent, pParam->DllName);
}
}
IoCompleteRequest(pIrp, IO_NO_INCREMENT);
return ntStatus;
}
here is the crash stacktrace of target process.
FAULTING_IP:
ntdll!RtlActivateActivationContextUnsafeFast+9c
7770f59a 8933 mov dword ptr [ebx],esi
EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 7770f59a (ntdll!RtlActivateActivationContextUnsafeFast+0x0000009c)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000001
Parameter[1]: 00000000
Attempt to write to address 00000000
DEFAULT_BUCKET_ID: NULL_POINTER_WRITE
PROCESS_NAME: QQProtect.exe
ERROR_CODE: (NTSTATUS) 0xc0000005 - 0x%08lx
EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - 0x%08lx
EXCEPTION_PARAMETER1: 00000001
EXCEPTION_PARAMETER2: 00000000
WRITE_ADDRESS: 00000000
FOLLOWUP_IP:
ntdll!RtlActivateActivationContextUnsafeFast+9c
7770f59a 8933 mov dword ptr [ebx],esi
NTGLOBALFLAG: 0
FAULTING_THREAD: 00000300
PRIMARY_PROBLEM_CLASS: NULL_POINTER_WRITE
BUGCHECK_STR: APPLICATION_FAULT_NULL_POINTER_WRITE
LAST_CONTROL_TRANSFER: from 77740baa to 7770f59a
STACK_TEXT:
021df8a0 77740baa 75655a2c 00000000 02790000 ntdll!RtlActivateActivationContextUnsafeFast+0x9c
021df91c 77740461 002b5cd0 021dfab8 756559bc ntdll!LdrpProcessStaticImports+0x1b8
021dfa8c 7774232c 021dfaec 021dfab8 00000000 ntdll!LdrpLoadDll+0x314
021dfac0 759088ee 0026f074 021dfb00 021dfaec ntdll!LdrLoadDll+0x92
021dfaf8 75dc3c12 00000000 00000000 00000001 KERNELBASE!LoadLibraryExW+0x15a
021dfb0c 77726f7d 02790000 00000000 00000000 kernel32!LoadLibraryW+0x11
021dff88 75dc3c45 00000000 021dffd4 777437f5 ntdll!KiUserApcDispatcher+0x25
021dff94 777437f5 00295fe8 75655ce4 00000000 kernel32!BaseThreadInitThunk+0xe
021dffd4 777437c8 7770fd0f 00295fe8 00000000 ntdll!__RtlUserThreadStart+0x70
021dffec 00000000 7770fd0f 00295fe8 00000000 ntdll!_RtlUserThreadStart+0x1b
STACK_COMMAND: ~1s; .ecxr ; kb
SYMBOL_STACK_INDEX: 0
SYMBOL_NAME: ntdll!RtlActivateActivationContextUnsafeFast+9c
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: ntdll
IMAGE_NAME: ntdll.dll
DEBUG_FLR_IMAGE_TIMESTAMP: 4ce7b96e
FAILURE_BUCKET_ID: NULL_POINTER_WRITE_c0000005_ntdll.dll!RtlActivateActivationContextUnsafeFast
BUCKET_ID: APPLICATION_FAULT_NULL_POINTER_WRITE_ntdll!RtlActivateActivationContextUnsafeFast+9c
WATSON_STAGEONE_URL: http://watson.microsoft.com/StageOne/QQProtect_exe/3_0_1_3629/50a06369/ntdll_dll/6_1_7601_17514/4ce7b96e/c0000005/0002f59a.htm?Retriage=1
Followup: MachineOwner
any idea?
I've being working on some dll-injetion by apc and got this exact same problem.
After performing some reverse-engeenering, I found why it occured.
This is caused when the current thread's TEB does not have a ActivationContextStackPointer member:
ntdll!_TEB
+0x000 NtTib : _NT_TIB
+0x01c EnvironmentPointer : (null)
+0x020 ClientId : _CLIENT_ID
+0x028 ActiveRpcHandle : (null)
+0x02c ThreadLocalStoragePointer : (null)
+0x030 ProcessEnvironmentBlock : 0x7ffd8000 _PEB
+0x034 LastErrorValue : 0
+0x038 CountOfOwnedCriticalSections : 0
+0x03c CsrClientThread : (null)
+0x040 Win32ThreadInfo : (null)
+0x044 User32Reserved : [26] 0
+0x0ac UserReserved : [5] 0
+0x0c0 WOW32Reserved : (null)
+0x0c4 CurrentLocale : 0x804
+0x0c8 FpSoftwareStatusRegister : 0
+0x0cc SystemReserved1 : [54] (null)
+0x1a4 ExceptionCode : 0n0
+0x1a8 ActivationContextStackPointer : (null) <---it's null
The RtlActivateActivationContextUnsafeFast function tries to insert a node in the ActivationContext list, but the ActivationContextStackPointer is null, so an ACCESS_VIOLATION is raised.
mov dword ptr [ebx],esi //ebx = teb->ActivationContextStackPointer which is 000000000 esi = node
The ActivationContext is kinda related to the manifest of DLL.
You can get more information about this in the Microsoft help.
When loading a dll with a manifest file, the RtlActivateActivationContextUnsafeFast is called.
In case of that, I've solved the problem by disabling the generation of my dll's manifest file:
linker - > (/MANIFEST:NO)
or you can try to enforce the system to allocate a ActivationContextStack for target thread by making the target thread to call a ActivateActCtx function (I'm not sure if it is possible?).
Hope this helps you.
I stumbled upon the same problem, with a slightly different crash on Windows 10 (invalid address access, but not NULL pointer dereference). I didn't like 王云路's solution since I wanted to be able to load any DLL. What worked for me is calling NtQueueApcThread directly instead of using QueueUserAPC. In this case, RtlDispatchAPC is bypassed and the activation functions are not called.
Reference:
https://repnz.github.io/posts/apc/user-apc/#queueuserapc-kernelbase-dll-layer

Linux Kernel programming: trying to get vm_area_struct->vm_start crashes kernel

this is for an assignment at school, where I need to determine the size of the processes on the system using a system call. My code is as follows:
...
struct task_struct *p;
struct vm_area_struct *v;
struct mm_struct *m;
read_lock(&tasklist_lock);
for_each_process(p) {
printk("%ld\n", p->pid);
m = p->mm;
v = m->mmap;
long start = v->vm_start;
printk("vm_start is %ld\n", start);
}
read_unlock(&tasklist_lock);
...
When I run a user level program that calls this system call, the output that I get is:
1
vm_start is 134512640
2
EIP: 0073:[<0806e352>] CPU: 0 Not tainted ESP: 007b:0f7ecf04 EFLAGS: 00010246
Not tainted
EAX: 00000000 EBX: 0fc587c0 ECX: 081fbb58 EDX: 00000000
ESI: bf88efe0 EDI: 0f482284 EBP: 0f7ecf10 DS: 007b ES: 007b
081f9bc0: [<08069ae8>] show_regs+0xb4/0xb9
081f9bec: [<080587ac>] segv+0x225/0x23d
081f9c8c: [<08058582>] segv_handler+0x4f/0x54
081f9cac: [<08067453>] sig_handler_common_skas+0xb7/0xd4
081f9cd4: [<08064748>] sig_handler+0x34/0x44
081f9cec: [<080648b5>] handle_signal+0x4c/0x7a
081f9d0c: [<08066227>] hard_handler+0xf/0x14
081f9d1c: [<00776420>] 0x776420
Kernel panic - not syncing: Kernel mode fault at addr 0x0, ip 0x806e352
EIP: 0073:[<400ea0f2>] CPU: 0 Not tainted ESP: 007b:bf88ef9c EFLAGS: 00000246
Not tainted
EAX: ffffffda EBX: 00000000 ECX: bf88efc8 EDX: 080483c8
ESI: 00000000 EDI: bf88efe0 EBP: bf88f038 DS: 007b ES: 007b
081f9b28: [<08069ae8>] show_regs+0xb4/0xb9
081f9b54: [<08058a1a>] panic_exit+0x25/0x3f
081f9b68: [<08084f54>] notifier_call_chain+0x21/0x46
081f9b88: [<08084fef>] __atomic_notifier_call_chain+0x17/0x19
081f9ba4: [<08085006>] atomic_notifier_call_chain+0x15/0x17
081f9bc0: [<0807039a>] panic+0x52/0xd8
081f9be0: [<080587ba>] segv+0x233/0x23d
081f9c8c: [<08058582>] segv_handler+0x4f/0x54
081f9cac: [<08067453>] sig_handler_common_skas+0xb7/0xd4
081f9cd4: [<08064748>] sig_handler+0x34/0x44
081f9cec: [<080648b5>] handle_signal+0x4c/0x7a
081f9d0c: [<08066227>] hard_handler+0xf/0x14
081f9d1c: [<00776420>] 0x776420
The first process (pid = 1) gave me the vm_start without any problems, but when I try to access the second process, the kernel crashes. Can anyone tell me what's wrong, and maybe how to fix it as well? Thanks a lot!
(sorry for the bad formatting....)
edit: This is done in a Fedora 2.6 core in an uml environment.
Some kernel threads might not have mm filled - check p->mm for NULL.
Changed the code to check for null pointers:
m = p->mm;
if (m != 0) {
v = m->mmap;
if (v != 0) {
long start = v->vm_start;
printk("vm_start is %ld\n", start);
}
}
All process related information can be found at /proc filesystem at the userspace level. Inside the kernel, these information are generated via fs/proc/*.c
http://lxr.linux.no/linux+v3.2.4/fs/proc/
Looking at the file task_mmu.c, which printing all the vm_start information u can observe that all handling of vm_start field always require the mmap_sem to be locked:
down_read(&mm->mmap_sem);
for (vma = mm->mmap; vma; vma = vma->vm_next) {
clear_refs_walk.private = vma;
...
walk_page_range(vma->vm_start, vma->vm_end,
&clear_refs_walk);
For kernel threads mm will be null. So whenever you read the mm do it in the following manner.
down_read(&p->mm->mmap_sem)
if(mm) {
/* read the contents of mm*/
}
up_read(&p->mm->mmap_sem)
Also you may use get_task_mm(). With get_task_mm() you need not acquire the lock. Here is how you use it :
struct mm_struct *mm;
mm = get_task_mm(p);
if (mm) {
/* read the mm contents */
}