How to read value of a member of Process Environment Block in Windbg - process

I'm new to debugging and I'm trying to understand what values are being populated in the members of a Process Environment Block (PEB). for example what does 0x7fb80000, of AnsiCodePageData mean? How to read it?
+0x050 SharedData : (null)
+0x054 ReadOnlyStaticServerData : 0x7fa504b0 -> (null)
+0x058 AnsiCodePageData : 0x7fb80000 Void
+0x05c OemCodePageData : 0x7fb80000 Void
+0x060 UnicodeCaseTableData : 0x7fba7c24 Void
+0x064 NumberOfProcessors : 2
+0x068 NtGlobalFlag : 0x70

0:000> dt ntdll!_PEB -y Ansi #$peb
+0x058 AnsiCodePageData : 0x7ffb0000 Void
using !address
0:000> !address 7ffb0000
Usage: Other
Base Address: 7ffb0000
End Address: 7ffd3000
Region Size: 00023000 ( 140.000 kB)
State: 00001000 MEM_COMMIT
Protect: 00000002 PAGE_READONLY
Type: 00040000 MEM_MAPPED
Allocation Base: 7ffb0000
Allocation Protect: 00000002 PAGE_READONLY
Additional info: NLS Tables
Content source: 1 (target), length: 23000
dumping raw contents
0:000> dc 7ffb0000
7ffb0000 04e4000d 003f0001 003f003f 0000003f ......?.?.?.?...
7ffb0010 00000000 00000000 01030000 00010000 ................
7ffb0020 00030002 00050004 00070006 00090008 ................
7ffb0030 000b000a 000d000c 000f000e 00110010 ................
7ffb0040 00130012 00150014 00170016 00190018 ................
7ffb0050 001b001a 001d001c 001f001e 00210020 ............ .!.
7ffb0060 00230022 00250024 00270026 00290028 ".#.$.%.&.'.(.).
7ffb0070 002b002a 002d002c 002f002e 00310030 *.+.,.-.../.0.1.
0:000>
using !vprot
0:000> !vprot 7ffb0000
BaseAddress: 7ffb0000
AllocationBase: 7ffb0000
AllocationProtect: 00000002 PAGE_READONLY
RegionSize: 00023000
State: 00001000 MEM_COMMIT
Protect: 00000002 PAGE_READONLY
Type: 00040000 MEM_MAPPED
0:000>

Related

How to debug crash in CFS scheduler in linux kernel

Recently, I've encounterd a kernel oops on a embedded linux system based on I.MX6ULL and 5.10.9 version of kernel. This issue has bothered me for more than a week. It's really painful.
First of all, let me show the kernel oops for your reference.
<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 0000001c
pgd = f6125cdd
[0000001c] *pgd=00000000
Internal error: Oops: 17 [#1] PREEMPT ARM
Modules linked in: lp(O) lrdmwl_sdio(O) lrdmwl(O) mac80211(O) cfg80211(O) compat(O) g_serial usb_f_serial u_serial rfid(O) industrial(O) applicator(O) libcomposite tm(PO) cutter(O) sensor(O) motor(O) pe(O) ui_leds(O) doorbell(O) psu(O) ui_buttons(O) modules(O)
CPU: 0 PID: 1107 Comm: kworker/0:0 Tainted: P O 5.10.9 #10
Hardware name: Freescale i.MX6 Ultralite (Device Tree)
Workqueue: events sdio_irq_work
PC is at set_next_entity+0x8/0x244
LR is at pick_next_task_fair+0xc0/0x3d0
pc : [<c01496b0>] lr : [<c0149b00>] psr: 20000093
sp : c105bc04 ip : 00000000 fp : c105bc8c
r10: c30de198 r9 : 0000000a r8 : cbfe6bc9
r7 : 00000000 r6 : c0c0df18 r5 : 00000000 r4 : 00000000
r3 : c105a000 r2 : 00000002 r1 : 00000000 r0 : c0c0df18
Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c53c7d Table: 81cd4059 DAC: 00000051
Process kworker/0:0 (pid: 1107, stack limit = 0x4d6559d7)
Stack: (0xc105bc04 to 0xc105c000)
bc00: c0c0df00 00000000 c0c0df18 00000000 cbfe6bc9 0000000a c30de198
bc20: c105bc8c c0149b00 c30dde00 c0c0df00 cbfe6bc9 0000000a cbfe6bc9 c07274c8
bc40: 40000093 c057d8e8 c18cf380 c057e284 00000000 00000000 0000353a c0727acc
bc60: c105bd6c 00000000 00000000 c30dde00 ffffe000 c105bdb4 c105bdb8 00000002
bc80: 00000001 00000000 c105bc9c c0727acc 7fffffff ffffe000 10000100 c072b4ac
bca0: c1155210 00000013 c18cf380 0000000b c105bd3c 00000000 ffffe000 00000000
bcc0: 10000100 7fffffff ffffe000 c105bdb4 c105bdb8 00000002 00000001 c0728ebc
bce0: 10000100 c30dde00 c105bdb8 c105bdb8 c18cf000 c105bda4 c105bdb4 00000000
bd00: c19aa800 c056669c 00000100 0000ffff 00000000 00000001 c19aa800 c0571cb4
bd20: c0d09494 c09e2bd8 c09e2bb0 c7f34162 00000900 00000100 81e0b900 00000035
bd40: 10000100 00000000 00000000 00000000 00000000 000001b5 00000000 00000000
bd60: 00000000 c105bd6c c105bda4 3b9aca00 00000000 00000100 00000001 00000000
bd80: 00000000 00000200 00000000 00000000 c105bda4 00000001 00000001 c105bd2c
bda0: 00000001 00000000 c105bd3c c105bd6c 00000000 00000000 c105bce8 c105bce8
bdc0: 00000000 c105bdc4 c105bdc4 c0565bac 00000000 00000000 00000000 00000000
bde0: c30de19c 00000100 00000000 c1155400 c1e0ba00 00000000 00000000 00000000
be00: 00000000 c057328c 00000000 c1e0b900 00000000 00000100 c30dde00 00000000
be20: 00000100 00000001 c1e0a800 c1e0b900 c1dc1da0 c1e0a800 c0c0d4d8 c0573480
be40: c1e0b900 00000100 c105beb4 bf1b705c c0c0df00 c11a4000 00000000 c193a8c0
be60: c105beb4 c18cf000 c19aa800 00000001 c7edc200 00000000 c18cf2d8 bf1b8388
be80: 00000000 c0727bd0 c1dc1da0 00000000 00000000 c1dc2da0 c014e830 00000000
bea0: 00000003 c18cf274 c1dc0ca0 00000004 60000013 00000001 00000000 00000000
bec0: 00000002 c046cc94 00000000 c18cf000 c19aa800 00000001 c7edc200 00000000
bee0: c18cf2d8 00000000 c0c0d4d8 c05736a0 00000100 00000122 c7edc200 c18cf000
bf00: c18cf2d4 c18cf000 c18cf2d4 00000000 c7edc200 00000000 c18cf2d8 c0573cb8
bf20: c18cf2d4 c4d14880 00000000 c0137088 c0cdb0a0 ffffe000 c4d14880 c0c0d4d8
bf40: c4d14894 c0cdb0a0 ffffe000 c0c0d4ec 00000008 c0137354 c0d09426 c09dfd28
bf60: c3085300 c3085300 c3085bc0 c105a000 00000000 c0137310 c4d14880 c10dfed4
bf80: c3085320 c013c7ec 00000000 c3085bc0 c013c6a8 00000000 00000000 00000000
bfa0: 00000000 00000000 00000000 c0100148 00000000 00000000 00000000 00000000
bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<c01496b0>] (set_next_entity) from [<cbfe6bc9>] (0xcbfe6bc9)
Code: c0c0df18 c0c0df48 e92d4ff0 e1a04001 (e591301c)
---[ end trace b08b1ab27e2a6927 ]---
note: kworker/0:0[1107] exited with preempt_count 2
By using gdb and addr2line, it shows that the NULL pointer dereference happens at the if statment in set_next_entity(), please refer to the folloing code sample, and the parameter se is a NULL pointer.
set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
/* 'current' is not kept within the tree. */
if (se->on_rq) {
By a little bit deep diving, it turns out that the NULL pointer "se" is returned by pick_next_entity(cfs_rq, NULL) which called by pick_next_task_fair. The sample code show as below. Thus, something is woring in the pick_next_entity(cfs_rq, NULL) just before calling set_net_entity(cfs_rq, se).
pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
……
do {
se = pick_next_entity(cfs_rq, NULL);
set_next_entity(cfs_rq, se);
cfs_rq = group_cfs_rq(se);
} while (cfs_rq);
And then, I've tried to dump some members in cfs_rq in pick_next_entity(cfs_rq, NULL). It shows as below. As you can see, the root node and leftmost of the rb_tree are equal to 0/NULL. So I guess the rb_tree has corrupted sometime early than calling this pick_next_entity().
(pick_next_entity:4505)cfs_rq[0xC0C0DF18]
(pick_next_entity:4507)cfs_rq->curr[0x0]
(pick_next_entity:4508)cfs_rq->next[0x0]
(pick_next_entity:4509)cfs_rq->last[0x0]
(pick_next_entity:4510)cfs_rq->skip[0x0]
(pick_next_entity:4511)cfs_rq->nr_running[1]
(pick_next_entity:4512)cfs_rq->h_nr_running[1]
(pick_next_entity:4513)cfs_rq->idle_h_nr_running[0]
(pick_next_entity:4514)cfs_rq->tasks_timeline.rb_leftmost[0x0]
(pick_next_entity:4515)cfs_rq->tasks_timeline.rb_root.rb_node[0x0]
(pick_next_task_fair:7131) picked next entity.
8<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 0000001c
As I'm not familiar with the fair scheduler and rb tree of kernel, I have no idea how to detect when the rb tree corrupted and potential reasons cause this issue. But I found a discussion of a similar issue on this url, which provide the folloing patch to detect when the rb tree corrupted.
diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 1fd61a9af45c..b4b4df3ad0fc 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
## -130,7 +130,28 ## struct rb_root_cached {
#define RB_ROOT_CACHED (struct rb_root_cached) { {NULL, }, NULL }
/* Same as rb_first(), but O(1) */
-#define rb_first_cached(root) (root)->rb_leftmost
+#define __rb_first_cached(root) (root)->rb_leftmost
+
+#ifndef CONFIG_RBTREE_DEBUG
+# define rb_first_cached(root) __rb_first_cached(root)
+# define rbtree_cached_debug(root) do { } while(0)
+
+#else
+static inline struct rb_node *rb_first_cached(struct rb_root_cached *root)
+{
+ struct rb_node *leftmost = __rb_first_cached(root);
+
+ WARN_ON(leftmost != rb_first(&root->rb_root));
+ return leftmost;
+}
+
+#define rbtree_cached_debug(root) \
+do { \
+ WARN_ON(rb_first(&(root)->rb_root) != __rb_first_cached((root))); \
+ WARN_ON(!RB_EMPTY_ROOT(&(root)->rb_root) && !__rb_first_cached((root))); \
+ WARN_ON(RB_EMPTY_ROOT(&(root)->rb_root) && __rb_first_cached((root))); \
+} while (0)
+#endif /* CONFIG_RBTREE_DEBUG */
static inline void rb_insert_color_cached(struct rb_node *node,
struct rb_root_cached *root,
## -139,6 +160,8 ## static inline void rb_insert_color_cached(struct rb_node *node,
if (leftmost)
root->rb_leftmost = node;
rb_insert_color(node, &root->rb_root);
+
+ rbtree_cached_debug(root);
}
static inline void rb_erase_cached(struct rb_node *node,
## -147,6 +170,8 ## static inline void rb_erase_cached(struct rb_node *node,
if (root->rb_leftmost == node)
root->rb_leftmost = rb_next(node);
rb_erase(node, &root->rb_root);
+
+ rbtree_cached_debug(root);
}
static inline void rb_replace_node_cached(struct rb_node *victim,
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2f6fb96405af..62ab9f978bc6 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
## -1727,6 +1727,16 ## config BACKTRACE_SELF_TEST
Say N if you are unsure.
+config RBTREE_DEBUG
+ bool "Red-Black tree sanity tests"
+ depends on DEBUG_KERNEL
+ help
+ This option enables runtime sanity checks on all variants
+ of the rbtree library. Doing so can cause significant overhead,
+ so only enable it in non-production environments.
+
+ Say N if you are unsure.
+
After implementing this patch, multiple warning message can be found before the kernel oops, and the first warning shows as below. By some analysis of this message, it turns out that the first corruption of rb tree is in rb_first_cached() which called by __dequeue_entity() which called by set_next_entity().
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at ./include/linux/rbtree.h:175 set_next_entity+0x1f4/0x2dc
Modules linked in: lp(O) g_serial usb_f_serial u_serial rfid(O) industrial(O) applicator(O) libcomposite tm(PO) cutter(O) sensor(O) motor(O) pe(O) ui_leds(O) doorbell(O) psu(O) ui_buttons(O) modules(O)
CPU: 0 PID: 0 Comm: swapper Tainted: P O 5.10.9 #11
Hardware name: Freescale i.MX6 Ultralite (Device Tree)
[<c010bb64>] (unwind_backtrace) from [<c0109ed8>] (show_stack+0x10/0x14)
[<c0109ed8>] (show_stack) from [<c0120ba4>] (__warn+0xe4/0xe8)
[<c0120ba4>] (__warn) from [<c0120c38>] (warn_slowpath_fmt+0x90/0xa0)
[<c0120c38>] (warn_slowpath_fmt) from [<c014987c>] (set_next_entity+0x1f4/0x2dc)
[<c014987c>] (set_next_entity) from [<c0149d40>] (pick_next_task_fair+0xbc/0x3c4)
[<c0149d40>] (pick_next_task_fair) from [<c07286a8>] (__schedule+0x200/0x7b0)
[<c07286a8>] (__schedule) from [<c072905c>] (schedule_idle+0x38/0x7c)
[<c072905c>] (schedule_idle) from [<c01478a8>] (do_idle+0x14c/0x214)
[<c01478a8>] (do_idle) from [<c0147c40>] (cpu_startup_entry+0xc/0x10)
[<c0147c40>] (cpu_startup_entry) from [<c0b00d44>] (start_kernel+0x42c/0x444)
---[ end trace 5b768ae127781c97 ]---
That's all I got at this time, but still have no idea why the rb tree corrupted and not able to get the root cause of this issue. Any suggestion to debug this kind of issue, to get more information related to it for deep analysis will be greatly appreciated. Could it be caused by some defect in device drivers? or it might be a kernel bug?

ESP32 crashes when I call WiFi.mode(WIFI_MODE_APSTA) from within a Interrupt ISR

Summary:
On my ISR routine I call a function that tries to set the WiFi mode to APSTA in a ESP32, and it crashes with the following dump:
Scanning for Modules...
OK1
Guru Meditation Error: Core 1 panic'ed (Interrupt wdt timeout on CPU1)
Core 1 register dump:
PC : 0x4008b7cc PS : 0x00060d34 A0 : 0x8008a9a7 A1 : 0x3ffbe530
A2 : 0x3ffb595c A3 : 0x3ffb8074 A4 : 0x00000001 A5 : 0x00000001
A6 : 0x00060d23 A7 : 0x00000000 A8 : 0x3ffb8074 A9 : 0x3ffb8074
A10 : 0x00000018 A11 : 0x00000018 A12 : 0x00000001 A13 : 0x00000001
A14 : 0x00060d21 A15 : 0x00000000 SAR : 0x00000020 EXCCAUSE: 0x00000006
EXCVADDR: 0x00000000 LBEG : 0x4000c2e0 LEND : 0x4000c2f6 LCOUNT : 0xffffffff
Core 1 was running in ISR context:
EPC1 : 0x400e176c EPC2 : 0x00000000 EPC3 : 0x00000000 EPC4 : 0x4008b7cc
Backtrace: 0x4008b7cc:0x3ffbe530 0x4008a9a4:0x3ffbe550 0x40088c3f:0x3ffbe570 0x400d504d:0x3ffbe5b0 0x400ed002:0x3ffbe5d0 0x400e1adf:0x3ffbe5f0 0x400d1991:0x3ffbe610 0x40081002:0x3ffbe710 0x400811c5:0x3ffbe790 0x4008125d:0x3ffbe7b0 0x40084a2d:0x3ffbe7d0 0x400d223a:0x3ffb1fb0 0x40088e39:0x3ffb1fd0
Core 0 register dump:
PC : 0x40089e2e PS : 0x00060034 A0 : 0x8008b005 A1 : 0x3ffb5500
A2 : 0x3ffbf300 A3 : 0x0000cdcd A4 : 0xb33fffff A5 : 0x00000001
A6 : 0x00060023 A7 : 0x0000abab A8 : 0x0000abab A9 : 0x3ffb5500
A10 : 0x00000000 A11 : 0x3ffba744 A12 : 0x000002cc A13 : 0x3ffbbc44
A14 : 0x3ffba8e9 A15 : 0x00000000 SAR : 0x00000001 EXCCAUSE: 0x00000006
EXCVADDR: 0x00000000 LBEG : 0x4000c2e0 LEND : 0x4000c2f6 LCOUNT : 0x00000000
Backtrace: 0x40089e2e:0x3ffb5500 0x4008b002:0x3ffb5530 0x40088b9f:0x3ffb5550 0x40088ca9:0x3ffb5590 0x400819ad:0x3ffb55b0 0x400e27c1:0x3ffb55d0 0x400f157d:0x3ffb55f0 0x400e2b3e:0x3ffb5620 0x400e2dc5:0x3ffb5650 0x400ecb31:0x3ffb56a0 0x400e9136:0x3ffb56d0 0x4008f83f:0x3ffb56f0 0x40088e39:0x3ffb5730
Rebooting...
ets Jun 8 2016 00:22:57
Detailed info:
In my project I have 2 buttons that call a different ISR (void IRAM_ATTR ScanForModule1() and void IRAM_ATTR ScanForModule2()) each, and within either ISR I call the same function with a different argument (void IRAM_ATTR ScanForModules(uint8_t moduleNumber)).
This new function call for WiFi mode to be set to WiFi.mode(WIFI_MODE_APSTA) and that makes the ESP32 crash.
I added some tracing "OK"s for debugging before and after the Wifi call and it only shows the one before.
Any idea that may help me?
Thank you
(non compilable snippet)
// ISR Scan for modules 1 in AP mode
void IRAM_ATTR ScanForModule1()
{
Serial.println("Enrolling Module 1...");
module1_addr = NULL;
ScanForModules(1);
}
// ISR Scan for modules 2 in AP mode
void IRAM_ATTR ScanForModule2()
{
Serial.println("Enrolling Module 2...");
module2_addr = NULL;
ScanForModules(2);
}
// Scan for modules in AP mode
void IRAM_ATTR ScanForModules(uint8_t moduleNumber)
{
detachInterrupt(ENROLL_BUTTON1_PIN);
detachInterrupt(ENROLL_BUTTON2_PIN);
Serial.println("Scanning for Modules...");
// Switch the ESP to AP and start broadcasting
//configDeviceAP();
Serial.println("OK1");
WiFi.mode(WIFI_MODE_APSTA); //Crashes here <---------------------
Serial.println("OK2");
The reason for the crash is that your WiFi mode change takes longer to process than an interrupt service routine is allowed to execute (well, technically the watchdog doesn't get reset because the task responsible for it gets blocked by ISR). Which is the expected outcome. Interrupt handlers are not meant for doing any serious work, but rather for registering an event and poking a responsible task to process it.
In this case you should have a task which simply sits and waits for the button press. When a button is pressed, the ISR doesn't do anything but wake up the processing task and say which button was pressed. You can build your own "notification system" with a simple shared bool and busywaiting; or use the FreeRTOS event groups to implement this. I find the Mastering the FreeRTOS Real Time Kernel – a Hands On Tutorial Guide to be an excellent resource for beginners.

objcopy: bloated binary output file

While working on my bootloader project, I noticed that the resulting binary file is larger than the sum of the sizes of each section in the original ELF file.
After linking the bootloader image via ld, the ELF file is structured as follows:
$ readelf -S elfboot.elf
There are 10 section headers, starting at offset 0xa708:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00007e00 000e00 004bc7 00 AX 0 0 16
[ 2] .rodata PROGBITS 0000c9d0 0059d0 000638 00 A 0 0 32
[ 3] .initcalls PROGBITS 0000ec20 007c20 00000c 00 WA 0 0 4
[ 4] .exitcalls PROGBITS 0000ec30 007c30 00000c 00 WA 0 0 4
[ 5] .data PROGBITS 0000ec40 007c40 0001e0 00 WA 0 0 32
[ 6] .bss NOBITS 0000ee20 007e20 00025c 00 WA 0 0 32
[ 7] .symtab SYMTAB 00000000 007e20 0016e0 10 8 166 4
[ 8] .strtab STRTAB 00000000 009500 0011bb 00 0 0 1
[ 9] .shstrtab STRTAB 00000000 00a6bb 00004a 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
p (processor specific)
I noticed that there is a rather big gap (~7KB) between the sections .rodata and .initcalls. I also checked out the content of each section:
$ objdump -s elfboot.elf
[...]
Contents of section .rodata:
[...]
cfc0 23cf0000 80000000 2fcf0000 00010000 #......./.......
cfd0 3bcf0000 00020000 47cf0000 00040000 ;.......G.......
cfe0 54cf0000 00080000 61cf0000 00100000 T.......a.......
cff0 6ecf0000 00200000 7bcf0000 00400000 n.... ..{....#..
d000 89cf0000 00800000 ........
Contents of section .initcalls:
ec20 02b50000 6db50000 feac0000 ....m.......
Contents of section .exitcalls:
[...]
My linker script tells to align every section at a 16 byte (0x10) boundary. After
objcopy -O binary elfboot.elf elfboot.bin
the resulting binary contains a huge chunk of zero filled bytes at offset 0x5200 (0xd000 - 0x7e00) all the way to offset 0x6e20 (0xec20 - 0x7e00). Now that I know where the zero filled bytes come from, how do I remove them? For reference I added the verbose output of the linker step including the linker script used:
i686-elfboot-gcc -o elfboot.elf -T elfboot.ld ./src/arch/x86/bootstrap.o ./src/arch/x86/a20.o ./src/arch/x86/bda.o ./src/arch/x86/bios.o ./src/arch/x86/copy.o ./src/arch/x86/e820.o ./src/arch/x86/entry.o ./src/arch/x86/idt.o ./src/arch/x86/realmode_jmp.o ./src/arch/x86/setup.o ./src/arch/x86/opmode.o ./src/arch/x86/pic.o ./src/arch/x86/ptrace.o ./src/arch/x86/video.o ./src/core/bdev.o ./src/core/cdev.o ./src/core/elf.o ./src/core/input.o ./src/core/interrupt.o ./src/core/loader.o ./src/core/main.o ./src/core/module.o ./src/core/pci.o ./src/core/printf.o ./src/core/string.o ./src/core/symbol.o ./src/crypto/crc32.o ./src/drivers/ide/ide.o ./src/fs/fs.o ./src/fs/file.o ./src/fs/super.o ./src/fs/ramfs/ramfs.o ./src/fs/isofs/isofs.o ./src/lib/ata/libata.o ./src/lib/tmg/libtmg.o ./src/mm/memblock.o ./src/mm/page_alloc.o ./src/mm/slub.o ./src/mm/util.o -O2 -Wl,--verbose -nostdlib -lgcc
GNU ld (GNU Binutils) 2.31.1
Supported emulations:
elf_i386
elf_iamcu
opened script file elfboot.ld
using external linker script:
==================================================
OUTPUT_FORMAT(elf32-i386)
ENTRY(_arch_start)
SECTIONS
{
/*
* The boot stack starts at 0x6000 and grows towards lower addresses.
* The stack is designed to be only one page in size, which should be
* sufficient for the bootstrap stage.
*/
. = 0x6000;
__stack_start = .;
/*
* Buffer for reading files from the boot device. Usually boot devices
* are designed to read data in chunks of 0x200 (512) or 0x800 (2048)
* bytes. The buffer is large enough to read 4 512 or 2 2048 chunks of
* data from the disk.
*/
__buffer_start = .;
. = 0x7000;
__buffer_end = .;
/*
* Actual bootstrap stage code and data.
*/
. = 0x7E00;
__bootstrap_start = .;
.text ALIGN(0x10) : {
__text_start = .;
*(.text*)
__text_end = .;
}
.rodata ALIGN(0x10) : {
__rodata_start = .;
*(.rodata*)
__rodata_end = .;
}
/*
* This section is dedicated to all built-in modules. If a module will
* be included in the elfboot binary, the module_init function pointer
* of that module is placed in this section.
*
* Make sure to initialize built-in filesystems first as we need it to
* setup the root file system node.
*/
.initcalls ALIGN(0x10) : {
__initcalls_vfs_start = .;
/* Built-in file systems */
*(.initcalls_vfs*)
__initcalls_vfs_end = .;
__initcalls_dev_start = .;
/* Built-in devices */
*(.initcalls_dev*)
__initcalls_dev_end = .;
__initcalls_start = .;
/* Built-in modules */
*(.initcalls*)
__initcalls_end = .;
}
.exitcalls ALIGN(0x10) : {
__exitcalls_start = .;
*(.exitcalls*)
__exitcalls_end = .;
}
.data ALIGN(0x10) : {
__data_start = .;
*(.data*)
__data_end = .;
}
.bss ALIGN(0x10) : {
__bss_start = .;
*(.bss*)
__bss_end = .;
}
__bootstrap_end = .;
}
==================================================
attempt to open ./src/arch/x86/bootstrap.o succeeded
./src/arch/x86/bootstrap.o
attempt to open ./src/arch/x86/a20.o succeeded
./src/arch/x86/a20.o
attempt to open ./src/arch/x86/bda.o succeeded
./src/arch/x86/bda.o
attempt to open ./src/arch/x86/bios.o succeeded
./src/arch/x86/bios.o
attempt to open ./src/arch/x86/copy.o succeeded
./src/arch/x86/copy.o
attempt to open ./src/arch/x86/e820.o succeeded
./src/arch/x86/e820.o
attempt to open ./src/arch/x86/entry.o succeeded
./src/arch/x86/entry.o
attempt to open ./src/arch/x86/idt.o succeeded
./src/arch/x86/idt.o
attempt to open ./src/arch/x86/realmode_jmp.o succeeded
./src/arch/x86/realmode_jmp.o
attempt to open ./src/arch/x86/setup.o succeeded
./src/arch/x86/setup.o
attempt to open ./src/arch/x86/opmode.o succeeded
./src/arch/x86/opmode.o
attempt to open ./src/arch/x86/pic.o succeeded
./src/arch/x86/pic.o
attempt to open ./src/arch/x86/ptrace.o succeeded
./src/arch/x86/ptrace.o
attempt to open ./src/arch/x86/video.o succeeded
./src/arch/x86/video.o
attempt to open ./src/core/bdev.o succeeded
./src/core/bdev.o
attempt to open ./src/core/cdev.o succeeded
./src/core/cdev.o
attempt to open ./src/core/elf.o succeeded
./src/core/elf.o
attempt to open ./src/core/input.o succeeded
./src/core/input.o
attempt to open ./src/core/interrupt.o succeeded
./src/core/interrupt.o
attempt to open ./src/core/loader.o succeeded
./src/core/loader.o
attempt to open ./src/core/main.o succeeded
./src/core/main.o
attempt to open ./src/core/module.o succeeded
./src/core/module.o
attempt to open ./src/core/pci.o succeeded
./src/core/pci.o
attempt to open ./src/core/printf.o succeeded
./src/core/printf.o
attempt to open ./src/core/string.o succeeded
./src/core/string.o
attempt to open ./src/core/symbol.o succeeded
./src/core/symbol.o
attempt to open ./src/crypto/crc32.o succeeded
./src/crypto/crc32.o
attempt to open ./src/drivers/ide/ide.o succeeded
./src/drivers/ide/ide.o
attempt to open ./src/fs/fs.o succeeded
./src/fs/fs.o
attempt to open ./src/fs/file.o succeeded
./src/fs/file.o
attempt to open ./src/fs/super.o succeeded
./src/fs/super.o
attempt to open ./src/fs/ramfs/ramfs.o succeeded
./src/fs/ramfs/ramfs.o
attempt to open ./src/fs/isofs/isofs.o succeeded
./src/fs/isofs/isofs.o
attempt to open ./src/lib/ata/libata.o succeeded
./src/lib/ata/libata.o
attempt to open ./src/lib/tmg/libtmg.o succeeded
./src/lib/tmg/libtmg.o
attempt to open ./src/mm/memblock.o succeeded
./src/mm/memblock.o
attempt to open ./src/mm/page_alloc.o succeeded
./src/mm/page_alloc.o
attempt to open ./src/mm/slub.o succeeded
./src/mm/slub.o
attempt to open ./src/mm/util.o succeeded
./src/mm/util.o
attempt to open /home/croemheld/Repositories/elfboot/elfboot-toolchain/lib/gcc/i686-elfboot/8.2.0/libgcc.so failed
attempt to open /home/croemheld/Repositories/elfboot/elfboot-toolchain/lib/gcc/i686-elfboot/8.2.0/libgcc.a succeeded
(/home/croemheld/Repositories/elfboot/elfboot-toolchain/lib/gcc/i686-elfboot/8.2.0/libgcc.a)_umoddi3.o
(/home/croemheld/Repositories/elfboot/elfboot-toolchain/lib/gcc/i686-elfboot/8.2.0/libgcc.a)_udivmoddi4.o
Please note that I am not able to load the sections individually into memory, since my first stage bootloader (boot sector) simply loads the entire file at a specified offset from the boot device. To reduce the size of the second stage bootloader image, I want to try to remove the gap at link time or also if possible at all, at post link time (objcopy, ...).

Retrieving IOCTL Input Buffer Content From Crash Dump + Windbg[BSOD]

We know user mode applications can pass IOCTL code and data buffer to kernel device drivers by calling DeviceIoControl() API.
BOOL WINAPI DeviceIoControl(
_In_ HANDLE hDevice,
_In_ DWORD dwIoControlCode, <--Control Code
_In_opt_ LPVOID lpInBuffer, <- Input buffer pointer
_In_ DWORD nInBufferSize, <- Input buffer size
_Out_opt_ LPVOID lpOutBuffer,
_In_ DWORD nOutBufferSize,
_Out_opt_ LPDWORD lpBytesReturned,
_Inout_opt_ LPOVERLAPPED lpOverlapped
);
I've a situation, where an user mode application sometime passing an IOCTL buffer to a Kernel driver and which is causing BSOD again and again. Every time i'm getting kernel memory dump for BSOD.
So my question is, is it possible to find the exact malformed input buffer and IOCTL code which causes the BSOD from the Kernel memory dump so that I can reproduce the BSOD using simple C prog.
As you can find from the stack trace, its crashing just after ntDeviceIoContrilFile call.
kd> kb
ChildEBP RetAddr Args to Child
b8048798 805246fb 00000050 ffff0000 00000001 nt!KeBugCheckEx+0x1b
b80487e4 804e1ff1 00000001 ffff0000 00000000 nt!MmAccessFault+0x6f5
b80487e4 804ed0db 00000001 ffff0000 00000000 nt!KiTrap0E+0xcc
b80488b4 804ed15a 88e23a38 b8048900 b80488f4 nt!IopCompleteRequest+0x92
b8048904 806f2c0a 00000000 00000000 b804891c nt!KiDeliverApc+0xb3
b8048904 806ed0b3 00000000 00000000 b804891c hal!HalpApcInterrupt2ndEntry+0x31
b8048990 804e59ec 88e23a38 88e239f8 00000000 hal!KfLowerIrql+0x43
b80489b0 804ed174 88e23a38 896864c8 00000000 nt!KeInsertQueueApc+0x4b
b80489e4 f7432123 8960e9d8 8980b300 00000000 nt!IopfCompleteRequest+0x1d8
WARNING: Stack unwind information not available. Following frames may be wrong.
b80489f8 804e3d77 0000001c 0000001c 806ed070 NinjaDriver+0x1123
b8048a08 8056a9ab 88e23a8c 896864c8 88e239f8 nt!IopfCallDriver+0x31
b8048a1c 8057d9f7 89817030 88e239f8 896864c8 nt!IopSynchronousServiceTail+0x60
b8048ac4 8057fbfa 00000090 00000000 00000000 nt!IopXxxControlFile+0x611
b8048af8 b6e6a06f 00000090 00000000 00000000 nt!NtDeviceIoControlFile+0x2a
b8048b8c b6e6a5c3 00000001 00000090 00000000 Ninja+0x506f
b8048c80 b6e6ab9b 00000001 88da9898 00000090 Ninja+0x55c3
b8048d34 804df06b 00000090 00000000 00000000 Ninja+0x5b9b
b8048d34 7c90ebab 00000090 00000000 00000000 nt!KiFastCallEntry+0xf8
00f8fd7c 00000000 00000000 00000000 00000000 0x7c90ebab
Thanks in Advance,
You would need the function signature for nt!NtDeviceIoControlFile. With that info unassemble backwards from nt!NtDeviceIoControlFile's return address with ub b6e6a06f. This will show you how Ninja sets up the arguments for its call to nt!NtDeviceIoControlFile. Find the args that correspond to the ioctl code and buffer and then dump their contents.
Note that registers will have been reused so you may need to dig back further in the disassembly to get the correct values from the non-volatile registers which will have been saved on the stack before the function call.
In the windbg help file (debugger.chm) there is a very useful page titled "x86 Architecture". In this case, you may want to read the sections titled "Registers" and "Calling Conventions".

Linux Kernel programming: trying to get vm_area_struct->vm_start crashes kernel

this is for an assignment at school, where I need to determine the size of the processes on the system using a system call. My code is as follows:
...
struct task_struct *p;
struct vm_area_struct *v;
struct mm_struct *m;
read_lock(&tasklist_lock);
for_each_process(p) {
printk("%ld\n", p->pid);
m = p->mm;
v = m->mmap;
long start = v->vm_start;
printk("vm_start is %ld\n", start);
}
read_unlock(&tasklist_lock);
...
When I run a user level program that calls this system call, the output that I get is:
1
vm_start is 134512640
2
EIP: 0073:[<0806e352>] CPU: 0 Not tainted ESP: 007b:0f7ecf04 EFLAGS: 00010246
Not tainted
EAX: 00000000 EBX: 0fc587c0 ECX: 081fbb58 EDX: 00000000
ESI: bf88efe0 EDI: 0f482284 EBP: 0f7ecf10 DS: 007b ES: 007b
081f9bc0: [<08069ae8>] show_regs+0xb4/0xb9
081f9bec: [<080587ac>] segv+0x225/0x23d
081f9c8c: [<08058582>] segv_handler+0x4f/0x54
081f9cac: [<08067453>] sig_handler_common_skas+0xb7/0xd4
081f9cd4: [<08064748>] sig_handler+0x34/0x44
081f9cec: [<080648b5>] handle_signal+0x4c/0x7a
081f9d0c: [<08066227>] hard_handler+0xf/0x14
081f9d1c: [<00776420>] 0x776420
Kernel panic - not syncing: Kernel mode fault at addr 0x0, ip 0x806e352
EIP: 0073:[<400ea0f2>] CPU: 0 Not tainted ESP: 007b:bf88ef9c EFLAGS: 00000246
Not tainted
EAX: ffffffda EBX: 00000000 ECX: bf88efc8 EDX: 080483c8
ESI: 00000000 EDI: bf88efe0 EBP: bf88f038 DS: 007b ES: 007b
081f9b28: [<08069ae8>] show_regs+0xb4/0xb9
081f9b54: [<08058a1a>] panic_exit+0x25/0x3f
081f9b68: [<08084f54>] notifier_call_chain+0x21/0x46
081f9b88: [<08084fef>] __atomic_notifier_call_chain+0x17/0x19
081f9ba4: [<08085006>] atomic_notifier_call_chain+0x15/0x17
081f9bc0: [<0807039a>] panic+0x52/0xd8
081f9be0: [<080587ba>] segv+0x233/0x23d
081f9c8c: [<08058582>] segv_handler+0x4f/0x54
081f9cac: [<08067453>] sig_handler_common_skas+0xb7/0xd4
081f9cd4: [<08064748>] sig_handler+0x34/0x44
081f9cec: [<080648b5>] handle_signal+0x4c/0x7a
081f9d0c: [<08066227>] hard_handler+0xf/0x14
081f9d1c: [<00776420>] 0x776420
The first process (pid = 1) gave me the vm_start without any problems, but when I try to access the second process, the kernel crashes. Can anyone tell me what's wrong, and maybe how to fix it as well? Thanks a lot!
(sorry for the bad formatting....)
edit: This is done in a Fedora 2.6 core in an uml environment.
Some kernel threads might not have mm filled - check p->mm for NULL.
Changed the code to check for null pointers:
m = p->mm;
if (m != 0) {
v = m->mmap;
if (v != 0) {
long start = v->vm_start;
printk("vm_start is %ld\n", start);
}
}
All process related information can be found at /proc filesystem at the userspace level. Inside the kernel, these information are generated via fs/proc/*.c
http://lxr.linux.no/linux+v3.2.4/fs/proc/
Looking at the file task_mmu.c, which printing all the vm_start information u can observe that all handling of vm_start field always require the mmap_sem to be locked:
down_read(&mm->mmap_sem);
for (vma = mm->mmap; vma; vma = vma->vm_next) {
clear_refs_walk.private = vma;
...
walk_page_range(vma->vm_start, vma->vm_end,
&clear_refs_walk);
For kernel threads mm will be null. So whenever you read the mm do it in the following manner.
down_read(&p->mm->mmap_sem)
if(mm) {
/* read the contents of mm*/
}
up_read(&p->mm->mmap_sem)
Also you may use get_task_mm(). With get_task_mm() you need not acquire the lock. Here is how you use it :
struct mm_struct *mm;
mm = get_task_mm(p);
if (mm) {
/* read the mm contents */
}