[[TOC(Debugging/HeapProtectionError, depth=2)]] = Debugging a Heap Protection Error = The RTEMS Heap Allocator has a heap protector to help track find corruptions. The most common type of heap corruption errors are using a pointer that has been returned to the heap or writing to memory outside the block allocated. These errors are difficult to track down and you need hardware watchpoint support to find them quickly. Hardware watchpoint support is something GDB can provide if your hardware is capable of providing it. Processors such as ARM, i386, and Coldfire have such support. A hardware watchpoint configures special hardware support in the processor to monitor a location for a write access and if it happens a breakpoint it is triggered and executable it stopped at the specific line of code. To find a heap corruption you need to locate the address of the heap block header and the block marker being checked and then set a hardware watchpoint. This example is using the PC BSP (pc686) and the build-in GDB stub which has hardware watchpoint support. I have a bug in the LibBSD `rc.conf` support I need to track down. The `rc.conf` uses Newlib's regex support and code that runs before the Newlib code is called has corrupted the heap. The Newlib call just happens to make a heap allocation request that detects the corruption. I always have a breakpoint set to `bsp_fatal_extension`. If something like a heap corruption is detected execution stops when that function is called. A backtrace is: {{{ (gdb) bt #0 bsp_fatal_extension (source=INTERNAL_ERROR_CORE, is_internal=false, code=3735928559) at /opt/work/chris/rtems/kernel/rtems.master/c/src/lib/libbsp/i386/pc386/../../shared/bspclean.c:19 #1 0x001b4100 in _User_extensions_Iterate (arg=arg@entry=0x2a4c94, visitor=0x1b4050 <_User_extensions_Fatal_visitor>, direction=, direction@entry=CHAIN_ITERATOR_BACKWARD) at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/score/src/userextiterate.c:171 #2 0x001ae842 in _User_extensions_Fatal (error=3735928559, is_internal=false, source=INTERNAL_ERROR_CORE) at ../../cpukit/../../../pc686/lib/include/rtems/score/userextimpl.h:298 #3 _Terminate (the_source=the_source@entry=INTERNAL_ERROR_CORE, is_internal=is_internal@entry=false, the_error=the_error@entry=3735928559) at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/score/src/interr.c:44 #4 0x001ad99e in _Heap_Protection_block_error_default (heap=0x2154a0 <_Workspace_Area>, block=0x2a2510) at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/score/src/heap.c:160 #5 0x001ae3cd in _Heap_Protection_block_error (block=0x2a2510, heap=0x2154a0 <_Workspace_Area>) at ../../cpukit/../../../pc686/lib/include/rtems/score/heapimpl.h:336 #6 _Heap_Protection_determine_block_free (block=0x2a2510, heap=0x2154a0 <_Workspace_Area>) at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/score/src/heapfree.c:92 #7 _Heap_Free (heap=heap@entry=0x2154a0 <_Workspace_Area>, alloc_begin_ptr=0x2a2534) at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/score/src/heapfree.c:141 #8 0x001ae02a in _Heap_Protection_free_delayed_blocks (alloc_begin=0, heap=0x2154a0 <_Workspace_Area>) at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/score/src/heapallocate.c:52 #9 _Heap_Allocate_aligned_with_boundary (heap=, heap@entry=0x2154a0 <_Workspace_Area>, alloc_size=, alloc_size@entry=4294967255, alignment=alignment@entry=0, boundary=boundary@entry=0) at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/score/src/heapallocate.c:253 #10 0x001ae6da in _Heap_Allocate (size=4294967255, heap=0x2154a0 <_Workspace_Area>) at ../../cpukit/../../../pc686/lib/include/rtems/score/heapimpl.h:137 #11 _Heap_Protection_free_all_delayed_blocks (heap=0x2154a0 <_Workspace_Area>) at ../../cpukit/../../../pc686/lib/include/rtems/score/heapimpl.h:345 #12 _Heap_Resize_block (heap=heap@entry=0x2154a0 <_Workspace_Area>, alloc_begin_ptr=alloc_begin_ptr@entry=0x2a576c, new_alloc_size=new_alloc_size@entry=72, old_size=old_size@entry=0x2a4dd8, new_size=new_size@entry=0x2a4ddc) at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/score/src/heapresizeblock.c:108 #13 0x001a5b52 in realloc (ptr=0x2a576c, size=72) at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/libcsupport/src/realloc.c:71 #14 0x001bb30e in stripsnug (g=0x2a55e8, g=0x2a55e8, p=0x2a4e30) at ../../../../../../gcc-6-20160609/newlib/libc/posix/regcomp.c:1668 #15 regcomp (preg=preg@entry=0x2a4ee0, pattern=0x2a5554 "test_regex_.*", cflags=cflags@entry=1) at ../../../../../../gcc-6-20160609/newlib/libc/posix/regcomp.c:271 #16 0x00139d9b in rtems_bsd_rc_conf_find_next (rc_conf=0x22e354, argc_argv=0x2a5520) at ../../rtemsbsd/rtems/rtems-bsd-rc-conf.c:395 #17 0x0010026f in test_service (rc_conf=0x22e354) at ../../testsuite/rcconf01/test_main.c:71 #18 0x00139ae9 in rc_conf_worker (task_argument=2286420) at ../../rtemsbsd/rtems/rtems-bsd-rc-conf.c:720 #19 0x001b6249 in _Thread_Handler () at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/score/src/threadhandler.c:93 #20 0xa5a5a5a5 in ?? () }}} It is easy to see from the backtrace the heap has detected a corruption. Move up the stack frame to find the corruption check using the GDB `up` command. In this case it is frame `#6`: {{{ (gdb) up #6 _Heap_Protection_determine_block_free (block=0x2a2510, heap=0x2154a0 <_Workspace_Area>) at /opt/work/chris/rtems/kernel/rtems.master/c/src/../../cpukit/score/src/heapfree.c:92 92 _Heap_Protection_block_error( heap, block ); }}} The code for `_Heap_Protection_determine_block_free` is: {{{ (gdb) l _Heap_Protection_determine_block_free 79 Heap_Control *heap, 80 Heap_Block *block 81 ) 82 { 83 bool do_free = true; 84 Heap_Block *const next = block->Protection_begin.next_delayed_free_block; 85 86 if ( next == NULL ) { 87 _Heap_Protection_delay_block_free( heap, block ); 88 do_free = false; (gdb) l 89 } else if ( next == HEAP_PROTECTION_OBOLUS ) { 90 _Heap_Protection_check_free_block( heap, block ); 91 } else { 92 _Heap_Protection_block_error( heap, block ); 93 } 94 95 return do_free; 96 } 97 #endif 98 }}} The heap is made up of blocks of type `Heap_Block` and a corruption has been detected. We need to look carefully because the compiler can optimize paths. Lets have a look at the value for `next`: {{{ (gdb) p next $4 = (gdb) p /x block->Protection_begin.next_delayed_free_block $5 = 0x1 }}} We cannot see it directly because it is not available however we can see it's value in the `block`. It is `1`. A quick search of the RTEMS source code find the value of `HEAP_PROTECTION_OBOLUS` as being `1`: {{{ $ grep -r HEAP_PROTECTION_OBOLUS cpukit/score/ cpukit/score/include/rtems/score/heap.h: #define HEAP_PROTECTION_OBOLUS ((Heap_Block *) 1) cpukit/score/src/heapallocate.c: HEAP_PROTECTION_OBOLUS; cpukit/score/src/heapfree.c: } else if ( next == HEAP_PROTECTION_OBOLUS ) { }}} This means the execution path is `_Heap_Protection_check_free_block` and a check of that code shows it to contain a call to `_Heap_Protection_block_error1: {{{ static void _Heap_Protection_check_free_block( Heap_Control *heap, Heap_Block *block ) { uintptr_t *const pattern_begin = (uintptr_t *) _Heap_Alloc_area_of_block( block ); uintptr_t *const pattern_end = (uintptr_t *) ((uintptr_t) block + _Heap_Block_size( block ) + HEAP_ALLOC_BONUS); uintptr_t *current = NULL; for ( current = pattern_begin; current != pattern_end; ++current ) { if ( *current != HEAP_FREE_PATTERN ) { _Heap_Protection_block_error( heap, block ); break; } } } }}} on the i386 things become complicated because the compiler has inlined all the calls in `_Heap_Free` and this makes it difficult to isolate the specific line of code. Setting a break on the call to `_Heap_Protection_block_error` in `_Heap_Protection_check_free_block` does not end up at the correct instruction. using `objdump` I inspected the code in `_Heap_Free` and found the specific instruction: {{{ $ /opt/work/rtems/4.12/bin/i386-rtems4.12-objdump -d --source build/i386-rtems4.12-pc686/rcconf01.exe | less }}} Using `less` I searched for `<_Heap_Free>`. To search hit `/` then enter `<_Heap_Free>` and repeat the search with just `/` until you see: {{{ 001ae320 <_Heap_Free>: }}} The instruction location will be different. Scroll down looking for the source code for `if ( *current != HEAP_FREE_PATTERN )`: {{{ if ( *current != HEAP_FREE_PATTERN ) { 1ae4ba: 81 3b df 3c 09 e7 cmpl $0xe7093cdf,(%ebx) 1ae4c0: 0f 85 f6 fe ff ff jne 1ae3bc <_Heap_Free+0x9c> 1ae4c6: 89 4c 24 1c mov %ecx,0x1c(%esp) 1ae4ca: 89 da mov %ebx,%edx 1ae4cc: 8b 4c 24 18 mov 0x18(%esp),%ecx 1ae4d0: eb 1a jmp 1ae4ec <_Heap_Free+0x1cc> 1ae4d2: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi 1ae4d9: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi 1ae4e0: 81 3a df 3c 09 e7 cmpl $0xe7093cdf,(%edx) 1ae4e6: 0f 85 d0 fe ff ff jne 1ae3bc <_Heap_Free+0x9c> }}} The place to set the breakpoint is `1ae4e6` which happens to the location the call frame ends up at. I set a breakpoint on that instruction: {{{ (gdb) b *0x1ae4e6 (gdb) c }}} The debugger continues then hits the breakpoint and I was able to see the `current` pointer: {{{ (gdb) p current $3 = (uintptr_t *) 0x216788 }}} I now have the location that is detected as corrupt I can set the watchpoint: {{{ (gdb) awatch *0x216788 }}} It pays to set to set the watchpoint as close to the access that corrupts it. This may take a few runs to find. The watchpoint can be triggered by valid accesses to the location. You need to skip these.