wiki:TBR/Review/Debugging/Start

Version 19 (modified by Gedare, on Apr 18, 2013 at 7:44:20 PM) (diff)

Debugging

Symbolic Debug Information

RTEMS and the RTEMS tool chains support symbolic debugging. The application, RTEMS and any libraries you wish to debug need to be compiled with the gcc option -g. This option causes the compiler to emit debugging information, while the code generated should not change. The GNU gcc debugging options can be found here -

<center> http://gcc.gnu.org/onlinedocs/gcc-3.4.3/gcc/Debugging-Options.html#Debugging-Options </center>

Using the -g option results in larger object and executable file sizes, and for C++ this can be quite large. For example a M68K C++ application can have a ELF executable file size of 19M bytes yet the code size is only 1.6M bytes. The actual memory foot print can be seen by using the size tool on an object file or the final executable -

$ m68k-rtems-size myfile.o

For RTEMS you typically do not need to strip the executable. The loading of the executable into the target memory or conversion to S records, Intel Hex, for binary format will automatically strip the debug information. Keeping the excutable with the debugging information is recommended. If a problem appears with code in the field and you recieve a some sort of dump or trace you can use the objdump tool to help locate the problem -

$ m68k-rtems-objdump -D --line-numbers --source --demangle  myapp.elf | less

Hardware Assisted Debugging

Embedded processors these days provide hardware assisted debugging. Typically the processor provides an interface which allows an external device the ability to take control of the processor. In the past hardware assisted debugging required an emulator. These were expensive, often difficult and fragile to connect to the target hardware and often limited in numbers in a project making debugging a time share operation. Any newer or faster processor usually required a new emulator. Todays microprocessors implement a range of functions found in emulators in the processor allowing every target the ability to be used for hardware assisted debugging.

Different microprocessors have different ways of the implementing hardware assisted debugging.

  • Freescale Coldfire and M683xx processors use Background Debug Mode (BDM).
  • ARM uses JTAG.
  • PowerPC MPC5xx and MPC8xx use BDM, while the rest of the PowerPC family uses various kinds of JTAG interfaces.

Coldfire and M683xx BDM

The BDM interface is a synchronous serial bus interface. The physical interface (connector) varies between the M683xx and Coldfire processor yet the way BDM works is similar. BDM is support by an Open Source Project which you can find here http://bdm.sourceforge.net/. The BDM project's software uses a low cost pod that connects between your target hardware. You can use the older parallel port pods that connect to your PC's parallel port or you can use the newer USB pods.

The USB pod is an open source design called the Turbo BDM Light Coldfire (TBLCF) created by Daniel Malik. You can download the design and build yourself http://forums.freescale.com/freescale/board/message?board.id=CFCOMM&thread.id=624 or you can obtain a manufactured pod from Axiom Manufacturing http://www.axman.com/?q=node/303.

If using a parallel port pod watch you get the correct pod. Different procesor speeds and core voltages require different pods. Newer pods should be able to handle the faster processors and different core voltages.

The latest version of the BDM software use a GDB server program and allows the use of the standard M68K GDB tool provided in the RTEMS binary tools packages.

Debugging a BSP

To test a BSP you need an application. The samples and tests provide a proven set of applications that allow you to test your BSP. Select a sample application such as the Hello World to try first, then move to an application that uses more interrupt sources.

The following is a list of debugging tools and setups that we recommended. Time spent on this will be rewarded further into the project:

  • Try to create a working debugger environment for the target hardware. The GNU debugger, GDB, supports a number of different ways to connect to a target. The interface can be a simple serial port using a GdbSerialMonitor, or an expensive TCP/IP type hardware probe.
  • If your target processor supports a simulator it is recommended you learn how to use it. It will give you a stable debugging enviroment.
  • Implement a printk interface in your BSP. For target hardware that does not have a video interface the printk outputs to a serial port. The driver is normally a simple polled UART driver.
  • Use the Capture Engine to aid the debugging and verfication of your real-time design.

GDB and RTEMS

Currently GDB is not RTEMS aware. GDB scripts exist that can help by providing presenting kernel structures in a user friendly manner. These can be found here: GdbScripts.

The issue with making GDB aware of RTEMS is where the knowledge of the kernel structures is located. A version of GDB that contains a specific kernel structure layout will break if RTEMS changes. The ideal would be for GDB to find the structure elements using the applications debug information.

Eclipse Plug-in

Eclipse can be used as GUI for GDB. General Eclipse related information can be found at RTEMS Eclipse Information. An more specific example of how serial port is utilized for remote debugging can be found at RTEMS Eclipse Plug-in.

How much memory is left in the C Program Heap?

The C heap is a region so this should work:

  (gdb) p ((Region_Control *)_Region_Information->local_table[1])->Memory->first
  $9 {back_flag=1, front_flag=8058280, next=0x7ea5b4, previous=0x7ea5b0}

Let's look at that gdb command in more detail.

  • _Region_Information contains the information used to manage Region objects.
  • _Region_Information->local_table is the object pointer table for Regions. It is indexed by the object index portion of the object ID.
  • Region_Information->local_table points to the first Region object. It is of type (Region_Control *).
  • ((Region_Control *)_Region_Information->local_table[1])->Memory points to the Heap control portion of this Region's control block.
  • ((Region_Control *)_Region_Information->local_table[1])->Memory->first references the contents of the first heap block on this Heap.

Notice that the front_flag is displayed as 8058280. This is in decimal since we used p not p/x to gdb. Since this number is even, we know the in use bit is 0 and the block is free. Thus the first block on the heap is 8,058,280 bytes and there are at least that many bytes left.

NOTE: This is really a crude estimate.

If you have compiled RTEMS libraries with -DRTEMS_DEBUG, malloc will maintain statistics. From an RTEMS application:

#include <libcsupport.h>
    .....
  malloc_dump ();

will print malloc's heap statistics to stdout. Example usage can be found in <tt>c/src/tests/frye/libcleak</tt> and <tt>c/src/tests/libtests/malloctest</tt>.

The following GDB macro may also be of use. If provided with an argument of 0 for a summary, or 1 to explicitly list the regions.

 define rtems-mallocheap-walk
   printf "walking the heap:\n"
   set $heapstart = ((Region_Control *)_Region_Information->local_table[RTEMS_Malloc_Heap&0xffff])->Memory->start
   set $currentblock = $heapstart
   set $used = 0
   set $numused = 0
   set $free = 0
   set $numfree = 0
   while $currentblock->front_flag != 1
     if $currentblock->front_flag & 1
       if $arg0 != 0
 	printf "USED: %d\n", $currentblock->front_flag & ~1
       else
         printf "*"
       end
       set $used = $used + $currentblock->front_flag & ~1
       set $numused = $numused + 1
     else
       if $arg0 != 0
 	printf "FREE: %d\n", $currentblock->front_flag & ~1
       else
         printf "."
       end
       set $free = $free + $currentblock->front_flag & ~1
       set $numfree = $numfree + 1
     end
     set $currentblock = (Heap_Block *)((char *)$currentblock + ($currentblock->front_flag&~1))
   end
   if $arg0 == 0
     printf "\n"
   end
   printf "TOTAL: %d (%d)\tUSED: %d (%d) \tFREE: %d (%d)\n", \
     $used + $free, $numused + $numfree, \
     $used, $numused, \
     $free, $numfree
 end

How much memory is left in the RTEMS Workspace?

An RTEMS workspace overage can be fairly easily spotted with a debugger. Look at WorkspaceArea. If first == last, then there is only one free block of memory in the workspace (very likely if no task or message queue deletions). Then do this:

  (gdb) p (Heap_Block *) Workspace_Area->first
  $3 = {back_flag=1, front_flag=68552, next=0x1e260, previous=0x1e25c}
  • Workspace_Area is the variable name of the RTEMS Workspace Heap control block.
  • (Heap_Block *) _Workspace_Area->first is the contents of the first heap block information.

Just as with the C Program Heap, the number was even indicating it is free. In this case, I had 68552 bytes left in the workspace.

NOTE: This is really a crude estimate. GDB 5.0 and newer support a macro language that provides the features necessary to write a function which would walk a heap structure and print out accurate statistics. If you write this, submit it. :)

BSP rtems_initialize_executive_late call dies

Your target is booting and the BSP is initialising but a call to rtems_initialize_executive_late results in an exception, or the target locking up. This can be due to a few reasons that you will need to work through.

  • The memory map is not correct. RTEMS use the WorkSpace? to create the initialization task, it's stack, the interrupt stack, plus more. If the WorkSpace? is wrong the target will die. The exact way depends on the specific processor and mapping error. If you have implemented printk it may be a good idea to add some code to bsp_pretasking_hook to show you the base and size:
          printk ("Heap : %5d KB @ 0x%08x\n   ", heap_size >> 10, heap_start);
  • The initialization process encountered an error. The set of these is documented in the Classic API C User's Guide. For RTEMS 4.6.1, this was documented in the Initialization Manager Failure section.
  • Interrupts are pending and no interrupt handler is present, or a bug exists in the interrupt handler, or vector table layout. RTEMS will enable interrupts when it switches to the initialization task rtems_initialize_executive_late creates. This problem can be found by getting to the context switch call, _CPU_Context_Switch, then the crash or lockup. The fix is to make sure the BSP has masked all sources of interrupts in the hardware around the processor. This allows RTEMS and the application an ordered and controlled initialization of driver and therefore interrupts.

The initialization task is a real task in RTEMS. The rtems_initialize_executive_late creates it and switches context to it. This means your BSP environment is another context that RTEMS will switch back to when RTEMS is shut down. This allows your BSP to take control again, then perform any specific functions it needs. A typical operational thing to do is to reboot as embedded targets should not stop.

GDB Cannot Find My Source Files

If you find the source code paths in your executable as seen by GDB are missing, you may find using an absolute path to invoke the RTEMS configure script may help. When RTEMS libraries get built, nested Makefiles are executed that walk through the build directory structure. Therefore, each file is compiled from a certain point in the build directory structure that lies in paralllel to the source directory structure.

As a consequece of this, you get a varing number of "dots", depending on how deep the corresponding directory is inside the build tree. There is no common location, from which the source file pointers in the debug info is correct for all object files.

If you call the initial <tt>configure</tt> in an absolute way such as :

/usr/local/src/rtems/tools/rtems-4.6.0/configure

rather than a relative way :

../rtems-4.6.0/configure

GDB should find the source.

Starting With Hello World

Congratulations! You are a new RTEMS user and you just got the hello world example to run on either a simulator or target hardware. You are on top of the world. So you modify hello world -- wouldn't it be cool to put a sleep between some prints like this:

printf( "Hello world -- line 1\n");
sleep(1);
printf( "Hello world -- line 2\n");

That sleep() could be any other call which blocks the caller while time passes. But when you run this program, it only prints "Hello world -- line 1" and appears to lock up. What is happening?

The answer is simpler than you think. RTEMS is always custom configured to meet the requirements of an application. This means that the number and types of objects and device drivers available are tailored. The hello world application does not require a clock device driver and thus it is not configured. When you added the sleep(), you added a call which needs the clock device driver configured in order to work. All you have to do is added this line to the configure section of the application BEFORE including confdefs.h.

#define CONFIGURE_APPLICATION_NEEDS_CLOCK_DRIVER

As you add to your program, you may have to increase the number of objects configured as well.

Standard IO and File Issues

Newlib's Stdio Functions return -1/EOF

The stdio functions in newlib depend on both initialised and uninitialised data. If you find they are returning -1, ensure your .bss and .data sections are correctly setup. Check your linkcmds file is creating the correct memory map and that your bsp boot process is copying/zeroing all appropriate sections in ram. It's also worth double checking that your ram and other hardware is working correctly!

open, fopen, and socket creation fail

RTEMS has very tight default configuration limits. Not being able to open a file or create a socket is a common error which indicates that you need to configure enough open file descriptors. By default, the constant CONFIGURE_LIBIO_MAXIMUM_DESCRIPTORS is set to 3 for stdin, stdout, and stderr. You will need to set it to the appropriate value for your application.

Optional Debugging Tools Provided by RTEMS

There are a number of optional debugging tools available with RTEMS that are not too well documented (yet!). These tools (and other goodies) can be found under cpukit/libmisc, and are well worth investigating.

Note: Most of what follows has been gleaned from the associated README files; I haven't added much original content, partly due to sloth, but more justifiably because I haven't yet used these tools extensively enough to be able to do any better! My primary purpose, at this point, in including these is so others, especially "nuBees" are aware of them.

RTEMS Monitor

The RTEMS Monitor is run as a high-priority task, and provides a useful window into the operation of your system. From the README:

monitor task

The monitor task is an optional task that knows about RTEMS
data structures and can print out information about them.
It is a work-in-progress and needs many more commands, but
is useful now.

The monitor works best when it is the highest priority task,
so all your other tasks should ideally be at some priority
greater than 1.

To use the monitor:
-------------------

    #include <rtems/monitor.h>

    ...

    rtems_monitor_init(0);

    The parameter to rtems_monitor_init() tells the monitor whether
    to suspend itself on startup.  A value of 0 causes the monitor
    to immediately enter command mode; a non-zero value causes the
    monitor to suspend itself after creation and wait for explicit
    wakeup.


    rtems_monitor_wakeup();
    
    wakes up a suspended monitor and causes it to reenter command mode.

Monitor commands
----------------

    The monitor prompt is 'rtems> '.
    Can abbreviate commands to "uniquity"
    There is a 'help' command.  Here is the output from various
    help commands:

        Commands (may be abbreviated)

          help      -- get this message or command specific help
          task      -- show task information
          queue     -- show message queue information
          symbol    -- show entries from symbol table
          pause     -- pause monitor for a specified number of ticks
          fatal     -- invoke a fatal RTEMS error

        task [id [id ...] ]
          display information about the specified tasks.
          Default is to display information about all tasks on this node

        queue [id [id ... ] ]
          display information about the specified message queues
          Default is to display information about all queues on this node

        symbol [ symbolname [symbolname ... ] ]
          display value associated with specified symbol.
          Defaults to displaying all known symbols.

        pause [ticks]
          monitor goes to "sleep" for specified ticks (default is 1)
          monitor will resume at end of period or if explicitly awakened

        fatal [status]
          Invoke 'rtems_fatal_error_occurred' with 'status'
          (default is RTEMS_INTERNAL_ERROR)

        continue
          put the monitor to sleep waiting for an explicit wakeup from the
          program running.


Sample output from 'task' command
---------------------------------

    rtems> task
      ID       NAME   PRIO   STAT   MODES  EVENTS   WAITID  WAITARG  NOTES
    ------------------------------------------------------------------------
    00010001   UI1     2    READY    P:T:nA    NONE15: 0x40606348
    00010002   RMON    1    READY    nP    NONE15: 0x40604110

    'RMON' is the monitor itself, so we have 1 "user" task.
    Its modes are P:T:nA which translate to:

        preemptable
        timesliced
        no ASRS

    It has no events.
    It has a notepad value for notepad 15 which is 0x40606348
    (this is the libc thread state)

Note that this README is quite dated - it hasn't been changed in 11 years! In fact, it provides much more information; here is the output from the help command on a recent (July 2007 HEAD) session:

rtems $ help
config     - Show the system configuration.
itask      - List init tasks for the system
mpci       - Show the MPCI system configuration, if configured.
pause      - Monitor goes to "sleep" for specified ticks (default is 1).
             Monitor will resume at end of period or if explicitly
             awakened
              pause [ticks]
continue   - Put the monitor to sleep waiting for an explicit wakeup from
             the program running.
go         - Alias for 'continue'
node       - Specify default node number for commands that take id's.
              node [ node number ]
symbol     - Display value associated with specified symbol. Defaults to
             displaying all known symbols.
              symbol [ symbolname [symbolname ... ] ]
extension  - Display information about specified extensions. Default is to
             display information about all extensions on this node.
              extension [id [id ...] ]
task       - Display information about the specified tasks. Default is to
             display information about all tasks on this node.
              task [id [id ...] ]
queue      - Display information about the specified message queues.
             Default is to display information about all queues on this
             node.
              queue [id [id ... ] ]
object     - Display information about specified RTEMS objects. Object id's
             must include 'type' information. (which may normally be
             defaulted)
              object [id [id ...] ]
driver     - Display the RTEMS device driver table.
              driver [ major [ major ... ] ]
dname      - Displays information about named drivers.
exit       - Invoke 'rtems_fatal_error_occurred' with 'status' (default is
             RTEMS_SUCCESSFUL)
              exit [status]
fatal      - 'exit' with fatal error; default error is RTEMS_TASK_EXITTED
              fatal [status]
quit       - Alias for 'exit'
help       - Provide information about commands. Default is show basic
             command summary.
            help [ command [ command ] ]
           - 'i"a
rtems $

Capture Engine

The Capture Engine is another neat tool. Unlike the RTEMS Monitor, this one is already documented on the Wiki, though you have to know to search for it. Check it out!

CPU Usage Monitoring

There is a CPU usage monitoring facility available in cpukit/libmisc/cpuuse. Again, from the README:

This directory contains code to report and reset per-task CPU usage. If the BSP supports nanosecond timestamp granularity, this this information is very accurate. Otherwise, it is dependendent on the tick granularity.

It provides two primary features:

  • Generate a CPU Usage Report
  • Reset CPU Usage Information

NOTES

#If configured for tick granularity, CPU usage is "docked" by a clock tick at each context switch. #If configured for nanosecond granularity, no work is done at each clock tick. All bookkeeping is done as part of a context switch.

Stack Checker

Introduction

This directory contains a stack bounds checker. It provides two primary features:

#check for stack overflow at each context switch #provides an educated guess at each task's stack usage

Enabling

Add the stack checker extension to the initial user extension set. If using confdefs.h to build your configuration table, this is as simple as adding -DSTACK_CHECK_ON to the gcc command line which compiles the file defining the configuration table. In the RTEMS test suites and samples, this is always init.c. Another way to enable it is to include the following prior to including confdefs.h:

#define STACK_CHECKER_ON

Once you've enabled the stack checker when building your application, it the stack checker runs automatically as part of a context switch. Additionally, you can call

boolean rtems_stack_checker_is_blown(void);

at any time to check yourself; it returns FALSE if the stack appears okay, or TRUE if the stack pointer is out of range or the pattern marker has been corrupted.

Background

The stack overflow check at context switch works by looking for a 16 byte pattern at the logical end of the stack to be corrupted. The "guesser" assumes that the entire stack was prefilled with a known pattern and assumes that the pattern is still in place if the memory has not been used as a stack.

Both of these can be fooled by pushing large holes onto the stack and not writing to them... or (much more unlikely) writing the magic patterns into memory.

This code is provided as a tool for RTEMS users to catch the most common mistake in multitasking systems ... too little stack space. Suggestions and comments are appreciated.

Optional Compile-time Selections

If someone ever gets VERY VERY desperate, Joel recently added some conditionals which can turn on walking the heap and checking the stack EVERY time you enter an RTEMS dispatching disabled critical section. You have to recompile but since this is such a heavy handed thing to have on, that seemed a fair trade off. The "heavy stack check" feature is enabled by defining RTEMS_HEAVY_STACK_DEBUG, and the "heavy malloc check" by defining RTEMS_HEAVY_MALLOC_DEBUG; in both cases, do this before you build and install RTEMS.

NOTES

#Stack usage information is questionable on CPUs which push large holes on stack. #Prior to 4.7.99.2, the stack checker used printf() instead of printk(). This often resulted in a fault when trying to print the helpful diagnostic message. If using the older printf() stack checker and it comes out, congratulations. If not, then the variable Stack_check_Blown_task contains a pointer to the TCB of the offending task. This is usually enough to go on. Now that it is using printk(), it should be able to get messages out.

= FUTURE ===

#Determine how/if gcc will generate stack probe calls and support that.