wiki:TBR/UserManual/RTEMS_Coverage_Analysis

Version 28 (modified by GlennHumphrey, on Sep 9, 2009 at 1:34:24 AM) (diff)

/* Applying Coverage Analysis to RTEMS */ Began discussion of profiles

RTEMS Coverage Analysis

RTEMS is used in many critical systems. It is important that the RTEMS Project ensure that the RTEMS product is tested as thoroughly as possible. With this goal in mind, we have set out to expand the RTEMS test suite so that 100% of the RTEMS executive is tested. There are numerous industry and country specific standards for safety including FAA DO-178B for flight software in the United States. There are similar aviation standards in other countries as well as in domains such as medical devices, trains, medical and military applications. As a free software project, the RTEMS Project will never have a complete set of certification paperwork available for download. But we would like to ensure that RTEMS meets the technical requirements that are shared across these safety and quality oriented standards.

We encourage members of the community to help out. If you are in a domain where a safety or certification standard applies, work with us to understand that standard and guide us to providing a polished RTEMS product that helps meets that criteria. Providing funding to augment tests, test procedures or documentation that would aid you in using RTEMS in your domain. Once the artifact is merged into the project, it becomes a community asset that will be easier to maintain. Plus the increased level of testing ensures that submissions to RTEMS do not negatively impact you.

Be active and help us meet your application domain requirements while improving the product for all!

Applying Coverage Analysis to RTEMS

In order to achieve the 100% tested goal, it is important to define what constitutes 100% tested. A lot of information exists about how to completely test a software application. In general, the term Code Coverage is used to refer to the analysis that is performed to determine what portions of the software are tested by the test suite and what portions are not tested. For some background information on Code Coverage Analysis, see Coverage Analysis Theory?.

Traditionally, Code Coverage Analysis has been performed by instrumenting the source code or object code or by using special hardware to monitor the instructions executed. An objective of the RTEMS code coverage effort is to use existing tools and to avoid altering the code to be analyzed. This can be accomplished by using a processor simulator that provides coverage analysis information. The information can be processed to determine which instructions are executed. We call this object code coverage. Initially, we set out to achieve 100% object code coverage of the RTEMS executive.

It is also important to define what is actually being tested. The RTEMS executive can contain a significant amount of code. The concept of profiles was introduced to provide boundaries for what is actually tested.

How it was Done

Automated coverage testing is performed using a processor simulator in conjunction with a set of RTEMS specific support scripts. The code to be analyzed is linked together as a single relocatable with special start (COVERAGE_START) and end (COVERAGE_END) symbols. The relocatable is then linked to the same address in every test from the test suite. Each test is then executed on a processor simulator that gathers information about which instructions were executed and produces a coverage map for the test. After all tests have finished, the support script covmerge is used to merge all coverage maps into a unified coverage map for the entire test suite and to produce reports that identify the uncovered code. The picture shown provides the general flow of the process.

Imported from old wiki.]]

What was Discovered

There are multiple ways to measure progress on this task. We primarily use two metrics. The first is the reduction in the number of uncovered binary code ranges from that identified initially. The second is the percent of untested binary object code as a percentage of the total code size under analysis. Together the metrics provide useful information. Some uncovered ranges may be a single instruction so eliminating that case improves the first metric more than the second.

Beyond Object Code Coverage

Statement Coverage

This requires knowing which source files are involved (which we do) and which lines in those files can produce assembly code (which I don't think we do 100%). We can easily know which lines are comments and blank but beyond that will require some thought.

The current object coverage utility covmerge can be modified to generate a report of which source lines were covered. It could generate a bitmap per source file where the bit index indicates if a source line in that file was executed or not. If we can generate a similar bit map from the source code which marks comments and other non-executable source lines as covered, then the union of the two bitmaps can be used to generate a report showing which source lines are not covered or represented in the object code. This may indicate dead code or weaknesses in the tests.

This is definitely an open project at this point.

MC/DC

From the RTEMS testing perspective, this is to verify that every branch instruction in the generated object has been both taken and not taken. We cannot determine this without help from a simulator or hardware debugger which gathers this information.

QEMU -- project to do MC/DC .. update here

Coding Advice

Reasons Code is Not Executed

The coverage analysis provides a report on the ranges of assembly instructions within RTEMS subsystems which are not currently exercised by the tests in the current configuration. Each case has to be individually analysed and addressed. Historically, we have identified multiple categories for code being uncovered:

  • Needs a new test case
  • Unreachable in current RTEMS configuration. For example, the SuperCore? could have a feature only exercised by a POSIX API object. It could be disabled when POSIX is not configured.
  • Debug or sanity checking code which can be placed inside an RTEMS_DEBUG conditional.
  • Unreachable paths generated by gcc for switches. Sometimes you have to restructure switches to avoid unreachable object code.
  • Critical sections which are synchronizing actions with ISRs. Most of these are very hard to hit and may require very specific support from a simulator environment. OAR has used tsim to exercise these paths but this is not reproducible in a BSP independent manner. Worse, sometimes there is often no external way to know the case in question has been hit and no way to do it in a one shot test. The spintrcriticalXX and psxintrcriticalXX tests attempt to reproduce these cases.

Impact of Optimization Level

Discuss impact of -O2 versus -Os with example from code.

Impact of RTEMS Configuration Options

Inlining _Thread_Dispatch_enable, etc.

Test Procedure

The scripts, tools, and patches are currently in the CVS module gcc-testing in the subdirectory rtems-coverage in the RTEMS CVS Repository.

Compilation and Configuration Options

TBD

Coverage Profiles

RTEMS includes a lot of source code and the coverage analysis should focus on improving the test coverage of well-defined code subsets with a trend over time of increasing both the level of coverage (e.g. object to statement to decision to MC/DC) and the amount of source code covered.

As other support libraries in cpukit is covered, these will be move from the Developmental Profile and added to the POSIX Enabled and Classic API Only profiles.

POSIX Enabled

This is the first profile we tested. This initially focused on the score, sapi, rtems, and posix directories in the cpukit directory. This profile represents a full tasking and synchronization feature set.

Classic API Only (POSIX Disabled)

In this profile, we disable POSIX and focus on the contents of the score, sapi, and rtems directories in the cpukit directory. The POSIX API and tests are disabled. In this profile, we expect to identify:

  • features in score only exercised by POSIX
  • features in score available via Classic API but only tested via POSIX
  • POSIX features like sleep() which are enabled when POSIX threads are disabled.

The first case will allow us to disable score features in this configuration and reduce the code size.

The second case allows us to approach 100% coverage in every RTEMS configuration.

The third case is similar to the second and indicates the need for tests in this configuration for features that are technically part of the POSIX API support.

Developmental

This is an experimental/developmental coverage configuration and adds almost all of the CPUKit contents that are non-networked. It nearly doubles the size of the code being covered. We are aiming for the entire contents of libcsupport, libmisc, and various filesystems. This is a large body of code and components like Termios and the file systems will require creativity to get automated coverage near 100%.

We have done initial tests on this profile. There is work to be done improving the test coverage. As components are covered 100%, they will be moved from experimental/developmental status to be included in the official coverage run.

We welcome your contributions.

BSPs Analyzed

If you know of a simulator that includes coverage analysis, please let us know.

ARM

The SkyEye project has added coverage analysis capabilities per our specifications. We are currently using it on the following ARM targets to generate coverage reports:

Blackfin

Since SkyEye supports this target architecture, we hope to one day get coverage results on the following BSPs:

  • eZKit553

Coldfire

SkyEye supports the Coldfire but is currently unable to run any RTEMS Coldfire BSP. Work to improve Skyeye's Coldfire support is welcomed. We look forward to being able to use it to perform coverage testing on the following BSPs.

  • mcf5206elite

i386

We have identified using Qemu for the information. This project (http://libre.adacore.com/libre/tools/coverage/) aims to add the necessary capabilities to that simulator. The source code for this project is available from http://forge.open-do.org/scm/?group_id=8. Now it is up to you.

We anticipate that someday we will be able to do coverage testing using Qemu on the following BSPs:

  • pc386

SPARC

We are using TSIM from Gaisler Research on the following BSPs:

  • ERC32
  • LEON2
  • LEON3

References

==General Coverage Testing==

===Standards and Certifications===

  • FAA DO-178B - United States Aviation Standard

Attachments (1)

Download all attachments as: .zip