wiki:Projects/libdl

Version 3 (modified by ChrisJohns, on Feb 19, 2009 at 9:01:11 AM) (diff)

Add TOC and fix The Task heading.

Dynamic Object File Loading

Dynamic loading of code into a running RTEMS target has been a long term wish for many RTEMS users. Dynamic loading is not for all systems but systems with the resources to support it could make use the advantages it offers.

Dynamic loading means a single binary release of RTEMS can be used by developers and production release teams. The same binary executable the application is developed and tested against can be the same binary executable placed into production. A single check sum can verify the image. If linking to create a single executable the verification and validation of the source to library to executable needs to carefully managed so the application and RTEMS match. A verified and released RTEMS image can reduce this overhead and therefore provide cost savings over the life cycle of a project. Dynamic loading allows developers the ability to load debugging and support code into the system when running to help locate a problem. This is particularly useful when a project is in an integration or testing phase and something has gone wrong.

Dynamic loading for RTEMS does not provide some of the advantages seen with virtual memory operating system such as Linux. On these systems dynamic loading allows code to be shared between separate processes. RTEMS is a single process operating system so there is nothing to sharing code with. Dynamic loading uses more resources. Memory is needed by the dynamic linker and space is needed on the target for the libraries if held locally. You need a working file system to read the code from into memory, plus there is the management of symbols. A potential down side of dynamic linking if not handle efficiently is the possible loading of a complete library. Virtual memory operating systems such as Linux avoid this issue by code sharing and demand loading executable files.

In recent years Till Straumann has provided a separate package call cexp to allow loading of modules of code in RTEMS. His excellent package provides a similar model used in some commercial real-time operating systems. It how ever does not follow any of the standard APIs that RTEMS currently follows. It also provides custom solutions for some of the more complex issues that arise with dynamic loading of code. This code and his efforts provide an important base for this and future work related to dynamic loading in RTEMS. An important area he has solved is the management of targeted linking of libraries such as libc.

The IEEE Std 1003.1-2004 standard defines <dlfcn.h>. This is a small API that makes an executable object file available to the calling program. The API calls are:

int dlclose(void *); char *dlerror(void); void *dlopen(const char *, int); void *dlsym(void *restrict, const char *restrict);

The functions provide an interface to the run-time linker and allow executable object files or shared object files to be loaded into a process's address space. RTEMS is a single process or single address space operating system so there is a close mapping to the needs of RTEMS.

Some operating systems provide extra interfaces to help manage dynamically loaded object files. For example FreeBSD provides dlinfo. The RTEMS can provide these types of interfaces as implementation demands.

The central component is the dynamic linker. The dynamic linker provides run-time loading and link-editing of object files. The linker loads the object file code for all shared libraries into the process's address space performing any relocation, then proceeds to resolve external references from both the main program and all object files loaded. The linker calls initialisation routines for each object file loaded giving a shared object an opportunity to perform any extra set-up before execution of the program proper beings. C++ libraries that contain static constructors require this type of initialisation. The dynamic linker is specific to the ELF file format. This means the RTEMS object file format is ELF for targets that require dynamic loading of object files.

The main application can be viewed as an object file that is required to loaded, relocated and initialised before being started. It can be considered the root of a tree of dynamically referenced object files loaded at run-time.

The Task

The task for RTEMS can be split in two, the host side and the target side. The two sides need to agree on the specific interfaces, formats and services performed. An example of issues that need to be considered is libgcc and RTEMS itself. Are these object files also considered dynamic and loaded by the dynamic linker ?

The issue of version numbers for the various components needs consideration.

On the host side the aim is to leverage as much existing code as possible. Dynamic linking is well understand and stable on a number of operating system using the GNU Compiler Collection. RTEMS also uses this tool set so the use of dynamic linking features of these tools is to be investigated and reported. The task starts with attempting to link a simple application such as "Hello World" dynamically. This will require the current static RTEMS libraries be made to appear as dynamic libraries. For an evaluation how this is done is not important. The static libraries may need to be unpacked as repacked as dynamic libraries or rebuilt from source. Not all libraries need to dynamic. One may be enough to report suitable findings.

The target side needs to handle ELF files in an efficient manner. The license for the code must be compatible with RTEMS. This rules out the BFD library in the binutils package. It is GPL code and too generic to meet the specific needs of RTEMS target code. Efficient handling of symbols is also needed. Compression can be useful here as it provides for smaller libraries plus it provides a run-time check sum on the object file or library. Java handles it class libraries this way.

A complexity raised earlier that Till handles with a special host side tool is selective linking against a library. Consider the "Hello World" example where a call to 'printf' is made. The linking phase sees the printf symbol is held in the dynamic library libc and fills the ELF file with the required information to allow the dynamic linker to perform its task. Our target run-time linker sees the reference to printf and libc and opens and loads the whole library. It has to do this because libc has no smaller dynamic library component. Lets now consider we have a very clever run-time linker that loads only the object files from libc that are used. That is we dynamically link everything based on demand at the object file level. The loading and relocating component can be handled, how-ever the initialisation needs consideration. How do you handle adding a dlopen call to our example to load an object file that also references libc but a part that is not currently loaded and needs loading ? How is this the initialisation handled ? A possible solution is the breaking down of libraries into a smaller collection of dynamic libraries. RTEMS could be packaged this way with a certain effort how-ever libc, libm plus other libraries also need consideration.

References: # FreeBSD Man Pages - dlopen: http://www.freebsd.org/cgi/man.cgi?query=dlopen&format=html # Open Group Single Unix Specification: http://www.opengroup.org/onlinepubs/009695399/functions/dlopen.html # FreeBSD Man Pages - ld-elf.so.1, ld.so, rtld: http://www.freebsd.org/cgi/man.cgi?query=rtld&sektion=1