source: rtems/doc/porting/taskcontext.t @ 451a46f

4.11
Last change on this file since 451a46f was 451a46f, checked in by Joel Sherrill <joel.sherrill@…>, on Apr 22, 2011 at 5:54:38 PM

2011-04-22 Joel Sherrill <joel.sherrill@…>

PR 1782/cpukit

  • porting/taskcontext.t: Disable deferred FPU context switches when SMP is enabled. Per code tracking of deferred contexts is not implemented.
  • Property mode set to 100644
File size: 21.9 KB
Line 
1@c
2@c  COPYRIGHT (c) 1988-2002.
3@c  On-Line Applications Research Corporation (OAR).
4@c  All rights reserved.
5@c
6@c  $Id$
7@c
8
9@chapter Task Context Management
10
11@section Introduction
12
13XXX
14
15@section Task Stacks
16
17XXX
18
19@subsection Direction of Stack Growth
20
21The CPU_STACK_GROWS_UP macro is set based upon the answer to the following
22question: Does the stack grow up (toward higher addresses) or down (toward
23lower addresses)?  If the stack grows upward in memory, then this macro
24should be set to TRUE.  Otherwise, it should be set to FALSE to indicate
25that the stack grows downward toward smaller addresses.
26
27The following illustrates how the CPU_STACK_GROWS_UP macro is set:
28
29@example
30#define CPU_STACK_GROWS_UP               TRUE
31@end example
32
33@subsection Minimum Task Stack Size
34
35The CPU_STACK_MINIMUM_SIZE macro should be set to the minimum size of each
36task stack.  This size is specified as the number of bytes.  This minimum
37stack size should be large enough to run all RTEMS tests.  The minimum
38stack size is chosen such that a "reasonable" small application should not
39have any problems.  Choosing a minimum stack size that is too small will
40result in the RTEMS tests "blowing" their stack and not executing
41properly.
42
43There are many reasons a task could require a stack size larger than the
44minimum.  For example, a task could have a very deep call path or declare
45large data structures on the stack.  Tasks which utilize C++ exceptions
46tend to require larger stacks as do Ada tasks.
47
48The following illustrates setting the minimum stack size to 4 kilobytes
49per task.
50
51@example
52#define CPU_STACK_MINIMUM_SIZE          (1024*4)
53@end example
54
55@subsection Stack Alignment Requirements
56
57The CPU_STACK_ALIGNMENT macro is set to indicate the byte alignment
58requirement for the stack.  This alignment requirement may be stricter
59than that for the data types alignment specified by CPU_ALIGNMENT.  If the
60CPU_ALIGNMENT is strict enough for the stack, then this should be set to
610.
62
63The following illustrates how the CPU_STACK_ALIGNMENT macro should be set
64when there are no special requirements:
65
66@example
67#define CPU_STACK_ALIGNMENT        0
68@end example
69
70NOTE:  This must be a power of 2 either 0 or greater than CPU_ALIGNMENT. [XXX is this true?]
71
72@section Task Context
73
74Associated with each task is a context that distinguishes it from other
75tasks in the system and logically gives it its own scratch pad area for
76computations.  In addition, when an interrupt occurs some processor
77context information must be saved and restored.  This is managed in RTEMS
78as three items:
79
80@itemize @bullet
81
82@item Basic task level context (e.g. the Context_Control structure)
83
84@item Floating point task context (e.g. Context_Control_fp structure)
85
86@item Interrupt level context (e.g.  the Context_Control_interrupt
87structure)
88
89@end itemize
90
91The integer and floating point context structures and the routines that
92manipulate them are discussed in detail in this section, while the
93interrupt level context structure is discussed in the XXX.
94
95Additionally, if the GNU debugger gdb is to be made aware of RTEMS tasks
96for this CPU, then care should be used in designing the context area.
97
98@example
99typedef struct @{
100    unsigned32 special_interrupt_register;
101@} CPU_Interrupt_frame;
102@end example
103
104
105@subsection Basic Context Data Structure
106
107The Context_Control data structure contains the basic integer context of a
108task.  In addition, this context area contains stack and frame pointers,
109processor status register(s), and any other registers that are normally
110altered by compiler generated code.  In addition, this context must
111contain the processor interrupt level since the processor interrupt level
112is maintained on a per-task basis.  This is necessary to support the
113interrupt level portion of the task mode as provided by the Classic RTEMS
114API.
115
116On some processors, it is cost-effective to save only the callee preserved
117registers during a task context switch.  This means that the ISR code
118needs to save those registers which do not persist across function calls. 
119It is not mandatory to make this distinctions between the caller/callee
120saves registers for the purpose of minimizing context saved during task
121switch and on interrupts.  If the cost of saving extra registers is
122minimal, simplicity is the choice.  Save the same context on interrupt
123entry as for tasks in this case.
124
125The Context_Control data structure should be defined such that the order
126of elements results in the simplest, most efficient implementation of XXX. 
127A typical implementation starts with a definition such as the following:
128
129@example
130typedef struct @{
131    unsigned32 some_integer_register;
132    unsigned32 another_integer_register;
133    unsigned32 some_system_register;
134@} Context_Control;
135@end example
136
137@subsection Initializing a Context
138
139The _CPU_Context_Initialize routine initializes the context to a state
140suitable for starting a task after a context restore operation. 
141Generally, this involves:
142
143@itemize @bullet
144
145@item  setting a starting address,
146
147@item  preparing the stack,
148
149@item  preparing the stack and frame pointers,
150
151@item  setting the proper interrupt level in the context, and
152
153@item  initializing the floating point context
154
155@end itemize
156
157This routine generally does not set any unnecessary register in the
158context.  The state of the "general data" registers is undefined at task
159start time. The _CPU_Context_initialize routine is prototyped as follows:
160
161@example
162void _CPU_Context_Initialize(
163    Context_Control *_the_context,
164    void            *_stack_base,
165    unsigned32       _size,
166    unsigned32       _isr,
167    void            *_entry_point,
168    unsigned32       _is_fp
169);
170@end example
171
172The @code{is_fp} parameter is TRUE if the thread is to be a floating point
173thread.  This is typically only used on CPUs where the FPU may be easily
174disabled by software such as on the SPARC where the PSR contains an enable
175FPU bit.  The use of an FPU enable bit allows RTEMS to ensure that a
176non-floating point task is unable to access the FPU.  This guarantees that
177a deferred floating point context switch is safe.
178
179The @code{_stack_base} parameter is the base address of the memory area
180allocated for use as the task stack.  It is critical to understand that
181@code{_stack_base} may not be the starting stack pointer for this task.
182On CPU families where the stack grows from high addresses to lower ones,
183(i.e. @code{CPU_STACK_GROWS_UP} is FALSE) the starting stack point
184will be near the end of the stack memory area or close to
185@code{_stack_base} + @code{_size}.  Even on CPU families where the stack
186grows from low to higher addresses, there may be some required
187outermost stack frame that must be put at the address @code{_stack_base}.
188
189The @code{_size} parameter is the requested size in bytes of the stack for
190this task.  It is assumed that the memory area @code{_stack_base}
191is of this size.
192
193XXX explain other parameters and check prototype
194
195@subsection Performing a Context Switch
196
197The _CPU_Context_switch performs a normal non-FP context switch from the
198context of the current executing thread to the context of the heir thread.
199
200@example
201void _CPU_Context_switch(
202  Context_Control  *run,
203  Context_Control  *heir
204);
205@end example
206
207This routine begins by saving the current state of the
208CPU (i.e. the context) in the context area at @code{run}.
209Then the routine should load the CPU context pointed to
210by @code{heir}.  Loading the new context will cause a
211branch to its task code, so the task that invoked
212@code{_CPU_Context_switch} will not run for a while. 
213When, eventually, a context switch is made to load
214context from @code{*run} again, this task will resume
215and @code{_CPU_Context_switch} will return to its caller.
216
217Care should be exercise when writing this routine.  All
218registers assumed to be preserved across subroutine calls
219must be preserved.  These registers may be saved in
220the task's context area or on its stack.  However, the
221stack pointer and address to resume executing the task
222at must be included in the context (normally the subroutine
223return address to the caller of @code{_Thread_Dispatch}.
224The decision of where to store the task's context is based
225on numerous factors including the capabilities of
226the CPU architecture itself and simplicity as well
227as external considerations such as debuggers wishing
228to examine a task's context.  In this case, it is
229often simpler to save all data in the context area.
230
231Also there may be special considerations
232when loading the stack pointers or interrupt level of the
233incoming task.  Independent of CPU specific considerations,
234if some context is saved on the task stack, then the porter
235must ensure that the stack pointer is adjusted @b{BEFORE}
236to make room for this context information before the
237information is written.  Otherwise, an interrupt could
238occur writing over the context data.  The following is
239an example of an @b{INCORRECT} sequence:
240
241@example
242save part of context beyond current top of stack
243interrupt pushes context -- overwriting written context
244interrupt returns
245adjust stack pointer
246@end example
247
248@subsection Restoring a Context
249
250The _CPU_Context_restore routine is generally used only to restart the
251currently executing thread (i.e. self) in an efficient manner.  In many
252ports, it can simply be a label in _CPU_Context_switch. It may be
253unnecessary to reload some registers.
254
255@example
256void _CPU_Context_restore(
257  Context_Control *new_context
258);
259@end example
260
261@subsection Restarting the Currently Executing Task
262
263The _CPU_Context_Restart_self is responsible for somehow restarting the
264currently executing task.  If you are lucky when porting RTEMS, then all
265that is necessary is restoring the context.  Otherwise, there will need to
266be a routine that does something special in this case.  Performing a
267_CPU_Context_Restore on the currently executing task after reinitializing
268that context should work on most ports.  It will not work if restarting
269self conflicts with the stack frame assumptions of restoring a context.
270
271The following is an implementation of _CPU_Context_Restart_self that can
272be used when no special handling is required for this case.
273
274@example
275#define _CPU_Context_Restart_self( _the_context ) \
276   _CPU_Context_restore( (_the_context) )
277@end example
278
279XXX find a port which does not do it this way and include it here
280
281@section Floating Point Context
282
283@subsection CPU_HAS_FPU Macro Definition
284
285The CPU_HAS_FPU macro is set based on the answer to the question: Does the
286CPU have hardware floating point?  If the CPU has an FPU, then this should
287be set to TRUE.  Otherwise, it should be set to FALSE.  The primary
288implication of setting this macro to TRUE is that it indicates that tasks
289may have floating point contexts.  In the Classic API, this means that the
290RTEMS_FLOATING_POINT task attribute specified as part of rtems_task_create
291is supported on this CPU.  If CPU_HAS_FPU is set to FALSE, then no tasks
292or threads may be floating point and the RTEMS_FLOATING_POINT task
293attribute is ignored.  On an API such as POSIX where all threads
294implicitly have a floating point context, then the setting of this macro
295determines whether every POSIX thread has a floating point context.
296
297The following example illustrates how the CPU_HARDWARE_FP (XXX macro name
298is varying) macro is set based on the CPU family dependent macro.
299
300@example
301#if ( THIS_CPU_FAMILY_HAS_FPU == 1 ) /* where THIS_CPU_FAMILY */
302                                     /* might be M68K */
303#define CPU_HARDWARE_FP     TRUE
304#else
305#define CPU_HARDWARE_FP     FALSE
306#endif
307@end example
308
309The macro name THIS_CPU_FAMILY_HAS_FPU should be made CPU specific.  It
310indicates whether or not this CPU model has FP support.  For example, the
311definition of the i386ex and i386sx CPU models would set I386_HAS_FPU to
312FALSE to indicate that these CPU models are i386's without an i387 and
313wish to leave floating point support out of RTEMS when built for the
314i386_nofp processor model.  On a CPU with a built-in FPU like the i486,
315this would be defined as TRUE.
316
317On some processor families, the setting of the THIS_CPU_FAMILY_HAS_FPU
318macro may be derived from compiler predefinitions.  This can be used when
319the compiler distinguishes the individual CPU models for this CPU family
320as distinctly as RTEMS requires.  Often RTEMS needs to need more about the
321CPU model than the compiler because of differences at the system level
322such as caching, interrupt structure.
323
324@subsection CPU_ALL_TASKS_ARE_FP Macro Setting
325
326The CPU_ALL_TASKS_ARE_FP macro is set to TRUE or FALSE based upon the
327answer to the following question: Are all tasks RTEMS_FLOATING_POINT tasks
328implicitly?  If this macro is set TRUE, then all tasks and threads are
329assumed to have a floating point context.  In the Classic API, this is
330equivalent to setting the RTEMS_FLOATING_POINT task attribute on all
331rtems_task_create calls.  If the CPU_ALL_TASKS_ARE_FP macro is set to
332FALSE, then the RTEMS_FLOATING_POINT task attribute in the Classic API is
333honored.
334
335The rationale for this macro is that if a function that an application
336developer would not think utilize the FP unit DOES, then one can not
337easily predict which tasks will use the FP hardware. In this case, this
338option should be TRUE.  So far, the only CPU families for which this macro
339has been to TRUE are the HP PA-RISC and PowerPC.  For the HP PA-RISC, the
340HP C compiler and gcc both implicitly use the floating point registers to
341perform integer multiplies.  For the PowerPC, this feature macro is set to
342TRUE because the printf routine saves a floating point register whether or
343not a floating point number is actually printed.  If the newlib
344implementation of printf were restructured to avoid this, then the PowerPC
345port would not have to have this option set to TRUE.
346
347The following example illustrates how the CPU_ALL_TASKS_ARE_FP is set on
348the PowerPC.  On this CPU family, this macro is set to TRUE if the CPU
349model has hardware floating point.
350
351@example
352#if (CPU_HARDWARE_FP == TRUE)
353#define CPU_ALL_TASKS_ARE_FP     TRUE
354#else
355#define CPU_ALL_TASKS_ARE_FP     FALSE
356#endif
357@end example
358
359NOTE: If CPU_HARDWARE_FP is FALSE, then this should be FALSE as well.
360
361@subsection CPU_USE_DEFERRED_FP_SWITCH Macro Setting
362
363The CPU_USE_DEFERRED_FP_SWITCH macro is set based upon the answer to the
364following question:  Should the saving of the floating point registers be
365deferred until a context switch is made to another different floating
366point task?  If the floating point context will not be stored until
367necessary, then this macro should be set to TRUE.  When set to TRUE, the
368floating point context of a task will remain in the floating point
369registers and not disturbed until another floating point task is switched
370to.
371
372If the CPU_USE_DEFERRED_FP_SWITCH is set to FALSE, then the floating point
373context is saved each time a floating point task is switched out and
374restored when the next floating point task is restored.  The state of the
375floating point registers between those two operations is not specified.
376
377There are a couple of known cases where the port should not defer saving
378the floating point context.  The first case is when the compiler generates
379instructions that use the FPU when floating point is not actually used. 
380This occurs on the HP PA-RISC for example when an integer multiply is
381performed.  On the PowerPC, the printf routine includes a save of a
382floating point register to support printing floating point numbers even if
383the path that actually prints the floating point number is not invoked. 
384In both of these cases, deferred floating point context switches can not
385be used.  If the floating point context has to be saved as part of
386interrupt dispatching, then it may also be necessary to disable deferred
387context switches.
388
389Setting this flag to TRUE results in using a different algorithm for
390deciding when to save and restore the floating point context.  The
391deferred FP switch algorithm minimizes the number of times the FP context
392is saved and restored.  The FP context is not saved until a context switch
393is made to another, different FP task.  Thus in a system with only one FP
394task, the FP context will never be saved or restored.
395
396The following illustrates setting the CPU_USE_DEFERRED_FP_SWITCH macro on
397a processor family such as the M68K or i386 which can use deferred
398floating point context switches.
399
400@example
401#define CPU_USE_DEFERRED_FP_SWITCH       TRUE
402@end example
403
404Note that currently, when in SMP configuration, deferred floating point
405context switching is unavailable.
406
407@subsection Floating Point Context Data Structure
408
409The Context_Control_fp contains the per task information for the floating
410point unit.  The organization of this structure may be a list of floating
411point registers along with any floating point control and status registers
412or it simply consist of an array of a fixed number of bytes.  Defining the
413floating point context area as an array of bytes is done when the floating
414point context is dumped by a "FP save context" type instruction and the
415format is either not completely defined by the CPU documentation or the
416format is not critical for the implementation of the floating point
417context switch routines.  In this case, there is no need to figure out the
418exact format -- only the size.  Of course, although this is enough
419information for RTEMS, it is probably not enough for a debugger such as
420gdb.  But that is another problem.
421
422@example
423typedef struct @{
424    double      some_float_register;
425@} Context_Control_fp;
426@end example
427
428
429On some CPUs with hardware floating point support, the Context_Control_fp
430structure will not be used.
431
432@subsection Size of Floating Point Context Macro
433
434The CPU_CONTEXT_FP_SIZE macro is set to the size of the floating point
435context area. On some CPUs this will not be a "sizeof" because the format
436of the floating point area is not defined -- only the size is.  This is
437usually on CPUs with a "floating point save context" instruction.  In
438general, though it is easier to define the structure as a "sizeof"
439operation and define the Context_Control_fp structure to be an area of
440bytes of the required size in this case.
441
442@example
443#define CPU_CONTEXT_FP_SIZE sizeof( Context_Control_fp )
444@end example
445
446@subsection Start of Floating Point Context Area Macro
447
448The _CPU_Context_Fp_start macro is used in the XXX routine and allows the initial pointer into a  floating point context area (used to save the floating point context) to be at an arbitrary place in the floating point context area.  This is necessary because some FP units are designed to have their context saved as a stack which grows into lower addresses.  Other FP units can be saved by simply moving registers into offsets from the base of the context area.  Finally some FP units provide a "dump context" instruction which could fill in from high to low or low to high based on the whim of the CPU designers.  Regardless, the address at which that floating point context area pointer should start within the actual floating point context area varies between ports and this macro provides a clean way of addressing this.
449
450This is a common implementation of the _CPU_Context_Fp_start routine which
451is suitable for many processors.  In particular, RISC processors tend to
452use this implementation since the floating point context is saved as a
453sequence of store operations.
454
455@example
456#define _CPU_Context_Fp_start( _base, _offset ) \
457   ( (void *) _Addresses_Add_offset( (_base), (_offset) ) )
458@end example
459
460In contrast, the m68k treats the floating point context area as a stack
461which grows downward in memory.  Thus the following implementation of
462_CPU_Context_Fp_start is used in that port:
463
464
465@example
466XXX insert m68k version here
467@end example
468
469@subsection Initializing a Floating Point Context
470
471The _CPU_Context_Initialize_fp routine initializes the floating point
472context area passed to it to. There are a few standard ways in which to
473initialize the floating point context.  The simplest, and least
474deterministic behaviorally, is to do nothing.  This leaves the FPU in a
475random state and is generally not a suitable way to implement this
476routine.  The second common implementation is to place a "null FP status
477word" into some status/control register in the FPU.  This mechanism is
478simple and works on many FPUs.  Another common way is to initialize the
479FPU to a known state during _CPU_Initialize and save the context (using
480_CPU_Context_save_fp_context) into the special floating point context
481_CPU_Null_fp_context.  Then all that is required to initialize a floating
482point context is to copy _CPU_Null_fp_context to the destination floating
483point context passed to it.  The following example implementation shows
484how to accomplish this:
485
486@example
487#define _CPU_Context_Initialize_fp( _destination ) \
488  @{ \
489   *((Context_Control_fp *) *((void **) _destination)) = \
490       _CPU_Null_fp_context; \
491  @}
492@end example
493
494The _CPU_Null_fp_context is optional.  A port need only include this variable when it uses the above mechanism to initialize a floating point context.  This is typically done on CPUs where it is difficult to generate an "uninitialized" FP context.  If the port requires this variable, then it is declared as follows:
495
496@example
497Context_Control_fp  _CPU_Null_fp_context;
498@end example
499
500
501@subsection Saving a Floating Point Context
502
503The _CPU_Context_save_fp_context routine is responsible for saving the FP
504context at *fp_context_ptr.  If the point to load the FP context from is
505changed then the pointer is modified by this routine.
506
507Sometimes a macro implementation of this is in cpu.h which dereferences
508the ** and a similarly named routine in this file is passed something like
509a (Context_Control_fp *).  The general rule on making this decision is to
510avoid writing assembly language.
511
512@example
513void _CPU_Context_save_fp(
514  void **fp_context_ptr
515)
516@end example
517
518@subsection Restoring a Floating Point Context
519
520The _CPU_Context_restore_fp_context is responsible for restoring the FP
521context at *fp_context_ptr.  If the point to load the FP context from is
522changed then the pointer is modified by this routine.
523
524Sometimes a macro implementation of this is in cpu.h which dereferences
525the ** and a similarly named routine in this file is passed something like
526a (Context_Control_fp *).  The general rule on making this decision is to
527avoid writing assembly language.
528
529@example
530void _CPU_Context_restore_fp(
531  void **fp_context_ptr
532);
533@end example
534
Note: See TracBrowser for help on using the repository browser.