source: rtems/doc/user/smp.t @ 18847ac

5
Last change on this file since 18847ac was 18847ac, checked in by Sebastian Huber <sebastian.huber@…>, on 12/16/15 at 07:23:44

doc: SMP status of RTEMS

  • Property mode set to 100644
File size: 38.7 KB
Line 
1@c
2@c  COPYRIGHT (c) 2014.
3@c  On-Line Applications Research Corporation (OAR).
4@c  All rights reserved.
5@c
6
7@chapter Symmetric Multiprocessing Services
8
9@section Introduction
10
11The Symmetric Multiprocessing (SMP) support of the RTEMS @value{VERSION} is
12available on
13
14@itemize @bullet
15@item ARM,
16@item PowerPC, and
17@item SPARC.
18@end itemize
19
20It must be explicitly enabled via the @code{--enable-smp} configure command
21line option.  To enable SMP in the application configuration see
22@ref{Configuring a System Enable SMP Support for Applications}.  The default
23scheduler for SMP applications supports up to 32 processors and is a global
24fixed priority scheduler, see also @ref{Configuring a System Configuring
25Clustered Schedulers}.  For example applications see
26@file{testsuites/smptests}.
27
28@strong{WARNING: The SMP support in RTEMS is work in progress.  Before you
29start using this RTEMS version for SMP ask on the RTEMS mailing list.}
30
31This chapter describes the services related to Symmetric Multiprocessing
32provided by RTEMS.
33
34The application level services currently provided are:
35
36@itemize @bullet
37@item @code{rtems_get_processor_count} - Get processor count
38@item @code{rtems_get_current_processor} - Get current processor index
39@item @code{rtems_scheduler_ident} - Get ID of a scheduler
40@item @code{rtems_scheduler_get_processor_set} - Get processor set of a scheduler
41@item @code{rtems_task_get_scheduler} - Get scheduler of a task
42@item @code{rtems_task_set_scheduler} - Set scheduler of a task
43@item @code{rtems_task_get_affinity} - Get task processor affinity
44@item @code{rtems_task_set_affinity} - Set task processor affinity
45@end itemize
46
47@c
48@c
49@c
50@section Background
51
52@subsection Uniprocessor versus SMP Parallelism
53
54Uniprocessor systems have long been used in embedded systems. In this hardware
55model, there are some system execution characteristics which have long been
56taken for granted:
57
58@itemize @bullet
59@item one task executes at a time
60@item hardware events result in interrupts
61@end itemize
62
63There is no true parallelism. Even when interrupts appear to occur
64at the same time, they are processed in largely a serial fashion.
65This is true even when the interupt service routines are allowed to
66nest.  From a tasking viewpoint,  it is the responsibility of the real-time
67operatimg system to simulate parallelism by switching between tasks.
68These task switches occur in response to hardware interrupt events and explicit
69application events such as blocking for a resource or delaying.
70
71With symmetric multiprocessing, the presence of multiple processors
72allows for true concurrency and provides for cost-effective performance
73improvements. Uniprocessors tend to increase performance by increasing
74clock speed and complexity. This tends to lead to hot, power hungry
75microprocessors which are poorly suited for many embedded applications.
76
77The true concurrency is in sharp contrast to the single task and
78interrupt model of uniprocessor systems. This results in a fundamental
79change to uniprocessor system characteristics listed above. Developers
80are faced with a different set of characteristics which, in turn, break
81some existing assumptions and result in new challenges. In an SMP system
82with N processors, these are the new execution characteristics.
83
84@itemize @bullet
85@item N tasks execute in parallel
86@item hardware events result in interrupts
87@end itemize
88
89There is true parallelism with a task executing on each processor and
90the possibility of interrupts occurring on each processor. Thus in contrast
91to their being one task and one interrupt to consider on a uniprocessor,
92there are N tasks and potentially N simultaneous interrupts to consider
93on an SMP system.
94
95This increase in hardware complexity and presence of true parallelism
96results in the application developer needing to be even more cautious
97about mutual exclusion and shared data access than in a uniprocessor
98embedded system. Race conditions that never or rarely happened when an
99application executed on a uniprocessor system, become much more likely
100due to multiple threads executing in parallel. On a uniprocessor system,
101these race conditions would only happen when a task switch occurred at
102just the wrong moment. Now there are N-1 tasks executing in parallel
103all the time and this results in many more opportunities for small
104windows in critical sections to be hit.
105
106@subsection Task Affinity
107
108@cindex task affinity
109@cindex thread affinity
110
111RTEMS provides services to manipulate the affinity of a task. Affinity
112is used to specify the subset of processors in an SMP system on which
113a particular task can execute.
114
115By default, tasks have an affinity which allows them to execute on any
116available processor.
117
118Task affinity is a possible feature to be supported by SMP-aware
119schedulers. However, only a subset of the available schedulers support
120affinity. Although the behavior is scheduler specific, if the scheduler
121does not support affinity, it is likely to ignore all attempts to set
122affinity.
123
124@subsection Task Migration
125
126@cindex task migration
127@cindex thread migration
128
129With more than one processor in the system tasks can migrate from one processor
130to another.  There are three reasons why tasks migrate in RTEMS.
131
132@itemize @bullet
133@item The scheduler changes explicitly via @code{rtems_task_set_scheduler()} or
134similar directives.
135@item The task resumes execution after a blocking operation.  On a priority
136based scheduler it will evict the lowest priority task currently assigned to a
137processor in the processor set managed by the scheduler instance.
138@item The task moves temporarily to another scheduler instance due to locking
139protocols like @cite{Migratory Priority Inheritance} or the
140@cite{Multiprocessor Resource Sharing Protocol}.
141@end itemize
142
143Task migration should be avoided so that the working set of a task can stay on
144the most local cache level.
145
146The current implementation of task migration in RTEMS has some implications
147with respect to the interrupt latency.  It is crucial to preserve the system
148invariant that a task can execute on at most one processor in the system at a
149time.  This is accomplished with a boolean indicator in the task context.  The
150processor architecture specific low-level task context switch code will mark
151that a task context is no longer executing and waits that the heir context
152stopped execution before it restores the heir context and resumes execution of
153the heir task.  So there is one point in time in which a processor is without a
154task.  This is essential to avoid cyclic dependencies in case multiple tasks
155migrate at once.  Otherwise some supervising entity is necessary to prevent
156life-locks.  Such a global supervisor would lead to scalability problems so
157this approach is not used.  Currently the thread dispatch is performed with
158interrupts disabled.  So in case the heir task is currently executing on
159another processor then this prolongs the time of disabled interrupts since one
160processor has to wait for another processor to make progress.
161
162It is difficult to avoid this issue with the interrupt latency since interrupts
163normally store the context of the interrupted task on its stack.  In case a
164task is marked as not executing we must not use its task stack to store such an
165interrupt context.  We cannot use the heir stack before it stopped execution on
166another processor.  So if we enable interrupts during this transition we have
167to provide an alternative task independent stack for this time frame.  This
168issue needs further investigation.
169
170@subsection Clustered Scheduling
171
172We have clustered scheduling in case the set of processors of a system is
173partitioned into non-empty pairwise-disjoint subsets. These subsets are called
174clusters.  Clusters with a cardinality of one are partitions.  Each cluster is
175owned by exactly one scheduler instance.
176
177Clustered scheduling helps to control the worst-case latencies in
178multi-processor systems, see @cite{Brandenburg, Björn B.: Scheduling and
179Locking in Multiprocessor Real-Time Operating Systems. PhD thesis, 2011.
180@uref{http://www.cs.unc.edu/~bbb/diss/brandenburg-diss.pdf}}.  The goal is to
181reduce the amount of shared state in the system and thus prevention of lock
182contention. Modern multi-processor systems tend to have several layers of data
183and instruction caches.  With clustered scheduling it is possible to honour the
184cache topology of a system and thus avoid expensive cache synchronization
185traffic.  It is easy to implement.  The problem is to provide synchronization
186primitives for inter-cluster synchronization (more than one cluster is involved
187in the synchronization process). In RTEMS there are currently four means
188available
189
190@itemize @bullet
191@item events,
192@item message queues,
193@item semaphores using the @ref{Semaphore Manager Priority Inheritance}
194protocol (priority boosting), and
195@item semaphores using the @ref{Semaphore Manager Multiprocessor Resource
196Sharing Protocol} (MrsP).
197@end itemize
198
199The clustered scheduling approach enables separation of functions with
200real-time requirements and functions that profit from fairness and high
201throughput provided the scheduler instances are fully decoupled and adequate
202inter-cluster synchronization primitives are used.  This is work in progress.
203
204For the configuration of clustered schedulers see @ref{Configuring a System
205Configuring Clustered Schedulers}.
206
207To set the scheduler of a task see @ref{Symmetric Multiprocessing Services
208SCHEDULER_IDENT - Get ID of a scheduler} and @ref{Symmetric Multiprocessing
209Services TASK_SET_SCHEDULER - Set scheduler of a task}.
210
211@subsection Task Priority Queues
212
213Due to the support for clustered scheduling the task priority queues need
214special attention.  It makes no sense to compare the priority values of two
215different scheduler instances.  Thus, it is not possible to simply use one
216plain priority queue for tasks of different scheduler instances.
217
218One solution to this problem is to use two levels of queues.  The top level
219queue provides FIFO ordering and contains priority queues.  Each priority queue
220is associated with a scheduler instance and contains only tasks of this
221scheduler instance.  Tasks are enqueued in the priority queue corresponding to
222their scheduler instance.  In case this priority queue was empty, then it is
223appended to the FIFO.  To dequeue a task the highest priority task of the first
224priority queue in the FIFO is selected.  Then the first priority queue is
225removed from the FIFO.  In case the previously first priority queue is not
226empty, then it is appended to the FIFO.  So there is FIFO fairness with respect
227to the highest priority task of each scheduler instances. See also @cite{
228Brandenburg, Björn B.: A fully preemptive multiprocessor semaphore protocol for
229latency-sensitive real-time applications. In Proceedings of the 25th Euromicro
230Conference on Real-Time Systems (ECRTS 2013), pages 292–302, 2013.
231@uref{http://www.mpi-sws.org/~bbb/papers/pdf/ecrts13b.pdf}}.
232
233Such a two level queue may need a considerable amount of memory if fast enqueue
234and dequeue operations are desired (depends on the scheduler instance count).
235To mitigate this problem an approch of the FreeBSD kernel was implemented in
236RTEMS.  We have the invariant that a task can be enqueued on at most one task
237queue.  Thus, we need only as many queues as we have tasks.  Each task is
238equipped with spare task queue which it can give to an object on demand.  The
239task queue uses a dedicated memory space independent of the other memory used
240for the task itself. In case a task needs to block, then there are two options
241
242@itemize @bullet
243@item the object already has task queue, then the task enqueues itself to this
244already present queue and the spare task queue of the task is added to a list
245of free queues for this object, or
246@item otherwise, then the queue of the task is given to the object and the task
247enqueues itself to this queue.
248@end itemize
249
250In case the task is dequeued, then there are two options
251
252@itemize @bullet
253@item the task is the last task on the queue, then it removes this queue from
254the object and reclaims it for its own purpose, or
255@item otherwise, then the task removes one queue from the free list of the
256object and reclaims it for its own purpose.
257@end itemize
258
259Since there are usually more objects than tasks, this actually reduces the
260memory demands. In addition the objects contain only a pointer to the task
261queue structure. This helps to hide implementation details and makes it
262possible to use self-contained synchronization objects in Newlib and GCC (C++
263and OpenMP run-time support).
264
265@subsection Scheduler Helping Protocol
266
267The scheduler provides a helping protocol to support locking protocols like
268@cite{Migratory Priority Inheritance} or the @cite{Multiprocessor Resource
269Sharing Protocol}.  Each ready task can use at least one scheduler node at a
270time to gain access to a processor.  Each scheduler node has an owner, a user
271and an optional idle task.  The owner of a scheduler node is determined a task
272creation and never changes during the life time of a scheduler node.  The user
273of a scheduler node may change due to the scheduler helping protocol.  A
274scheduler node is in one of the four scheduler help states:
275
276@table @dfn
277
278@item help yourself
279
280This scheduler node is solely used by the owner task.  This task owns no
281resources using a helping protocol and thus does not take part in the scheduler
282helping protocol.  No help will be provided for other tasks.
283
284@item help active owner
285
286This scheduler node is owned by a task actively owning a resource and can be
287used to help out tasks.
288
289In case this scheduler node changes its state from ready to scheduled and the
290task executes using another node, then an idle task will be provided as a user
291of this node to temporarily execute on behalf of the owner task.  Thus lower
292priority tasks are denied access to the processors of this scheduler instance.
293
294In case a task actively owning a resource performs a blocking operation, then
295an idle task will be used also in case this node is in the scheduled state.
296
297@item help active rival
298
299This scheduler node is owned by a task actively obtaining a resource currently
300owned by another task and can be used to help out tasks.
301
302The task owning this node is ready and will give away its processor in case the
303task owning the resource asks for help.
304
305@item help passive
306
307This scheduler node is owned by a task obtaining a resource currently owned by
308another task and can be used to help out tasks.
309
310The task owning this node is blocked.
311
312@end table
313
314The following scheduler operations return a task in need for help
315
316@itemize @bullet
317@item unblock,
318@item change priority,
319@item yield, and
320@item ask for help.
321@end itemize
322
323A task in need for help is a task that encounters a scheduler state change from
324scheduled to ready (this is a pre-emption by a higher priority task) or a task
325that cannot be scheduled in an unblock operation.  Such a task can ask tasks
326which depend on resources owned by this task for help.
327
328In case it is not possible to schedule a task in need for help, then the
329scheduler nodes available for the task will be placed into the set of ready
330scheduler nodes of the corresponding scheduler instances.  Once a state change
331from ready to scheduled happens for one of scheduler nodes it will be used to
332schedule the task in need for help.
333
334The ask for help scheduler operation is used to help tasks in need for help
335returned by the operations mentioned above.  This operation is also used in
336case the root of a resource sub-tree owned by a task changes.
337
338The run-time of the ask for help procedures depend on the size of the resource
339tree of the task needing help and other resource trees in case tasks in need
340for help are produced during this operation.  Thus the worst-case latency in
341the system depends on the maximum resource tree size of the application.
342
343@subsection Critical Section Techniques and SMP
344
345As discussed earlier, SMP systems have opportunities for true parallelism
346which was not possible on uniprocessor systems. Consequently, multiple
347techniques that provided adequate critical sections on uniprocessor
348systems are unsafe on SMP systems. In this section, some of these
349unsafe techniques will be discussed.
350
351In general, applications must use proper operating system provided mutual
352exclusion mechanisms to ensure correct behavior. This primarily means
353the use of binary semaphores or mutexes to implement critical sections.
354
355@subsubsection Disable Interrupts and Interrupt Locks
356
357A low overhead means to ensure mutual exclusion in uni-processor configurations
358is to disable interrupts around a critical section.  This is commonly used in
359device driver code and throughout the operating system core.  On SMP
360configurations, however, disabling the interrupts on one processor has no
361effect on other processors.  So, this is insufficient to ensure system wide
362mutual exclusion.  The macros
363@itemize @bullet
364@item @code{rtems_interrupt_disable()},
365@item @code{rtems_interrupt_enable()}, and
366@item @code{rtems_interrupt_flush()}
367@end itemize
368are disabled on SMP configurations and its use will lead to compiler warnings
369and linker errors.  In the unlikely case that interrupts must be disabled on
370the current processor, then the
371@itemize @bullet
372@item @code{rtems_interrupt_local_disable()}, and
373@item @code{rtems_interrupt_local_enable()}
374@end itemize
375macros are now available in all configurations.
376
377Since disabling of interrupts is not enough to ensure system wide mutual
378exclusion on SMP, a new low-level synchronization primitive was added - the
379interrupt locks.  They are a simple API layer on top of the SMP locks used for
380low-level synchronization in the operating system core.  Currently they are
381implemented as a ticket lock.  On uni-processor configurations they degenerate
382to simple interrupt disable/enable sequences.  It is disallowed to acquire a
383single interrupt lock in a nested way.  This will result in an infinite loop
384with interrupts disabled.  While converting legacy code to interrupt locks care
385must be taken to avoid this situation.
386
387@example
388@group
389void legacy_code_with_interrupt_disable_enable( void )
390@{
391  rtems_interrupt_level level;
392
393  rtems_interrupt_disable( level );
394  /* Some critical stuff */
395  rtems_interrupt_enable( level );
396@}
397
398RTEMS_INTERRUPT_LOCK_DEFINE( static, lock, "Name" )
399
400void smp_ready_code_with_interrupt_lock( void )
401@{
402  rtems_interrupt_lock_context lock_context;
403
404  rtems_interrupt_lock_acquire( &lock, &lock_context );
405  /* Some critical stuff */
406  rtems_interrupt_lock_release( &lock, &lock_context );
407@}
408@end group
409@end example
410
411The @code{rtems_interrupt_lock} structure is empty on uni-processor
412configurations.  Empty structures have a different size in C
413(implementation-defined, zero in case of GCC) and C++ (implementation-defined
414non-zero value, one in case of GCC).  Thus the
415@code{RTEMS_INTERRUPT_LOCK_DECLARE()}, @code{RTEMS_INTERRUPT_LOCK_DEFINE()},
416@code{RTEMS_INTERRUPT_LOCK_MEMBER()}, and
417@code{RTEMS_INTERRUPT_LOCK_REFERENCE()} macros are provided to ensure ABI
418compatibility.
419
420@subsubsection Highest Priority Task Assumption
421
422On a uniprocessor system, it is safe to assume that when the highest
423priority task in an application executes, it will execute without being
424preempted until it voluntarily blocks. Interrupts may occur while it is
425executing, but there will be no context switch to another task unless
426the highest priority task voluntarily initiates it.
427
428Given the assumption that no other tasks will have their execution
429interleaved with the highest priority task, it is possible for this
430task to be constructed such that it does not need to acquire a binary
431semaphore or mutex for protected access to shared data.
432
433In an SMP system, it cannot be assumed there will never be a single task
434executing. It should be assumed that every processor is executing another
435application task. Further, those tasks will be ones which would not have
436been executed in a uniprocessor configuration and should be assumed to
437have data synchronization conflicts with what was formerly the highest
438priority task which executed without conflict.
439
440@subsubsection Disable Preemption
441
442On a uniprocessor system, disabling preemption in a task is very similar
443to making the highest priority task assumption. While preemption is
444disabled, no task context switches will occur unless the task initiates
445them voluntarily. And, just as with the highest priority task assumption,
446there are N-1 processors also running tasks. Thus the assumption that no
447other tasks will run while the task has preemption disabled is violated.
448
449@subsection Task Unique Data and SMP
450
451Per task variables are a service commonly provided by real-time operating
452systems for application use. They work by allowing the application
453to specify a location in memory (typically a @code{void *}) which is
454logically added to the context of a task. On each task switch, the
455location in memory is stored and each task can have a unique value in
456the same memory location. This memory location is directly accessed as a
457variable in a program.
458
459This works well in a uniprocessor environment because there is one task
460executing and one memory location containing a task-specific value. But
461it is fundamentally broken on an SMP system because there are always N
462tasks executing. With only one location in memory, N-1 tasks will not
463have the correct value.
464
465This paradigm for providing task unique data values is fundamentally
466broken on SMP systems.
467
468@subsubsection Classic API Per Task Variables
469
470The Classic API provides three directives to support per task variables. These are:
471
472@itemize @bullet
473@item @code{@value{DIRPREFIX}task_variable_add} - Associate per task variable
474@item @code{@value{DIRPREFIX}task_variable_get} - Obtain value of a a per task variable
475@item @code{@value{DIRPREFIX}task_variable_delete} - Remove per task variable
476@end itemize
477
478As task variables are unsafe for use on SMP systems, the use of these services
479must be eliminated in all software that is to be used in an SMP environment.
480The task variables API is disabled on SMP. Its use will lead to compile-time
481and link-time errors. It is recommended that the application developer consider
482the use of POSIX Keys or Thread Local Storage (TLS). POSIX Keys are available
483in all RTEMS configurations.  For the availablity of TLS on a particular
484architecture please consult the @cite{RTEMS CPU Architecture Supplement}.
485
486The only remaining user of task variables in the RTEMS code base is the Ada
487support.  So basically Ada is not available on RTEMS SMP.
488
489@subsection OpenMP
490
491OpenMP support for RTEMS is available via the GCC provided libgomp.  There is
492libgomp support for RTEMS in the POSIX configuration of libgomp since GCC 4.9
493(requires a Newlib snapshot after 2015-03-12). In GCC 6.1 or later (requires a
494Newlib snapshot after 2015-07-30 for <sys/lock.h> provided self-contained
495synchronization objects) there is a specialized libgomp configuration for RTEMS
496which offers a significantly better performance compared to the POSIX
497configuration of libgomp.  In addition application configurable thread pools
498for each scheduler instance are available in GCC 6.1 or later.
499
500The run-time configuration of libgomp is done via environment variables
501documented in the @uref{https://gcc.gnu.org/onlinedocs/libgomp/, libgomp
502manual}.  The environment variables are evaluated in a constructor function
503which executes in the context of the first initialization task before the
504actual initialization task function is called (just like a global C++
505constructor).  To set application specific values, a higher priority
506constructor function must be used to set up the environment variables.
507
508@example
509@group
510#include <stdlib.h>
511
512void __attribute__((constructor(1000))) config_libgomp( void )
513@{
514  setenv( "OMP_DISPLAY_ENV", "VERBOSE", 1 );
515  setenv( "GOMP_SPINCOUNT", "30000", 1 );
516  setenv( "GOMP_RTEMS_THREAD_POOLS", "1$2@@SCHD", 1 );
517@}
518@end group
519@end example
520
521The environment variable @env{GOMP_RTEMS_THREAD_POOLS} is RTEMS-specific.  It
522determines the thread pools for each scheduler instance.  The format for
523@env{GOMP_RTEMS_THREAD_POOLS} is a list of optional
524@code{<thread-pool-count>[$<priority>]@@<scheduler-name>} configurations
525separated by @code{:} where:
526
527@itemize @bullet
528@item @code{<thread-pool-count>} is the thread pool count for this scheduler
529instance.
530@item @code{$<priority>} is an optional priority for the worker threads of a
531thread pool according to @code{pthread_setschedparam}.  In case a priority
532value is omitted, then a worker thread will inherit the priority of the OpenMP
533master thread that created it.  The priority of the worker thread is not
534changed by libgomp after creation, even if a new OpenMP master thread using the
535worker has a different priority.
536@item @code{@@<scheduler-name>} is the scheduler instance name according to the
537RTEMS application configuration.
538@end itemize
539
540In case no thread pool configuration is specified for a scheduler instance,
541then each OpenMP master thread of this scheduler instance will use its own
542dynamically allocated thread pool.  To limit the worker thread count of the
543thread pools, each OpenMP master thread must call @code{omp_set_num_threads}.
544
545Lets suppose we have three scheduler instances @code{IO}, @code{WRK0}, and
546@code{WRK1} with @env{GOMP_RTEMS_THREAD_POOLS} set to
547@code{"1@@WRK0:3$4@@WRK1"}.  Then there are no thread pool restrictions for
548scheduler instance @code{IO}.  In the scheduler instance @code{WRK0} there is
549one thread pool available.  Since no priority is specified for this scheduler
550instance, the worker thread inherits the priority of the OpenMP master thread
551that created it.  In the scheduler instance @code{WRK1} there are three thread
552pools available and their worker threads run at priority four.
553
554@subsection Thread Dispatch Details
555
556This section gives background information to developers interested in the
557interrupt latencies introduced by thread dispatching.  A thread dispatch
558consists of all work which must be done to stop the currently executing thread
559on a processor and hand over this processor to an heir thread.
560
561On SMP systems, scheduling decisions on one processor must be propagated to
562other processors through inter-processor interrupts.  So, a thread dispatch
563which must be carried out on another processor happens not instantaneous.  Thus
564several thread dispatch requests might be in the air and it is possible that
565some of them may be out of date before the corresponding processor has time to
566deal with them.  The thread dispatch mechanism uses three per-processor
567variables,
568@itemize @bullet
569@item the executing thread,
570@item the heir thread, and
571@item an boolean flag indicating if a thread dispatch is necessary or not.
572@end itemize
573Updates of the heir thread and the thread dispatch necessary indicator are
574synchronized via explicit memory barriers without the use of locks.  A thread
575can be an heir thread on at most one processor in the system.  The thread context
576is protected by a TTAS lock embedded in the context to ensure that it is used
577on at most one processor at a time.  The thread post-switch actions use a
578per-processor lock.  This implementation turned out to be quite efficient and
579no lock contention was observed in the test suite.
580
581The current implementation of thread dispatching has some implications with
582respect to the interrupt latency.  It is crucial to preserve the system
583invariant that a thread can execute on at most one processor in the system at a
584time.  This is accomplished with a boolean indicator in the thread context.
585The processor architecture specific context switch code will mark that a thread
586context is no longer executing and waits that the heir context stopped
587execution before it restores the heir context and resumes execution of the heir
588thread (the boolean indicator is basically a TTAS lock).  So, there is one
589point in time in which a processor is without a thread.  This is essential to
590avoid cyclic dependencies in case multiple threads migrate at once.  Otherwise
591some supervising entity is necessary to prevent deadlocks.  Such a global
592supervisor would lead to scalability problems so this approach is not used.
593Currently the context switch is performed with interrupts disabled.  Thus in
594case the heir thread is currently executing on another processor, the time of
595disabled interrupts is prolonged since one processor has to wait for another
596processor to make progress.
597
598It is difficult to avoid this issue with the interrupt latency since interrupts
599normally store the context of the interrupted thread on its stack.  In case a
600thread is marked as not executing, we must not use its thread stack to store
601such an interrupt context.  We cannot use the heir stack before it stopped
602execution on another processor.  If we enable interrupts during this
603transition, then we have to provide an alternative thread independent stack for
604interrupts in this time frame.  This issue needs further investigation.
605
606The problematic situation occurs in case we have a thread which executes with
607thread dispatching disabled and should execute on another processor (e.g. it is
608an heir thread on another processor).  In this case the interrupts on this
609other processor are disabled until the thread enables thread dispatching and
610starts the thread dispatch sequence.  The scheduler (an exception is the
611scheduler with thread processor affinity support) tries to avoid such a
612situation and checks if a new scheduled thread already executes on a processor.
613In case the assigned processor differs from the processor on which the thread
614already executes and this processor is a member of the processor set managed by
615this scheduler instance, it will reassign the processors to keep the already
616executing thread in place.  Therefore normal scheduler requests will not lead
617to such a situation.  Explicit thread migration requests, however, can lead to
618this situation.  Explicit thread migrations may occur due to the scheduler
619helping protocol or explicit scheduler instance changes.  The situation can
620also be provoked by interrupts which suspend and resume threads multiple times
621and produce stale asynchronous thread dispatch requests in the system.
622
623@c
624@c
625@c
626@section Operations
627
628@subsection Setting Affinity to a Single Processor
629
630On some embedded applications targeting SMP systems, it may be beneficial to
631lock individual tasks to specific processors.  In this way, one can designate a
632processor for I/O tasks, another for computation, etc..  The following
633illustrates the code sequence necessary to assign a task an affinity for
634processor with index @code{processor_index}.
635
636@example
637@group
638#include <rtems.h>
639#include <assert.h>
640
641void pin_to_processor(rtems_id task_id, int processor_index)
642@{
643  rtems_status_code sc;
644  cpu_set_t         cpuset;
645
646  CPU_ZERO(&cpuset);
647  CPU_SET(processor_index, &cpuset);
648
649  sc = rtems_task_set_affinity(task_id, sizeof(cpuset), &cpuset);
650  assert(sc == RTEMS_SUCCESSFUL);
651@}
652@end group
653@end example
654
655It is important to note that the @code{cpuset} is not validated until the
656@code{@value{DIRPREFIX}task_set_affinity} call is made. At that point,
657it is validated against the current system configuration.
658
659@c
660@c
661@c
662@section Directives
663
664This section details the symmetric multiprocessing services.  A subsection
665is dedicated to each of these services and describes the calling sequence,
666related constants, usage, and status codes.
667
668@c
669@c rtems_get_processor_count
670@c
671@page
672@subsection GET_PROCESSOR_COUNT - Get processor count
673
674@subheading CALLING SEQUENCE:
675
676@ifset is-C
677@example
678uint32_t rtems_get_processor_count(void);
679@end example
680@end ifset
681
682@ifset is-Ada
683@end ifset
684
685@subheading DIRECTIVE STATUS CODES:
686
687The count of processors in the system.
688
689@subheading DESCRIPTION:
690
691On uni-processor configurations a value of one will be returned.
692
693On SMP configurations this returns the value of a global variable set during
694system initialization to indicate the count of utilized processors.  The
695processor count depends on the physically or virtually available processors and
696application configuration.  The value will always be less than or equal to the
697maximum count of application configured processors.
698
699@subheading NOTES:
700
701None.
702
703@c
704@c rtems_get_current_processor
705@c
706@page
707@subsection GET_CURRENT_PROCESSOR - Get current processor index
708
709@subheading CALLING SEQUENCE:
710
711@ifset is-C
712@example
713uint32_t rtems_get_current_processor(void);
714@end example
715@end ifset
716
717@ifset is-Ada
718@end ifset
719
720@subheading DIRECTIVE STATUS CODES:
721
722The index of the current processor.
723
724@subheading DESCRIPTION:
725
726On uni-processor configurations a value of zero will be returned.
727
728On SMP configurations an architecture specific method is used to obtain the
729index of the current processor in the system.  The set of processor indices is
730the range of integers starting with zero up to the processor count minus one.
731
732Outside of sections with disabled thread dispatching the current processor
733index may change after every instruction since the thread may migrate from one
734processor to another.  Sections with disabled interrupts are sections with
735thread dispatching disabled.
736
737@subheading NOTES:
738
739None.
740
741@c
742@c rtems_scheduler_ident
743@c
744@page
745@subsection SCHEDULER_IDENT - Get ID of a scheduler
746
747@subheading CALLING SEQUENCE:
748
749@ifset is-C
750@example
751rtems_status_code rtems_scheduler_ident(
752  rtems_name  name,
753  rtems_id   *id
754);
755@end example
756@end ifset
757
758@ifset is-Ada
759@end ifset
760
761@subheading DIRECTIVE STATUS CODES:
762
763@code{@value{RPREFIX}SUCCESSFUL} - successful operation@*
764@code{@value{RPREFIX}INVALID_ADDRESS} - @code{id} is NULL@*
765@code{@value{RPREFIX}INVALID_NAME} - invalid scheduler name@*
766@code{@value{RPREFIX}UNSATISFIED} - - a scheduler with this name exists, but
767the processor set of this scheduler is empty
768
769@subheading DESCRIPTION:
770
771Identifies a scheduler by its name.  The scheduler name is determined by the
772scheduler configuration.  @xref{Configuring a System Configuring Clustered
773Schedulers}.
774
775@subheading NOTES:
776
777None.
778
779@c
780@c rtems_scheduler_get_processor_set
781@c
782@page
783@subsection SCHEDULER_GET_PROCESSOR_SET - Get processor set of a scheduler
784
785@subheading CALLING SEQUENCE:
786
787@ifset is-C
788@example
789rtems_status_code rtems_scheduler_get_processor_set(
790  rtems_id   scheduler_id,
791  size_t     cpusetsize,
792  cpu_set_t *cpuset
793);
794@end example
795@end ifset
796
797@ifset is-Ada
798@end ifset
799
800@subheading DIRECTIVE STATUS CODES:
801
802@code{@value{RPREFIX}SUCCESSFUL} - successful operation@*
803@code{@value{RPREFIX}INVALID_ADDRESS} - @code{cpuset} is NULL@*
804@code{@value{RPREFIX}INVALID_ID} - invalid scheduler id@*
805@code{@value{RPREFIX}INVALID_NUMBER} - the affinity set buffer is too small for
806set of processors owned by the scheduler
807
808@subheading DESCRIPTION:
809
810Returns the processor set owned by the scheduler in @code{cpuset}.  A set bit
811in the processor set means that this processor is owned by the scheduler and a
812cleared bit means the opposite.
813
814@subheading NOTES:
815
816None.
817
818@c
819@c rtems_task_get_scheduler
820@c
821@page
822@subsection TASK_GET_SCHEDULER - Get scheduler of a task
823
824@subheading CALLING SEQUENCE:
825
826@ifset is-C
827@example
828rtems_status_code rtems_task_get_scheduler(
829  rtems_id  task_id,
830  rtems_id *scheduler_id
831);
832@end example
833@end ifset
834
835@ifset is-Ada
836@end ifset
837
838@subheading DIRECTIVE STATUS CODES:
839
840@code{@value{RPREFIX}SUCCESSFUL} - successful operation@*
841@code{@value{RPREFIX}INVALID_ADDRESS} - @code{scheduler_id} is NULL@*
842@code{@value{RPREFIX}INVALID_ID} - invalid task id
843
844@subheading DESCRIPTION:
845
846Returns the scheduler identifier of a task identified by @code{task_id} in
847@code{scheduler_id}.
848
849@subheading NOTES:
850
851None.
852
853@c
854@c rtems_task_set_scheduler
855@c
856@page
857@subsection TASK_SET_SCHEDULER - Set scheduler of a task
858
859@subheading CALLING SEQUENCE:
860
861@ifset is-C
862@example
863rtems_status_code rtems_task_set_scheduler(
864  rtems_id task_id,
865  rtems_id scheduler_id
866);
867@end example
868@end ifset
869
870@ifset is-Ada
871@end ifset
872
873@subheading DIRECTIVE STATUS CODES:
874
875@code{@value{RPREFIX}SUCCESSFUL} - successful operation@*
876@code{@value{RPREFIX}INVALID_ID} - invalid task or scheduler id@*
877@code{@value{RPREFIX}INCORRECT_STATE} - the task is in the wrong state to
878perform a scheduler change
879
880@subheading DESCRIPTION:
881
882Sets the scheduler of a task identified by @code{task_id} to the scheduler
883identified by @code{scheduler_id}.  The scheduler of a task is initialized to
884the scheduler of the task that created it.
885
886@subheading NOTES:
887
888None.
889
890@subheading EXAMPLE:
891
892@example
893@group
894#include <rtems.h>
895#include <assert.h>
896
897void task(rtems_task_argument arg);
898
899void example(void)
900@{
901  rtems_status_code sc;
902  rtems_id          task_id;
903  rtems_id          scheduler_id;
904  rtems_name        scheduler_name;
905
906  scheduler_name = rtems_build_name('W', 'O', 'R', 'K');
907
908  sc = rtems_scheduler_ident(scheduler_name, &scheduler_id);
909  assert(sc == RTEMS_SUCCESSFUL);
910
911  sc = rtems_task_create(
912    rtems_build_name('T', 'A', 'S', 'K'),
913    1,
914    RTEMS_MINIMUM_STACK_SIZE,
915    RTEMS_DEFAULT_MODES,
916    RTEMS_DEFAULT_ATTRIBUTES,
917    &task_id
918  );
919  assert(sc == RTEMS_SUCCESSFUL);
920
921  sc = rtems_task_set_scheduler(task_id, scheduler_id);
922  assert(sc == RTEMS_SUCCESSFUL);
923
924  sc = rtems_task_start(task_id, task, 0);
925  assert(sc == RTEMS_SUCCESSFUL);
926@}
927@end group
928@end example
929
930@c
931@c rtems_task_get_affinity
932@c
933@page
934@subsection TASK_GET_AFFINITY - Get task processor affinity
935
936@subheading CALLING SEQUENCE:
937
938@ifset is-C
939@example
940rtems_status_code rtems_task_get_affinity(
941  rtems_id   id,
942  size_t     cpusetsize,
943  cpu_set_t *cpuset
944);
945@end example
946@end ifset
947
948@ifset is-Ada
949@end ifset
950
951@subheading DIRECTIVE STATUS CODES:
952
953@code{@value{RPREFIX}SUCCESSFUL} - successful operation@*
954@code{@value{RPREFIX}INVALID_ADDRESS} - @code{cpuset} is NULL@*
955@code{@value{RPREFIX}INVALID_ID} - invalid task id@*
956@code{@value{RPREFIX}INVALID_NUMBER} - the affinity set buffer is too small for
957the current processor affinity set of the task
958
959@subheading DESCRIPTION:
960
961Returns the current processor affinity set of the task in @code{cpuset}.  A set
962bit in the affinity set means that the task can execute on this processor and a
963cleared bit means the opposite.
964
965@subheading NOTES:
966
967None.
968
969@c
970@c rtems_task_set_affinity
971@c
972@page
973@subsection TASK_SET_AFFINITY - Set task processor affinity
974
975@subheading CALLING SEQUENCE:
976
977@ifset is-C
978@example
979rtems_status_code rtems_task_set_affinity(
980  rtems_id         id,
981  size_t           cpusetsize,
982  const cpu_set_t *cpuset
983);
984@end example
985@end ifset
986
987@ifset is-Ada
988@end ifset
989
990@subheading DIRECTIVE STATUS CODES:
991
992@code{@value{RPREFIX}SUCCESSFUL} - successful operation@*
993@code{@value{RPREFIX}INVALID_ADDRESS} - @code{cpuset} is NULL@*
994@code{@value{RPREFIX}INVALID_ID} - invalid task id@*
995@code{@value{RPREFIX}INVALID_NUMBER} - invalid processor affinity set
996
997@subheading DESCRIPTION:
998
999Sets the processor affinity set for the task specified by @code{cpuset}.  A set
1000bit in the affinity set means that the task can execute on this processor and a
1001cleared bit means the opposite.
1002
1003@subheading NOTES:
1004
1005This function will not change the scheduler of the task.  The intersection of
1006the processor affinity set and the set of processors owned by the scheduler of
1007the task must be non-empty.  It is not an error if the processor affinity set
1008contains processors that are not part of the set of processors owned by the
1009scheduler instance of the task.  A task will simply not run under normal
1010circumstances on these processors since the scheduler ignores them.  Some
1011locking protocols may temporarily use processors that are not included in the
1012processor affinity set of the task.  It is also not an error if the processor
1013affinity set contains processors that are not part of the system.
Note: See TracBrowser for help on using the repository browser.