Context Navigation

source: rtems-docs/c-user/symmetric_multiprocessing_services.rst @ 90a3c41

5

Last change on this file since 90a3c41 was 90a3c41, checked in by Sebastian Huber <sebastian.huber@…>, on 02/03/17 at 10:55:29
c-user: Add TAS and TTAS terms
Property mode set to `100644`
File size: 27.8 KB

Line
1	.. comment SPDX-License-Identifier: CC-BY-SA-4.0
2
3	.. COMMENT: COPYRIGHT (c) 2014.
4	.. COMMENT: On-Line Applications Research Corporation (OAR).
5	.. COMMENT: All rights reserved.
6
7	Symmetric Multiprocessing Services
8	**********************************
9
10	Introduction
11	============
12
13	The Symmetric Multiprocessing (SMP) support of the RTEMS 4.11.0 and later is available
14	on
15
16	- ARM,
17
18	- PowerPC, and
19
20	- SPARC.
21
22	It must be explicitly enabled via the ``--enable-smp`` configure command line
23	option. To enable SMP in the application configuration see :ref:`Enable SMP
24	Support for Applications`. The default scheduler for SMP applications supports
25	up to 32 processors and is a global fixed priority scheduler, see also
26	:ref:`Configuring Clustered Schedulers`. For example applications
27	see:file:`testsuites/smptests`.
28
29	.. warning::
30
31	The SMP support in the release of RTEMS is a work in progress. Before you
32	start using this RTEMS version for SMP ask on the RTEMS mailing list.
33
34	This chapter describes the services related to Symmetric Multiprocessing
35	provided by RTEMS.
36
37	The application level services currently provided are:
38
39	- rtems_get_processor_count_ - Get processor count
40
41	- rtems_get_current_processor_ - Get current processor index
42
43	Background
44	==========
45
46	Uniprocessor versus SMP Parallelism
47	-----------------------------------
48
49	Uniprocessor systems have long been used in embedded systems. In this hardware
50	model, there are some system execution characteristics which have long been
51	taken for granted:
52
53	- one task executes at a time
54
55	- hardware events result in interrupts
56
57	There is no true parallelism. Even when interrupts appear to occur at the same
58	time, they are processed in largely a serial fashion. This is true even when
59	the interupt service routines are allowed to nest. From a tasking viewpoint,
60	it is the responsibility of the real-time operatimg system to simulate
61	parallelism by switching between tasks. These task switches occur in response
62	to hardware interrupt events and explicit application events such as blocking
63	for a resource or delaying.
64
65	With symmetric multiprocessing, the presence of multiple processors allows for
66	true concurrency and provides for cost-effective performance
67	improvements. Uniprocessors tend to increase performance by increasing clock
68	speed and complexity. This tends to lead to hot, power hungry microprocessors
69	which are poorly suited for many embedded applications.
70
71	The true concurrency is in sharp contrast to the single task and interrupt
72	model of uniprocessor systems. This results in a fundamental change to
73	uniprocessor system characteristics listed above. Developers are faced with a
74	different set of characteristics which, in turn, break some existing
75	assumptions and result in new challenges. In an SMP system with N processors,
76	these are the new execution characteristics.
77
78	- N tasks execute in parallel
79
80	- hardware events result in interrupts
81
82	There is true parallelism with a task executing on each processor and the
83	possibility of interrupts occurring on each processor. Thus in contrast to
84	their being one task and one interrupt to consider on a uniprocessor, there are
85	N tasks and potentially N simultaneous interrupts to consider on an SMP system.
86
87	This increase in hardware complexity and presence of true parallelism results
88	in the application developer needing to be even more cautious about mutual
89	exclusion and shared data access than in a uniprocessor embedded system. Race
90	conditions that never or rarely happened when an application executed on a
91	uniprocessor system, become much more likely due to multiple threads executing
92	in parallel. On a uniprocessor system, these race conditions would only happen
93	when a task switch occurred at just the wrong moment. Now there are N-1 tasks
94	executing in parallel all the time and this results in many more opportunities
95	for small windows in critical sections to be hit.
96
97	Task Affinity
98	-------------
99	.. index:: task affinity
100	.. index:: thread affinity
101
102	RTEMS provides services to manipulate the affinity of a task. Affinity is used
103	to specify the subset of processors in an SMP system on which a particular task
104	can execute.
105
106	By default, tasks have an affinity which allows them to execute on any
107	available processor.
108
109	Task affinity is a possible feature to be supported by SMP-aware
110	schedulers. However, only a subset of the available schedulers support
111	affinity. Although the behavior is scheduler specific, if the scheduler does
112	not support affinity, it is likely to ignore all attempts to set affinity.
113
114	The scheduler with support for arbitary processor affinities uses a proof of
115	concept implementation. See https://devel.rtems.org/ticket/2510.
116
117	Task Migration
118	--------------
119	.. index:: task migration
120	.. index:: thread migration
121
122	With more than one processor in the system tasks can migrate from one processor
123	to another. There are four reasons why tasks migrate in RTEMS.
124
125	- The scheduler changes explicitly via
126	:ref:`rtems_task_set_scheduler() <rtems_task_set_scheduler>` or similar
127	directives.
128
129	- The task processor affinity changes explicitly via
130	:ref:`rtems_task_set_affinity() <rtems_task_set_affinity>` or similar
131	directives.
132
133	- The task resumes execution after a blocking operation. On a priority based
134	scheduler it will evict the lowest priority task currently assigned to a
135	processor in the processor set managed by the scheduler instance.
136
137	- The task moves temporarily to another scheduler instance due to locking
138	protocols like the :ref:`MrsP` or the :ref:`OMIP`.
139
140	Task migration should be avoided so that the working set of a task can stay on
141	the most local cache level.
142
143	Clustered Scheduling
144	--------------------
145
146	The scheduler is responsible to assign processors to some of the threads which
147	are ready to execute. Trouble starts if more ready threads than processors
148	exist at the same time. There are various rules how the processor assignment
149	can be performed attempting to fulfill additional constraints or yield some
150	overall system properties. As a matter of fact it is impossible to meet all
151	requirements at the same time. The way a scheduler works distinguishes
152	real-time operating systems from general purpose operating systems.
153
154	We have clustered scheduling in case the set of processors of a system is
155	partitioned into non-empty pairwise-disjoint subsets of processors. These
156	subsets are called clusters. Clusters with a cardinality of one are
157	partitions. Each cluster is owned by exactly one scheduler instance. In case
158	the cluster size equals the processor count, it is called global scheduling.
159
160	Modern SMP systems have multi-layer caches. An operating system which neglects
161	cache constraints in the scheduler will not yield good performance. Real-time
162	operating systems usually provide priority (fixed or job-level) based
163	schedulers so that each of the highest priority threads is assigned to a
164	processor. Priority based schedulers have difficulties in providing cache
165	locality for threads and may suffer from excessive thread migrations
166	:cite:`Brandenburg:2011:SL` :cite:`Compagnin:2014:RUN`. Schedulers that use local run
167	queues and some sort of load-balancing to improve the cache utilization may not
168	fulfill global constraints :cite:`Gujarati:2013:LPP` and are more difficult to
169	implement than one would normally expect :cite:`Lozi:2016:LSDWC`.
170
171	Clustered scheduling was implemented for RTEMS SMP to best use the cache
172	topology of a system and to keep the worst-case latencies under control. The
173	low-level SMP locks use FIFO ordering. So, the worst-case run-time of
174	operations increases with each processor involved. The scheduler configuration
175	is quite flexible and done at link-time, see :ref:`Configuring Clustered
176	Schedulers`. It is possible to re-assign processors to schedulers during
177	run-time via :ref:`rtems_scheduler_add_processor()
178	<rtems_scheduler_add_processor>` and :ref:`rtems_scheduler_remove_processor()
179	<rtems_scheduler_remove_processor>`. The schedulers are implemented in an
180	object-oriented fashion.
181
182	The problem is to provide synchronization
183	primitives for inter-cluster synchronization (more than one cluster is involved
184	in the synchronization process). In RTEMS there are currently some means
185	available
186
187	- events,
188
189	- message queues,
190
191	- mutexes using the :ref:`OMIP`,
192
193	- mutexes using the :ref:`MrsP`, and
194
195	- binary and counting semaphores.
196
197	The clustered scheduling approach enables separation of functions with
198	real-time requirements and functions that profit from fairness and high
199	throughput provided the scheduler instances are fully decoupled and adequate
200	inter-cluster synchronization primitives are used.
201
202	To set the scheduler of a task see :ref:`rtems_scheduler_ident()
203	<rtems_scheduler_ident>` and :ref:`rtems_task_set_scheduler()
204	<rtems_task_set_scheduler>`.
205
206	OpenMP
207	------
208
209	OpenMP support for RTEMS is available via the GCC provided libgomp. There is
210	libgomp support for RTEMS in the POSIX configuration of libgomp since GCC 4.9
211	(requires a Newlib snapshot after 2015-03-12). In GCC 6.1 or later (requires a
212	Newlib snapshot after 2015-07-30 for <sys/lock.h> provided self-contained
213	synchronization objects) there is a specialized libgomp configuration for RTEMS
214	which offers a significantly better performance compared to the POSIX
215	configuration of libgomp. In addition application configurable thread pools
216	for each scheduler instance are available in GCC 6.1 or later.
217
218	The run-time configuration of libgomp is done via environment variables
219	documented in the `libgomp manual <https://gcc.gnu.org/onlinedocs/libgomp/>`_.
220	The environment variables are evaluated in a constructor function which
221	executes in the context of the first initialization task before the actual
222	initialization task function is called (just like a global C++ constructor).
223	To set application specific values, a higher priority constructor function must
224	be used to set up the environment variables.
225
226	.. code-block:: c
227
228	#include <stdlib.h>
229	void __attribute__((constructor(1000))) config_libgomp( void )
230	{
231	setenv( "OMP_DISPLAY_ENV", "VERBOSE", 1 );
232	setenv( "GOMP_SPINCOUNT", "30000", 1 );
233	setenv( "GOMP_RTEMS_THREAD_POOLS", "1$2@SCHD", 1 );
234	}
235
236	The environment variable ``GOMP_RTEMS_THREAD_POOLS`` is RTEMS-specific. It
237	determines the thread pools for each scheduler instance. The format for
238	``GOMP_RTEMS_THREAD_POOLS`` is a list of optional
239	``<thread-pool-count>[$<priority>]@<scheduler-name>`` configurations separated
240	by ``:`` where:
241
242	- ``<thread-pool-count>`` is the thread pool count for this scheduler instance.
243
244	- ``$<priority>`` is an optional priority for the worker threads of a thread
245	pool according to ``pthread_setschedparam``. In case a priority value is
246	omitted, then a worker thread will inherit the priority of the OpenMP master
247	thread that created it. The priority of the worker thread is not changed by
248	libgomp after creation, even if a new OpenMP master thread using the worker
249	has a different priority.
250
251	- ``@<scheduler-name>`` is the scheduler instance name according to the RTEMS
252	application configuration.
253
254	In case no thread pool configuration is specified for a scheduler instance,
255	then each OpenMP master thread of this scheduler instance will use its own
256	dynamically allocated thread pool. To limit the worker thread count of the
257	thread pools, each OpenMP master thread must call ``omp_set_num_threads``.
258
259	Lets suppose we have three scheduler instances ``IO``, ``WRK0``, and ``WRK1``
260	with ``GOMP_RTEMS_THREAD_POOLS`` set to ``"1@WRK0:3$4@WRK1"``. Then there are
261	no thread pool restrictions for scheduler instance ``IO``. In the scheduler
262	instance ``WRK0`` there is one thread pool available. Since no priority is
263	specified for this scheduler instance, the worker thread inherits the priority
264	of the OpenMP master thread that created it. In the scheduler instance
265	``WRK1`` there are three thread pools available and their worker threads run at
266	priority four.
267
268	Application Issues
269	==================
270
271	Most operating system services provided by the uni-processor RTEMS are
272	available in SMP configurations as well. However, applications designed for an
273	uni-processor environment may need some changes to correctly run in an SMP
274	configuration.
275
276	As discussed earlier, SMP systems have opportunities for true parallelism which
277	was not possible on uni-processor systems. Consequently, multiple techniques
278	that provided adequate critical sections on uni-processor systems are unsafe on
279	SMP systems. In this section, some of these unsafe techniques will be
280	discussed.
281
282	In general, applications must use proper operating system provided mutual
283	exclusion mechanisms to ensure correct behavior.
284
285	Task variables
286	--------------
287
288	Task variables are ordinary global variables with a dedicated value for each
289	thread. During a context switch from the executing thread to the heir thread,
290	the value of each task variable is saved to the thread control block of the
291	executing thread and restored from the thread control block of the heir thread.
292	This is inherently broken if more than one executing thread exists.
293	Alternatives to task variables are POSIX keys and :ref:`TLS <TLS>`. All use
294	cases of task variables in the RTEMS code base were replaced with alternatives.
295	The task variable API has been removed in RTEMS 4.12.
296
297	Highest Priority Thread Never Walks Alone
298	-----------------------------------------
299
300	On a uni-processor system, it is safe to assume that when the highest priority
301	task in an application executes, it will execute without being preempted until
302	it voluntarily blocks. Interrupts may occur while it is executing, but there
303	will be no context switch to another task unless the highest priority task
304	voluntarily initiates it.
305
306	Given the assumption that no other tasks will have their execution interleaved
307	with the highest priority task, it is possible for this task to be constructed
308	such that it does not need to acquire a mutex for protected access to shared
309	data.
310
311	In an SMP system, it cannot be assumed there will never be a single task
312	executing. It should be assumed that every processor is executing another
313	application task. Further, those tasks will be ones which would not have been
314	executed in a uni-processor configuration and should be assumed to have data
315	synchronization conflicts with what was formerly the highest priority task
316	which executed without conflict.
317
318	Disabling of Thread Pre-Emption
319	-------------------------------
320
321	A thread which disables pre-emption prevents that a higher priority thread gets
322	hold of its processor involuntarily. In uni-processor configurations, this can
323	be used to ensure mutual exclusion at thread level. In SMP configurations,
324	however, more than one executing thread may exist. Thus, it is impossible to
325	ensure mutual exclusion using this mechanism. In order to prevent that
326	applications using pre-emption for this purpose, would show inappropriate
327	behaviour, this feature is disabled in SMP configurations and its use would
328	case run-time errors.
329
330	Disabling of Interrupts
331	-----------------------
332
333	A low overhead means that ensures mutual exclusion in uni-processor
334	configurations is the disabling of interrupts around a critical section. This
335	is commonly used in device driver code. In SMP configurations, however,
336	disabling the interrupts on one processor has no effect on other processors.
337	So, this is insufficient to ensure system-wide mutual exclusion. The macros
338
339	* :ref:`rtems_interrupt_disable() <rtems_interrupt_disable>`,
340
341	* :ref:`rtems_interrupt_enable() <rtems_interrupt_enable>`, and
342
343	* :ref:`rtems_interrupt_flash() <rtems_interrupt_flash>`.
344
345	are disabled in SMP configurations and its use will cause compile-time warnings
346	and link-time errors. In the unlikely case that interrupts must be disabled on
347	the current processor, the
348
349	* :ref:`rtems_interrupt_local_disable() <rtems_interrupt_local_disable>`, and
350
351	* :ref:`rtems_interrupt_local_enable() <rtems_interrupt_local_enable>`.
352
353	macros are now available in all configurations.
354
355	Since disabling of interrupts is insufficient to ensure system-wide mutual
356	exclusion on SMP a new low-level synchronization primitive was added --
357	interrupt locks. The interrupt locks are a simple API layer on top of the SMP
358	locks used for low-level synchronization in the operating system core.
359	Currently, they are implemented as a ticket lock. In uni-processor
360	configurations, they degenerate to simple interrupt disable/enable sequences by
361	means of the C pre-processor. It is disallowed to acquire a single interrupt
362	lock in a nested way. This will result in an infinite loop with interrupts
363	disabled. While converting legacy code to interrupt locks, care must be taken
364	to avoid this situation to happen.
365
366	.. code-block:: c
367	:linenos:
368
369	#include <rtems.h>
370
371	void legacy_code_with_interrupt_disable_enable( void )
372	{
373	rtems_interrupt_level level;
374
375	rtems_interrupt_disable( level );
376	/* Critical section */
377	rtems_interrupt_enable( level );
378	}
379
380	RTEMS_INTERRUPT_LOCK_DEFINE( static, lock, "Name" )
381
382	void smp_ready_code_with_interrupt_lock( void )
383	{
384	rtems_interrupt_lock_context lock_context;
385
386	rtems_interrupt_lock_acquire( &lock, &lock_context );
387	/* Critical section */
388	rtems_interrupt_lock_release( &lock, &lock_context );
389	}
390
391	An alternative to the RTEMS-specific interrupt locks are POSIX spinlocks. The
392	:c:type:`pthread_spinlock_t` is defined as a self-contained object, e.g. the
393	user must provide the storage for this synchronization object.
394
395	.. code-block:: c
396	:linenos:
397
398	#include <assert.h>
399	#include <pthread.h>
400
401	pthread_spinlock_t lock;
402
403	void smp_ready_code_with_posix_spinlock( void )
404	{
405	int error;
406
407	error = pthread_spin_lock( &lock );
408	assert( error == 0 );
409	/* Critical section */
410	error = pthread_spin_unlock( &lock );
411	assert( error == 0 );
412	}
413
414	In contrast to POSIX spinlock implementation on Linux or FreeBSD, it is not
415	allowed to call blocking operating system services inside the critical section.
416	A recursive lock attempt is a severe usage error resulting in an infinite loop
417	with interrupts disabled. Nesting of different locks is allowed. The user
418	must ensure that no deadlock can occur. As a non-portable feature the locks
419	are zero-initialized, e.g. statically initialized global locks reside in the
420	``.bss`` section and there is no need to call :c:func:`pthread_spin_init`.
421
422	Interrupt Service Routines Execute in Parallel With Threads
423	-----------------------------------------------------------
424
425	On a machine with more than one processor, interrupt service routines (this
426	includes timer service routines installed via :ref:`rtems_timer_fire_after()
427	<rtems_timer_fire_after>`) and threads can execute in parallel. Interrupt
428	service routines must take this into account and use proper locking mechanisms
429	to protect critical sections from interference by threads (interrupt locks or
430	POSIX spinlocks). This likely requires code modifications in legacy device
431	drivers.
432
433	Timers Do Not Stop Immediately
434	------------------------------
435
436	Timer service routines run in the context of the clock interrupt. On
437	uni-processor configurations, it is sufficient to disable interrupts and remove
438	a timer from the set of active timers to stop it. In SMP configurations,
439	however, the timer service routine may already run and wait on an SMP lock
440	owned by the thread which is about to stop the timer. This opens the door to
441	subtle synchronization issues. During destruction of objects, special care
442	must be taken to ensure that timer service routines cannot access (partly or
443	fully) destroyed objects.
444
445	False Sharing of Cache Lines Due to Objects Table
446	-------------------------------------------------
447
448	The Classic API and most POSIX API objects are indirectly accessed via an
449	object identifier. The user-level functions validate the object identifier and
450	map it to the actual object structure which resides in a global objects table
451	for each object class. So, unrelated objects are packed together in a table.
452	This may result in false sharing of cache lines. The effect of false sharing
453	of cache lines can be observed with the `TMFINE 1
454	<https://git.rtems.org/rtems/tree/testsuites/tmtests/tmfine01>`_ test program
455	on a suitable platform, e.g. QorIQ T4240. High-performance SMP applications
456	need full control of the object storage :cite:`Drepper:2007:Memory`.
457	Therefore, self-contained synchronization objects are now available for RTEMS.
458
459	Directives
460	==========
461
462	This section details the symmetric multiprocessing services. A subsection is
463	dedicated to each of these services and describes the calling sequence, related
464	constants, usage, and status codes.
465
466	.. raw:: latex
467
468	\clearpage
469
470	.. _rtems_get_processor_count:
471
472	GET_PROCESSOR_COUNT - Get processor count
473	-----------------------------------------
474
475	CALLING SEQUENCE:
476	.. code-block:: c
477
478	uint32_t rtems_get_processor_count(void);
479
480	DIRECTIVE STATUS CODES:
481	The count of processors in the system.
482
483	DESCRIPTION:
484	In uni-processor configurations, a value of one will be returned.
485
486	In SMP configurations, this returns the value of a global variable set
487	during system initialization to indicate the count of utilized processors.
488	The processor count depends on the physically or virtually available
489	processors and application configuration. The value will always be less
490	than or equal to the maximum count of application configured processors.
491
492	NOTES:
493	None.
494
495	.. raw:: latex
496
497	\clearpage
498
499	.. _rtems_get_current_processor:
500
501	GET_CURRENT_PROCESSOR - Get current processor index
502	---------------------------------------------------
503
504	CALLING SEQUENCE:
505	.. code-block:: c
506
507	uint32_t rtems_get_current_processor(void);
508
509	DIRECTIVE STATUS CODES:
510	The index of the current processor.
511
512	DESCRIPTION:
513	In uni-processor configurations, a value of zero will be returned.
514
515	In SMP configurations, an architecture specific method is used to obtain the
516	index of the current processor in the system. The set of processor indices
517	is the range of integers starting with zero up to the processor count minus
518	one.
519
520	Outside of sections with disabled thread dispatching the current processor
521	index may change after every instruction since the thread may migrate from
522	one processor to another. Sections with disabled interrupts are sections
523	with thread dispatching disabled.
524
525	NOTES:
526	None.
527
528	Implementation Details
529	======================
530
531	This section covers some implementation details of the RTEMS SMP support.
532
533	Low-Level Synchronization
534	-------------------------
535
536	All low-level synchronization primitives are implemented using :term:`C11`
537	atomic operations, so no target-specific hand-written assembler code is
538	necessary. Four synchronization primitives are currently available
539
540	* ticket locks (mutual exclusion),
541
542	* :term:`MCS` locks (mutual exclusion),
543
544	* barriers, implemented as a sense barrier, and
545
546	* sequence locks :cite:`Boehm:2012:Seqlock`.
547
548	A vital requirement for low-level mutual exclusion is :term:`FIFO` fairness
549	since we are interested in a predictable system and not maximum throughput.
550	With this requirement, there are only few options to resolve this problem. For
551	reasons of simplicity, the ticket lock algorithm was chosen to implement the
552	SMP locks. However, the API is capable to support MCS locks, which may be
553	interesting in the future for systems with a processor count in the range of 32
554	or more, e.g. :term:`NUMA`, many-core systems.
555
556	The test program `SMPLOCK 1
557	<https://git.rtems.org/rtems/tree/testsuites/smptests/smplock01>`_ can be used
558	to gather performance and fairness data for several scenarios. The SMP lock
559	performance and fairness measured on the QorIQ T4240 follows as an example.
560	This chip contains three L2 caches. Each L2 cache is shared by eight
561	processors.
562
563	.. image:: ../images/c_user/smplock01perf-t4240.*
564	:width: 400
565	:align: center
566
567	.. image:: ../images/c_user/smplock01fair-t4240.*
568	:width: 400
569	:align: center
570
571	Scheduler Helping Protocol
572	--------------------------
573
574	The scheduler provides a helping protocol to support locking protocols like the
575	:ref:`OMIP` or the :ref:`MrsP`. Each thread has a scheduler node for each
576	scheduler instance in the system which are located in its :term:`TCB`. A
577	thread has exactly one home scheduler instance which is set during thread
578	creation. The home scheduler instance can be changed with
579	:ref:`rtems_task_set_scheduler() <rtems_task_set_scheduler>`. Due to the
580	locking protocols a thread may gain access to scheduler nodes of other
581	scheduler instances. This allows the thread to temporarily migrate to another
582	scheduler instance in case of pre-emption.
583
584	The scheduler infrastructure is based on an object-oriented design. The
585	scheduler operations for a thread are defined as virtual functions. For the
586	scheduler helping protocol the following operations must be implemented by an
587	SMP-aware scheduler
588
589	* ask a scheduler node for help,
590	* reconsider the help request of a scheduler node,
591	* withdraw a schedule node.
592
593	All currently available SMP-aware schedulers use a framework which is
594	customized via inline functions. This eases the implementation of scheduler
595	variants. Up to now, only priority-based schedulers are implemented.
596
597	In case a thread is allowed to use more than one scheduler node it will ask
598	these nodes for help
599
600	* in case of pre-emption, or
601	* an unblock did not schedule the thread, or
602	* a yield was successful.
603
604	The actual ask for help scheduler operations are carried out as a side-effect
605	of the thread dispatch procedure. Once a need for help is recognized, a help
606	request is registered in one of the processors related to the thread and a
607	thread dispatch is issued. This indirection leads to a better decoupling of
608	scheduler instances. Unrelated processors are not burdened with extra work for
609	threads which participate in resource sharing. Each ask for help operation
610	indicates if it could help or not. The procedure stops after the first
611	successful ask for help. Unsuccessful ask for help operations will register
612	this need in the scheduler context.
613
614	After a thread dispatch the reconsider help request operation is used to clean
615	up stale help registrations in the scheduler contexts.
616
617	The withdraw operation takes away scheduler nodes once the thread is no longer
618	allowed to use them, e.g. it released a mutex. The availability of scheduler
619	nodes for a thread is controlled by the thread queues.
620
621	Thread Dispatch Details
622	-----------------------
623
624	This section gives background information to developers interested in the
625	interrupt latencies introduced by thread dispatching. A thread dispatch
626	consists of all work which must be done to stop the currently executing thread
627	on a processor and hand over this processor to an heir thread.
628
629	In SMP systems, scheduling decisions on one processor must be propagated
630	to other processors through inter-processor interrupts. A thread dispatch
631	which must be carried out on another processor does not happen instantaneously.
632	Thus, several thread dispatch requests might be in the air and it is possible
633	that some of them may be out of date before the corresponding processor has
634	time to deal with them. The thread dispatch mechanism uses three per-processor
635	variables,
636
637	- the executing thread,
638
639	- the heir thread, and
640
641	- a boolean flag indicating if a thread dispatch is necessary or not.
642
643	Updates of the heir thread are done via a normal store operation. The thread
644	dispatch necessary indicator of another processor is set as a side-effect of an
645	inter-processor interrupt. So, this change notification works without the use
646	of locks. The thread context is protected by a :term:`TTAS` lock embedded in
647	the context to ensure that it is used on at most one processor at a time.
648	Normally, only thread-specific or per-processor locks are used during a thread
649	dispatch. This implementation turned out to be quite efficient and no lock
650	contention was observed in the testsuite. The heavy-weight thread dispatch
651	sequence is only entered in case the thread dispatch indicator is set.
652
653	The context-switch is performed with interrupts enabled. During the transition
654	from the executing to the heir thread neither the stack of the executing nor
655	the heir thread must be used during interrupt processing. For this purpose a
656	temporary per-processor stack is set up which may be used by the interrupt
657	prologue before the stack is switched to the interrupt stack.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: