1 | .. comment SPDX-License-Identifier: CC-BY-SA-4.0 |
---|
2 | |
---|
3 | .. COMMENT: COPYRIGHT (c) 2014. |
---|
4 | .. COMMENT: On-Line Applications Research Corporation (OAR). |
---|
5 | .. COMMENT: All rights reserved. |
---|
6 | |
---|
7 | Symmetric Multiprocessing Services |
---|
8 | ********************************** |
---|
9 | |
---|
10 | Introduction |
---|
11 | ============ |
---|
12 | |
---|
13 | The Symmetric Multiprocessing (SMP) support of the RTEMS 4.11.0 and later is available |
---|
14 | on |
---|
15 | |
---|
16 | - ARM, |
---|
17 | |
---|
18 | - PowerPC, and |
---|
19 | |
---|
20 | - SPARC. |
---|
21 | |
---|
22 | It must be explicitly enabled via the ``--enable-smp`` configure command line |
---|
23 | option. To enable SMP in the application configuration see :ref:`Enable SMP |
---|
24 | Support for Applications`. The default scheduler for SMP applications supports |
---|
25 | up to 32 processors and is a global fixed priority scheduler, see also |
---|
26 | :ref:`Configuring Clustered Schedulers`. For example applications |
---|
27 | see:file:`testsuites/smptests`. |
---|
28 | |
---|
29 | .. warning:: |
---|
30 | |
---|
31 | The SMP support in the release of RTEMS is a work in progress. Before you |
---|
32 | start using this RTEMS version for SMP ask on the RTEMS mailing list. |
---|
33 | |
---|
34 | This chapter describes the services related to Symmetric Multiprocessing |
---|
35 | provided by RTEMS. |
---|
36 | |
---|
37 | The application level services currently provided are: |
---|
38 | |
---|
39 | - rtems_get_processor_count_ - Get processor count |
---|
40 | |
---|
41 | - rtems_get_current_processor_ - Get current processor index |
---|
42 | |
---|
43 | Background |
---|
44 | ========== |
---|
45 | |
---|
46 | Uniprocessor versus SMP Parallelism |
---|
47 | ----------------------------------- |
---|
48 | |
---|
49 | Uniprocessor systems have long been used in embedded systems. In this hardware |
---|
50 | model, there are some system execution characteristics which have long been |
---|
51 | taken for granted: |
---|
52 | |
---|
53 | - one task executes at a time |
---|
54 | |
---|
55 | - hardware events result in interrupts |
---|
56 | |
---|
57 | There is no true parallelism. Even when interrupts appear to occur at the same |
---|
58 | time, they are processed in largely a serial fashion. This is true even when |
---|
59 | the interupt service routines are allowed to nest. From a tasking viewpoint, |
---|
60 | it is the responsibility of the real-time operatimg system to simulate |
---|
61 | parallelism by switching between tasks. These task switches occur in response |
---|
62 | to hardware interrupt events and explicit application events such as blocking |
---|
63 | for a resource or delaying. |
---|
64 | |
---|
65 | With symmetric multiprocessing, the presence of multiple processors allows for |
---|
66 | true concurrency and provides for cost-effective performance |
---|
67 | improvements. Uniprocessors tend to increase performance by increasing clock |
---|
68 | speed and complexity. This tends to lead to hot, power hungry microprocessors |
---|
69 | which are poorly suited for many embedded applications. |
---|
70 | |
---|
71 | The true concurrency is in sharp contrast to the single task and interrupt |
---|
72 | model of uniprocessor systems. This results in a fundamental change to |
---|
73 | uniprocessor system characteristics listed above. Developers are faced with a |
---|
74 | different set of characteristics which, in turn, break some existing |
---|
75 | assumptions and result in new challenges. In an SMP system with N processors, |
---|
76 | these are the new execution characteristics. |
---|
77 | |
---|
78 | - N tasks execute in parallel |
---|
79 | |
---|
80 | - hardware events result in interrupts |
---|
81 | |
---|
82 | There is true parallelism with a task executing on each processor and the |
---|
83 | possibility of interrupts occurring on each processor. Thus in contrast to |
---|
84 | their being one task and one interrupt to consider on a uniprocessor, there are |
---|
85 | N tasks and potentially N simultaneous interrupts to consider on an SMP system. |
---|
86 | |
---|
87 | This increase in hardware complexity and presence of true parallelism results |
---|
88 | in the application developer needing to be even more cautious about mutual |
---|
89 | exclusion and shared data access than in a uniprocessor embedded system. Race |
---|
90 | conditions that never or rarely happened when an application executed on a |
---|
91 | uniprocessor system, become much more likely due to multiple threads executing |
---|
92 | in parallel. On a uniprocessor system, these race conditions would only happen |
---|
93 | when a task switch occurred at just the wrong moment. Now there are N-1 tasks |
---|
94 | executing in parallel all the time and this results in many more opportunities |
---|
95 | for small windows in critical sections to be hit. |
---|
96 | |
---|
97 | Task Affinity |
---|
98 | ------------- |
---|
99 | .. index:: task affinity |
---|
100 | .. index:: thread affinity |
---|
101 | |
---|
102 | RTEMS provides services to manipulate the affinity of a task. Affinity is used |
---|
103 | to specify the subset of processors in an SMP system on which a particular task |
---|
104 | can execute. |
---|
105 | |
---|
106 | By default, tasks have an affinity which allows them to execute on any |
---|
107 | available processor. |
---|
108 | |
---|
109 | Task affinity is a possible feature to be supported by SMP-aware |
---|
110 | schedulers. However, only a subset of the available schedulers support |
---|
111 | affinity. Although the behavior is scheduler specific, if the scheduler does |
---|
112 | not support affinity, it is likely to ignore all attempts to set affinity. |
---|
113 | |
---|
114 | The scheduler with support for arbitary processor affinities uses a proof of |
---|
115 | concept implementation. See https://devel.rtems.org/ticket/2510. |
---|
116 | |
---|
117 | Task Migration |
---|
118 | -------------- |
---|
119 | .. index:: task migration |
---|
120 | .. index:: thread migration |
---|
121 | |
---|
122 | With more than one processor in the system tasks can migrate from one processor |
---|
123 | to another. There are four reasons why tasks migrate in RTEMS. |
---|
124 | |
---|
125 | - The scheduler changes explicitly via |
---|
126 | :ref:`rtems_task_set_scheduler() <rtems_task_set_scheduler>` or similar |
---|
127 | directives. |
---|
128 | |
---|
129 | - The task processor affinity changes explicitly via |
---|
130 | :ref:`rtems_task_set_affinity() <rtems_task_set_affinity>` or similar |
---|
131 | directives. |
---|
132 | |
---|
133 | - The task resumes execution after a blocking operation. On a priority based |
---|
134 | scheduler it will evict the lowest priority task currently assigned to a |
---|
135 | processor in the processor set managed by the scheduler instance. |
---|
136 | |
---|
137 | - The task moves temporarily to another scheduler instance due to locking |
---|
138 | protocols like the :ref:`MrsP` or the :ref:`OMIP`. |
---|
139 | |
---|
140 | Task migration should be avoided so that the working set of a task can stay on |
---|
141 | the most local cache level. |
---|
142 | |
---|
143 | Clustered Scheduling |
---|
144 | -------------------- |
---|
145 | |
---|
146 | The scheduler is responsible to assign processors to some of the threads which |
---|
147 | are ready to execute. Trouble starts if more ready threads than processors |
---|
148 | exist at the same time. There are various rules how the processor assignment |
---|
149 | can be performed attempting to fulfill additional constraints or yield some |
---|
150 | overall system properties. As a matter of fact it is impossible to meet all |
---|
151 | requirements at the same time. The way a scheduler works distinguishes |
---|
152 | real-time operating systems from general purpose operating systems. |
---|
153 | |
---|
154 | We have clustered scheduling in case the set of processors of a system is |
---|
155 | partitioned into non-empty pairwise-disjoint subsets of processors. These |
---|
156 | subsets are called clusters. Clusters with a cardinality of one are |
---|
157 | partitions. Each cluster is owned by exactly one scheduler instance. In case |
---|
158 | the cluster size equals the processor count, it is called global scheduling. |
---|
159 | |
---|
160 | Modern SMP systems have multi-layer caches. An operating system which neglects |
---|
161 | cache constraints in the scheduler will not yield good performance. Real-time |
---|
162 | operating systems usually provide priority (fixed or job-level) based |
---|
163 | schedulers so that each of the highest priority threads is assigned to a |
---|
164 | processor. Priority based schedulers have difficulties in providing cache |
---|
165 | locality for threads and may suffer from excessive thread migrations |
---|
166 | :cite:`Brandenburg:2011:SL` :cite:`Compagnin:2014:RUN`. Schedulers that use local run |
---|
167 | queues and some sort of load-balancing to improve the cache utilization may not |
---|
168 | fulfill global constraints :cite:`Gujarati:2013:LPP` and are more difficult to |
---|
169 | implement than one would normally expect :cite:`Lozi:2016:LSDWC`. |
---|
170 | |
---|
171 | Clustered scheduling was implemented for RTEMS SMP to best use the cache |
---|
172 | topology of a system and to keep the worst-case latencies under control. The |
---|
173 | low-level SMP locks use FIFO ordering. So, the worst-case run-time of |
---|
174 | operations increases with each processor involved. The scheduler configuration |
---|
175 | is quite flexible and done at link-time, see :ref:`Configuring Clustered |
---|
176 | Schedulers`. It is possible to re-assign processors to schedulers during |
---|
177 | run-time via :ref:`rtems_scheduler_add_processor() |
---|
178 | <rtems_scheduler_add_processor>` and :ref:`rtems_scheduler_remove_processor() |
---|
179 | <rtems_scheduler_remove_processor>`. The schedulers are implemented in an |
---|
180 | object-oriented fashion. |
---|
181 | |
---|
182 | The problem is to provide synchronization |
---|
183 | primitives for inter-cluster synchronization (more than one cluster is involved |
---|
184 | in the synchronization process). In RTEMS there are currently some means |
---|
185 | available |
---|
186 | |
---|
187 | - events, |
---|
188 | |
---|
189 | - message queues, |
---|
190 | |
---|
191 | - mutexes using the :ref:`OMIP`, |
---|
192 | |
---|
193 | - mutexes using the :ref:`MrsP`, and |
---|
194 | |
---|
195 | - binary and counting semaphores. |
---|
196 | |
---|
197 | The clustered scheduling approach enables separation of functions with |
---|
198 | real-time requirements and functions that profit from fairness and high |
---|
199 | throughput provided the scheduler instances are fully decoupled and adequate |
---|
200 | inter-cluster synchronization primitives are used. |
---|
201 | |
---|
202 | To set the scheduler of a task see :ref:`rtems_scheduler_ident() |
---|
203 | <rtems_scheduler_ident>` and :ref:`rtems_task_set_scheduler() |
---|
204 | <rtems_task_set_scheduler>`. |
---|
205 | |
---|
206 | Scheduler Helping Protocol |
---|
207 | -------------------------- |
---|
208 | |
---|
209 | The scheduler provides a helping protocol to support locking protocols like |
---|
210 | *Migratory Priority Inheritance* or the *Multiprocessor Resource Sharing |
---|
211 | Protocol*. Each ready task can use at least one scheduler node at a time to |
---|
212 | gain access to a processor. Each scheduler node has an owner, a user and an |
---|
213 | optional idle task. The owner of a scheduler node is determined a task |
---|
214 | creation and never changes during the life time of a scheduler node. The user |
---|
215 | of a scheduler node may change due to the scheduler helping protocol. A |
---|
216 | scheduler node is in one of the four scheduler help states: |
---|
217 | |
---|
218 | :dfn:`help yourself` |
---|
219 | This scheduler node is solely used by the owner task. This task owns no |
---|
220 | resources using a helping protocol and thus does not take part in the |
---|
221 | scheduler helping protocol. No help will be provided for other tasks. |
---|
222 | |
---|
223 | :dfn:`help active owner` |
---|
224 | This scheduler node is owned by a task actively owning a resource and can |
---|
225 | be used to help out tasks. In case this scheduler node changes its state |
---|
226 | from ready to scheduled and the task executes using another node, then an |
---|
227 | idle task will be provided as a user of this node to temporarily execute on |
---|
228 | behalf of the owner task. Thus lower priority tasks are denied access to |
---|
229 | the processors of this scheduler instance. In case a task actively owning |
---|
230 | a resource performs a blocking operation, then an idle task will be used |
---|
231 | also in case this node is in the scheduled state. |
---|
232 | |
---|
233 | :dfn:`help active rival` |
---|
234 | This scheduler node is owned by a task actively obtaining a resource |
---|
235 | currently owned by another task and can be used to help out tasks. The |
---|
236 | task owning this node is ready and will give away its processor in case the |
---|
237 | task owning the resource asks for help. |
---|
238 | |
---|
239 | :dfn:`help passive` |
---|
240 | This scheduler node is owned by a task obtaining a resource currently owned |
---|
241 | by another task and can be used to help out tasks. The task owning this |
---|
242 | node is blocked. |
---|
243 | |
---|
244 | The following scheduler operations return a task in need for help |
---|
245 | |
---|
246 | - unblock, |
---|
247 | |
---|
248 | - change priority, |
---|
249 | |
---|
250 | - yield, and |
---|
251 | |
---|
252 | - ask for help. |
---|
253 | |
---|
254 | A task in need for help is a task that encounters a scheduler state change from |
---|
255 | scheduled to ready (this is a pre-emption by a higher priority task) or a task |
---|
256 | that cannot be scheduled in an unblock operation. Such a task can ask tasks |
---|
257 | which depend on resources owned by this task for help. |
---|
258 | |
---|
259 | In case it is not possible to schedule a task in need for help, then the |
---|
260 | scheduler nodes available for the task will be placed into the set of ready |
---|
261 | scheduler nodes of the corresponding scheduler instances. Once a state change |
---|
262 | from ready to scheduled happens for one of scheduler nodes it will be used to |
---|
263 | schedule the task in need for help. |
---|
264 | |
---|
265 | The ask for help scheduler operation is used to help tasks in need for help |
---|
266 | returned by the operations mentioned above. This operation is also used in |
---|
267 | case the root of a resource sub-tree owned by a task changes. |
---|
268 | |
---|
269 | The run-time of the ask for help procedures depend on the size of the resource |
---|
270 | tree of the task needing help and other resource trees in case tasks in need |
---|
271 | for help are produced during this operation. Thus the worst-case latency in |
---|
272 | the system depends on the maximum resource tree size of the application. |
---|
273 | |
---|
274 | OpenMP |
---|
275 | ------ |
---|
276 | |
---|
277 | OpenMP support for RTEMS is available via the GCC provided libgomp. There is |
---|
278 | libgomp support for RTEMS in the POSIX configuration of libgomp since GCC 4.9 |
---|
279 | (requires a Newlib snapshot after 2015-03-12). In GCC 6.1 or later (requires a |
---|
280 | Newlib snapshot after 2015-07-30 for <sys/lock.h> provided self-contained |
---|
281 | synchronization objects) there is a specialized libgomp configuration for RTEMS |
---|
282 | which offers a significantly better performance compared to the POSIX |
---|
283 | configuration of libgomp. In addition application configurable thread pools |
---|
284 | for each scheduler instance are available in GCC 6.1 or later. |
---|
285 | |
---|
286 | The run-time configuration of libgomp is done via environment variables |
---|
287 | documented in the `libgomp manual <https://gcc.gnu.org/onlinedocs/libgomp/>`_. |
---|
288 | The environment variables are evaluated in a constructor function which |
---|
289 | executes in the context of the first initialization task before the actual |
---|
290 | initialization task function is called (just like a global C++ constructor). |
---|
291 | To set application specific values, a higher priority constructor function must |
---|
292 | be used to set up the environment variables. |
---|
293 | |
---|
294 | .. code-block:: c |
---|
295 | |
---|
296 | #include <stdlib.h> |
---|
297 | void __attribute__((constructor(1000))) config_libgomp( void ) |
---|
298 | { |
---|
299 | setenv( "OMP_DISPLAY_ENV", "VERBOSE", 1 ); |
---|
300 | setenv( "GOMP_SPINCOUNT", "30000", 1 ); |
---|
301 | setenv( "GOMP_RTEMS_THREAD_POOLS", "1$2@SCHD", 1 ); |
---|
302 | } |
---|
303 | |
---|
304 | The environment variable ``GOMP_RTEMS_THREAD_POOLS`` is RTEMS-specific. It |
---|
305 | determines the thread pools for each scheduler instance. The format for |
---|
306 | ``GOMP_RTEMS_THREAD_POOLS`` is a list of optional |
---|
307 | ``<thread-pool-count>[$<priority>]@<scheduler-name>`` configurations separated |
---|
308 | by ``:`` where: |
---|
309 | |
---|
310 | - ``<thread-pool-count>`` is the thread pool count for this scheduler instance. |
---|
311 | |
---|
312 | - ``$<priority>`` is an optional priority for the worker threads of a thread |
---|
313 | pool according to ``pthread_setschedparam``. In case a priority value is |
---|
314 | omitted, then a worker thread will inherit the priority of the OpenMP master |
---|
315 | thread that created it. The priority of the worker thread is not changed by |
---|
316 | libgomp after creation, even if a new OpenMP master thread using the worker |
---|
317 | has a different priority. |
---|
318 | |
---|
319 | - ``@<scheduler-name>`` is the scheduler instance name according to the RTEMS |
---|
320 | application configuration. |
---|
321 | |
---|
322 | In case no thread pool configuration is specified for a scheduler instance, |
---|
323 | then each OpenMP master thread of this scheduler instance will use its own |
---|
324 | dynamically allocated thread pool. To limit the worker thread count of the |
---|
325 | thread pools, each OpenMP master thread must call ``omp_set_num_threads``. |
---|
326 | |
---|
327 | Lets suppose we have three scheduler instances ``IO``, ``WRK0``, and ``WRK1`` |
---|
328 | with ``GOMP_RTEMS_THREAD_POOLS`` set to ``"1@WRK0:3$4@WRK1"``. Then there are |
---|
329 | no thread pool restrictions for scheduler instance ``IO``. In the scheduler |
---|
330 | instance ``WRK0`` there is one thread pool available. Since no priority is |
---|
331 | specified for this scheduler instance, the worker thread inherits the priority |
---|
332 | of the OpenMP master thread that created it. In the scheduler instance |
---|
333 | ``WRK1`` there are three thread pools available and their worker threads run at |
---|
334 | priority four. |
---|
335 | |
---|
336 | Thread Dispatch Details |
---|
337 | ----------------------- |
---|
338 | |
---|
339 | This section gives background information to developers interested in the |
---|
340 | interrupt latencies introduced by thread dispatching. A thread dispatch |
---|
341 | consists of all work which must be done to stop the currently executing thread |
---|
342 | on a processor and hand over this processor to an heir thread. |
---|
343 | |
---|
344 | In SMP systems, scheduling decisions on one processor must be propagated |
---|
345 | to other processors through inter-processor interrupts. A thread dispatch |
---|
346 | which must be carried out on another processor does not happen instantaneously. |
---|
347 | Thus, several thread dispatch requests might be in the air and it is possible |
---|
348 | that some of them may be out of date before the corresponding processor has |
---|
349 | time to deal with them. The thread dispatch mechanism uses three per-processor |
---|
350 | variables, |
---|
351 | |
---|
352 | - the executing thread, |
---|
353 | |
---|
354 | - the heir thread, and |
---|
355 | |
---|
356 | - a boolean flag indicating if a thread dispatch is necessary or not. |
---|
357 | |
---|
358 | Updates of the heir thread are done via a normal store operation. The thread |
---|
359 | dispatch necessary indicator of another processor is set as a side-effect of an |
---|
360 | inter-processor interrupt. So, this change notification works without the use |
---|
361 | of locks. The thread context is protected by a TTAS lock embedded in the |
---|
362 | context to ensure that it is used on at most one processor at a time. |
---|
363 | Normally, only thread-specific or per-processor locks are used during a thread |
---|
364 | dispatch. This implementation turned out to be quite efficient and no lock |
---|
365 | contention was observed in the testsuite. The heavy-weight thread dispatch |
---|
366 | sequence is only entered in case the thread dispatch indicator is set. |
---|
367 | |
---|
368 | The context-switch is performed with interrupts enabled. During the transition |
---|
369 | from the executing to the heir thread neither the stack of the executing nor |
---|
370 | the heir thread must be used during interrupt processing. For this purpose a |
---|
371 | temporary per-processor stack is set up which may be used by the interrupt |
---|
372 | prologue before the stack is switched to the interrupt stack. |
---|
373 | |
---|
374 | Application Issues |
---|
375 | ================== |
---|
376 | |
---|
377 | Most operating system services provided by the uni-processor RTEMS are |
---|
378 | available in SMP configurations as well. However, applications designed for an |
---|
379 | uni-processor environment may need some changes to correctly run in an SMP |
---|
380 | configuration. |
---|
381 | |
---|
382 | As discussed earlier, SMP systems have opportunities for true parallelism which |
---|
383 | was not possible on uni-processor systems. Consequently, multiple techniques |
---|
384 | that provided adequate critical sections on uni-processor systems are unsafe on |
---|
385 | SMP systems. In this section, some of these unsafe techniques will be |
---|
386 | discussed. |
---|
387 | |
---|
388 | In general, applications must use proper operating system provided mutual |
---|
389 | exclusion mechanisms to ensure correct behavior. |
---|
390 | |
---|
391 | Task variables |
---|
392 | -------------- |
---|
393 | |
---|
394 | Task variables are ordinary global variables with a dedicated value for each |
---|
395 | thread. During a context switch from the executing thread to the heir thread, |
---|
396 | the value of each task variable is saved to the thread control block of the |
---|
397 | executing thread and restored from the thread control block of the heir thread. |
---|
398 | This is inherently broken if more than one executing thread exists. |
---|
399 | Alternatives to task variables are POSIX keys and :ref:`TLS <TLS>`. All use |
---|
400 | cases of task variables in the RTEMS code base were replaced with alternatives. |
---|
401 | The task variable API has been removed in RTEMS 4.12. |
---|
402 | |
---|
403 | Highest Priority Thread Never Walks Alone |
---|
404 | ----------------------------------------- |
---|
405 | |
---|
406 | On a uni-processor system, it is safe to assume that when the highest priority |
---|
407 | task in an application executes, it will execute without being preempted until |
---|
408 | it voluntarily blocks. Interrupts may occur while it is executing, but there |
---|
409 | will be no context switch to another task unless the highest priority task |
---|
410 | voluntarily initiates it. |
---|
411 | |
---|
412 | Given the assumption that no other tasks will have their execution interleaved |
---|
413 | with the highest priority task, it is possible for this task to be constructed |
---|
414 | such that it does not need to acquire a mutex for protected access to shared |
---|
415 | data. |
---|
416 | |
---|
417 | In an SMP system, it cannot be assumed there will never be a single task |
---|
418 | executing. It should be assumed that every processor is executing another |
---|
419 | application task. Further, those tasks will be ones which would not have been |
---|
420 | executed in a uni-processor configuration and should be assumed to have data |
---|
421 | synchronization conflicts with what was formerly the highest priority task |
---|
422 | which executed without conflict. |
---|
423 | |
---|
424 | Disabling of Thread Pre-Emption |
---|
425 | ------------------------------- |
---|
426 | |
---|
427 | A thread which disables pre-emption prevents that a higher priority thread gets |
---|
428 | hold of its processor involuntarily. In uni-processor configurations, this can |
---|
429 | be used to ensure mutual exclusion at thread level. In SMP configurations, |
---|
430 | however, more than one executing thread may exist. Thus, it is impossible to |
---|
431 | ensure mutual exclusion using this mechanism. In order to prevent that |
---|
432 | applications using pre-emption for this purpose, would show inappropriate |
---|
433 | behaviour, this feature is disabled in SMP configurations and its use would |
---|
434 | case run-time errors. |
---|
435 | |
---|
436 | Disabling of Interrupts |
---|
437 | ----------------------- |
---|
438 | |
---|
439 | A low overhead means that ensures mutual exclusion in uni-processor |
---|
440 | configurations is the disabling of interrupts around a critical section. This |
---|
441 | is commonly used in device driver code. In SMP configurations, however, |
---|
442 | disabling the interrupts on one processor has no effect on other processors. |
---|
443 | So, this is insufficient to ensure system-wide mutual exclusion. The macros |
---|
444 | |
---|
445 | * :ref:`rtems_interrupt_disable() <rtems_interrupt_disable>`, |
---|
446 | |
---|
447 | * :ref:`rtems_interrupt_enable() <rtems_interrupt_enable>`, and |
---|
448 | |
---|
449 | * :ref:`rtems_interrupt_flash() <rtems_interrupt_flash>`. |
---|
450 | |
---|
451 | are disabled in SMP configurations and its use will cause compile-time warnings |
---|
452 | and link-time errors. In the unlikely case that interrupts must be disabled on |
---|
453 | the current processor, the |
---|
454 | |
---|
455 | * :ref:`rtems_interrupt_local_disable() <rtems_interrupt_local_disable>`, and |
---|
456 | |
---|
457 | * :ref:`rtems_interrupt_local_enable() <rtems_interrupt_local_enable>`. |
---|
458 | |
---|
459 | macros are now available in all configurations. |
---|
460 | |
---|
461 | Since disabling of interrupts is insufficient to ensure system-wide mutual |
---|
462 | exclusion on SMP a new low-level synchronization primitive was added -- |
---|
463 | interrupt locks. The interrupt locks are a simple API layer on top of the SMP |
---|
464 | locks used for low-level synchronization in the operating system core. |
---|
465 | Currently, they are implemented as a ticket lock. In uni-processor |
---|
466 | configurations, they degenerate to simple interrupt disable/enable sequences by |
---|
467 | means of the C pre-processor. It is disallowed to acquire a single interrupt |
---|
468 | lock in a nested way. This will result in an infinite loop with interrupts |
---|
469 | disabled. While converting legacy code to interrupt locks, care must be taken |
---|
470 | to avoid this situation to happen. |
---|
471 | |
---|
472 | .. code-block:: c |
---|
473 | :linenos: |
---|
474 | |
---|
475 | #include <rtems.h> |
---|
476 | |
---|
477 | void legacy_code_with_interrupt_disable_enable( void ) |
---|
478 | { |
---|
479 | rtems_interrupt_level level; |
---|
480 | |
---|
481 | rtems_interrupt_disable( level ); |
---|
482 | /* Critical section */ |
---|
483 | rtems_interrupt_enable( level ); |
---|
484 | } |
---|
485 | |
---|
486 | RTEMS_INTERRUPT_LOCK_DEFINE( static, lock, "Name" ) |
---|
487 | |
---|
488 | void smp_ready_code_with_interrupt_lock( void ) |
---|
489 | { |
---|
490 | rtems_interrupt_lock_context lock_context; |
---|
491 | |
---|
492 | rtems_interrupt_lock_acquire( &lock, &lock_context ); |
---|
493 | /* Critical section */ |
---|
494 | rtems_interrupt_lock_release( &lock, &lock_context ); |
---|
495 | } |
---|
496 | |
---|
497 | An alternative to the RTEMS-specific interrupt locks are POSIX spinlocks. The |
---|
498 | :c:type:`pthread_spinlock_t` is defined as a self-contained object, e.g. the |
---|
499 | user must provide the storage for this synchronization object. |
---|
500 | |
---|
501 | .. code-block:: c |
---|
502 | :linenos: |
---|
503 | |
---|
504 | #include <assert.h> |
---|
505 | #include <pthread.h> |
---|
506 | |
---|
507 | pthread_spinlock_t lock; |
---|
508 | |
---|
509 | void smp_ready_code_with_posix_spinlock( void ) |
---|
510 | { |
---|
511 | int error; |
---|
512 | |
---|
513 | error = pthread_spin_lock( &lock ); |
---|
514 | assert( error == 0 ); |
---|
515 | /* Critical section */ |
---|
516 | error = pthread_spin_unlock( &lock ); |
---|
517 | assert( error == 0 ); |
---|
518 | } |
---|
519 | |
---|
520 | In contrast to POSIX spinlock implementation on Linux or FreeBSD, it is not |
---|
521 | allowed to call blocking operating system services inside the critical section. |
---|
522 | A recursive lock attempt is a severe usage error resulting in an infinite loop |
---|
523 | with interrupts disabled. Nesting of different locks is allowed. The user |
---|
524 | must ensure that no deadlock can occur. As a non-portable feature the locks |
---|
525 | are zero-initialized, e.g. statically initialized global locks reside in the |
---|
526 | ``.bss`` section and there is no need to call :c:func:`pthread_spin_init`. |
---|
527 | |
---|
528 | Interrupt Service Routines Execute in Parallel With Threads |
---|
529 | ----------------------------------------------------------- |
---|
530 | |
---|
531 | On a machine with more than one processor, interrupt service routines (this |
---|
532 | includes timer service routines installed via :ref:`rtems_timer_fire_after() |
---|
533 | <rtems_timer_fire_after>`) and threads can execute in parallel. Interrupt |
---|
534 | service routines must take this into account and use proper locking mechanisms |
---|
535 | to protect critical sections from interference by threads (interrupt locks or |
---|
536 | POSIX spinlocks). This likely requires code modifications in legacy device |
---|
537 | drivers. |
---|
538 | |
---|
539 | Timers Do Not Stop Immediately |
---|
540 | ------------------------------ |
---|
541 | |
---|
542 | Timer service routines run in the context of the clock interrupt. On |
---|
543 | uni-processor configurations, it is sufficient to disable interrupts and remove |
---|
544 | a timer from the set of active timers to stop it. In SMP configurations, |
---|
545 | however, the timer service routine may already run and wait on an SMP lock |
---|
546 | owned by the thread which is about to stop the timer. This opens the door to |
---|
547 | subtle synchronization issues. During destruction of objects, special care |
---|
548 | must be taken to ensure that timer service routines cannot access (partly or |
---|
549 | fully) destroyed objects. |
---|
550 | |
---|
551 | False Sharing of Cache Lines Due to Objects Table |
---|
552 | ------------------------------------------------- |
---|
553 | |
---|
554 | The Classic API and most POSIX API objects are indirectly accessed via an |
---|
555 | object identifier. The user-level functions validate the object identifier and |
---|
556 | map it to the actual object structure which resides in a global objects table |
---|
557 | for each object class. So, unrelated objects are packed together in a table. |
---|
558 | This may result in false sharing of cache lines. The effect of false sharing |
---|
559 | of cache lines can be observed with the `TMFINE 1 |
---|
560 | <https://git.rtems.org/rtems/tree/testsuites/tmtests/tmfine01>`_ test program |
---|
561 | on a suitable platform, e.g. QorIQ T4240. High-performance SMP applications |
---|
562 | need full control of the object storage :cite:`Drepper:2007:Memory`. |
---|
563 | Therefore, self-contained synchronization objects are now available for RTEMS. |
---|
564 | |
---|
565 | Directives |
---|
566 | ========== |
---|
567 | |
---|
568 | This section details the symmetric multiprocessing services. A subsection is |
---|
569 | dedicated to each of these services and describes the calling sequence, related |
---|
570 | constants, usage, and status codes. |
---|
571 | |
---|
572 | .. raw:: latex |
---|
573 | |
---|
574 | \clearpage |
---|
575 | |
---|
576 | .. _rtems_get_processor_count: |
---|
577 | |
---|
578 | GET_PROCESSOR_COUNT - Get processor count |
---|
579 | ----------------------------------------- |
---|
580 | |
---|
581 | CALLING SEQUENCE: |
---|
582 | .. code-block:: c |
---|
583 | |
---|
584 | uint32_t rtems_get_processor_count(void); |
---|
585 | |
---|
586 | DIRECTIVE STATUS CODES: |
---|
587 | The count of processors in the system. |
---|
588 | |
---|
589 | DESCRIPTION: |
---|
590 | In uni-processor configurations, a value of one will be returned. |
---|
591 | |
---|
592 | In SMP configurations, this returns the value of a global variable set |
---|
593 | during system initialization to indicate the count of utilized processors. |
---|
594 | The processor count depends on the physically or virtually available |
---|
595 | processors and application configuration. The value will always be less |
---|
596 | than or equal to the maximum count of application configured processors. |
---|
597 | |
---|
598 | NOTES: |
---|
599 | None. |
---|
600 | |
---|
601 | .. raw:: latex |
---|
602 | |
---|
603 | \clearpage |
---|
604 | |
---|
605 | .. _rtems_get_current_processor: |
---|
606 | |
---|
607 | GET_CURRENT_PROCESSOR - Get current processor index |
---|
608 | --------------------------------------------------- |
---|
609 | |
---|
610 | CALLING SEQUENCE: |
---|
611 | .. code-block:: c |
---|
612 | |
---|
613 | uint32_t rtems_get_current_processor(void); |
---|
614 | |
---|
615 | DIRECTIVE STATUS CODES: |
---|
616 | The index of the current processor. |
---|
617 | |
---|
618 | DESCRIPTION: |
---|
619 | In uni-processor configurations, a value of zero will be returned. |
---|
620 | |
---|
621 | In SMP configurations, an architecture specific method is used to obtain the |
---|
622 | index of the current processor in the system. The set of processor indices |
---|
623 | is the range of integers starting with zero up to the processor count minus |
---|
624 | one. |
---|
625 | |
---|
626 | Outside of sections with disabled thread dispatching the current processor |
---|
627 | index may change after every instruction since the thread may migrate from |
---|
628 | one processor to another. Sections with disabled interrupts are sections |
---|
629 | with thread dispatching disabled. |
---|
630 | |
---|
631 | NOTES: |
---|
632 | None. |
---|