1 | @c |
---|
2 | @c COPYRIGHT (c) 2014. |
---|
3 | @c On-Line Applications Research Corporation (OAR). |
---|
4 | @c All rights reserved. |
---|
5 | @c |
---|
6 | |
---|
7 | @chapter Symmetric Multiprocessing Services |
---|
8 | |
---|
9 | @section Introduction |
---|
10 | |
---|
11 | This chapter describes the services related to Symmetric Multiprocessing |
---|
12 | provided by RTEMS. |
---|
13 | |
---|
14 | The application level services currently provided are: |
---|
15 | |
---|
16 | @itemize @bullet |
---|
17 | @item @code{rtems_get_processor_count} - Get processor count |
---|
18 | @item @code{rtems_get_current_processor} - Get current processor index |
---|
19 | @item @code{rtems_task_get_affinity} - Get task processor affinity |
---|
20 | @item @code{rtems_task_set_affinity} - Set task processor affinity |
---|
21 | @end itemize |
---|
22 | |
---|
23 | @c |
---|
24 | @c |
---|
25 | @c |
---|
26 | @section Background |
---|
27 | |
---|
28 | @subsection Uniprocessor versus SMP Parallelism |
---|
29 | |
---|
30 | Uniprocessor systems have long been used in embedded systems. In this hardware |
---|
31 | model, there are some system execution characteristics which have long been |
---|
32 | taken for granted: |
---|
33 | |
---|
34 | @itemize @bullet |
---|
35 | @item one task executes at a time |
---|
36 | @item hardware events result in interrupts |
---|
37 | @end itemize |
---|
38 | |
---|
39 | There is no true parallelism. Even when interrupts appear to occur |
---|
40 | at the same time, they are processed in largely a serial fashion. |
---|
41 | This is true even when the interupt service routines are allowed to |
---|
42 | nest. From a tasking viewpoint, it is the responsibility of the real-time |
---|
43 | operatimg system to simulate parallelism by switching between tasks. |
---|
44 | These task switches occur in response to hardware interrupt events and explicit |
---|
45 | application events such as blocking for a resource or delaying. |
---|
46 | |
---|
47 | With symmetric multiprocessing, the presence of multiple processors |
---|
48 | allows for true concurrency and provides for cost-effective performance |
---|
49 | improvements. Uniprocessors tend to increase performance by increasing |
---|
50 | clock speed and complexity. This tends to lead to hot, power hungry |
---|
51 | microprocessors which are poorly suited for many embedded applications. |
---|
52 | |
---|
53 | The true concurrency is in sharp contrast to the single task and |
---|
54 | interrupt model of uniprocessor systems. This results in a fundamental |
---|
55 | change to uniprocessor system characteristics listed above. Developers |
---|
56 | are faced with a different set of characteristics which, in turn, break |
---|
57 | some existing assumptions and result in new challenges. In an SMP system |
---|
58 | with N processors, these are the new execution characteristics. |
---|
59 | |
---|
60 | @itemize @bullet |
---|
61 | @item N tasks execute in parallel |
---|
62 | @item hardware events result in interrupts |
---|
63 | @end itemize |
---|
64 | |
---|
65 | There is true parallelism with a task executing on each processor and |
---|
66 | the possibility of interrupts occurring on each processor. Thus in contrast |
---|
67 | to their being one task and one interrupt to consider on a uniprocessor, |
---|
68 | there are N tasks and potentially N simultaneous interrupts to consider |
---|
69 | on an SMP system. |
---|
70 | |
---|
71 | This increase in hardware complexity and presence of true parallelism |
---|
72 | results in the application developer needing to be even more cautious |
---|
73 | about mutual exclusion and shared data access than in a uniprocessor |
---|
74 | embedded system. Race conditions that never or rarely happened when an |
---|
75 | application executed on a uniprocessor system, become much more likely |
---|
76 | due to multiple threads executing in parallel. On a uniprocessor system, |
---|
77 | these race conditions would only happen when a task switch occurred at |
---|
78 | just the wrong moment. Now there are N-1 tasks executing in parallel |
---|
79 | all the time and this results in many more opportunities for small |
---|
80 | windows in critical sections to be hit. |
---|
81 | |
---|
82 | @subsection Task Affinity |
---|
83 | |
---|
84 | @cindex task affinity |
---|
85 | @cindex thread affinity |
---|
86 | |
---|
87 | RTEMS provides services to manipulate the affinity of a task. Affinity |
---|
88 | is used to specify the subset of processors in an SMP system on which |
---|
89 | a particular task can execute. |
---|
90 | |
---|
91 | By default, tasks have an affinity which allows them to execute on any |
---|
92 | available processor. |
---|
93 | |
---|
94 | Task affinity is a possible feature to be supported by SMP-aware |
---|
95 | schedulers. However, only a subset of the available schedulers support |
---|
96 | affinity. Although the behavior is scheduler specific, if the scheduler |
---|
97 | does not support affinity, it is likely to ignore all attempts to set |
---|
98 | affinity. |
---|
99 | |
---|
100 | @subsection Critical Section Techniques and SMP |
---|
101 | |
---|
102 | As discussed earlier, SMP systems have opportunities for true parallelism |
---|
103 | which was not possible on uniprocessor systems. Consequently, multiple |
---|
104 | techniques that provided adequate critical sections on uniprocessor |
---|
105 | systems are unsafe on SMP systems. In this section, some of these |
---|
106 | unsafe techniques will be discussed. |
---|
107 | |
---|
108 | In general, applications must use proper operating system provided mutual |
---|
109 | exclusion mechanisms to ensure correct behavior. This primarily means |
---|
110 | the use of binary semaphores or mutexes to implement critical sections. |
---|
111 | |
---|
112 | @subsubsection Disable Interrupts |
---|
113 | |
---|
114 | Again on a uniprocessor system, there is only a single processor which |
---|
115 | logically executes a single task and takes interrupts. On an SMP system, |
---|
116 | each processor may take an interrupt. When the application disables |
---|
117 | interrupts, it generally does so by altering a processor register to |
---|
118 | mask interrupts and later to re-enable them. On a uniprocessor system, |
---|
119 | changing this in the single processor is sufficient. However, on an SMP |
---|
120 | system, this register in @strong{ALL} processors must be changed. There |
---|
121 | are no comparable capabilities in an SMP system to disable all interrupts |
---|
122 | across all processors. |
---|
123 | |
---|
124 | @subsubsection Highest Priority Task Assumption |
---|
125 | |
---|
126 | On a uniprocessor system, it is safe to assume that when the highest |
---|
127 | priority task in an application executes, it will execute without being |
---|
128 | preempted until it voluntarily blocks. Interrupts may occur while it is |
---|
129 | executing, but there will be no context switch to another task unless |
---|
130 | the highest priority task voluntarily initiates it. |
---|
131 | |
---|
132 | Given the assumption that no other tasks will have their execution |
---|
133 | interleaved with the highest priority task, it is possible for this |
---|
134 | task to be constructed such that it does not need to acquire a binary |
---|
135 | semaphore or mutex for protected access to shared data. |
---|
136 | |
---|
137 | In an SMP system, it cannot be assumed there will never be a single task |
---|
138 | executing. It should be assumed that every processor is executing another |
---|
139 | application task. Further, those tasks will be ones which would not have |
---|
140 | been executed in a uniprocessor configuration and should be assumed to |
---|
141 | have data synchronization conflicts with what was formerly the highest |
---|
142 | priority task which executed without conflict. |
---|
143 | |
---|
144 | @subsubsection Disable Preemption |
---|
145 | |
---|
146 | On a uniprocessor system, disabling preemption in a task is very similar |
---|
147 | to making the highest priority task assumption. While preemption is |
---|
148 | disabled, no task context switches will occur unless the task initiates |
---|
149 | them voluntarily. And, just as with the highest priority task assumption, |
---|
150 | there are N-1 processors also running tasks. Thus the assumption that no |
---|
151 | other tasks will run while the task has preemption disabled is violated. |
---|
152 | |
---|
153 | @subsection Task Unique Data and SMP |
---|
154 | |
---|
155 | Per task variables are a service commonly provided by real-time operating |
---|
156 | systems for application use. They work by allowing the application |
---|
157 | to specify a location in memory (typically a @code{void *}) which is |
---|
158 | logically added to the context of a task. On each task switch, the |
---|
159 | location in memory is stored and each task can have a unique value in |
---|
160 | the same memory location. This memory location is directly accessed as a |
---|
161 | variable in a program. |
---|
162 | |
---|
163 | This works well in a uniprocessor environment because there is one task |
---|
164 | executing and one memory location containing a task-specific value. But |
---|
165 | it is fundamentally broken on an SMP system because there are always N |
---|
166 | tasks executing. With only one location in memory, N-1 tasks will not |
---|
167 | have the correct value. |
---|
168 | |
---|
169 | This paradigm for providing task unique data values is fundamentally |
---|
170 | broken on SMP systems. |
---|
171 | |
---|
172 | @subsubsection Classic API Per Task Variables |
---|
173 | |
---|
174 | The Classic API provides three directives to support per task variables. These are: |
---|
175 | |
---|
176 | @itemize @bullet |
---|
177 | @item @code{@value{DIRPREFIX}task_variable_add} - Associate per task variable |
---|
178 | @item @code{@value{DIRPREFIX}task_variable_get} - Obtain value of a a per task variable |
---|
179 | @item @code{@value{DIRPREFIX}task_variable_delete} - Remove per task variable |
---|
180 | @end itemize |
---|
181 | |
---|
182 | As task variables are unsafe for use on SMP systems, the use of these |
---|
183 | services should be eliminated in all software that is to be used in |
---|
184 | an SMP environment. It is recommended that the application developer |
---|
185 | consider the use of POSIX Keys or Thread Local Storage (TLS). POSIX Keys |
---|
186 | are not enabled in all RTEMS configurations. |
---|
187 | |
---|
188 | @b{STATUS}: As of March 2014, some support services in the |
---|
189 | @code{rtems/cpukit} use per task variables. When these uses are |
---|
190 | eliminated, the per task variable directives will be disabled when |
---|
191 | building RTEMS in SMP configuration. |
---|
192 | |
---|
193 | @c |
---|
194 | @c |
---|
195 | @c |
---|
196 | @section Operations |
---|
197 | |
---|
198 | @subsection Setting Affinity to a Single Processor |
---|
199 | |
---|
200 | In many embedded applications targeting SMP systems, it is common to lock individual tasks to specific cores. In this way, one can designate a core for I/O tasks, another for computation, etc.. The following illustrates the code sequence necessary to assign a task an affinity for processor zero (0). |
---|
201 | |
---|
202 | @example |
---|
203 | rtems_status_code sc; |
---|
204 | cpu_set_t set; |
---|
205 | |
---|
206 | CPU_EMPTY( &set ); |
---|
207 | CPU_SET( 0, &set ); |
---|
208 | |
---|
209 | sc = rtems_task_set_affinity(rtems_task_self(), sizeof(set), &set); |
---|
210 | assert(sc == RTEMS_SUCCESSFUL); |
---|
211 | @end example |
---|
212 | |
---|
213 | It is important to note that the @code{cpu_set_t} is not validated until the |
---|
214 | @code{@value{DIRPREFIX}task_set_affinity} call is made. At that point, |
---|
215 | it is validated against the current system configuration. |
---|
216 | |
---|
217 | @c |
---|
218 | @c |
---|
219 | @c |
---|
220 | @section Directives |
---|
221 | |
---|
222 | This section details the symmetric multiprocessing services. A subsection |
---|
223 | is dedicated to each of these services and describes the calling sequence, |
---|
224 | related constants, usage, and status codes. |
---|
225 | |
---|
226 | @c |
---|
227 | @c rtems_get_processor_count |
---|
228 | @c |
---|
229 | @page |
---|
230 | @subsection GET_PROCESSOR_COUNT - Get processor count |
---|
231 | |
---|
232 | @subheading CALLING SEQUENCE: |
---|
233 | |
---|
234 | @ifset is-C |
---|
235 | @example |
---|
236 | uint32_t rtems_get_processor_count(void); |
---|
237 | @end example |
---|
238 | @end ifset |
---|
239 | |
---|
240 | @ifset is-Ada |
---|
241 | @end ifset |
---|
242 | |
---|
243 | @subheading DIRECTIVE STATUS CODES: |
---|
244 | |
---|
245 | The count of processors in the system. |
---|
246 | |
---|
247 | @subheading DESCRIPTION: |
---|
248 | |
---|
249 | On uni-processor configurations a value of one will be returned. |
---|
250 | |
---|
251 | On SMP configurations this returns the value of a global variable set during |
---|
252 | system initialization to indicate the count of utilized processors. The |
---|
253 | processor count depends on the physically or virtually available processors and |
---|
254 | application configuration. The value will always be less than or equal to the |
---|
255 | maximum count of application configured processors. |
---|
256 | |
---|
257 | @subheading NOTES: |
---|
258 | |
---|
259 | None. |
---|
260 | |
---|
261 | @c |
---|
262 | @c rtems_get_current_processor |
---|
263 | @c |
---|
264 | @page |
---|
265 | @subsection GET_CURRENT_PROCESSOR - Get current processor index |
---|
266 | |
---|
267 | @subheading CALLING SEQUENCE: |
---|
268 | |
---|
269 | @ifset is-C |
---|
270 | @example |
---|
271 | uint32_t rtems_get_current_processor(void); |
---|
272 | @end example |
---|
273 | @end ifset |
---|
274 | |
---|
275 | @ifset is-Ada |
---|
276 | @end ifset |
---|
277 | |
---|
278 | @subheading DIRECTIVE STATUS CODES: |
---|
279 | |
---|
280 | The index of the current processor. |
---|
281 | |
---|
282 | @subheading DESCRIPTION: |
---|
283 | |
---|
284 | On uni-processor configurations a value of zero will be returned. |
---|
285 | |
---|
286 | On SMP configurations an architecture specific method is used to obtain the |
---|
287 | index of the current processor in the system. The set of processor indices is |
---|
288 | the range of integers starting with zero up to the processor count minus one. |
---|
289 | |
---|
290 | Outside of sections with disabled thread dispatching the current processor |
---|
291 | index may change after every instruction since the thread may migrate from one |
---|
292 | processor to another. Sections with disabled interrupts are sections with |
---|
293 | thread dispatching disabled. |
---|
294 | |
---|
295 | @subheading NOTES: |
---|
296 | |
---|
297 | None. |
---|
298 | |
---|
299 | @c |
---|
300 | @c rtems_task_get_affinity |
---|
301 | @c |
---|
302 | @page |
---|
303 | @subsection TASK_GET_AFFINITY - Get task processor affinity |
---|
304 | |
---|
305 | @subheading CALLING SEQUENCE: |
---|
306 | |
---|
307 | @ifset is-C |
---|
308 | @example |
---|
309 | rtems_status_code rtems_task_get_affinity( |
---|
310 | rtems_id id, |
---|
311 | size_t cpusetsize, |
---|
312 | cpu_set_t *cpuset |
---|
313 | ); |
---|
314 | @end example |
---|
315 | @end ifset |
---|
316 | |
---|
317 | @ifset is-Ada |
---|
318 | @end ifset |
---|
319 | |
---|
320 | @subheading DIRECTIVE STATUS CODES: |
---|
321 | |
---|
322 | @code{@value{RPREFIX}SUCCESSFUL} - successful operation@* |
---|
323 | @code{@value{RPREFIX}INVALID_ADDRESS} - @code{cpuset} is NULL@* |
---|
324 | @code{@value{RPREFIX}INVALID_ID} - invalid task id@* |
---|
325 | @code{@value{RPREFIX}INVALID_NUMBER} - the affinity set buffer is too small for |
---|
326 | the current processor affinity set of the task |
---|
327 | |
---|
328 | @subheading DESCRIPTION: |
---|
329 | |
---|
330 | Returns the current processor affinity set of the task in @code{cpuset}. A set |
---|
331 | bit in the affinity set means that the task can execute on this processor and a |
---|
332 | cleared bit means the opposite. |
---|
333 | |
---|
334 | @subheading NOTES: |
---|
335 | |
---|
336 | None. |
---|
337 | |
---|
338 | @c |
---|
339 | @c rtems_task_set_affinity |
---|
340 | @c |
---|
341 | @page |
---|
342 | @subsection TASK_SET_AFFINITY - Set task processor affinity |
---|
343 | |
---|
344 | @subheading CALLING SEQUENCE: |
---|
345 | |
---|
346 | @ifset is-C |
---|
347 | @example |
---|
348 | rtems_status_code rtems_task_set_affinity( |
---|
349 | rtems_id id, |
---|
350 | size_t cpusetsize, |
---|
351 | const cpu_set_t *cpuset |
---|
352 | ); |
---|
353 | @end example |
---|
354 | @end ifset |
---|
355 | |
---|
356 | @ifset is-Ada |
---|
357 | @end ifset |
---|
358 | |
---|
359 | @subheading DIRECTIVE STATUS CODES: |
---|
360 | |
---|
361 | @code{@value{RPREFIX}SUCCESSFUL} - successful operation@* |
---|
362 | @code{@value{RPREFIX}INVALID_ADDRESS} - @code{cpuset} is NULL@* |
---|
363 | @code{@value{RPREFIX}INVALID_ID} - invalid task id@* |
---|
364 | @code{@value{RPREFIX}INVALID_NUMBER} - invalid processor affinity set |
---|
365 | |
---|
366 | @subheading DESCRIPTION: |
---|
367 | |
---|
368 | Sets the processor affinity set for the task specified by @code{cpuset}. A set |
---|
369 | bit in the affinity set means that the task can execute on this processor and a |
---|
370 | cleared bit means the opposite. |
---|
371 | |
---|
372 | @subheading NOTES: |
---|
373 | |
---|
374 | None. |
---|