1 | @c |
---|
2 | @c COPYRIGHT (c) 1988-2002. |
---|
3 | @c On-Line Applications Research Corporation (OAR). |
---|
4 | @c All rights reserved. |
---|
5 | @c |
---|
6 | @c $Id$ |
---|
7 | @c |
---|
8 | |
---|
9 | @chapter Timing Specification |
---|
10 | |
---|
11 | @section Introduction |
---|
12 | |
---|
13 | This chapter provides information pertaining to the |
---|
14 | measurement of the performance of RTEMS, the methods of |
---|
15 | gathering the timing data, and the usefulness of the data. Also |
---|
16 | discussed are other time critical aspects of RTEMS that affect |
---|
17 | an applications design and ultimate throughput. These aspects |
---|
18 | include determinancy, interrupt latency and context switch times. |
---|
19 | |
---|
20 | @section Philosophy |
---|
21 | |
---|
22 | Benchmarks are commonly used to evaluate the |
---|
23 | performance of software and hardware. Benchmarks can be an |
---|
24 | effective tool when comparing systems. Unfortunately, |
---|
25 | benchmarks can also be manipulated to justify virtually any |
---|
26 | claim. Benchmarks of real-time executives are difficult to |
---|
27 | evaluate for a variety of reasons. Executives vary in the |
---|
28 | robustness of features and options provided. Even when |
---|
29 | executives compare favorably in functionality, it is quite |
---|
30 | likely that different methodologies were used to obtain the |
---|
31 | timing data. Another problem is that some executives provide |
---|
32 | times for only a small subset of directives, This is typically |
---|
33 | justified by claiming that these are the only time-critical |
---|
34 | directives. The performance of some executives is also very |
---|
35 | sensitive to the number of objects in the system. To obtain any |
---|
36 | measure of usefulness, the performance information provided for |
---|
37 | an executive should address each of these issues. |
---|
38 | |
---|
39 | When evaluating the performance of a real-time |
---|
40 | executive, one typically considers the following areas: |
---|
41 | determinancy, directive times, worst case interrupt latency, and |
---|
42 | context switch time. Unfortunately, these areas do not have |
---|
43 | standard measurement methodologies. This allows vendors to |
---|
44 | manipulate the results such that their product is favorably |
---|
45 | represented. We have attempted to provide useful and meaningful |
---|
46 | timing information for RTEMS. To insure the usefulness of our |
---|
47 | data, the methodology and definitions used to obtain and |
---|
48 | describe the data are also documented. |
---|
49 | |
---|
50 | @subsection Determinancy |
---|
51 | |
---|
52 | The correctness of data in a real-time system must |
---|
53 | always be judged by its timeliness. In many real-time systems, |
---|
54 | obtaining the correct answer does not necessarily solve the |
---|
55 | problem. For example, in a nuclear reactor it is not enough to |
---|
56 | determine that the core is overheating. This situation must be |
---|
57 | detected and acknowledged early enough that corrective action |
---|
58 | can be taken and a meltdown avoided. |
---|
59 | |
---|
60 | Consequently, a system designer must be able to |
---|
61 | predict the worst-case behavior of the application running under |
---|
62 | the selected executive. In this light, it is important that a |
---|
63 | real-time system perform consistently regardless of the number |
---|
64 | of tasks, semaphores, or other resources allocated. An |
---|
65 | important design goal of a real-time executive is that all |
---|
66 | internal algorithms be fixed-cost. Unfortunately, this goal is |
---|
67 | difficult to completely meet without sacrificing the robustness |
---|
68 | of the executive's feature set. |
---|
69 | |
---|
70 | Many executives use the term deterministic to mean |
---|
71 | that the execution times of their services can be predicted. |
---|
72 | However, they often provide formulas to modify execution times |
---|
73 | based upon the number of objects in the system. This usage is |
---|
74 | in sharp contrast to the notion of deterministic meaning fixed |
---|
75 | cost. |
---|
76 | |
---|
77 | Almost all RTEMS directives execute in a fixed amount |
---|
78 | of time regardless of the number of objects present in the |
---|
79 | system. The primary exception occurs when a task blocks while |
---|
80 | acquiring a resource and specifies a non-zero timeout interval. |
---|
81 | |
---|
82 | Other exceptions are message queue broadcast, |
---|
83 | obtaining a variable length memory block, object name to ID |
---|
84 | translation, and deleting a resource upon which tasks are |
---|
85 | waiting. In addition, the time required to service a clock tick |
---|
86 | interrupt is based upon the number of timeouts and other |
---|
87 | "events" which must be processed at that tick. This second |
---|
88 | group is composed primarily of capabilities which are inherently |
---|
89 | non-deterministic but are infrequently used in time critical |
---|
90 | situations. The major exception is that of servicing a clock |
---|
91 | tick. However, most applications have a very small number of |
---|
92 | timeouts which expire at exactly the same millisecond (usually |
---|
93 | none, but occasionally two or three). |
---|
94 | |
---|
95 | @subsection Interrupt Latency |
---|
96 | |
---|
97 | Interrupt latency is the delay between the CPU's |
---|
98 | receipt of an interrupt request and the execution of the first |
---|
99 | application-specific instruction in an interrupt service |
---|
100 | routine. Interrupts are a critical component of most real-time |
---|
101 | applications and it is critical that they be acted upon as |
---|
102 | quickly as possible. |
---|
103 | |
---|
104 | Knowledge of the worst case interrupt latency of an |
---|
105 | executive aids the application designer in determining the |
---|
106 | maximum period of time between the generation of an interrupt |
---|
107 | and an interrupt handler responding to that interrupt. The |
---|
108 | interrupt latency of an system is the greater of the executive's |
---|
109 | and the applications's interrupt latency. If the application |
---|
110 | disables interrupts longer than the executive, then the |
---|
111 | application's interrupt latency is the system's worst case |
---|
112 | interrupt disable period. |
---|
113 | |
---|
114 | The worst case interrupt latency for a real-time |
---|
115 | executive is based upon the following components: |
---|
116 | |
---|
117 | @itemize @bullet |
---|
118 | @item the longest period of time interrupts are disabled |
---|
119 | by the executive, |
---|
120 | |
---|
121 | @item the overhead required by the executive at the |
---|
122 | beginning of each ISR, |
---|
123 | |
---|
124 | @item the time required for the CPU to vector the |
---|
125 | interrupt, and |
---|
126 | |
---|
127 | @item for some microprocessors, the length of the longest |
---|
128 | instruction. |
---|
129 | @end itemize |
---|
130 | |
---|
131 | The first component is irrelevant if an interrupt |
---|
132 | occurs when interrupts are enabled, although it must be included |
---|
133 | in a worst case analysis. The third and fourth components are |
---|
134 | particular to a CPU implementation and are not dependent on the |
---|
135 | executive. The fourth component is ignored by this document |
---|
136 | because most applications use only a subset of a |
---|
137 | microprocessor's instruction set. Because of this the longest |
---|
138 | instruction actually executed is application dependent. The |
---|
139 | worst case interrupt latency of an executive is typically |
---|
140 | defined as the sum of components (1) and (2). The second |
---|
141 | component includes the time necessry for RTEMS to save registers |
---|
142 | and vector to the user-defined handler. RTEMS includes the |
---|
143 | third component, the time required for the CPU to vector the |
---|
144 | interrupt, because it is a required part of any interrupt. |
---|
145 | |
---|
146 | Many executives report the maximum interrupt disable |
---|
147 | period as their interrupt latency and ignore the other |
---|
148 | components. This results in very low worst-case interrupt |
---|
149 | latency times which are not indicative of actual application |
---|
150 | performance. The definition used by RTEMS results in a higher |
---|
151 | interrupt latency being reported, but accurately reflects the |
---|
152 | longest delay between the CPU's receipt of an interrupt request |
---|
153 | and the execution of the first application-specific instruction |
---|
154 | in an interrupt service routine. |
---|
155 | |
---|
156 | The actual interrupt latency times are reported in |
---|
157 | the Timing Data chapter of this supplement. |
---|
158 | |
---|
159 | @subsection Context Switch Time |
---|
160 | |
---|
161 | An RTEMS context switch is defined as the act of |
---|
162 | taking the CPU from the currently executing task and giving it |
---|
163 | to another task. This process involves the following components: |
---|
164 | |
---|
165 | @itemize @bullet |
---|
166 | @item Saving the hardware state of the current task. |
---|
167 | |
---|
168 | @item Optionally, invoking the TASK_SWITCH user extension. |
---|
169 | |
---|
170 | @item Restoring the hardware state of the new task. |
---|
171 | @end itemize |
---|
172 | |
---|
173 | RTEMS defines the hardware state of a task to include |
---|
174 | the CPU's data registers, address registers, and, optionally, |
---|
175 | floating point registers. |
---|
176 | |
---|
177 | Context switch time is often touted as a performance |
---|
178 | measure of real-time executives. However, a context switch is |
---|
179 | performed as part of a directive's actions and should be viewed |
---|
180 | as such when designing an application. For example, if a task |
---|
181 | is unable to acquire a semaphore and blocks, a context switch is |
---|
182 | required to transfer control from the blocking task to a new |
---|
183 | task. From the application's perspective, the context switch is |
---|
184 | a direct result of not acquiring the semaphore. In this light, |
---|
185 | the context switch time is no more relevant than the performance |
---|
186 | of any other of the executive's subroutines which are not |
---|
187 | directly accessible by the application. |
---|
188 | |
---|
189 | In spite of the inappropriateness of using the |
---|
190 | context switch time as a performance metric, RTEMS context |
---|
191 | switch times for floating point and non-floating points tasks |
---|
192 | are provided for comparison purposes. Of the executives which |
---|
193 | actually support floating point operations, many do not report |
---|
194 | context switch times for floating point context switch time. |
---|
195 | This results in a reported context switch time which is |
---|
196 | meaningless for an application with floating point tasks. |
---|
197 | |
---|
198 | The actual context switch times are reported in the |
---|
199 | Timing Data chapter of this supplement. |
---|
200 | |
---|
201 | @subsection Directive Times |
---|
202 | |
---|
203 | Directives are the application's interface to the |
---|
204 | executive, and as such their execution times are critical in |
---|
205 | determining the performance of the application. For example, an |
---|
206 | application using a semaphore to protect a critical data |
---|
207 | structure should be aware of the time required to acquire and |
---|
208 | release a semaphore. In addition, the application designer can |
---|
209 | utilize the directive execution times to evaluate the |
---|
210 | performance of different synchronization and communication |
---|
211 | mechanisms. |
---|
212 | |
---|
213 | The actual directive execution times are reported in |
---|
214 | the Timing Data chapter of this supplement. |
---|
215 | |
---|
216 | @section Methodology |
---|
217 | |
---|
218 | @subsection Software Platform |
---|
219 | |
---|
220 | The RTEMS timing suite is written in C. The overhead |
---|
221 | of passing arguments to RTEMS by C is not timed. The times |
---|
222 | reported represent the amount of time from entering to exiting |
---|
223 | RTEMS. |
---|
224 | |
---|
225 | The tests are based upon one of two execution models: |
---|
226 | (1) single invocation times, and (2) average times of repeated |
---|
227 | invocations. Single invocation times are provided for |
---|
228 | directives which cannot easily be invoked multiple times in the |
---|
229 | same scenario. For example, the times reported for entering and |
---|
230 | exiting an interrupt service routine are single invocation |
---|
231 | times. The second model is used for directives which can easily |
---|
232 | be invoked multiple times in the same scenario. For example, |
---|
233 | the times reported for semaphore obtain and semaphore release |
---|
234 | are averages of multiple invocations. At least 100 invocations |
---|
235 | are used to obtain the average. |
---|
236 | |
---|
237 | @subsection Hardware Platform |
---|
238 | |
---|
239 | Since RTEMS supports a variety of processors, the |
---|
240 | hardware platform used to gather the benchmark times must also |
---|
241 | vary. Therefore, for each processor supported the hardware |
---|
242 | platform must be defined. Each definition will include a brief |
---|
243 | description of the target hardware platform including the clock |
---|
244 | speed, memory wait states encountered, and any other pertinent |
---|
245 | information. This definition may be found in the processor |
---|
246 | dependent timing data chapter within this supplement. |
---|
247 | |
---|
248 | @subsection What is measured? |
---|
249 | |
---|
250 | An effort was made to provide execution times for a |
---|
251 | large portion of RTEMS. Times were provided for most directives |
---|
252 | regardless of whether or not they are typically used in time |
---|
253 | critical code. For example, execution times are provided for |
---|
254 | all object create and delete directives, even though these are |
---|
255 | typically part of application initialization. |
---|
256 | |
---|
257 | The times include all RTEMS actions necessary in a |
---|
258 | particular scenario. For example, all times for blocking |
---|
259 | directives include the context switch necessary to transfer |
---|
260 | control to a new task. Under no circumstances is it necessary |
---|
261 | to add context switch time to the reported times. |
---|
262 | |
---|
263 | The following list describes the objects created by |
---|
264 | the timing suite: |
---|
265 | |
---|
266 | @itemize @bullet |
---|
267 | @item All tasks are non-floating point. |
---|
268 | |
---|
269 | @item All tasks are created as local objects. |
---|
270 | |
---|
271 | @item No timeouts are used on blocking directives. |
---|
272 | |
---|
273 | @item All tasks wait for objects in FIFO order. |
---|
274 | |
---|
275 | @end itemize |
---|
276 | |
---|
277 | In addition, no user extensions are configured. |
---|
278 | |
---|
279 | @subsection What is not measured? |
---|
280 | |
---|
281 | The times presented in this document are not intended |
---|
282 | to represent best or worst case times, nor are all directives |
---|
283 | included. For example, no times are provided for the initialize |
---|
284 | executive and fatal_error_occurred directives. Other than the |
---|
285 | exceptions detailed in the Determinancy section, all directives |
---|
286 | will execute in the fixed length of time given. |
---|
287 | |
---|
288 | Other than entering and exiting an interrupt service |
---|
289 | routine, all directives were executed from tasks and not from |
---|
290 | interrupt service routines. Directives invoked from ISRs, when |
---|
291 | allowable, will execute in slightly less time than when invoked |
---|
292 | from a task because rescheduling is delayed until the interrupt |
---|
293 | exits. |
---|
294 | |
---|
295 | @subsection Terminology |
---|
296 | |
---|
297 | The following is a list of phrases which are used to |
---|
298 | distinguish individual execution paths of the directives taken |
---|
299 | during the RTEMS performance analysis: |
---|
300 | |
---|
301 | @table @b |
---|
302 | @item another task |
---|
303 | The directive was performed |
---|
304 | on a task other than the calling task. |
---|
305 | |
---|
306 | @item available |
---|
307 | A task attempted to obtain a resource and |
---|
308 | immediately acquired it. |
---|
309 | |
---|
310 | @item blocked task |
---|
311 | The task operated upon by the |
---|
312 | directive was blocked waiting for a resource. |
---|
313 | |
---|
314 | @item caller blocks |
---|
315 | The requested resoure was not |
---|
316 | immediately available and the calling task chose to wait. |
---|
317 | |
---|
318 | @item calling task |
---|
319 | The task invoking the directive. |
---|
320 | |
---|
321 | @item messages flushed |
---|
322 | One or more messages was flushed |
---|
323 | from the message queue. |
---|
324 | |
---|
325 | @item no messages flushed |
---|
326 | No messages were flushed from |
---|
327 | the message queue. |
---|
328 | |
---|
329 | @item not available |
---|
330 | A task attempted to obtain a resource |
---|
331 | and could not immediately acquire it. |
---|
332 | |
---|
333 | @item no reschedule |
---|
334 | The directive did not require a |
---|
335 | rescheduling operation. |
---|
336 | |
---|
337 | @item NO_WAIT |
---|
338 | A resource was not available and the |
---|
339 | calling task chose to return immediately via the NO_WAIT option |
---|
340 | with an error. |
---|
341 | |
---|
342 | @item obtain current |
---|
343 | The current value of something was |
---|
344 | requested by the calling task. |
---|
345 | |
---|
346 | @item preempts caller |
---|
347 | The release of a resource caused a |
---|
348 | task of higher priority than the calling to be readied and it |
---|
349 | became the executing task. |
---|
350 | |
---|
351 | @item ready task |
---|
352 | The task operated upon by the directive |
---|
353 | was in the ready state. |
---|
354 | |
---|
355 | @item reschedule |
---|
356 | The actions of the directive |
---|
357 | necessitated a rescheduling operation. |
---|
358 | |
---|
359 | @item returns to caller |
---|
360 | The directive succeeded and |
---|
361 | immediately returned to the calling task. |
---|
362 | |
---|
363 | @item returns to interrupted task |
---|
364 | The instructions |
---|
365 | executed immediately following this interrupt will be in the |
---|
366 | interrupted task. |
---|
367 | |
---|
368 | @item returns to nested interrupt |
---|
369 | The instructions |
---|
370 | executed immediately following this interrupt will be in a |
---|
371 | previously interrupted ISR. |
---|
372 | |
---|
373 | @item returns to preempting task |
---|
374 | The instructions |
---|
375 | executed immediately following this interrupt or signal handler |
---|
376 | will be in a task other than the interrupted task. |
---|
377 | |
---|
378 | @item signal to self |
---|
379 | The signal set was sent to the |
---|
380 | calling task and signal processing was enabled. |
---|
381 | |
---|
382 | @item suspended task |
---|
383 | The task operated upon by the |
---|
384 | directive was in the suspended state. |
---|
385 | |
---|
386 | @item task readied |
---|
387 | The release of a resource caused a |
---|
388 | task of lower or equal priority to be readied and the calling |
---|
389 | task remained the executing task. |
---|
390 | |
---|
391 | @item yield |
---|
392 | The act of attempting to voluntarily release |
---|
393 | the CPU. |
---|
394 | |
---|
395 | @end table |
---|
396 | |
---|