1 | $Id$ |
---|
2 | |
---|
3 | BSP support middleware for 'new-exception' style PPC. |
---|
4 | |
---|
5 | T. Straumann, 12/2007 |
---|
6 | |
---|
7 | EXPLANATION OF SOME TERMS |
---|
8 | ========================= |
---|
9 | |
---|
10 | In this README we refer to exceptions and sometimes |
---|
11 | to 'interrupts'. Interrupts simply are asynchronous |
---|
12 | exceptions such as 'external' exceptions or 'decrementer' |
---|
13 | /'timer' exceptions. |
---|
14 | |
---|
15 | Traditionally (in the libbsp/powerpc/shared implementation), |
---|
16 | synchronous exceptions are handled entirely in the context |
---|
17 | of the interrupted task, i.e., the exception handlers use |
---|
18 | the task's stack and leave thread-dispatching enabled, |
---|
19 | i.e., scheduling is allowed to happen 'in the middle' |
---|
20 | of an exception handler. |
---|
21 | |
---|
22 | Asynchronous exceptions/interrupts, OTOH, use a dedicated |
---|
23 | interrupt stack and defer scheduling until after the last |
---|
24 | nested ISR has finished. |
---|
25 | |
---|
26 | RATIONALE |
---|
27 | ========= |
---|
28 | The 'new-exception' processing API works at a rather |
---|
29 | low level. It provides functions for |
---|
30 | installing low-level code (which must be written in |
---|
31 | assembly code) directly into the PPC vector area. |
---|
32 | It is entirely left to the BSP to implement low-level |
---|
33 | exception handlers and to implement an API for |
---|
34 | C-level exception handlers and to implement the |
---|
35 | RTEMS interrupt API defined in cpukit/include/rtems/irq.h. |
---|
36 | |
---|
37 | The result has been a Darwinian evolution of variants |
---|
38 | of this code which is very hard to maintain. Mostly, |
---|
39 | the four files |
---|
40 | |
---|
41 | libbsp/powerpc/shared/vectors/vectors.S |
---|
42 | (low-level handlers for 'normal' or 'synchronous' |
---|
43 | exceptions. This code saves all registers on |
---|
44 | the interrupted task's stack and calls a |
---|
45 | 'global' C (high-level) exception handler. |
---|
46 | |
---|
47 | libbsp/powerpc/shared/vectors/vectors_init.c |
---|
48 | (default implementation of the 'global' C |
---|
49 | exception handler and initialization of the |
---|
50 | vector table with trampoline code that ends up |
---|
51 | calling the 'global' handler. |
---|
52 | |
---|
53 | libbsp/powerpc/shared/irq/irq_asm.S |
---|
54 | (low-level handlers for 'IRQ'-type or 'asynchronous' |
---|
55 | exceptions. This code is very similar to vectors.S |
---|
56 | but does slightly more: after saving (only |
---|
57 | the minimal set of) registers on the interrupted |
---|
58 | task's stack it disables thread-dispatching, switches |
---|
59 | to a dedicated ISR stack (if not already there which is |
---|
60 | possible for nested interrupts) and then executes the high |
---|
61 | level (C) interrupt dispatcher 'C_dispatch_irq_handler()'. |
---|
62 | After 'C_dispatch_irq_handler()' returns the stack |
---|
63 | is switched back (if not a nested IRQ), thread-dispatching |
---|
64 | is re-enabled, signals are delivered and a context |
---|
65 | switch is initiated if necessary. |
---|
66 | |
---|
67 | libbsp/powerpc/shared/irq/irq.c |
---|
68 | implementation of the RTEMS ('new') IRQ API defined |
---|
69 | in cpukit/include/rtems/irq.h. |
---|
70 | |
---|
71 | have been copied and modified by a myriad of BSPs leading |
---|
72 | to many slightly different variants. |
---|
73 | |
---|
74 | THE BSP-SUPORT MIDDLEWARE |
---|
75 | ========================= |
---|
76 | |
---|
77 | The code in this directory is an attempt to provide the |
---|
78 | functionality implemented by the aforementioned files |
---|
79 | in a more generic way so that it can be shared by more |
---|
80 | BSPs rather than being copied and modified. |
---|
81 | |
---|
82 | Another important goal was eliminating all conditional |
---|
83 | compilation which tested for specific CPU models by means |
---|
84 | of C-preprocessor symbols (#ifdef ppcXYZ). |
---|
85 | Instead, appropriate run-time checks for features defined |
---|
86 | in cpuIdent.h are used. |
---|
87 | |
---|
88 | The assembly code has been (almost completely) rewritten |
---|
89 | and it tries to address a few problems while deliberately |
---|
90 | trying to live with the existing APIs and semantics |
---|
91 | (how these could be improved is beyond the scope but |
---|
92 | that they could is beyond doubt...): |
---|
93 | |
---|
94 | - some PPCs don't fit into the classic scheme where |
---|
95 | the exception vector addresses all were multiples of |
---|
96 | 0x100 (some are spaced as closely as 0x10). |
---|
97 | The API should not expose vector offsets but only |
---|
98 | vector numbers which can be considered an abstract |
---|
99 | entity. The mapping from vector numbers to actual |
---|
100 | address offsets is performed inside 'raw_exception.c' |
---|
101 | - having to provide assembly prologue code in order to |
---|
102 | hook an exception is cumbersome. The middleware |
---|
103 | tries to free users and BSP writers from this issue |
---|
104 | by dealing with assembly prologues entirely inside |
---|
105 | the middleware. The user can hook ordinary C routines. |
---|
106 | - the advent of BookE CPUs brought interrupts with |
---|
107 | multiple priorities: non-critical and critical |
---|
108 | interrupts. Unfortunately, these are not entirely |
---|
109 | trivial to deal with (unless critical interrupts |
---|
110 | are permanently disabled [which is still the case: |
---|
111 | ATM rtems_interrupt_enable()/rtems_interrupt_disable() |
---|
112 | only deal with EE]). See separate section titled |
---|
113 | 'race condition...' below for a detailed explanation. |
---|
114 | |
---|
115 | STRUCTURE |
---|
116 | ========= |
---|
117 | |
---|
118 | The middleware uses exception 'categories' or |
---|
119 | 'flavors' as defined in raw_exception.h. |
---|
120 | |
---|
121 | The middleware consists of the following parts: |
---|
122 | |
---|
123 | 1 small 'prologue' snippets that encode the |
---|
124 | vector information and jump to appropriate |
---|
125 | 'flavored-wrapper' code for further handling. |
---|
126 | Some PPC exceptions are spaced only |
---|
127 | 16-bytes apart, so the generic |
---|
128 | prologue snippets are only 16-bytes long. |
---|
129 | Prologues for synchronuos and asynchronous |
---|
130 | exceptions differ. |
---|
131 | |
---|
132 | 2 flavored-wrappers which sets up a stack frame |
---|
133 | and do things that are specific for |
---|
134 | different 'flavors' of exceptions which |
---|
135 | currently are |
---|
136 | - classic PPC exception |
---|
137 | - ppc405 critical exception |
---|
138 | - bookE critical exception |
---|
139 | - e500 machine check exception |
---|
140 | |
---|
141 | Assembler macros are provided and they can be |
---|
142 | expanded to generate prologue templates and |
---|
143 | flavored-wrappers for different flavors |
---|
144 | of exceptions. Currently, there are two prologues |
---|
145 | for all aforementioned flavors. One for synchronous |
---|
146 | exceptions, the other for interrupts. |
---|
147 | |
---|
148 | 3 generic assembly-level code that does the bulk |
---|
149 | of saving register context and calling C-code. |
---|
150 | |
---|
151 | 4 C-code (ppc_exc_hdl.c) for dispatching BSP/user |
---|
152 | handlers. |
---|
153 | |
---|
154 | 5 Initialization code (vectors_init.c). All valid |
---|
155 | exceptions for the detected CPU are determined |
---|
156 | and a fitting prologue snippet for the exception |
---|
157 | category (classic, critical, synchronous or IRQ, ...) |
---|
158 | is generated from a template and the vector number |
---|
159 | and then installed in the vector area. |
---|
160 | |
---|
161 | The user/BSP only has to deal with installing |
---|
162 | high-level handlers but by default, the standard |
---|
163 | 'C_dispatch_irq_handler' routine is hooked to |
---|
164 | the external and 'decrementer' exceptions. |
---|
165 | |
---|
166 | 6 RTEMS IRQ API is implemented by 'irq.c'. It |
---|
167 | relies on a few routines to be provided by |
---|
168 | the BSP. |
---|
169 | |
---|
170 | USAGE |
---|
171 | ===== |
---|
172 | BSP writers must provide the following routines |
---|
173 | (declared in irq_supp.h): |
---|
174 | Interrupt controller (PIC) support: |
---|
175 | BSP_setup_the_pic() - initialize PIC hardware |
---|
176 | BSP_enable_irq_at_pic() - enable/disable given irq at PIC; IGNORE if |
---|
177 | BSP_disable_irq_at_pic() irq number out of range! |
---|
178 | C_dispatch_irq_handler() - handle irqs and dispatch user handlers |
---|
179 | this routine SHOULD use the inline |
---|
180 | fragment |
---|
181 | |
---|
182 | bsp_irq_dispatch_list() |
---|
183 | |
---|
184 | provided by irq_supp.h |
---|
185 | for calling user handlers. |
---|
186 | |
---|
187 | BSP initialization; call |
---|
188 | |
---|
189 | initialize_exceptions(); |
---|
190 | BSP_rtems_irq_mngt_set(); |
---|
191 | |
---|
192 | Note that BSP_rtems_irq_mngt_set() hooks the C_dispatch_irq_handler() |
---|
193 | to the external and decrementer (PIT exception for bookE; a decrementer |
---|
194 | emulation is activated) exceptions for backwards compatibility reasons. |
---|
195 | C_dispatch_irq_handler() must therefore be able to support these two |
---|
196 | exceptions. |
---|
197 | However, the BSP implementor is free to either disconnect |
---|
198 | C_dispatch_irq_handler() from either of these exceptions, to connect |
---|
199 | other handlers (e.g., for SYSMGMT exceptions) or to hook |
---|
200 | C_dispatch_irq_handler() to yet more exceptions etc. *after* |
---|
201 | BSP_rtems_irq_mngt_set() executed. |
---|
202 | |
---|
203 | Hooking exceptions: |
---|
204 | |
---|
205 | The API defined in ppc_exc_bspsupp.h declares routines for connecting |
---|
206 | a C-handler to any exception. Note that the execution environment |
---|
207 | of the C-handler depends on the exception being synchronous or |
---|
208 | asynchronous: |
---|
209 | |
---|
210 | - synchronous exceptions use the task stack and do not |
---|
211 | disable thread dispatching scheduling. |
---|
212 | - asynchronous exceptions use a dedicated stack and do |
---|
213 | defer thread dispatching until handling has (almost) finished. |
---|
214 | |
---|
215 | By inspecting the vector number stored in the exception frame |
---|
216 | the nature of the exception can be determined: asynchronous |
---|
217 | exceptions have the most significant bit(s) set. |
---|
218 | |
---|
219 | Any exception for which no dedicated handler is registered |
---|
220 | ends up being handled by the routine addressed by the |
---|
221 | (traditional) 'globalExcHdl' function pointer. |
---|
222 | |
---|
223 | Makefile.am: |
---|
224 | - make sure the Makefile.am does NOT use any of the files |
---|
225 | vectors.S, vectors.h, vectors_init.c, irq_asm.S, irq.c |
---|
226 | from 'libbsp/powerpc/shared' NOR must the BSP implement |
---|
227 | any functionality that is provided by those files (and |
---|
228 | now the middleware). |
---|
229 | |
---|
230 | - (probably) remove 'vectors.rel' and anything related |
---|
231 | |
---|
232 | - add |
---|
233 | |
---|
234 | include_bsp_HEADERS += \ |
---|
235 | ../../../libcpu/@RTEMS_CPU@/@exceptions@/bspsupport/vectors.h \ |
---|
236 | ../../../libcpu/@RTEMS_CPU@/@exceptions@/bspsupport/irq_supp.h \ |
---|
237 | ../../../libcpu/@RTEMS_CPU@/@exceptions@/bspsupport/ppc_exc_bspsupp.h |
---|
238 | |
---|
239 | - add |
---|
240 | |
---|
241 | ../../../libcpu/@RTEMS_CPU@/@exceptions@/exc_bspsupport.rel \ |
---|
242 | |
---|
243 | to 'libbsp_a_LIBADD' |
---|
244 | |
---|
245 | |
---|
246 | |
---|
247 | RACE CONDITION WHEN DEALING WITH CRITICAL INTERRUPTS |
---|
248 | ==================================================== |
---|
249 | |
---|
250 | The problematic race condition is as follows: |
---|
251 | |
---|
252 | Usually, ISRs are allowed to use certain OS |
---|
253 | primitives such as e.g., releasing a semaphore. |
---|
254 | In order to prevent a context switch from happening |
---|
255 | immediately (this would result in the ISR being |
---|
256 | suspended), thread-dispatching must be disabled |
---|
257 | around execution of the ISR. However, on the |
---|
258 | PPC architecture it is neither possible to |
---|
259 | atomically disable ALL interrupts nor is it |
---|
260 | possible to atomically increment a variable |
---|
261 | (the thread-dispatch-disable level). |
---|
262 | Hence, the following sequence of events could |
---|
263 | occur: |
---|
264 | 1) low-priority interrupt (LPI) is taken |
---|
265 | 2) before the LPI can increase the |
---|
266 | thread-dispatch-disable level or disable |
---|
267 | high-priority interupts, a high-priority |
---|
268 | interrupt (HPI) happens |
---|
269 | 3) HPI increases dispatch-disable level |
---|
270 | 4) HPI executes high-priority ISR which e.g., |
---|
271 | posts a semaphore |
---|
272 | 5) HPI decreases dispatch-disable level and |
---|
273 | realizes that a context switch is necessary |
---|
274 | 6) context switch is performed since LPI had |
---|
275 | not gotten to the point where it could |
---|
276 | increase the dispatch-disable level. |
---|
277 | At this point, the LPI has been effectively |
---|
278 | suspended which means that the low-priority |
---|
279 | ISR will not be executed until the task |
---|
280 | interupted in 1) is scheduled again! |
---|
281 | |
---|
282 | The solution to this problem is letting the |
---|
283 | first machine instruction of the low-priority |
---|
284 | exception handler write a non-zero value to |
---|
285 | a variable in memory: |
---|
286 | |
---|
287 | ee_vector_offset: |
---|
288 | |
---|
289 | stw r1, ee_lock@sdarel(r13) |
---|
290 | .. save some registers etc.. |
---|
291 | .. increase thread-dispatch-disable-level |
---|
292 | .. clear 'ee_lock' variable |
---|
293 | |
---|
294 | The earliest a critical exception could interrupt |
---|
295 | the 'external' exception handler is after the |
---|
296 | 'stw r1, ee_lock@sdarel(r13)' instruction. |
---|
297 | |
---|
298 | After the HPI decrements the dispatch-disable level |
---|
299 | it checks 'ee_lock' and refrains from performing |
---|
300 | a context switch if 'ee_lock' is nonzero. Since |
---|
301 | the LPI will complete execution subsequently it |
---|
302 | will eventually do the context switch. |
---|
303 | |
---|
304 | For the single-instruction write operation we must |
---|
305 | a) write a register that is guaranteed to be |
---|
306 | non-zero (e.g., R1 (stack pointer) or R13 |
---|
307 | (SVR4 short-data area). |
---|
308 | b) use an addressing mode that doesn't require |
---|
309 | loading any registers. The short-data area |
---|
310 | pointer R13 is appropriate. |
---|
311 | |
---|