source: rtems/c/src/lib/libcpu/powerpc/new-exceptions/bspsupport/README @ c499856

4.115
Last change on this file since c499856 was 65c6425, checked in by Joel Sherrill <joel.sherrill@…>, on 05/03/12 at 17:24:46

Remove CVS Id Strings (manual edits after script)

These modifications were required by hand after running the script.
In some cases, the file names did not match patterns. In others,
the format of the file did not match any common patterns.

  • Property mode set to 100644
File size: 15.8 KB
Line 
1
2BSP support middleware for 'new-exception' style PPC.
3
4T. Straumann, 12/2007
5
6EXPLANATION OF SOME TERMS
7=========================
8
9In this README we refer to exceptions and sometimes
10to 'interrupts'. Interrupts simply are asynchronous
11exceptions such as 'external' exceptions or 'decrementer'
12/'timer' exceptions.
13
14Traditionally (in the libbsp/powerpc/shared implementation),
15synchronous exceptions are handled entirely in the context
16of the interrupted task, i.e., the exception handlers use
17the task's stack and leave thread-dispatching enabled,
18i.e., scheduling is allowed to happen 'in the middle'
19of an exception handler.
20
21Asynchronous exceptions/interrupts, OTOH, use a dedicated
22interrupt stack and defer scheduling until after the last
23nested ISR has finished.
24
25RATIONALE
26=========
27The 'new-exception' processing API works at a rather
28low level. It provides functions for
29installing low-level code (which must be written in
30assembly code) directly into the PPC vector area.
31It is entirely left to the BSP to implement low-level
32exception handlers and to implement an API for
33C-level exception handlers and to implement the
34RTEMS interrupt API defined in cpukit/include/rtems/irq.h.
35
36The result has been a Darwinian evolution of variants
37of this code which is very hard to maintain. Mostly,
38the four files
39
40libbsp/powerpc/shared/vectors/vectors.S
41  (low-level handlers for 'normal' or 'synchronous'
42  exceptions. This code saves all registers on
43  the interrupted task's stack and calls a
44  'global' C (high-level) exception handler.
45
46libbsp/powerpc/shared/vectors/vectors_init.c
47  (default implementation of the 'global' C
48  exception handler and initialization of the
49  vector table with trampoline code that ends up
50  calling the 'global' handler.
51
52libbsp/powerpc/shared/irq/irq_asm.S
53  (low-level handlers for 'IRQ'-type or 'asynchronous'
54  exceptions. This code is very similar to vectors.S
55  but does slightly more: after saving (only
56  the minimal set of) registers on the interrupted
57  task's stack it disables thread-dispatching, switches
58  to a dedicated ISR stack (if not already there which is
59  possible for nested interrupts) and then executes the high
60  level (C) interrupt dispatcher 'C_dispatch_irq_handler()'.
61  After 'C_dispatch_irq_handler()' returns the stack
62  is switched back (if not a nested IRQ), thread-dispatching
63  is re-enabled, signals are delivered and a context
64  switch is initiated if necessary.
65
66libbsp/powerpc/shared/irq/irq.c
67  implementation of the RTEMS ('new') IRQ API defined
68  in cpukit/include/rtems/irq.h.
69
70have been copied and modified by a myriad of BSPs leading
71to many slightly different variants.
72
73THE BSP-SUPORT MIDDLEWARE
74=========================
75
76The code in this directory is an attempt to provide the
77functionality implemented by the aforementioned files
78in a more generic way so that it can be shared by more
79BSPs rather than being copied and modified.
80
81Another important goal was eliminating all conditional
82compilation which tested for specific CPU models by means
83of C-preprocessor symbols (#ifdef ppcXYZ).
84Instead, appropriate run-time checks for features defined
85in cpuIdent.h are used.
86
87The assembly code has been (almost completely) rewritten
88and it tries to address a few problems while deliberately
89trying to live with the existing APIs and semantics
90(how these could be improved is beyond the scope but
91that they could is beyond doubt...):
92
93 - some PPCs don't fit into the classic scheme where
94   the exception vector addresses all were multiples of
95   0x100 (some vectors are spaced as closely as 0x10).
96   The API should not expose vector offsets but only
97   vector numbers which can be considered an abstract
98   entity. The mapping from vector numbers to actual
99   address offsets is performed inside 'raw_exception.c'
100 - having to provide assembly prologue code in order to
101   hook an exception is cumbersome. The middleware
102   tries to free users and BSP writers from this issue
103   by dealing with assembly prologues entirely inside
104   the middleware. The user can hook ordinary C routines.
105 - the advent of BookE CPUs brought interrupts with
106   multiple priorities: non-critical and critical
107   interrupts. Unfortunately, these are not entirely
108   trivial to deal with (unless critical interrupts
109   are permanently disabled [which is still the case:
110   ATM rtems_interrupt_enable()/rtems_interrupt_disable()
111   only deal with EE]). See separate section titled
112   'race condition...' below for a detailed explanation.
113
114STRUCTURE
115=========
116
117The middleware uses exception 'categories' or
118'flavors' as defined in raw_exception.h.
119
120The middleware consists of the following parts:
121
122   1 small 'prologue' snippets that encode the
123     vector information and jump to appropriate
124         'flavored-wrapper' code for further handling.
125         Some PPC exceptions are spaced only
126         16-bytes apart, so the generic
127         prologue snippets are only 16-bytes long.
128         Prologues for synchronuos and asynchronous
129         exceptions differ.
130
131   2 flavored-wrappers which sets up a stack frame
132     and do things that are specific for
133         different 'flavors' of exceptions which
134         currently are
135           - classic PPC exception
136           - ppc405 critical exception
137           - bookE critical exception
138           - e500 machine check exception
139
140   Assembler macros are provided and they can be
141   expanded to generate prologue templates and
142   flavored-wrappers for different flavors
143   of exceptions. Currently, there are two prologues
144   for all aforementioned flavors. One for synchronous
145   exceptions, the other for interrupts.
146
147   3 generic assembly-level code that does the bulk
148     of saving register context and calling C-code.
149
150   4 C-code (ppc_exc_hdl.c) for dispatching BSP/user
151     handlers.
152
153   5 Initialization code (vectors_init.c). All valid
154     exceptions for the detected CPU are determined
155         and a fitting prologue snippet for the exception
156         category (classic, critical, synchronous or IRQ, ...)
157         is generated from a template and the vector number
158         and then installed in the vector area.
159
160         The user/BSP only has to deal with installing
161         high-level handlers but by default, the standard
162         'C_dispatch_irq_handler' routine is hooked to
163         the external and 'decrementer' exceptions.
164
165   6 RTEMS IRQ API is implemented by 'irq.c'. It
166     relies on a few routines to be provided by
167         the BSP.
168
169USAGE
170=====
171        BSP writers must provide the following routines
172        (declared in irq_supp.h):
173        Interrupt controller (PIC) support:
174                BSP_setup_the_pic()        - initialize PIC hardware
175                BSP_enable_irq_at_pic()    - enable/disable given irq at PIC; IGNORE if
176                BSP_disable_irq_at_pic()     irq number out of range!
177                C_dispatch_irq_handler()   - handle irqs and dispatch user handlers
178                                             this routine SHOULD use the inline
179                                                                         fragment
180
181                                                                           bsp_irq_dispatch_list()
182
183                                                                         provided by irq_supp.h
184                                                                         for calling user handlers.
185
186        BSP initialization; call
187
188        rtems_status_code sc = ppc_exc_initialize(
189          PPC_INTERRUPT_DISABLE_MASK_DEFAULT,
190          interrupt_stack_begin,
191          interrupt_stack_size
192        );
193        if (sc != RTEMS_SUCCESSFUL) {
194          BSP_panic("cannot initialize exceptions");
195        }
196        BSP_rtems_irq_mngt_set();
197
198        Note that BSP_rtems_irq_mngt_set() hooks the C_dispatch_irq_handler()
199        to the external and decrementer (PIT exception for bookE; a decrementer
200        emulation is activated) exceptions for backwards compatibility reasons.
201        C_dispatch_irq_handler() must therefore be able to support these two
202        exceptions.
203        However, the BSP implementor is free to either disconnect
204        C_dispatch_irq_handler() from either of these exceptions, to connect
205        other handlers (e.g., for SYSMGMT exceptions) or to hook
206        C_dispatch_irq_handler() to yet more exceptions etc. *after*
207        BSP_rtems_irq_mngt_set() executed.
208
209        Hooking exceptions:
210
211        The API defined in vectors.h declares routines for connecting
212        a C-handler to any exception. Note that the execution environment
213        of the C-handler depends on the exception being synchronous or
214        asynchronous:
215
216                - synchronous exceptions use the task stack and do not
217                  disable thread dispatching scheduling.
218                - asynchronous exceptions use a dedicated stack and do
219                  defer thread dispatching until handling has (almost) finished.
220
221        By inspecting the vector number stored in the exception frame
222        the nature of the exception can be determined: asynchronous
223        exceptions have the most significant bit(s) set.
224
225        Any exception for which no dedicated handler is registered
226        ends up being handled by the routine addressed by the
227        (traditional) 'globalExcHdl' function pointer.
228
229        Makefile.am:
230                - make sure the Makefile.am does NOT use any of the files
231                        vectors.S, vectors.h, vectors_init.c, irq_asm.S, irq.c
232                  from 'libbsp/powerpc/shared' NOR must the BSP implement
233                  any functionality that is provided by those files (and
234                  now the middleware).
235
236                - (probably) remove 'vectors.rel' and anything related
237
238                - add
239
240                    ../../../libcpu/@RTEMS_CPU@/@exceptions@/bspsupport/vectors.h
241                    ../../../libcpu/@RTEMS_CPU@/@exceptions@/bspsupport/irq_supp.h
242
243                  to 'include_bsp_HEADERS'
244
245                - add
246
247                    ../../../libcpu/@RTEMS_CPU@/@exceptions@/exc_bspsupport.rel
248                    ../../../libcpu/@RTEMS_CPU@/@exceptions@/irq_bspsupport.rel
249
250                  to 'libbsp_a_LIBADD'
251
252                  (irq.c is in a separate '.rel' so that you can get support
253                  for exceptions only).
254
255CAVEATS
256=======
257
258On classic PPCs, early (and late) parts of the low-level
259exception handling code run with the MMU disabled which mean
260that the default caching attributes (write-back) are in effect
261(thanks to Thomas Doerfler for bringing this up).
262The code currently assumes that the MMU translations
263for the task and interrupt stacks as well as some
264variables in the data-area MATCH THE DEFAULT CACHING
265ATTRIBUTES (this assumption also holds for the old code
266in libbsp/powepc/shared/vectors ../irq).
267
268During initialization of exception handling, a crude test
269is performed to check if memory seems to have the write-back
270attribute. The 'dcbz' instruction should - on most PPCs - cause
271an alignment exception if the tested cache-line does not
272have this attribute.
273
274BSPs which entirely disable caching (e.g., by physically
275disabling the cache(s)) should set the variable
276  ppc_exc_cache_wb_check = 0
277prior to calling initialize_exceptions().
278Note that this check does not catch all possible
279misconfigurations (e.g., on the 860, the default attribute
280is AFAIK [libcpu/powerpc/mpc8xx/mmu/mmu_init.c] set to
281'caching-disabled' which is potentially harmful but
282this situation is not detected).
283
284
285RACE CONDITION WHEN DEALING WITH CRITICAL INTERRUPTS
286====================================================
287
288   The problematic race condition is as follows:
289
290   Usually, ISRs are allowed to use certain OS
291   primitives such as e.g., releasing a semaphore.
292   In order to prevent a context switch from happening
293   immediately (this would result in the ISR being
294   suspended), thread-dispatching must be disabled
295   around execution of the ISR. However, on the
296   PPC architecture it is neither possible to
297   atomically disable ALL interrupts nor is it
298   possible to atomically increment a variable
299   (the thread-dispatch-disable level).
300   Hence, the following sequence of events could
301   occur:
302    1) low-priority interrupt (LPI) is taken
303    2) before the LPI can increase the
304           thread-dispatch-disable level or disable
305           high-priority interupts, a high-priority
306           interrupt (HPI) happens
307        3) HPI increases dispatch-disable level
308        4) HPI executes high-priority ISR which e.g.,
309           posts a semaphore
310        5) HPI decreases dispatch-disable level and
311           realizes that a context switch is necessary
312        6) context switch is performed since LPI had
313           not gotten to the point where it could
314           increase the dispatch-disable level.
315   At this point, the LPI has been effectively
316   suspended which means that the low-priority
317   ISR will not be executed until the task
318   interupted in 1) is scheduled again!
319
320   The solution to this problem is letting the
321   first machine instruction of the low-priority
322   exception handler write a non-zero value to
323   a variable in memory:
324
325        ee_vector_offset: 
326
327         stw r1, ee_lock@sdarel(r13)   
328         .. save some registers etc..
329                 .. increase thread-dispatch-disable-level
330                 .. clear 'ee_lock' variable
331
332        After the HPI decrements the dispatch-disable level
333        it checks 'ee_lock' and refrains from performing
334        a context switch if 'ee_lock' is nonzero. Since
335        the LPI will complete execution subsequently it
336        will eventually do the context switch.
337
338        For the single-instruction write operation we must
339          a) write a register that is guaranteed to be
340             non-zero (e.g., R1 (stack pointer) or R13
341                 (SVR4 short-data area).
342          b) use an addressing mode that doesn't require
343             loading any registers. The short-data area
344                 pointer R13 is appropriate.
345
346    CAVEAT: unfortunately, this method by itself
347        is *NOT* enough because raising a low-priority
348        exception and executing the first instruction
349        of the handler is *NOT* atomic. Hence, the following
350        could occur:
351
352         1) LPI is taken
353         2) PC is saved in SRR0, PC is loaded with
354            address of 'locking instruction'
355                  stw r1, ee_lock@sdarel(r13)
356     3) ==> critical interrupt happens
357         4) PC (containing address of locking instruction)
358            is saved in CSRR0
359     5) HPI is dispatched
360
361        For the HPI to correctly handle this situation
362        it does the following:
363
364       
365                a) increase thread-dispatch disable level
366                b) do interrupt work
367                c) decrease thread-dispatch disable level
368            d) if ( dispatch-disable level == 0 )
369                 d1) check ee_lock
370                 d2) check instruction at *CSRR0
371                 d3) do a context switch if necessary ONLY IF
372                     ee_lock is NOT set AND *CSRR0 is NOT the
373                         'locking instruction'
374
375        this works because the address of 'ee_lock'
376        is embedded in the locking instruction
377        'stw r1, ee_lock@sdarel(r13)' and because the
378        registers r1/r13 have a special purpose
379        (stack-pointer, SDA-pointer). Hence it is safe
380        to assume that the particular instruction
381        'stw r1,ee_lock&sdarel(r13)' never occurs
382        anywhere else.
383
384        Another note: this algorithm also makes sure
385        that ONLY nested ASYNCHRONOUS interrupts which
386        enable/disable thread-dispatching and check if
387        thread-dispatching is required before returning
388        control engage in this locking protocol. It is
389        important that when a critical, asynchronous
390        interrupt interrupts a 'synchronous' exception
391        (which does not disable thread-dispatching)
392        the thread-dispatching operation upon return of
393        the HPI is NOT deferred (because the synchronous
394        handler would not, eventually, check for a
395        dispatch requirement).
396
397        And one more note: We never want to disable
398        machine-check exceptions to avoid a checkstop.
399        This means that we cannot use enabling/disabling
400        this type of exception for protection of critical
401        OS data structures.
402        Therefore, calling OS primitives from a asynchronous
403        machine-check handler is ILLEGAL and not supported.
404        Since machine-checks can happen anytime it is not
405        legal to test if a deferred context switch should
406        be performed when the asynchronous machine-check
407        handler returns (since _Context_Switch_is_necessary
408        could have been set by a IRQ-protected section of
409        code that was hit by the machine-check).
410        Note that synchronous machine-checks can legally
411        use OS primitives and currently there are no
412        asynchronous machine-checks defined.
413
414   Epilogue:
415
416   You have to disable all asynchronous exceptions which may cause a context
417   switch before the restoring of the SRRs and the RFI.  Reason:
418   
419      Suppose we are in the epilogue code of an EE between the move to SRRs and
420      the RFI. Here EE is disabled but CE is enabled. Now a CE happens.  The
421      handler decides that a thread dispatch is necessary. The CE checks if
422      this is possible:
423   
424         o The thread dispatch disable level is 0, because the EE has already
425           decremented it.
426         o The EE lock variable is cleared.
427         o The EE executes not the first instruction.
428   
429      Hence a thread dispatch is allowed. The CE issues a context switch to a
430      task with EE enabled (for example a task waiting for a semaphore). Now a
431      EE happens and the current content of the SRRs is lost.
Note: See TracBrowser for help on using the repository browser.