source: rtems/c/src/lib/libcpu/powerpc/new-exceptions/bspsupport/README @ 25a92bc1

4.104.114.95
Last change on this file since 25a92bc1 was 25a92bc1, checked in by Thomas Doerfler <Thomas.Doerfler@…>, on 07/11/08 at 10:02:12

adapted powerpc exception code

  • Property mode set to 100644
File size: 15.7 KB
Line 
1$Id$
2
3BSP support middleware for 'new-exception' style PPC.
4
5T. Straumann, 12/2007
6
7EXPLANATION OF SOME TERMS
8=========================
9
10In this README we refer to exceptions and sometimes
11to 'interrupts'. Interrupts simply are asynchronous
12exceptions such as 'external' exceptions or 'decrementer'
13/'timer' exceptions.
14
15Traditionally (in the libbsp/powerpc/shared implementation),
16synchronous exceptions are handled entirely in the context
17of the interrupted task, i.e., the exception handlers use
18the task's stack and leave thread-dispatching enabled,
19i.e., scheduling is allowed to happen 'in the middle'
20of an exception handler.
21
22Asynchronous exceptions/interrupts, OTOH, use a dedicated
23interrupt stack and defer scheduling until after the last
24nested ISR has finished.
25
26RATIONALE
27=========
28The 'new-exception' processing API works at a rather
29low level. It provides functions for
30installing low-level code (which must be written in
31assembly code) directly into the PPC vector area.
32It is entirely left to the BSP to implement low-level
33exception handlers and to implement an API for
34C-level exception handlers and to implement the
35RTEMS interrupt API defined in cpukit/include/rtems/irq.h.
36
37The result has been a Darwinian evolution of variants
38of this code which is very hard to maintain. Mostly,
39the four files
40
41libbsp/powerpc/shared/vectors/vectors.S
42  (low-level handlers for 'normal' or 'synchronous'
43  exceptions. This code saves all registers on
44  the interrupted task's stack and calls a
45  'global' C (high-level) exception handler.
46
47libbsp/powerpc/shared/vectors/vectors_init.c
48  (default implementation of the 'global' C
49  exception handler and initialization of the
50  vector table with trampoline code that ends up
51  calling the 'global' handler.
52
53libbsp/powerpc/shared/irq/irq_asm.S
54  (low-level handlers for 'IRQ'-type or 'asynchronous'
55  exceptions. This code is very similar to vectors.S
56  but does slightly more: after saving (only
57  the minimal set of) registers on the interrupted
58  task's stack it disables thread-dispatching, switches
59  to a dedicated ISR stack (if not already there which is
60  possible for nested interrupts) and then executes the high
61  level (C) interrupt dispatcher 'C_dispatch_irq_handler()'.
62  After 'C_dispatch_irq_handler()' returns the stack
63  is switched back (if not a nested IRQ), thread-dispatching
64  is re-enabled, signals are delivered and a context
65  switch is initiated if necessary.
66
67libbsp/powerpc/shared/irq/irq.c
68  implementation of the RTEMS ('new') IRQ API defined
69  in cpukit/include/rtems/irq.h.
70
71have been copied and modified by a myriad of BSPs leading
72to many slightly different variants.
73
74THE BSP-SUPORT MIDDLEWARE
75=========================
76
77The code in this directory is an attempt to provide the
78functionality implemented by the aforementioned files
79in a more generic way so that it can be shared by more
80BSPs rather than being copied and modified.
81
82Another important goal was eliminating all conditional
83compilation which tested for specific CPU models by means
84of C-preprocessor symbols (#ifdef ppcXYZ).
85Instead, appropriate run-time checks for features defined
86in cpuIdent.h are used.
87
88The assembly code has been (almost completely) rewritten
89and it tries to address a few problems while deliberately
90trying to live with the existing APIs and semantics
91(how these could be improved is beyond the scope but
92that they could is beyond doubt...):
93
94 - some PPCs don't fit into the classic scheme where
95   the exception vector addresses all were multiples of
96   0x100 (some vectors are spaced as closely as 0x10).
97   The API should not expose vector offsets but only
98   vector numbers which can be considered an abstract
99   entity. The mapping from vector numbers to actual
100   address offsets is performed inside 'raw_exception.c'
101 - having to provide assembly prologue code in order to
102   hook an exception is cumbersome. The middleware
103   tries to free users and BSP writers from this issue
104   by dealing with assembly prologues entirely inside
105   the middleware. The user can hook ordinary C routines.
106 - the advent of BookE CPUs brought interrupts with
107   multiple priorities: non-critical and critical
108   interrupts. Unfortunately, these are not entirely
109   trivial to deal with (unless critical interrupts
110   are permanently disabled [which is still the case:
111   ATM rtems_interrupt_enable()/rtems_interrupt_disable()
112   only deal with EE]). See separate section titled
113   'race condition...' below for a detailed explanation.
114
115STRUCTURE
116=========
117
118The middleware uses exception 'categories' or
119'flavors' as defined in raw_exception.h.
120
121The middleware consists of the following parts:
122
123   1 small 'prologue' snippets that encode the
124     vector information and jump to appropriate
125         'flavored-wrapper' code for further handling.
126         Some PPC exceptions are spaced only
127         16-bytes apart, so the generic
128         prologue snippets are only 16-bytes long.
129         Prologues for synchronuos and asynchronous
130         exceptions differ.
131
132   2 flavored-wrappers which sets up a stack frame
133     and do things that are specific for
134         different 'flavors' of exceptions which
135         currently are
136           - classic PPC exception
137           - ppc405 critical exception
138           - bookE critical exception
139           - e500 machine check exception
140
141   Assembler macros are provided and they can be
142   expanded to generate prologue templates and
143   flavored-wrappers for different flavors
144   of exceptions. Currently, there are two prologues
145   for all aforementioned flavors. One for synchronous
146   exceptions, the other for interrupts.
147
148   3 generic assembly-level code that does the bulk
149     of saving register context and calling C-code.
150
151   4 C-code (ppc_exc_hdl.c) for dispatching BSP/user
152     handlers.
153
154   5 Initialization code (vectors_init.c). All valid
155     exceptions for the detected CPU are determined
156         and a fitting prologue snippet for the exception
157         category (classic, critical, synchronous or IRQ, ...)
158         is generated from a template and the vector number
159         and then installed in the vector area.
160
161         The user/BSP only has to deal with installing
162         high-level handlers but by default, the standard
163         'C_dispatch_irq_handler' routine is hooked to
164         the external and 'decrementer' exceptions.
165
166   6 RTEMS IRQ API is implemented by 'irq.c'. It
167     relies on a few routines to be provided by
168         the BSP.
169
170USAGE
171=====
172        BSP writers must provide the following routines
173        (declared in irq_supp.h):
174        Interrupt controller (PIC) support:
175                BSP_setup_the_pic()        - initialize PIC hardware
176                BSP_enable_irq_at_pic()    - enable/disable given irq at PIC; IGNORE if
177                BSP_disable_irq_at_pic()     irq number out of range!
178                C_dispatch_irq_handler()   - handle irqs and dispatch user handlers
179                                             this routine SHOULD use the inline
180                                                                         fragment
181
182                                                                           bsp_irq_dispatch_list()
183
184                                                                         provided by irq_supp.h
185                                                                         for calling user handlers.
186
187        BSP initialization; call
188
189            initialize_exceptions();
190                BSP_rtems_irq_mngt_set();
191
192        Note that BSP_rtems_irq_mngt_set() hooks the C_dispatch_irq_handler()
193        to the external and decrementer (PIT exception for bookE; a decrementer
194        emulation is activated) exceptions for backwards compatibility reasons.
195        C_dispatch_irq_handler() must therefore be able to support these two
196        exceptions.
197        However, the BSP implementor is free to either disconnect
198        C_dispatch_irq_handler() from either of these exceptions, to connect
199        other handlers (e.g., for SYSMGMT exceptions) or to hook
200        C_dispatch_irq_handler() to yet more exceptions etc. *after*
201        BSP_rtems_irq_mngt_set() executed.
202
203        Hooking exceptions:
204
205        The API defined in ppc_exc_bspsupp.h declares routines for connecting
206        a C-handler to any exception. Note that the execution environment
207        of the C-handler depends on the exception being synchronous or
208        asynchronous:
209
210                - synchronous exceptions use the task stack and do not
211                  disable thread dispatching scheduling.
212                - asynchronous exceptions use a dedicated stack and do
213                  defer thread dispatching until handling has (almost) finished.
214
215        By inspecting the vector number stored in the exception frame
216        the nature of the exception can be determined: asynchronous
217        exceptions have the most significant bit(s) set.
218
219        Any exception for which no dedicated handler is registered
220        ends up being handled by the routine addressed by the
221        (traditional) 'globalExcHdl' function pointer.
222
223        Makefile.am:
224                - make sure the Makefile.am does NOT use any of the files
225                        vectors.S, vectors.h, vectors_init.c, irq_asm.S, irq.c
226                  from 'libbsp/powerpc/shared' NOR must the BSP implement
227                  any functionality that is provided by those files (and
228                  now the middleware).
229
230                - (probably) remove 'vectors.rel' and anything related
231
232                - add
233
234                  include_bsp_HEADERS += \
235                    ../../../libcpu/@RTEMS_CPU@/@exceptions@/bspsupport/vectors.h   \
236                        ../../../libcpu/@RTEMS_CPU@/@exceptions@/bspsupport/irq_supp.h  \
237                        ../../../libcpu/@RTEMS_CPU@/@exceptions@/bspsupport/ppc_exc_bspsupp.h
238
239                - add
240
241                    ../../../libcpu/@RTEMS_CPU@/@exceptions@/exc_bspsupport.rel \
242                    ../../../libcpu/@RTEMS_CPU@/@exceptions@/irq_bspsupport.rel \
243
244                  to 'libbsp_a_LIBADD'
245
246                  (irq.c is in a separate '.rel' so that you can get support
247                  for exceptions only).
248
249CAVEATS
250=======
251
252On classic PPCs, early (and late) parts of the low-level
253exception handling code run with the MMU disabled which mean
254that the default caching attributes (write-back) are in effect
255(thanks to Thomas Doerfler for bringing this up).
256The code currently assumes that the MMU translations
257for the task and interrupt stacks as well as some
258variables in the data-area MATCH THE DEFAULT CACHING
259ATTRIBUTES (this assumption also holds for the old code
260in libbsp/powepc/shared/vectors ../irq).
261
262During initialization of exception handling, a crude test
263is performed to check if memory seems to have the write-back
264attribute. The 'dcbz' instruction should - on most PPCs - cause
265an alignment exception if the tested cache-line does not
266have this attribute.
267
268BSPs which entirely disable caching (e.g., by physically
269disabling the cache(s)) should set the variable
270  ppc_exc_cache_wb_check = 0
271prior to calling initialize_exceptions().
272Note that this check does not catch all possible
273misconfigurations (e.g., on the 860, the default attribute
274is AFAIK [libcpu/powerpc/mpc8xx/mmu/mmu_init.c] set to
275'caching-disabled' which is potentially harmful but
276this situation is not detected).
277
278
279RACE CONDITION WHEN DEALING WITH CRITICAL INTERRUPTS
280====================================================
281
282   The problematic race condition is as follows:
283
284   Usually, ISRs are allowed to use certain OS
285   primitives such as e.g., releasing a semaphore.
286   In order to prevent a context switch from happening
287   immediately (this would result in the ISR being
288   suspended), thread-dispatching must be disabled
289   around execution of the ISR. However, on the
290   PPC architecture it is neither possible to
291   atomically disable ALL interrupts nor is it
292   possible to atomically increment a variable
293   (the thread-dispatch-disable level).
294   Hence, the following sequence of events could
295   occur:
296    1) low-priority interrupt (LPI) is taken
297    2) before the LPI can increase the
298           thread-dispatch-disable level or disable
299           high-priority interupts, a high-priority
300           interrupt (HPI) happens
301        3) HPI increases dispatch-disable level
302        4) HPI executes high-priority ISR which e.g.,
303           posts a semaphore
304        5) HPI decreases dispatch-disable level and
305           realizes that a context switch is necessary
306        6) context switch is performed since LPI had
307           not gotten to the point where it could
308           increase the dispatch-disable level.
309   At this point, the LPI has been effectively
310   suspended which means that the low-priority
311   ISR will not be executed until the task
312   interupted in 1) is scheduled again!
313
314   The solution to this problem is letting the
315   first machine instruction of the low-priority
316   exception handler write a non-zero value to
317   a variable in memory:
318
319        ee_vector_offset: 
320
321         stw r1, ee_lock@sdarel(r13)   
322         .. save some registers etc..
323                 .. increase thread-dispatch-disable-level
324                 .. clear 'ee_lock' variable
325
326        After the HPI decrements the dispatch-disable level
327        it checks 'ee_lock' and refrains from performing
328        a context switch if 'ee_lock' is nonzero. Since
329        the LPI will complete execution subsequently it
330        will eventually do the context switch.
331
332        For the single-instruction write operation we must
333          a) write a register that is guaranteed to be
334             non-zero (e.g., R1 (stack pointer) or R13
335                 (SVR4 short-data area).
336          b) use an addressing mode that doesn't require
337             loading any registers. The short-data area
338                 pointer R13 is appropriate.
339
340    CAVEAT: unfortunately, this method by itself
341        is *NOT* enough because raising a low-priority
342        exception and executing the first instruction
343        of the handler is *NOT* atomic. Hence, the following
344        could occur:
345
346         1) LPI is taken
347         2) PC is saved in SRR0, PC is loaded with
348            address of 'locking instruction'
349                  stw r1, ee_lock@sdarel(r13)
350     3) ==> critical interrupt happens
351         4) PC (containing address of locking instruction)
352            is saved in CSRR0
353     5) HPI is dispatched
354
355        For the HPI to correctly handle this situation
356        it does the following:
357
358       
359                a) increase thread-dispatch disable level
360                b) do interrupt work
361                c) decrease thread-dispatch disable level
362            d) if ( dispatch-disable level == 0 )
363                 d1) check ee_lock
364                 d2) check instruction at *CSRR0
365                 d3) do a context switch if necessary ONLY IF
366                     ee_lock is NOT set AND *CSRR0 is NOT the
367                         'locking instruction'
368
369        this works because the address of 'ee_lock'
370        is embedded in the locking instruction
371        'stw r1, ee_lock@sdarel(r13)' and because the
372        registers r1/r13 have a special purpose
373        (stack-pointer, SDA-pointer). Hence it is safe
374        to assume that the particular instruction
375        'stw r1,ee_lock&sdarel(r13)' never occurs
376        anywhere else.
377
378        Another note: this algorithm also makes sure
379        that ONLY nested ASYNCHRONOUS interrupts which
380        enable/disable thread-dispatching and check if
381        thread-dispatching is required before returning
382        control engage in this locking protocol. It is
383        important that when a critical, asynchronous
384        interrupt interrupts a 'synchronous' exception
385        (which does not disable thread-dispatching)
386        the thread-dispatching operation upon return of
387        the HPI is NOT deferred (because the synchronous
388        handler would not, eventually, check for a
389        dispatch requirement).
390
391        And one more note: We never want to disable
392        machine-check exceptions to avoid a checkstop.
393        This means that we cannot use enabling/disabling
394        this type of exception for protection of critical
395        OS data structures.
396        Therefore, calling OS primitives from a asynchronous
397        machine-check handler is ILLEGAL and not supported.
398        Since machine-checks can happen anytime it is not
399        legal to test if a deferred context switch should
400        be performed when the asynchronous machine-check
401        handler returns (since _Context_Switch_is_necessary
402        could have been set by a IRQ-protected section of
403        code that was hit by the machine-check).
404        Note that synchronous machine-checks can legally
405        use OS primitives and currently there are no
406        asynchronous machine-checks defined.
407
408   Epilogue:
409
410   You have to disable all asynchronous exceptions which may cause a context
411   switch before the restoring of the SRRs and the RFI.  Reason:
412   
413      Suppose we are in the epilogue code of an EE between the move to SRRs and
414      the RFI. Here EE is disabled but CE is enabled. Now a CE happens.  The
415      handler decides that a thread dispatch is necessary. The CE checks if
416      this is possible:
417   
418         o The thread dispatch disable level is 0, because the EE has already
419           decremented it.
420         o The EE lock variable is cleared.
421         o The EE executes not the first instruction.
422   
423      Hence a thread dispatch is allowed. The CE issues a context switch to a
424      task with EE enabled (for example a task waiting for a semaphore). Now a
425      EE happens and the current content of the SRRs is lost.
Note: See TracBrowser for help on using the repository browser.