1 | .. comment SPDX-License-Identifier: CC-BY-SA-4.0 |
---|
2 | |
---|
3 | .. COMMENT: COPYRIGHT (c) 1988-2002. |
---|
4 | .. COMMENT: On-Line Applications Research Corporation (OAR). |
---|
5 | .. COMMENT: All rights reserved. |
---|
6 | |
---|
7 | SPARC Specific Information |
---|
8 | ************************** |
---|
9 | |
---|
10 | The Real Time Executive for Multiprocessor Systems (RTEMS) is designed to be |
---|
11 | portable across multiple processor architectures. However, the nature of |
---|
12 | real-time systems makes it essential that the application designer understand |
---|
13 | certain processor dependent implementation details. These processor |
---|
14 | dependencies include calling convention, board support package issues, |
---|
15 | interrupt processing, exact RTEMS memory requirements, performance data, header |
---|
16 | files, and the assembly language interface to the executive. |
---|
17 | |
---|
18 | This document discusses the SPARC architecture dependencies in this port of |
---|
19 | RTEMS. This architectural port is for SPARC Version 7 and |
---|
20 | 8. Implementations for SPARC V9 are in the sparc64 target. |
---|
21 | |
---|
22 | It is highly recommended that the SPARC RTEMS application developer obtain and |
---|
23 | become familiar with the documentation for the processor being used as well as |
---|
24 | the specification for the revision of the SPARC architecture which corresponds |
---|
25 | to that processor. |
---|
26 | |
---|
27 | **SPARC Architecture Documents** |
---|
28 | |
---|
29 | For information on the SPARC architecture, refer to the following documents |
---|
30 | available from SPARC International, Inc. (http://www.sparc.com): |
---|
31 | |
---|
32 | - SPARC Standard Version 7. |
---|
33 | |
---|
34 | - SPARC Standard Version 8. |
---|
35 | |
---|
36 | **ERC32 Specific Information** |
---|
37 | |
---|
38 | The European Space Agency's ERC32 is a microprocessor implementing a |
---|
39 | SPARC V7 processor and associated support circuitry for embedded space |
---|
40 | applications. The integer and floating-point units (90C601E & 90C602E) are |
---|
41 | based on the Cypress 7C601 and 7C602, with additional error-detection and |
---|
42 | recovery functions. The memory controller (MEC) implements system support |
---|
43 | functions such as address decoding, memory interface, DMA interface, UARTs, |
---|
44 | timers, interrupt control, write-protection, memory reconfiguration and |
---|
45 | error-detection. The core is designed to work at 25MHz, but using space |
---|
46 | qualified memories limits the system frequency to around 15 MHz, resulting in a |
---|
47 | performance of 10 MIPS and 2 MFLOPS. |
---|
48 | |
---|
49 | The ERC32 is available from Atmel as the TSC695F. |
---|
50 | |
---|
51 | The RTEMS configuration of GDB enables the SPARC Instruction Simulator (SIS) |
---|
52 | which can simulate the ERC32 as well as the follow up LEON2 and LEON3 |
---|
53 | microprocessors. |
---|
54 | |
---|
55 | CPU Model Dependent Features |
---|
56 | ============================ |
---|
57 | |
---|
58 | Microprocessors are generally classified into families with a variety of CPU |
---|
59 | models or implementations within that family. Within a processor family, there |
---|
60 | is a high level of binary compatibility. This family may be based on either an |
---|
61 | architectural specification or on maintaining compatibility with a popular |
---|
62 | processor. Recent microprocessor families such as the SPARC or PowerPC are |
---|
63 | based on an architectural specification which is independent or any particular |
---|
64 | CPU model or implementation. Older families such as the M68xxx and the iX86 |
---|
65 | evolved as the manufacturer strived to produce higher performance processor |
---|
66 | models which maintained binary compatibility with older models. |
---|
67 | |
---|
68 | RTEMS takes advantage of the similarity of the various models within a CPU |
---|
69 | family. Although the models do vary in significant ways, the high level of |
---|
70 | compatibility makes it possible to share the bulk of the CPU dependent |
---|
71 | executive code across the entire family. |
---|
72 | |
---|
73 | CPU Model Feature Flags |
---|
74 | ----------------------- |
---|
75 | |
---|
76 | Each processor family supported by RTEMS has a list of features which vary |
---|
77 | between CPU models within a family. For example, the most common model |
---|
78 | dependent feature regardless of CPU family is the presence or absence of a |
---|
79 | floating point unit or coprocessor. When defining the list of features present |
---|
80 | on a particular CPU model, one simply notes that floating point hardware is or |
---|
81 | is not present and defines a single constant appropriately. Conditional |
---|
82 | compilation is utilized to include the appropriate source code for this CPU |
---|
83 | model's feature set. It is important to note that this means that RTEMS is |
---|
84 | thus compiled using the appropriate feature set and compilation flags optimal |
---|
85 | for this CPU model used. The alternative would be to generate a binary which |
---|
86 | would execute on all family members using only the features which were always |
---|
87 | present. |
---|
88 | |
---|
89 | This section presents the set of features which vary across SPARC |
---|
90 | implementations and are of importance to RTEMS. The set of CPU model feature |
---|
91 | macros are defined in the file cpukit/score/cpu/sparc/sparc.h based upon the |
---|
92 | particular CPU model defined on the compilation command line. |
---|
93 | |
---|
94 | CPU Model Name |
---|
95 | ~~~~~~~~~~~~~~ |
---|
96 | |
---|
97 | The macro CPU_MODEL_NAME is a string which designates the name of this CPU |
---|
98 | model. For example, for the European Space Agency's ERC32 SPARC model, this |
---|
99 | macro is set to the string "erc32". |
---|
100 | |
---|
101 | Floating Point Unit |
---|
102 | ~~~~~~~~~~~~~~~~~~~ |
---|
103 | |
---|
104 | The macro SPARC_HAS_FPU is set to 1 to indicate that this CPU model has a |
---|
105 | hardware floating point unit and 0 otherwise. |
---|
106 | |
---|
107 | Bitscan Instruction |
---|
108 | ~~~~~~~~~~~~~~~~~~~ |
---|
109 | |
---|
110 | The macro SPARC_HAS_BITSCAN is set to 1 to indicate that this CPU model has the |
---|
111 | bitscan instruction. For example, this instruction is supported by the Fujitsu |
---|
112 | SPARClite family. |
---|
113 | |
---|
114 | Number of Register Windows |
---|
115 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
---|
116 | |
---|
117 | The macro SPARC_NUMBER_OF_REGISTER_WINDOWS is set to indicate the number of |
---|
118 | register window sets implemented by this CPU model. The SPARC architecture |
---|
119 | allows a for a maximum of thirty-two register window sets although most |
---|
120 | implementations only include eight. |
---|
121 | |
---|
122 | Low Power Mode |
---|
123 | ~~~~~~~~~~~~~~ |
---|
124 | |
---|
125 | The macro SPARC_HAS_LOW_POWER_MODE is set to one to indicate that this CPU |
---|
126 | model has a low power mode. If low power is enabled, then there must be CPU |
---|
127 | model specific implementation of the IDLE task in cpukit/score/cpu/sparc/cpu.c. |
---|
128 | The low power mode IDLE task should be of the form: |
---|
129 | |
---|
130 | .. code-block:: c |
---|
131 | |
---|
132 | while ( TRUE ) { |
---|
133 | enter low power mode |
---|
134 | } |
---|
135 | |
---|
136 | The code required to enter low power mode is CPU model specific. |
---|
137 | |
---|
138 | CPU Model Implementation Notes |
---|
139 | ------------------------------ |
---|
140 | |
---|
141 | The ERC32 is a custom SPARC V7 implementation based on the Cypress 601/602 |
---|
142 | chipset. This CPU has a number of on-board peripherals and was developed by |
---|
143 | the European Space Agency to target space applications. RTEMS currently |
---|
144 | provides support for the following peripherals: |
---|
145 | |
---|
146 | - UART Channels A and B |
---|
147 | |
---|
148 | - General Purpose Timer |
---|
149 | |
---|
150 | - Real Time Clock |
---|
151 | |
---|
152 | - Watchdog Timer (so it can be disabled) |
---|
153 | |
---|
154 | - Control Register (so powerdown mode can be enabled) |
---|
155 | |
---|
156 | - Memory Control Register |
---|
157 | |
---|
158 | - Interrupt Control |
---|
159 | |
---|
160 | The General Purpose Timer and Real Time Clock Timer provided with the ERC32 |
---|
161 | share the Timer Control Register. Because the Timer Control Register is write |
---|
162 | only, we must mirror it in software and insure that writes to one timer do not |
---|
163 | alter the current settings and status of the other timer. Routines are |
---|
164 | provided in erc32.h which promote the view that the two timers are completely |
---|
165 | independent. By exclusively using these routines to access the Timer Control |
---|
166 | Register, the application can view the system as having a General Purpose Timer |
---|
167 | Control Register and a Real Time Clock Timer Control Register rather than the |
---|
168 | single shared value. |
---|
169 | |
---|
170 | The RTEMS Idle thread take advantage of the low power mode provided by the |
---|
171 | ERC32. Low power mode is entered during idle loops and is enabled at |
---|
172 | initialization time. |
---|
173 | |
---|
174 | Calling Conventions |
---|
175 | =================== |
---|
176 | |
---|
177 | Each high-level language compiler generates subroutine entry and exit code |
---|
178 | based upon a set of rules known as the application binary interface (ABI) |
---|
179 | calling convention. These rules address the following issues: |
---|
180 | |
---|
181 | - register preservation and usage |
---|
182 | |
---|
183 | - parameter passing |
---|
184 | |
---|
185 | - call and return mechanism |
---|
186 | |
---|
187 | An ABI calling convention is of importance when interfacing to subroutines |
---|
188 | written in another language either assembly or high-level. It determines also |
---|
189 | the set of registers to be saved or restored during a context switch and |
---|
190 | interrupt processing. |
---|
191 | |
---|
192 | The ABI relevant for RTEMS on SPARC is defined by SYSTEM V APPLICATION BINARY |
---|
193 | INTERFACE, SPARC Processor Supplement, Third Edition. |
---|
194 | |
---|
195 | Programming Model |
---|
196 | ----------------- |
---|
197 | |
---|
198 | This section discusses the programming model for the SPARC architecture. |
---|
199 | |
---|
200 | Non-Floating Point Registers |
---|
201 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
---|
202 | |
---|
203 | The SPARC architecture defines thirty-two non-floating point registers directly |
---|
204 | visible to the programmer. These are divided into four sets: |
---|
205 | |
---|
206 | - input registers |
---|
207 | |
---|
208 | - local registers |
---|
209 | |
---|
210 | - output registers |
---|
211 | |
---|
212 | - global registers |
---|
213 | |
---|
214 | Each register is referred to by either two or three names in the SPARC |
---|
215 | reference manuals. First, the registers are referred to as r0 through r31 or |
---|
216 | with the alternate notation r[0] through r[31]. Second, each register is a |
---|
217 | member of one of the four sets listed above. Finally, some registers have an |
---|
218 | architecturally defined role in the programming model which provides an |
---|
219 | alternate name. The following table describes the mapping between the 32 |
---|
220 | registers and the register sets: |
---|
221 | |
---|
222 | ================ ================ =================== |
---|
223 | Register Number Register Names Description |
---|
224 | ================ ================ =================== |
---|
225 | 0 - 7 g0 - g7 Global Registers |
---|
226 | 8 - 15 o0 - o7 Output Registers |
---|
227 | 16 - 23 l0 - l7 Local Registers |
---|
228 | 24 - 31 i0 - i7 Input Registers |
---|
229 | ================ ================ =================== |
---|
230 | |
---|
231 | As mentioned above, some of the registers serve defined roles in the |
---|
232 | programming model. The following table describes the role of each of these |
---|
233 | registers: |
---|
234 | |
---|
235 | ============== ================ ================================== |
---|
236 | Register Name Alternate Name Description |
---|
237 | ============== ================ ================================== |
---|
238 | g0 na reads return 0, writes are ignored |
---|
239 | o6 sp stack pointer |
---|
240 | i6 fp frame pointer |
---|
241 | i7 na return address |
---|
242 | ============== ================ ================================== |
---|
243 | |
---|
244 | The registers g2 through g4 are reserved for applications. GCC uses them as |
---|
245 | volatile registers by default. So they are treated like volatile registers in |
---|
246 | RTEMS as well. |
---|
247 | |
---|
248 | The register g6 is reserved for the operating system and contains the address |
---|
249 | of the per-CPU control block of the current processor. This register is |
---|
250 | initialized during system start and then remains unchanged. It is not |
---|
251 | saved/restored by the context switch or interrupt processing code. |
---|
252 | |
---|
253 | The register g7 is reserved for the operating system and contains the thread |
---|
254 | pointer used for thread-local storage (TLS) as mandated by the SPARC ABI. |
---|
255 | |
---|
256 | Floating Point Registers |
---|
257 | ~~~~~~~~~~~~~~~~~~~~~~~~ |
---|
258 | |
---|
259 | The SPARC V7 architecture includes thirty-two, thirty-two bit registers. These |
---|
260 | registers may be viewed as follows: |
---|
261 | |
---|
262 | - 32 single precision floating point or integer registers (f0, f1, ... f31) |
---|
263 | |
---|
264 | - 16 double precision floating point registers (f0, f2, f4, ... f30) |
---|
265 | |
---|
266 | - 8 extended precision floating point registers (f0, f4, f8, ... f28) |
---|
267 | |
---|
268 | The floating point status register (FSR) specifies the behavior of the floating |
---|
269 | point unit for rounding, contains its condition codes, version specification, |
---|
270 | and trap information. |
---|
271 | |
---|
272 | According to the ABI all floating point registers and the floating point status |
---|
273 | register (FSR) are volatile. Thus the floating point context of a thread is |
---|
274 | the empty set. The rounding direction is a system global state and must not be |
---|
275 | modified by threads. |
---|
276 | |
---|
277 | A queue of the floating point instructions which have started execution but not |
---|
278 | yet completed is maintained. This queue is needed to support the multiple |
---|
279 | cycle nature of floating point operations and to aid floating point exception |
---|
280 | trap handlers. Once a floating point exception has been encountered, the queue |
---|
281 | is frozen until it is emptied by the trap handler. The floating point queue is |
---|
282 | loaded by launching instructions. It is emptied normally when the floating |
---|
283 | point completes all outstanding instructions and by floating point exception |
---|
284 | handlers with the store double floating point queue (stdfq) instruction. |
---|
285 | |
---|
286 | Special Registers |
---|
287 | ~~~~~~~~~~~~~~~~~ |
---|
288 | |
---|
289 | The SPARC architecture includes two special registers which are critical to the |
---|
290 | programming model: the Processor State Register (psr) and the Window Invalid |
---|
291 | Mask (wim). The psr contains the condition codes, processor interrupt level, |
---|
292 | trap enable bit, supervisor mode and previous supervisor mode bits, version |
---|
293 | information, floating point unit and coprocessor enable bits, and the current |
---|
294 | window pointer (cwp). The cwp field of the psr and wim register are used to |
---|
295 | manage the register windows in the SPARC architecture. The register windows |
---|
296 | are discussed in more detail below. |
---|
297 | |
---|
298 | Register Windows |
---|
299 | ---------------- |
---|
300 | |
---|
301 | The SPARC architecture includes the concept of register windows. An overly |
---|
302 | simplistic way to think of these windows is to imagine them as being an |
---|
303 | infinite supply of "fresh" register sets available for each subroutine to use. |
---|
304 | In reality, they are much more complicated. |
---|
305 | |
---|
306 | The save instruction is used to obtain a new register window. This instruction |
---|
307 | decrements the current window pointer, thus providing a new set of registers |
---|
308 | for use. This register set includes eight fresh local registers for use |
---|
309 | exclusively by this subroutine. When done with a register set, the restore |
---|
310 | instruction increments the current window pointer and the previous register set |
---|
311 | is once again available. |
---|
312 | |
---|
313 | The two primary issues complicating the use of register windows are that (1) |
---|
314 | the set of register windows is finite, and (2) some registers are shared |
---|
315 | between adjacent registers windows. |
---|
316 | |
---|
317 | Because the set of register windows is finite, it is possible to execute enough |
---|
318 | save instructions without corresponding restore's to consume all of the |
---|
319 | register windows. This is easily accomplished in a high level language because |
---|
320 | each subroutine typically performs a save instruction upon entry. Thus having |
---|
321 | a subroutine call depth greater than the number of register windows will result |
---|
322 | in a window overflow condition. The window overflow condition generates a trap |
---|
323 | which must be handled in software. The window overflow trap handler is |
---|
324 | responsible for saving the contents of the oldest register window on the |
---|
325 | program stack. |
---|
326 | |
---|
327 | Similarly, the subroutines will eventually complete and begin to perform |
---|
328 | restore's. If the restore results in the need for a register window which has |
---|
329 | previously been written to memory as part of an overflow, then a window |
---|
330 | underflow condition results. Just like the window overflow, the window |
---|
331 | underflow condition must be handled in software by a trap handler. The window |
---|
332 | underflow trap handler is responsible for reloading the contents of the |
---|
333 | register window requested by the restore instruction from the program stack. |
---|
334 | |
---|
335 | The Window Invalid Mask (wim) and the Current Window Pointer (cwp) field in the |
---|
336 | psr are used in conjunction to manage the finite set of register windows and |
---|
337 | detect the window overflow and underflow conditions. The cwp contains the |
---|
338 | index of the register window currently in use. The save instruction decrements |
---|
339 | the cwp modulo the number of register windows. Similarly, the restore |
---|
340 | instruction increments the cwp modulo the number of register windows. Each bit |
---|
341 | in the wim represents represents whether a register window contains valid |
---|
342 | information. The value of 0 indicates the register window is valid and 1 |
---|
343 | indicates it is invalid. When a save instruction causes the cwp to point to a |
---|
344 | register window which is marked as invalid, a window overflow condition |
---|
345 | results. Conversely, the restore instruction may result in a window underflow |
---|
346 | condition. |
---|
347 | |
---|
348 | Other than the assumption that a register window is always available for trap |
---|
349 | (i.e. interrupt) handlers, the SPARC architecture places no limits on the |
---|
350 | number of register windows simultaneously marked as invalid (i.e. number of |
---|
351 | bits set in the wim). However, RTEMS assumes that only one register window is |
---|
352 | marked invalid at a time (i.e. only one bit set in the wim). This makes the |
---|
353 | maximum possible number of register windows available to the user while still |
---|
354 | meeting the requirement that window overflow and underflow conditions can be |
---|
355 | detected. |
---|
356 | |
---|
357 | The window overflow and window underflow trap handlers are a critical part of |
---|
358 | the run-time environment for a SPARC application. The SPARC architectural |
---|
359 | specification allows for the number of register windows to be any power of two |
---|
360 | less than or equal to 32. The most common choice for SPARC implementations |
---|
361 | appears to be 8 register windows. This results in the cwp ranging in value |
---|
362 | from 0 to 7 on most implementations. |
---|
363 | |
---|
364 | The second complicating factor is the sharing of registers between adjacent |
---|
365 | register windows. While each register window has its own set of local |
---|
366 | registers, the input and output registers are shared between adjacent windows. |
---|
367 | The output registers for register window N are the same as the input registers |
---|
368 | for register window ((N - 1) modulo RW) where RW is the number of register |
---|
369 | windows. An alternative way to think of this is to remember how parameters are |
---|
370 | passed to a subroutine on the SPARC. The caller loads values into what are its |
---|
371 | output registers. Then after the callee executes a save instruction, those |
---|
372 | parameters are available in its input registers. This is a very efficient way |
---|
373 | to pass parameters as no data is actually moved by the save or restore |
---|
374 | instructions. |
---|
375 | |
---|
376 | Call and Return Mechanism |
---|
377 | ------------------------- |
---|
378 | |
---|
379 | The SPARC architecture supports a simple yet effective call and return |
---|
380 | mechanism. A subroutine is invoked via the call (call) instruction. This |
---|
381 | instruction places the return address in the caller's output register 7 (o7). |
---|
382 | After the callee executes a save instruction, this value is available in input |
---|
383 | register 7 (i7) until the corresponding restore instruction is executed. |
---|
384 | |
---|
385 | The callee returns to the caller via a jmp to the return address. There is a |
---|
386 | delay slot following this instruction which is commonly used to execute a |
---|
387 | restore instruction - if a register window was allocated by this subroutine. |
---|
388 | |
---|
389 | It is important to note that the SPARC subroutine call and return mechanism |
---|
390 | does not automatically save and restore any registers. This is accomplished |
---|
391 | via the save and restore instructions which manage the set of registers |
---|
392 | windows. |
---|
393 | |
---|
394 | In case a floating-point unit is supported, then floating-point return values |
---|
395 | appear in the floating-point registers. Single-precision values occupy %f0; |
---|
396 | double-precision values occupy %f0 and %f1. Otherwise, these are scratch |
---|
397 | registers. Due to this the hardware and software floating-point ABIs are |
---|
398 | incompatible. |
---|
399 | |
---|
400 | Calling Mechanism |
---|
401 | ----------------- |
---|
402 | |
---|
403 | All RTEMS directives are invoked using the regular SPARC calling convention via |
---|
404 | the call instruction. |
---|
405 | |
---|
406 | Register Usage |
---|
407 | -------------- |
---|
408 | |
---|
409 | As discussed above, the call instruction does not automatically save any |
---|
410 | registers. The save and restore instructions are used to allocate and |
---|
411 | deallocate register windows. When a register window is allocated, the new set |
---|
412 | of local registers are available for the exclusive use of the subroutine which |
---|
413 | allocated this register set. |
---|
414 | |
---|
415 | Parameter Passing |
---|
416 | ----------------- |
---|
417 | |
---|
418 | RTEMS assumes that arguments are placed in the caller's output registers with |
---|
419 | the first argument in output register 0 (o0), the second argument in output |
---|
420 | register 1 (o1), and so forth. Until the callee executes a save instruction, |
---|
421 | the parameters are still visible in the output registers. After the callee |
---|
422 | executes a save instruction, the parameters are visible in the corresponding |
---|
423 | input registers. The following pseudo-code illustrates the typical sequence |
---|
424 | used to call a RTEMS directive with three (3) arguments: |
---|
425 | |
---|
426 | .. code-block:: c |
---|
427 | |
---|
428 | load third argument into o2 |
---|
429 | load second argument into o1 |
---|
430 | load first argument into o0 |
---|
431 | invoke directive |
---|
432 | |
---|
433 | User-Provided Routines |
---|
434 | ---------------------- |
---|
435 | |
---|
436 | All user-provided routines invoked by RTEMS, such as user extensions, device |
---|
437 | drivers, and MPCI routines, must also adhere to these calling conventions. |
---|
438 | |
---|
439 | Memory Model |
---|
440 | ============ |
---|
441 | |
---|
442 | A processor may support any combination of memory models ranging from pure |
---|
443 | physical addressing to complex demand paged virtual memory systems. RTEMS |
---|
444 | supports a flat memory model which ranges contiguously over the processor's |
---|
445 | allowable address space. RTEMS does not support segmentation or virtual memory |
---|
446 | of any kind. The appropriate memory model for RTEMS provided by the targeted |
---|
447 | processor and related characteristics of that model are described in this |
---|
448 | chapter. |
---|
449 | |
---|
450 | Flat Memory Model |
---|
451 | ----------------- |
---|
452 | |
---|
453 | The SPARC architecture supports a flat 32-bit address space with addresses |
---|
454 | ranging from 0x00000000 to 0xFFFFFFFF (4 gigabytes). Each address is |
---|
455 | represented by a 32-bit value and is byte addressable. The address may be used |
---|
456 | to reference a single byte, half-word (2-bytes), word (4 bytes), or doubleword |
---|
457 | (8 bytes). Memory accesses within this address space are performed in big |
---|
458 | endian fashion by the SPARC. Memory accesses which are not properly aligned |
---|
459 | generate a "memory address not aligned" trap (type number 7). The following |
---|
460 | table lists the alignment requirements for a variety of data accesses: |
---|
461 | |
---|
462 | ============== ====================== |
---|
463 | Data Type Alignment Requirement |
---|
464 | ============== ====================== |
---|
465 | byte 1 |
---|
466 | half-word 2 |
---|
467 | word 4 |
---|
468 | doubleword 8 |
---|
469 | ============== ====================== |
---|
470 | |
---|
471 | Doubleword load and store operations must use a pair of registers as their |
---|
472 | source or destination. This pair of registers must be an adjacent pair of |
---|
473 | registers with the first of the pair being even numbered. For example, a valid |
---|
474 | destination for a doubleword load might be input registers 0 and 1 (i0 and i1). |
---|
475 | The pair i1 and i2 would be invalid. \[NOTE: Some assemblers for the SPARC do |
---|
476 | not generate an error if an odd numbered register is specified as the beginning |
---|
477 | register of the pair. In this case, the assembler assumes that what the |
---|
478 | programmer meant was to use the even-odd pair which ends at the specified |
---|
479 | register. This may or may not have been a correct assumption.] |
---|
480 | |
---|
481 | RTEMS does not support any SPARC Memory Management Units, therefore, virtual |
---|
482 | memory or segmentation systems involving the SPARC are not supported. |
---|
483 | |
---|
484 | Interrupt Processing |
---|
485 | ==================== |
---|
486 | |
---|
487 | Different types of processors respond to the occurrence of an interrupt in its |
---|
488 | own unique fashion. In addition, each processor type provides a control |
---|
489 | mechanism to allow for the proper handling of an interrupt. The processor |
---|
490 | dependent response to the interrupt modifies the current execution state and |
---|
491 | results in a change in the execution stream. Most processors require that an |
---|
492 | interrupt handler utilize some special control mechanisms to return to the |
---|
493 | normal processing stream. Although RTEMS hides many of the processor dependent |
---|
494 | details of interrupt processing, it is important to understand how the RTEMS |
---|
495 | interrupt manager is mapped onto the processor's unique architecture. Discussed |
---|
496 | in this chapter are the SPARC's interrupt response and control mechanisms as |
---|
497 | they pertain to RTEMS. |
---|
498 | |
---|
499 | RTEMS and associated documentation uses the terms interrupt and vector. In the |
---|
500 | SPARC architecture, these terms correspond to traps and trap type, |
---|
501 | respectively. The terms will be used interchangeably in this manual. |
---|
502 | |
---|
503 | Synchronous Versus Asynchronous Traps |
---|
504 | ------------------------------------- |
---|
505 | |
---|
506 | The SPARC architecture includes two classes of traps: synchronous and |
---|
507 | asynchronous. Asynchronous traps occur when an external event interrupts the |
---|
508 | processor. These traps are not associated with any instruction executed by the |
---|
509 | processor and logically occur between instructions. The instruction currently |
---|
510 | in the execute stage of the processor is allowed to complete although |
---|
511 | subsequent instructions are annulled. The return address reported by the |
---|
512 | processor for asynchronous traps is the pair of instructions following the |
---|
513 | current instruction. |
---|
514 | |
---|
515 | Synchronous traps are caused by the actions of an instruction. The trap |
---|
516 | stimulus in this case either occurs internally to the processor or is from an |
---|
517 | external signal that was provoked by the instruction. These traps are taken |
---|
518 | immediately and the instruction that caused the trap is aborted before any |
---|
519 | state changes occur in the processor itself. The return address reported by |
---|
520 | the processor for synchronous traps is the instruction which caused the trap |
---|
521 | and the following instruction. |
---|
522 | |
---|
523 | Vectoring of Interrupt Handler |
---|
524 | ------------------------------ |
---|
525 | |
---|
526 | Upon receipt of an interrupt the SPARC automatically performs the following |
---|
527 | actions: |
---|
528 | |
---|
529 | - disables traps (sets the ET bit of the psr to 0), |
---|
530 | |
---|
531 | - the S bit of the psr is copied into the Previous Supervisor Mode (PS) bit of |
---|
532 | the psr, |
---|
533 | |
---|
534 | - the cwp is decremented by one (modulo the number of register windows) to |
---|
535 | activate a trap window, |
---|
536 | |
---|
537 | - the PC and nPC are loaded into local register 1 and 2 (l0 and l1), |
---|
538 | |
---|
539 | - the trap type (tt) field of the Trap Base Register (TBR) is set to the |
---|
540 | appropriate value, and |
---|
541 | |
---|
542 | - if the trap is not a reset, then the PC is written with the contents of the |
---|
543 | TBR and the nPC is written with TBR + 4. If the trap is a reset, then the PC |
---|
544 | is set to zero and the nPC is set to 4. |
---|
545 | |
---|
546 | Trap processing on the SPARC has two features which are noticeably different |
---|
547 | than interrupt processing on other architectures. First, the value of psr |
---|
548 | register in effect immediately before the trap occurred is not explicitly |
---|
549 | saved. Instead only reversible alterations are made to it. Second, the |
---|
550 | Processor Interrupt Level (pil) is not set to correspond to that of the |
---|
551 | interrupt being processed. When a trap occurs, ALL subsequent traps are |
---|
552 | disabled. In order to safely invoke a subroutine during trap handling, traps |
---|
553 | must be enabled to allow for the possibility of register window overflow and |
---|
554 | underflow traps. |
---|
555 | |
---|
556 | If the interrupt handler was installed as an RTEMS interrupt handler, then upon |
---|
557 | receipt of the interrupt, the processor passes control to the RTEMS interrupt |
---|
558 | handler which performs the following actions: |
---|
559 | |
---|
560 | - saves the state of the interrupted task on it's stack, |
---|
561 | |
---|
562 | - insures that a register window is available for subsequent traps, |
---|
563 | |
---|
564 | - if this is the outermost (i.e. non-nested) interrupt, then the RTEMS |
---|
565 | interrupt handler switches from the current stack to the interrupt stack, |
---|
566 | |
---|
567 | - enables traps, |
---|
568 | |
---|
569 | - invokes the vectors to a user interrupt service routine (ISR). |
---|
570 | |
---|
571 | Asynchronous interrupts are ignored while traps are disabled. Synchronous |
---|
572 | traps which occur while traps are disabled result in the CPU being forced into |
---|
573 | an error mode. |
---|
574 | |
---|
575 | A nested interrupt is processed similarly with the exception that the current |
---|
576 | stack need not be switched to the interrupt stack. |
---|
577 | |
---|
578 | Traps and Register Windows |
---|
579 | -------------------------- |
---|
580 | |
---|
581 | One of the register windows must be reserved at all times for trap processing. |
---|
582 | This is critical to the proper operation of the trap mechanism in the SPARC |
---|
583 | architecture. It is the responsibility of the trap handler to insure that |
---|
584 | there is a register window available for a subsequent trap before re-enabling |
---|
585 | traps. It is likely that any high level language routines invoked by the trap |
---|
586 | handler (such as a user-provided RTEMS interrupt handler) will allocate a new |
---|
587 | register window. The save operation could result in a window overflow trap. |
---|
588 | This trap cannot be correctly processed unless (1) traps are enabled and (2) a |
---|
589 | register window is reserved for traps. Thus, the RTEMS interrupt handler |
---|
590 | insures that a register window is available for subsequent traps before |
---|
591 | enabling traps and invoking the user's interrupt handler. |
---|
592 | |
---|
593 | Interrupt Levels |
---|
594 | ---------------- |
---|
595 | |
---|
596 | Sixteen levels (0-15) of interrupt priorities are supported by the SPARC |
---|
597 | architecture with level fifteen (15) being the highest priority. Level |
---|
598 | zero (0) indicates that interrupts are fully enabled. Interrupt requests for |
---|
599 | interrupts with priorities less than or equal to the current interrupt mask |
---|
600 | level are ignored. Level fifteen (15) is a non-maskable interrupt (NMI), which |
---|
601 | makes it unsuitable for standard usage since it can affect the real-time |
---|
602 | behaviour by interrupting critical sections and spinlocks. Disabling traps |
---|
603 | stops also the NMI interrupt from happening. It can however be used for |
---|
604 | power-down or other critical events. |
---|
605 | |
---|
606 | Although RTEMS supports 256 interrupt levels, the SPARC only supports sixteen. |
---|
607 | RTEMS interrupt levels 0 through 15 directly correspond to SPARC processor |
---|
608 | interrupt levels. All other RTEMS interrupt levels are undefined and their |
---|
609 | behavior is unpredictable. |
---|
610 | |
---|
611 | Many LEON SPARC v7/v8 systems features an extended interrupt controller which |
---|
612 | adds an extra step of interrupt decoding to allow handling of interrupt |
---|
613 | 16-31. When such an extended interrupt is generated the CPU traps into a |
---|
614 | specific interrupt trap level 1-14 and software reads out from the interrupt |
---|
615 | controller which extended interrupt source actually caused the interrupt. |
---|
616 | |
---|
617 | Disabling of Interrupts by RTEMS |
---|
618 | -------------------------------- |
---|
619 | |
---|
620 | During the execution of directive calls, critical sections of code may be |
---|
621 | executed. When these sections are encountered, RTEMS disables interrupts to |
---|
622 | level fifteen (15) before the execution of the section and restores them to the |
---|
623 | previous level upon completion of the section. RTEMS has been optimized to |
---|
624 | ensure that interrupts are disabled for less than RTEMS_MAXIMUM_DISABLE_PERIOD |
---|
625 | microseconds on a RTEMS_MAXIMUM_DISABLE_PERIOD_MHZ Mhz ERC32 with zero wait |
---|
626 | states. These numbers will vary based the number of wait states and processor |
---|
627 | speed present on the target board. [NOTE: The maximum period with interrupts |
---|
628 | disabled is hand calculated. This calculation was last performed for Release |
---|
629 | RTEMS_RELEASE_FOR_MAXIMUM_DISABLE_PERIOD.] |
---|
630 | |
---|
631 | [NOTE: It is thought that the length of time at which the processor interrupt |
---|
632 | level is elevated to fifteen by RTEMS is not anywhere near as long as the |
---|
633 | length of time ALL traps are disabled as part of the "flush all register |
---|
634 | windows" operation.] |
---|
635 | |
---|
636 | Non-maskable interrupts (NMI) cannot be disabled, and ISRs which execute at |
---|
637 | this level MUST NEVER issue RTEMS system calls. If a directive is invoked, |
---|
638 | unpredictable results may occur due to the inability of RTEMS to protect its |
---|
639 | critical sections. However, ISRs that make no system calls may safely execute |
---|
640 | as non-maskable interrupts. |
---|
641 | |
---|
642 | Interrupts are disabled or enabled by performing a system call to the Operating |
---|
643 | System reserved software traps 9 (SPARC_SWTRAP_IRQDIS) or 10 |
---|
644 | (SPARC_SWTRAP_IRQEN). The trap is generated by the software trap (Ticc) |
---|
645 | instruction or indirectly by calling sparc_disable_interrupts() or |
---|
646 | sparc_enable_interrupts() functions. Disabling interrupts return the previous |
---|
647 | interrupt level (on trap entry) in register G1 and sets PSR.PIL to 15 to |
---|
648 | disable all maskable interrupts. The interrupt level can be restored by |
---|
649 | trapping into the enable interrupt handler with G1 containing the new interrupt |
---|
650 | level. |
---|
651 | |
---|
652 | Interrupt Stack |
---|
653 | --------------- |
---|
654 | |
---|
655 | The SPARC architecture does not provide for a dedicated interrupt stack. Thus |
---|
656 | by default, trap handlers would execute on the stack of the RTEMS task which |
---|
657 | they interrupted. This artificially inflates the stack requirements for each |
---|
658 | task since EVERY task stack would have to include enough space to account for |
---|
659 | the worst case interrupt stack requirements in addition to it's own worst case |
---|
660 | usage. RTEMS addresses this problem on the SPARC by providing a dedicated |
---|
661 | interrupt stack managed by software. |
---|
662 | |
---|
663 | During system initialization, RTEMS allocates the interrupt stack from the |
---|
664 | Workspace Area. The amount of memory allocated for the interrupt stack is |
---|
665 | determined by the interrupt_stack_size field in the CPU Configuration Table. |
---|
666 | As part of processing a non-nested interrupt, RTEMS will switch to the |
---|
667 | interrupt stack before invoking the installed handler. |
---|
668 | |
---|
669 | Default Fatal Error Processing |
---|
670 | ============================== |
---|
671 | |
---|
672 | Upon detection of a fatal error by either the application or RTEMS the fatal |
---|
673 | error manager is invoked. The fatal error manager will invoke the |
---|
674 | user-supplied fatal error handlers. If no user-supplied handlers are |
---|
675 | configured, the RTEMS provided default fatal error handler is invoked. If the |
---|
676 | user-supplied fatal error handlers return to the executive the default fatal |
---|
677 | error handler is then invoked. This chapter describes the precise operations |
---|
678 | of the default fatal error handler. |
---|
679 | |
---|
680 | Default Fatal Error Handler Operations |
---|
681 | -------------------------------------- |
---|
682 | |
---|
683 | The default fatal error handler which is invoked by the fatal_error_occurred |
---|
684 | directive when there is no user handler configured or the user handler returns |
---|
685 | control to RTEMS. |
---|
686 | |
---|
687 | If the BSP has been configured with ``BSP_POWER_DOWN_AT_FATAL_HALT`` set to |
---|
688 | true, the default handler will disable interrupts and enter power down mode. If |
---|
689 | power down mode is not available, it goes into an infinite loop to simulate a |
---|
690 | halt processor instruction. |
---|
691 | |
---|
692 | If ``BSP_POWER_DOWN_AT_FATAL_HALT`` is set to false, the default handler will |
---|
693 | place the value ``1`` in register ``g1``, the error source in register ``g2``, |
---|
694 | and the error code in register``g3``. It will then generate a system error |
---|
695 | which will hand over control to the debugger, simulator, etc. |
---|
696 | |
---|
697 | Symmetric Multiprocessing |
---|
698 | ========================= |
---|
699 | |
---|
700 | SMP is supported. Available platforms are the Cobham Gaisler GR712RC and |
---|
701 | GR740. |
---|
702 | |
---|
703 | Thread-Local Storage |
---|
704 | ==================== |
---|
705 | |
---|
706 | Thread-local storage is supported. |
---|
707 | |
---|
708 | Board Support Packages |
---|
709 | ====================== |
---|
710 | |
---|
711 | An RTEMS Board Support Package (BSP) must be designed to support a particular |
---|
712 | processor and target board combination. This chapter presents a discussion of |
---|
713 | SPARC specific BSP issues. For more information on developing a BSP, refer to |
---|
714 | the chapter titled Board Support Packages in the RTEMS Applications User's |
---|
715 | Guide. |
---|
716 | |
---|
717 | System Reset |
---|
718 | ------------ |
---|
719 | |
---|
720 | An RTEMS based application is initiated or re-initiated when the SPARC |
---|
721 | processor is reset. When the SPARC is reset, the processor performs the |
---|
722 | following actions: |
---|
723 | |
---|
724 | - the enable trap (ET) of the psr is set to 0 to disable traps, |
---|
725 | |
---|
726 | - the supervisor bit (S) of the psr is set to 1 to enter supervisor mode, and |
---|
727 | |
---|
728 | - the PC is set 0 and the nPC is set to 4. |
---|
729 | |
---|
730 | The processor then begins to execute the code at location 0. It is important |
---|
731 | to note that all fields in the psr are not explicitly set by the above steps |
---|
732 | and all other registers retain their value from the previous execution mode. |
---|
733 | This is true even of the Trap Base Register (TBR) whose contents reflect the |
---|
734 | last trap which occurred before the reset. |
---|
735 | |
---|
736 | Processor Initialization |
---|
737 | ------------------------ |
---|
738 | |
---|
739 | It is the responsibility of the application's initialization code to initialize |
---|
740 | the TBR and install trap handlers for at least the register window overflow and |
---|
741 | register window underflow conditions. Traps should be enabled before invoking |
---|
742 | any subroutines to allow for register window management. However, interrupts |
---|
743 | should be disabled by setting the Processor Interrupt Level (pil) field of the |
---|
744 | psr to 15. RTEMS installs it's own Trap Table as part of initialization which |
---|
745 | is initialized with the contents of the Trap Table in place when the |
---|
746 | ``rtems_initialize_executive`` directive was invoked. Upon completion of |
---|
747 | executive initialization, interrupts are enabled. |
---|
748 | |
---|
749 | If this SPARC implementation supports on-chip caching and this is to be |
---|
750 | utilized, then it should be enabled during the reset application initialization |
---|
751 | code. |
---|
752 | |
---|
753 | In addition to the requirements described in the Board Support Packages chapter |
---|
754 | of the C Applications Users Manual for the reset code which is executed before |
---|
755 | the call to``rtems_initialize_executive``, the SPARC version has the following |
---|
756 | specific requirements: |
---|
757 | |
---|
758 | - Must leave the S bit of the status register set so that the SPARC remains in |
---|
759 | the supervisor state. |
---|
760 | |
---|
761 | - Must set stack pointer (sp) such that a minimum stack size of |
---|
762 | MINIMUM_STACK_SIZE bytes is provided for the``rtems_initialize_executive`` |
---|
763 | directive. |
---|
764 | |
---|
765 | - Must disable all external interrupts (i.e. set the pil to 15). |
---|
766 | |
---|
767 | - Must enable traps so window overflow and underflow conditions can be properly |
---|
768 | handled. |
---|
769 | |
---|
770 | - Must initialize the SPARC's initial trap table with at least trap handlers |
---|
771 | for register window overflow and register window underflow. |
---|
772 | |
---|
773 | .................................... |
---|
774 | .... |
---|
775 | |
---|
776 | Understanding stacks and registers in the SPARC architecture(s) |
---|
777 | =============================================================== |
---|
778 | |
---|
779 | The content in this section originally appeared at |
---|
780 | https://www.sics.se/~psm/sparcstack.html. It appears here with the |
---|
781 | gracious permission of the author Peter Magnusson. |
---|
782 | |
---|
783 | |
---|
784 | The SPARC architecture from Sun Microsystems has some "interesting" |
---|
785 | characteristics. After having to deal with both compiler, interpreter, OS |
---|
786 | emulator, and OS porting issues for the SPARC, I decided to gather notes |
---|
787 | and documentation in one place. If there are any issues you don't find |
---|
788 | addressed by this page, or if you know of any similar Net resources, let |
---|
789 | me know. This document is limited to the V8 version of the architecture. |
---|
790 | |
---|
791 | General Structure |
---|
792 | ----------------- |
---|
793 | |
---|
794 | SPARC has 32 general purpose integer registers visible to the program |
---|
795 | at any given time. Of these, 8 registers are global registers and 24 |
---|
796 | registers are in a register window. A window consists of three groups |
---|
797 | of 8 registers, the out, local, and in registers. See table 1. A SPARC |
---|
798 | implementation can have from 2 to 32 windows, thus varying the number |
---|
799 | of registers from 40 to 520. Most implentations have 7 or 8 windows. The |
---|
800 | variable number of registers is the principal reason for the SPARC being |
---|
801 | "scalable". |
---|
802 | |
---|
803 | At any given time, only one window is visible, as determined by the |
---|
804 | current window pointer (CWP) which is part of the processor status |
---|
805 | register (PSR). This is a five bit value that can be decremented or |
---|
806 | incremented by the SAVE and RESTORE instructions, respectively. These |
---|
807 | instructions are generally executed on procedure call and return |
---|
808 | (respectively). The idea is that the in registers contain incoming |
---|
809 | parameters, the local register constitute scratch registers, the out |
---|
810 | registers contain outgoing parameters, and the global registers contain |
---|
811 | values that vary little between executions. The register windows overlap |
---|
812 | partially, thus the out registers become renamed by SAVE to become the in |
---|
813 | registers of the called procedure. Thus, the memory traffic is reduced |
---|
814 | when going up and down the procedure call. Since this is a frequent |
---|
815 | operation, performance is improved. |
---|
816 | |
---|
817 | (That was the idea, anyway. The drawback is that upon interactions |
---|
818 | with the system the registers need to be flushed to the stack, |
---|
819 | necessitating a long sequence of writes to memory of data that is |
---|
820 | often mostly garbage. Register windows was a bad idea that was caused |
---|
821 | by simulation studies that considered only programs in isolation, as |
---|
822 | opposed to multitasking workloads, and by considering compilers with |
---|
823 | poor optimization. It also caused considerable problems in implementing |
---|
824 | high-end SPARC processors such as the SuperSPARC, although more recent |
---|
825 | implementations have dealt effectively with the obstacles. Register |
---|
826 | windows is now part of the compatibility legacy and not easily removed |
---|
827 | from the architecture.) |
---|
828 | |
---|
829 | ================ ======== ================ |
---|
830 | Register Group Mnemonic Register Address |
---|
831 | ================ ======== ================ |
---|
832 | global %g0-%g7 r[0]-r[7] |
---|
833 | out %o0-%o7 r[8]-r[15] |
---|
834 | local %l0-%l7 r[16]-r[23] |
---|
835 | in %i0-%i7 r[24]-r[31] |
---|
836 | ================ ======== ================ |
---|
837 | |
---|
838 | .. Table 1 - Visible Registers |
---|
839 | |
---|
840 | The overlap of the registers is illustrated in figure 1. The figure |
---|
841 | shows an implementation with 8 windows, numbered 0 to 7 (labeled w0 to |
---|
842 | w7 in the figure).. Each window corresponds to 24 registers, 16 of which |
---|
843 | are shared with "neighboring" windows. The windows are arranged in a |
---|
844 | wrap-around manner, thus window number 0 borders window number 7. The |
---|
845 | common cause of changing the current window, as pointed to by CWP, is |
---|
846 | the RESTORE and SAVE instuctions, shown in the middle. Less common is |
---|
847 | the supervisor RETT instruction (return from trap) and the trap event |
---|
848 | (interrupt, exception, or TRAP instruction). |
---|
849 | |
---|
850 | |
---|
851 | .. image:: sparcwin.gif |
---|
852 | |
---|
853 | Figure 1 - Windowed Registers |
---|
854 | |
---|
855 | The "WIM" register is also indicated in the top left of figure 1. The |
---|
856 | window invalid mask is a bit map of valid windows. It is generally used |
---|
857 | as a pointer, i.e. exactly one bit is set in the WIM register indicating |
---|
858 | which window is invalid (in the figure it's window 7). Register windows |
---|
859 | are generally used to support procedure calls, so they can be viewed |
---|
860 | as a cache of the stack contents. The WIM "pointer" indicates how |
---|
861 | many procedure calls in a row can be taken without writing out data to |
---|
862 | memory. In the figure, the capacity of the register windows is fully |
---|
863 | utilized. An additional call will thus exceed capacity, triggering a |
---|
864 | window overflow trap. At the other end, a window underflow trap occurs |
---|
865 | when the register window "cache" if empty and more data needs to be |
---|
866 | fetched from memory. |
---|
867 | |
---|
868 | Register Semantics |
---|
869 | ------------------ |
---|
870 | |
---|
871 | phe SPARC Architecture includes recommended software semantics. These are |
---|
872 | described in the architecture manual, the SPARC ABI (application binary |
---|
873 | interface) standard, and, unfortunately, in various other locations as |
---|
874 | well (including header files and compiler documentation). |
---|
875 | |
---|
876 | Figure 2 shows a summary of register contents at any given time. |
---|
877 | |
---|
878 | .. code-block:: asm |
---|
879 | |
---|
880 | %g0 (r00) always zero |
---|
881 | %g1 (r01) [1] temporary value |
---|
882 | %g2 (r02) [2] global 2 |
---|
883 | global %g3 (r03) [2] global 3 |
---|
884 | %g4 (r04) [2] global 4 |
---|
885 | %g5 (r05) reserved for SPARC ABI |
---|
886 | %g6 (r06) reserved for SPARC ABI |
---|
887 | %g7 (r07) reserved for SPARC ABI |
---|
888 | |
---|
889 | %o0 (r08) [3] outgoing parameter 0 / return value from callee |
---|
890 | %o1 (r09) [1] outgoing parameter 1 |
---|
891 | %o2 (r10) [1] outgoing parameter 2 |
---|
892 | out %o3 (r11) [1] outgoing parameter 3 |
---|
893 | %o4 (r12) [1] outgoing parameter 4 |
---|
894 | %o5 (r13) [1] outgoing parameter 5 |
---|
895 | %sp, %o6 (r14) [1] stack pointer |
---|
896 | %o7 (r15) [1] temporary value / address of CALL instruction |
---|
897 | |
---|
898 | %l0 (r16) [3] local 0 |
---|
899 | %l1 (r17) [3] local 1 |
---|
900 | %l2 (r18) [3] local 2 |
---|
901 | local %l3 (r19) [3] local 3 |
---|
902 | %l4 (r20) [3] local 4 |
---|
903 | %l5 (r21) [3] local 5 |
---|
904 | %l6 (r22) [3] local 6 |
---|
905 | %l7 (r23) [3] local 7 |
---|
906 | |
---|
907 | %i0 (r24) [3] incoming parameter 0 / return value to caller |
---|
908 | %i1 (r25) [3] incoming parameter 1 |
---|
909 | %i2 (r26) [3] incoming parameter 2 |
---|
910 | in %i3 (r27) [3] incoming parameter 3 |
---|
911 | %i4 (r28) [3] incoming parameter 4 |
---|
912 | %i5 (r29) [3] incoming parameter 5 |
---|
913 | %fp, %i6 (r30) [3] frame pointer |
---|
914 | %i7 (r31) [3] return address - 8 |
---|
915 | |
---|
916 | Notes: |
---|
917 | |
---|
918 | # assumed by caller to be destroyed (volatile) across a procedure call |
---|
919 | |
---|
920 | # should not be used by SPARC ABI library code |
---|
921 | |
---|
922 | # assumed by caller to be preserved across a procedure call |
---|
923 | |
---|
924 | .. Above was Figure 2 - SPARC register semantics |
---|
925 | |
---|
926 | Particular compilers are likely to vary slightly. |
---|
927 | |
---|
928 | Note that globals %g2-%g4 are reserved for the "application", which |
---|
929 | includes libraries and compiler. Thus, for example, libraries may |
---|
930 | overwrite these registers unless they've been compiled with suitable |
---|
931 | flags. Also, the "reserved" registers are presumed to be allocated |
---|
932 | (in the future) bottom-up, i.e. %g7 is currently the "safest" to use. |
---|
933 | |
---|
934 | Optimizing linkers and interpreters are exmples that use global registers. |
---|
935 | |
---|
936 | Register Windows and the Stack |
---|
937 | ------------------------------ |
---|
938 | |
---|
939 | The SPARC register windows are, naturally, intimately related to the |
---|
940 | stack. In particular, the stack pointer (%sp or %o6) must always point |
---|
941 | to a free block of 64 bytes. This area is used by the operating system |
---|
942 | (Solaris, SunOS, and Linux at least) to save the current local and in |
---|
943 | registers upon a system interupt, exception, or trap instruction. (Note |
---|
944 | that this can occur at any time.) |
---|
945 | |
---|
946 | Other aspects of register relations with memory are programming |
---|
947 | convention. The typical, and recommended, layout of the stack is shown |
---|
948 | in figure 3. The figure shows a stack frame. |
---|
949 | |
---|
950 | .. code-block:: asm |
---|
951 | low addresses |
---|
952 | +-------------------------+ |
---|
953 | %sp --> | 16 words for storing | |
---|
954 | | LOCAL and IN registers | |
---|
955 | +-------------------------+ |
---|
956 | | one-word pointer to | |
---|
957 | | aggregate return value | |
---|
958 | +-------------------------+ |
---|
959 | | 6 words for callee | |
---|
960 | | to store register | |
---|
961 | | arguments | |
---|
962 | +-------------------------+ |
---|
963 | | outgoing parameters | |
---|
964 | | past the 6th, if any | |
---|
965 | +-------------------------+ |
---|
966 | | space, if needed, for | |
---|
967 | | compiler temporaries | |
---|
968 | | and saved floating- | |
---|
969 | | point registers | |
---|
970 | +-------------------------+ |
---|
971 | ................. |
---|
972 | +-------------------------+ |
---|
973 | | space dynamically | |
---|
974 | | allocated via the | |
---|
975 | | alloca() library call | |
---|
976 | +-------------------------+ |
---|
977 | | space, if needed, for | |
---|
978 | | automatic arrays, | |
---|
979 | | aggregates, and | |
---|
980 | | addressable scalar | |
---|
981 | | automatics | |
---|
982 | +-------------------------+ |
---|
983 | %fp --> |
---|
984 | high addresses |
---|
985 | |
---|
986 | .. Figure 3 - Stack frame contents |
---|
987 | |
---|
988 | Note that the top boxes of figure 3 are addressed via the stack pointer |
---|
989 | (%sp), as positive offsets (including zero), and the bottom boxes are |
---|
990 | accessed over the frame pointer using negative offsets (excluding zero), |
---|
991 | and that the frame pointer is the old stack pointer. This scheme allows |
---|
992 | the separation of information known at compile time (number and size |
---|
993 | of local parameters, etc) from run-time information (size of blocks |
---|
994 | allocated by alloca()). |
---|
995 | |
---|
996 | "addressable scalar automatics" is a fancy name for local variables. |
---|
997 | |
---|
998 | The clever nature of the stack and frame pointers are that they are always |
---|
999 | 16 registers apart in the register windows. Thus, a SAVE instruction will |
---|
1000 | make the current stack pointer into the frame pointer and, since the SAVE |
---|
1001 | instruction also doubles as an ADD, create a new stack pointer. Figure 4 |
---|
1002 | illustrates what the top of a stack might look like during execution. (The |
---|
1003 | listing is from the "pwin" command in the SimICS simulator.) |
---|
1004 | |
---|
1005 | .. code-block:: asm |
---|
1006 | |
---|
1007 | REGISTER WINDOWS |
---|
1008 | +--+---+----------+ |
---|
1009 | |g0|r00|0x00000000| global |
---|
1010 | |g1|r01|0x00000006| registers |
---|
1011 | |g2|r02|0x00091278| |
---|
1012 | g0-g7 |g3|r03|0x0008ebd0| |
---|
1013 | |g4|r04|0x00000000| (note: 'save' and 'trap' decrements CWP, |
---|
1014 | |g5|r05|0x00000000| i.e. moves it up on this diagram. 'restore' |
---|
1015 | |g6|r06|0x00000000| and 'rett' increments CWP, i.e. down) |
---|
1016 | |g7|r07|0x00000000| |
---|
1017 | +--+---+----------+ |
---|
1018 | CWP (2) |o0|r08|0x00000002| |
---|
1019 | |o1|r09|0x00000000| MEMORY |
---|
1020 | |o2|r10|0x00000001| |
---|
1021 | o0-o7 |o3|r11|0x00000001| stack growth |
---|
1022 | |o4|r12|0x000943d0| |
---|
1023 | |o5|r13|0x0008b400| ^ |
---|
1024 | |sp|r14|0xdffff9a0| ----\ /|\ |
---|
1025 | |o7|r15|0x00062abc| | | addresses |
---|
1026 | +--+---+----------+ | +--+----------+ virtual physical |
---|
1027 | |l0|r16|0x00087c00| \---> |l0|0x00000000| 0xdffff9a0 0x000039a0 top of frame 0 |
---|
1028 | |l1|r17|0x00027fd4| |l1|0x00000000| 0xdffff9a4 0x000039a4 |
---|
1029 | |l2|r18|0x00000000| |l2|0x0009df80| 0xdffff9a8 0x000039a8 |
---|
1030 | l0-l7 |l3|r19|0x00000000| |l3|0x00097660| 0xdffff9ac 0x000039ac |
---|
1031 | |l4|r20|0x00000000| |l4|0x00000014| 0xdffff9b0 0x000039b0 |
---|
1032 | |l5|r21|0x00097678| |l5|0x00000001| 0xdffff9b4 0x000039b4 |
---|
1033 | |l6|r22|0x0008b400| |l6|0x00000004| 0xdffff9b8 0x000039b8 |
---|
1034 | |l7|r23|0x0008b800| |l7|0x0008dd60| 0xdffff9bc 0x000039bc |
---|
1035 | +--+--+---+----------+ +--+----------+ |
---|
1036 | CWP+1 (3) |o0|i0|r24|0x00000002| |i0|0x00091048| 0xdffff9c0 0x000039c0 |
---|
1037 | |o1|i1|r25|0x00000000| |i1|0x00000011| 0xdffff9c4 0x000039c4 |
---|
1038 | |o2|i2|r26|0x0008b7c0| |i2|0x00091158| 0xdffff9c8 0x000039c8 |
---|
1039 | i0-i7 |o3|i3|r27|0x00000019| |i3|0x0008d370| 0xdffff9cc 0x000039cc |
---|
1040 | |o4|i4|r28|0x0000006c| |i4|0x0008eac4| 0xdffff9d0 0x000039d0 |
---|
1041 | |o5|i5|r29|0x00000000| |i5|0x00000000| 0xdffff9d4 0x000039d4 |
---|
1042 | |o6|fp|r30|0xdffffa00| ----\ |fp|0x00097660| 0xdffff9d8 0x000039d8 |
---|
1043 | |o7|i7|r31|0x00040468| | |i7|0x00000000| 0xdffff9dc 0x000039dc |
---|
1044 | +--+--+---+----------+ | +--+----------+ |
---|
1045 | | |0x00000001| 0xdffff9e0 0x000039e0 parameters |
---|
1046 | | |0x00000002| 0xdffff9e4 0x000039e4 |
---|
1047 | | |0x00000040| 0xdffff9e8 0x000039e8 |
---|
1048 | | |0x00097671| 0xdffff9ec 0x000039ec |
---|
1049 | | |0xdffffa68| 0xdffff9f0 0x000039f0 |
---|
1050 | | |0x00024078| 0xdffff9f4 0x000039f4 |
---|
1051 | | |0x00000004| 0xdffff9f8 0x000039f8 |
---|
1052 | | |0x0008dd60| 0xdffff9fc 0x000039fc |
---|
1053 | +--+------+----------+ | +--+----------+ |
---|
1054 | |l0| |0x00087c00| \---> |l0|0x00091048| 0xdffffa00 0x00003a00 top of frame 1 |
---|
1055 | |l1| |0x000c8d48| |l1|0x0000000b| 0xdffffa04 0x00003a04 |
---|
1056 | |l2| |0x000007ff| |l2|0x00091158| 0xdffffa08 0x00003a08 |
---|
1057 | |l3| |0x00000400| |l3|0x000c6f10| 0xdffffa0c 0x00003a0c |
---|
1058 | |l4| |0x00000000| |l4|0x0008eac4| 0xdffffa10 0x00003a10 |
---|
1059 | |l5| |0x00088000| |l5|0x00000000| 0xdffffa14 0x00003a14 |
---|
1060 | |l6| |0x0008d5e0| |l6|0x000c6f10| 0xdffffa18 0x00003a18 |
---|
1061 | |l7| |0x00088000| |l7|0x0008cd00| 0xdffffa1c 0x00003a1c |
---|
1062 | +--+--+---+----------+ +--+----------+ |
---|
1063 | CWP+2 (4) |i0|o0| |0x00000002| |i0|0x0008cb00| 0xdffffa20 0x00003a20 |
---|
1064 | |i1|o1| |0x00000011| |i1|0x00000003| 0xdffffa24 0x00003a24 |
---|
1065 | |i2|o2| |0xffffffff| |i2|0x00000040| 0xdffffa28 0x00003a28 |
---|
1066 | |i3|o3| |0x00000000| |i3|0x0009766b| 0xdffffa2c 0x00003a2c |
---|
1067 | |i4|o4| |0x00000000| |i4|0xdffffa68| 0xdffffa30 0x00003a30 |
---|
1068 | |i5|o5| |0x00064c00| |i5|0x000253d8| 0xdffffa34 0x00003a34 |
---|
1069 | |i6|o6| |0xdffffa70| ----\ |i6|0xffffffff| 0xdffffa38 0x00003a38 |
---|
1070 | |i7|o7| |0x000340e8| | |i7|0x00000000| 0xdffffa3c 0x00003a3c |
---|
1071 | +--+--+---+----------+ | +--+----------+ |
---|
1072 | | |0x00000001| 0xdffffa40 0x00003a40 parameters |
---|
1073 | | |0x00000000| 0xdffffa44 0x00003a44 |
---|
1074 | | |0x00000000| 0xdffffa48 0x00003a48 |
---|
1075 | | |0x00000000| 0xdffffa4c 0x00003a4c |
---|
1076 | | |0x00000000| 0xdffffa50 0x00003a50 |
---|
1077 | | |0x00000000| 0xdffffa54 0x00003a54 |
---|
1078 | | |0x00000002| 0xdffffa58 0x00003a58 |
---|
1079 | | |0x00000002| 0xdffffa5c 0x00003a5c |
---|
1080 | | | . | |
---|
1081 | | | . | .. etc (another 16 bytes) |
---|
1082 | | | . | |
---|
1083 | |
---|
1084 | .. Figure 4 - Sample stack contents |
---|
1085 | |
---|
1086 | Note how the stack contents are not necessarily synchronized with the |
---|
1087 | registers. Various events can cause the register windows to be "flushed" |
---|
1088 | to memory, including most system calls. A programmer can force this |
---|
1089 | update by using ST_FLUSH_WINDOWS trap, which also reduces the number of |
---|
1090 | valid windows to the minimum of 1. |
---|
1091 | |
---|
1092 | Writing a library for multithreaded execution is an example that requires |
---|
1093 | explicit flushing, as is longjmp(). |
---|
1094 | |
---|
1095 | Procedure epilogue and prologue |
---|
1096 | ------------------------------- |
---|
1097 | |
---|
1098 | The stack frame described in the previous section leads to the standard |
---|
1099 | entry/exit mechanisms listed in figure 5. |
---|
1100 | |
---|
1101 | .. code-block:: asm |
---|
1102 | |
---|
1103 | function: |
---|
1104 | save %sp, -C, %sp |
---|
1105 | |
---|
1106 | ; perform function, leave return value, |
---|
1107 | ; if any, in register %i0 upon exit |
---|
1108 | |
---|
1109 | ret ; jmpl %i7+8, %g0 |
---|
1110 | restore ; restore %g0,%g0,%g0 |
---|
1111 | |
---|
1112 | .. Figure 5 - Epilogue/prologue in procedures |
---|
1113 | The SAVE instruction decrements the CWP, as discussed earlier, and also |
---|
1114 | performs an addition. The constant "C" that is used in the figure to |
---|
1115 | indicate the amount of space to make on the stack, and thus corresponds |
---|
1116 | to the frame contents in Figure 3. The minimum is therefore the 16 words |
---|
1117 | for the LOCAL and IN registers, i.e. (hex) 0x40 bytes. |
---|
1118 | |
---|
1119 | A confusing element of the SAVE instruction is that the source operands |
---|
1120 | (the first two parameters) are read from the old register window, and |
---|
1121 | the destination operand (the rightmost parameter) is written to the new |
---|
1122 | window. Thus, allthough "%sp" is indicated as both source and destination, |
---|
1123 | the result is actually written into the stack pointer of the new window |
---|
1124 | (the source stack pointer becomes renamed and is now the frame pointer). |
---|
1125 | |
---|
1126 | The return instructions are also a bit particular. ret is a synthetic |
---|
1127 | instruction, corresponding to jmpl (jump linked). This instruction |
---|
1128 | jumps to the address resulting from adding 8 to the %i7 register. The |
---|
1129 | source instruction address (the address of the ret instruction itself) |
---|
1130 | is written to the %g0 register, i.e. it is discarded. |
---|
1131 | |
---|
1132 | The restore instruction is similarly a synthetic instruction, and is |
---|
1133 | just a short form for a restore that choses not to perform an addition. |
---|
1134 | |
---|
1135 | The calling instruction, in turn, typically looks as follows: |
---|
1136 | |
---|
1137 | .. code-block:: asm |
---|
1138 | |
---|
1139 | call <function> ; jmpl <address>, %o7 |
---|
1140 | mov 0, %o0 |
---|
1141 | |
---|
1142 | Again, the call instruction is synthetic, and is actually the same |
---|
1143 | instruction that performs the return. This time, however, it is interested |
---|
1144 | in saving the return address, into register %o7. Note that the delay |
---|
1145 | slot is often filled with an instruction related to the parameters, |
---|
1146 | in this example it sets the first parameter to zero. |
---|
1147 | Note also that the return value is also generally passed in %o0. |
---|
1148 | |
---|
1149 | Leaf procedures are different. A leaf procedure is an optimization that |
---|
1150 | reduces unnecessary work by taking advantage of the knowledge that no |
---|
1151 | call instructions exist in many procedures. Thus, the save/restore couple |
---|
1152 | can be eliminated. The downside is that such a procedure may only use |
---|
1153 | the out registers (since the in and local registers actually belong to |
---|
1154 | the caller). See Figure 6. |
---|
1155 | |
---|
1156 | .. code-block:: asm |
---|
1157 | |
---|
1158 | function: |
---|
1159 | ; no save instruction needed upon entry |
---|
1160 | |
---|
1161 | ; perform function, leave return value, |
---|
1162 | ; if any, in register %o0 upon exit |
---|
1163 | |
---|
1164 | retl ; jmpl %o7+8, %g0 |
---|
1165 | nop ; the delay slot can be used for something else |
---|
1166 | |
---|
1167 | .. Figure 6 - Epilogue/prologue in leaf procedures |
---|
1168 | |
---|
1169 | Note in the figure that there is only one instruction overhead, namely the |
---|
1170 | retl instruction. retl is also synthetic (return from leaf subroutine), is |
---|
1171 | again a variant of the jmpl instruction, this time with %o7+8 as target. |
---|
1172 | |
---|
1173 | Yet another variation of epilogue is caused by tail call elimination, |
---|
1174 | an optimization supported by some compilers (including Sun's C compiler |
---|
1175 | but not GCC). If the compiler detects that a called function will return |
---|
1176 | to the calling function, it can replace its place on the stack with the |
---|
1177 | called function. Figure 7 contains an example. |
---|
1178 | |
---|
1179 | .. code-block:: asm |
---|
1180 | |
---|
1181 | int |
---|
1182 | foo(int n) |
---|
1183 | { |
---|
1184 | if (n == 0) |
---|
1185 | return 0; |
---|
1186 | else |
---|
1187 | return bar(n); |
---|
1188 | } |
---|
1189 | cmp %o0,0 |
---|
1190 | bne .L1 |
---|
1191 | or %g0,%o7,%g1 |
---|
1192 | retl |
---|
1193 | or %g0,0,%o0 |
---|
1194 | .L1: call bar |
---|
1195 | or %g0,%g1,%o7 |
---|
1196 | |
---|
1197 | .. Figure 7 - Example of tail call elimination |
---|
1198 | |
---|
1199 | Note that the call instruction overwrites register %o7 with the program |
---|
1200 | counter. Therefore the above code saves the old value of %o7, and restores |
---|
1201 | it in the delay slot of the call instruction. If the function call is |
---|
1202 | register indirect, this twiddling with %o7 can be avoided, but of course |
---|
1203 | that form of call is slower on modern processors. |
---|
1204 | |
---|
1205 | The benefit of tail call elimination is to remove an indirection upon |
---|
1206 | return. It is also needed to reduce register window usage, since otherwise |
---|
1207 | the foo() function in Figure 7 would need to allocate a stack frame to |
---|
1208 | save the program counter. |
---|
1209 | |
---|
1210 | A special form of tail call elimination is tail recursion elimination, |
---|
1211 | which detects functions calling themselves, and replaces it with a simple |
---|
1212 | branch. Figure 8 contains an example. |
---|
1213 | |
---|
1214 | .. code-block:: asm |
---|
1215 | |
---|
1216 | int |
---|
1217 | foo(int n) |
---|
1218 | { |
---|
1219 | if (n == 0) |
---|
1220 | return 1; |
---|
1221 | else |
---|
1222 | return (foo(n - 1)); |
---|
1223 | } |
---|
1224 | cmp %o0,0 |
---|
1225 | be .L1 |
---|
1226 | or %g0,%o0,%g1 |
---|
1227 | subcc %g1,1,%g1 |
---|
1228 | .L2: bne .L2 |
---|
1229 | subcc %g1,1,%g1 |
---|
1230 | .L1: retl |
---|
1231 | or %g0,1,%o0 |
---|
1232 | |
---|
1233 | .. comment Figure 8 - Example of tail recursion elimination |
---|
1234 | |
---|
1235 | Needless to say, these optimizations produce code that is difficult to debug. |
---|
1236 | |
---|
1237 | Procedures, stacks, and debuggers |
---|
1238 | ---------------------------------- |
---|
1239 | |
---|
1240 | When debugging an application, your debugger will be parsing the binary |
---|
1241 | and consulting the symbol table to determine procedure entry points. It |
---|
1242 | will also travel the stack frames "upward" to determine the current |
---|
1243 | call chain. |
---|
1244 | |
---|
1245 | When compiling for debugging, compilers will generate additional code |
---|
1246 | as well as avoid some optimizations in order to allow reconstructing |
---|
1247 | situations during execution. For example, GCC/GDB makes sure original |
---|
1248 | parameter values are kept intact somewhere for future parsing of |
---|
1249 | the procedure call stack. The live in registers other than %i0 are |
---|
1250 | not touched. %i0 itself is copied into a free local register, and its |
---|
1251 | location is noted in the symbol file. (You can find out where variables |
---|
1252 | reside by using the "info address" command in GDB.) |
---|
1253 | |
---|
1254 | Given that much of the semantics relating to stack handling and procedure |
---|
1255 | call entry/exit code is only recommended, debuggers will sometimes |
---|
1256 | be fooled. For example, the decision as to wether or not the current |
---|
1257 | procedure is a leaf one or not can be incorrect. In this case a spurious |
---|
1258 | procedure will be inserted between the current procedure and it's "real" |
---|
1259 | parent. Another example is when the application maintains its own implicit |
---|
1260 | call hierarchy, such as jumping to function pointers. In this case the |
---|
1261 | debugger can easily become totally confused. |
---|
1262 | |
---|
1263 | The window overflow and underflow traps |
---|
1264 | --------------------------------------- |
---|
1265 | |
---|
1266 | When the SAVE instruction decrements the current window pointer (CWP) |
---|
1267 | so that it coincides with the invalid window in the window invalid mask |
---|
1268 | (WIM), a window overflow trap occurs. Conversely, when the RESTORE or |
---|
1269 | RETT instructions increment the CWP to coincide with the invalid window, |
---|
1270 | a window underflow trap occurs. |
---|
1271 | |
---|
1272 | Either trap is handled by the operating system. Generally, data is |
---|
1273 | written out to memory and/or read from memory, and the WIM register |
---|
1274 | suitably altered. |
---|
1275 | |
---|
1276 | The code in Figure 9 and Figure 10 below are bare-bones handlers for |
---|
1277 | the two traps. The text is directly from the source code, and sort of |
---|
1278 | works. (As far as I know, these are minimalistic handlers for SPARC |
---|
1279 | V8). Note that there is no way to directly access window registers |
---|
1280 | other than the current one, hence the code does additional save/restore |
---|
1281 | instructions. It's pretty tricky to understand the code, but figure 1 |
---|
1282 | should be of help. |
---|
1283 | |
---|
1284 | .. code-block:: asm |
---|
1285 | |
---|
1286 | /* a SAVE instruction caused a trap */ |
---|
1287 | window_overflow: |
---|
1288 | /* rotate WIM on bit right, we have 8 windows */ |
---|
1289 | mov %wim,%l3 |
---|
1290 | sll %l3,7,%l4 |
---|
1291 | srl %l3,1,%l3 |
---|
1292 | or %l3,%l4,%l3 |
---|
1293 | and %l3,0xff,%l3 |
---|
1294 | |
---|
1295 | /* disable WIM traps */ |
---|
1296 | mov %g0,%wim |
---|
1297 | nop; nop; nop |
---|
1298 | |
---|
1299 | /* point to correct window */ |
---|
1300 | save |
---|
1301 | |
---|
1302 | /* dump registers to stack */ |
---|
1303 | std %l0, [%sp + 0] |
---|
1304 | std %l2, [%sp + 8] |
---|
1305 | std %l4, [%sp + 16] |
---|
1306 | std %l6, [%sp + 24] |
---|
1307 | std %i0, [%sp + 32] |
---|
1308 | std %i2, [%sp + 40] |
---|
1309 | std %i4, [%sp + 48] |
---|
1310 | std %i6, [%sp + 56] |
---|
1311 | |
---|
1312 | /* back to where we should be */ |
---|
1313 | restore |
---|
1314 | |
---|
1315 | /* set new value of window */ |
---|
1316 | mov %l3,%wim |
---|
1317 | nop; nop; nop |
---|
1318 | |
---|
1319 | /* go home */ |
---|
1320 | jmp %l1 |
---|
1321 | rett %l2 |
---|
1322 | Figure 9 - window_underflow trap handler |
---|
1323 | /* a RESTORE instruction caused a trap */ |
---|
1324 | window_underflow: |
---|
1325 | |
---|
1326 | /* rotate WIM on bit LEFT, we have 8 windows */ |
---|
1327 | mov %wim,%l3 |
---|
1328 | srl %l3,7,%l4 |
---|
1329 | sll %l3,1,%l3 |
---|
1330 | or %l3,%l4,%l3 |
---|
1331 | and %l3,0xff,%l3 |
---|
1332 | |
---|
1333 | /* disable WIM traps */ |
---|
1334 | mov %g0,%wim |
---|
1335 | nop; nop; nop |
---|
1336 | |
---|
1337 | /* point to correct window */ |
---|
1338 | restore |
---|
1339 | restore |
---|
1340 | |
---|
1341 | /* dump registers to stack */ |
---|
1342 | ldd [%sp + 0], %l0 |
---|
1343 | ldd [%sp + 8], %l2 |
---|
1344 | ldd [%sp + 16], %l4 |
---|
1345 | ldd [%sp + 24], %l6 |
---|
1346 | ldd [%sp + 32], %i0 |
---|
1347 | ldd [%sp + 40], %i2 |
---|
1348 | ldd [%sp + 48], %i4 |
---|
1349 | ldd [%sp + 56], %i6 |
---|
1350 | |
---|
1351 | /* back to where we should be */ |
---|
1352 | save |
---|
1353 | save |
---|
1354 | |
---|
1355 | /* set new value of window */ |
---|
1356 | mov %l3,%wim |
---|
1357 | nop; nop; nop |
---|
1358 | |
---|
1359 | /* go home */ |
---|
1360 | jmp %l1 |
---|
1361 | rett %l2 |
---|
1362 | |
---|
1363 | .. comment Figure 10 - window_underflow trap handler |
---|
1364 | |
---|