source: rtems-docs/cpu_supplement/sparc64.rst @ f233256

4.115
Last change on this file since f233256 was f233256, checked in by Chris Johns <chrisj@…>, on 10/06/16 at 22:13:16

Clean up the CPU Supplement.

  • Property mode set to 100644
File size: 24.5 KB
RevLine 
[489740f]1.. comment SPDX-License-Identifier: CC-BY-SA-4.0
2
[f233256]3.. COMMENT: COPYRIGHT (c) 1988-2002.
4.. COMMENT: On-Line Applications Research Corporation (OAR).
5.. COMMENT: All rights reserved.
6
[d755cbd]7SPARC-64 Specific Information
8#############################
9
10This document discusses the SPARC Version 9 (aka SPARC-64, SPARC64 or SPARC V9)
11architecture dependencies in this port of RTEMS.
12
13The SPARC V9 architecture leaves a lot of undefined implemenation dependencies
14which are defined by the processor models. Consult the specific CPU model
15section in this document for additional documents covering the implementation
16dependent architectural features.
17
18**sun4u Specific Information**
19
20sun4u is the subset of the SPARC V9 implementations comprising the UltraSPARC I
21through UltraSPARC IV processors.
22
23The following documents were used in developing the SPARC-64 sun4u port:
24
[d389819]25- UltraSPARC  User's Manual
[d755cbd]26  (http://www.sun.com/microelectronics/manuals/ultrasparc/802-7220-02.pdf)
27
[f233256]28- UltraSPARC IIIi Processor (http://datasheets.chipdb.org/Sun/UltraSparc-IIIi.pdf)
[d755cbd]29
30**sun4v Specific Information**
31
[f233256]32sun4v is the subset of the SPARC V9 implementations comprising the UltraSPARC
33T1 or T2 processors.
[d755cbd]34
35The following documents were used in developing the SPARC-64 sun4v port:
36
37- UltraSPARC Architecture 2005 Specification
38  (http://opensparc-t1.sunsource.net/specs/UA2005-current-draft-P-EXT.pdf)
39
40- UltraSPARC T1 supplement to UltraSPARC Architecture 2005 Specification
41  (http://opensparc-t1.sunsource.net/specs/UST1-UASuppl-current-draft-P-EXT.pdf)
42
[f233256]43The defining feature that separates the sun4v architecture from its predecessor
44is the existence of a super-privileged hypervisor that is responsible for
45providing virtualized execution environments.  The impact of the hypervisor on
46the real-time guarantees available with sun4v has not yet been determined.
[d755cbd]47
48CPU Model Dependent Features
49============================
50
51CPU Model Feature Flags
52-----------------------
53
[f233256]54This section presents the set of features which vary across SPARC-64
55implementations and are of importance to RTEMS. The set of CPU model feature
56macros are defined in the file cpukit/score/cpu/sparc64/sparc64.h based upon
57the particular CPU model defined on the compilation command line.
[d755cbd]58
59CPU Model Name
60~~~~~~~~~~~~~~
61
[f233256]62The macro CPU MODEL NAME is a string which designates the name of this CPU
63model.  For example, for the UltraSPARC T1 SPARC V9 model, this macro is set to
64the string "sun4v".
[d755cbd]65
66Floating Point Unit
67~~~~~~~~~~~~~~~~~~~
68
[f233256]69The macro SPARC_HAS_FPU is set to 1 to indicate that this CPU model has a
70hardware floating point unit and 0 otherwise.
[d755cbd]71
72Number of Register Windows
73~~~~~~~~~~~~~~~~~~~~~~~~~~
74
[f233256]75The macro SPARC_NUMBER_OF_REGISTER_WINDOWS is set to indicate the number of
76register window sets implemented by this CPU model.  The SPARC architecture
77allows for a maximum of thirty-two register window sets although most
78implementations only include eight.
[d755cbd]79
80CPU Model Implementation Notes
81------------------------------
82
[f233256]83This section describes the implemenation dependencies of the CPU Models sun4u
84and sun4v of the SPARC V9 architecture.
[d755cbd]85
86sun4u Notes
87~~~~~~~~~~~
88
89XXX
90
91sun4v Notes
92-----------
93
94XXX
95
96Calling Conventions
97===================
98
[f233256]99Each high-level language compiler generates subroutine entry and exit code
100based upon a set of rules known as the compiler's calling convention.  These
101rules address the following issues:
[d755cbd]102
103- register preservation and usage
104
105- parameter passing
106
107- call and return mechanism
108
[d389819]109A compiler's calling convention is of importance when
[d755cbd]110interfacing to subroutines written in another language either
111assembly or high-level.  Even when the high-level language and
112target processor are the same, different compilers may use
113different calling conventions.  As a result, calling conventions
114are both processor and compiler dependent.
115
[f233256]116The following document also provides some conventions on the global register
117usage in SPARC V9: http://developers.sun.com/solaris/articles/sparcv9abi.html
[d755cbd]118
119Programming Model
120-----------------
121
[f233256]122This section discusses the programming model for the SPARC architecture.
[d755cbd]123
124Non-Floating Point Registers
125~~~~~~~~~~~~~~~~~~~~~~~~~~~~
126
[f233256]127The SPARC architecture defines thirty-two non-floating point registers directly
128visible to the programmer.  These are divided into four sets:
[d755cbd]129
130- input registers
131
132- local registers
133
134- output registers
135
136- global registers
137
[f233256]138Each register is referred to by either two or three names in the SPARC
139reference manuals.  First, the registers are referred to as r0 through r31 or
140with the alternate notation r[0] through r[31].  Second, each register is a
141member of one of the four sets listed above.  Finally, some registers have an
142architecturally defined role in the programming model which provides an
143alternate name.  The following table describes the mapping between the 32
144registers and the register sets:
[d755cbd]145
146    +-----------------+----------------+------------------+
147    | Register Number | Register Names |   Description    |
148    +-----------------+----------------+------------------+
149    |     0 - 7       |    g0 - g7     | Global Registers |
150    +-----------------+----------------+------------------+
151    |     8 - 15      |    o0 - o7     | Output Registers |
152    +-----------------+----------------+------------------+
153    |    16 - 23      |    l0 - l7     | Local Registers  |
154    +-----------------+----------------+------------------+
155    |    24 - 31      |    i0 - i7     | Input Registers  |
156    +-----------------+----------------+------------------+
157
[f233256]158As mentioned above, some of the registers serve defined roles in the
159programming model.  The following table describes the role of each of these
160registers:
[d755cbd]161
162    +---------------+----------------+----------------------+
163    | Register Name | Alternate Name |      Description     |
164    +---------------+----------------+----------------------+
165    |     g0        |      na        |    reads return 0    |
166    |               |                |  writes are ignored  |
167    +---------------+----------------+----------------------+
168    |     o6        |      sp        |     stack pointer    |
169    +---------------+----------------+----------------------+
170    |     i6        |      fp        |     frame pointer    |
171    +---------------+----------------+----------------------+
172    |     i7        |      na        |    return address    |
173    +---------------+----------------+----------------------+
174
175Floating Point Registers
176~~~~~~~~~~~~~~~~~~~~~~~~
177
[f233256]178The SPARC V9 architecture includes sixty-four, thirty-two bit registers.  These
179registers may be viewed as follows:
[d755cbd]180
[f233256]181- 32 32-bit single precision floating point or integer registers (f0, f1,
182  ... f31)
[d755cbd]183
[f233256]184- 32 64-bit double precision floating point registers (f0, f2, f4, ... f62)
[d755cbd]185
[f233256]186- 16 128-bit extended precision floating point registers (f0, f4, f8, ... f60)
[d755cbd]187
[f233256]188The floating point state register (fsr) specifies the behavior of the floating
189point unit for rounding, contains its condition codes, version specification,
190and trap information.
[d755cbd]191
192Special Registers
193~~~~~~~~~~~~~~~~~
194
195The SPARC architecture includes a number of special registers:
196
197*``Ancillary State Registers (ASRs)``*
[f233256]198    The ancillary state registers (ASRs) are optional state registers that may
199    be privileged or nonprivileged. ASRs 16-31 are implementation-
200    dependent. The SPARC V9 ASRs include: y, ccr, asi, tick, pc, fprs.  The
201    sun4u ASRs include: pcr, pic, dcr, gsr, softint set, softint clr, softint,
202    and tick cmpr. The sun4v ASRs include: pcr, pic, gsr, soft- int set,
203    softint clr, softint, tick cmpr, stick, and stick cmpr.
[d755cbd]204
205*``Processor State Register (pstate)``*
206    The privileged pstate register contains control fields for the proces-
[d389819]207    sor's current state. Its flag fields include the interrupt enable, privi-
[d755cbd]208    leged mode, and enable FPU.
209
210*``Processor Interrupt Level (pil)``*
211    The PIL specifies the interrupt level above which interrupts will be
212    accepted.
213
214*``Trap Registers``*
[f233256]215    The trap handling mechanism of the SPARC V9 includes a number of registers,
216    including: trap program counter (tpc), trap next pc (tnpc), trap state
217    (tstate), trap type (tt), trap base address (tba), and trap level (tl).
[d755cbd]218
219*``Alternate Globals``*
[f233256]220    The AG bit of the pstate register provides access to an alternate set of
221    global registers. On sun4v, the AG bit is replaced by the global level (gl)
222    register, providing access to at least two and at most eight alternate sets
223    of globals.
[d755cbd]224
225*``Register Window registers``*
[f233256]226    A number of registers assist in register window management.  These include
227    the current window pointer (cwp), savable windows (cansave), restorable
228    windows (canrestore), clean windows (clean- win), other windows (otherwin),
229    and window state (wstate).
[d755cbd]230
231Register Windows
232----------------
233
[f233256]234The SPARC architecture includes the concept of register windows.  An overly
235simplistic way to think of these windows is to imagine them as being an
236infinite supply of "fresh" register sets available for each subroutine to use.
237In reality, they are much more complicated.
238
239The save instruction is used to obtain a new register window.  This instruction
240increments the current window pointer, thus providing a new set of registers
241for use. This register set includes eight fresh local registers for use
242exclusively by this subroutine. When done with a register set, the restore
243instruction decrements the current window pointer and the previous register set
244is once again available.
245
246The two primary issues complicating the use of register windows are that (1)
247the set of register windows is finite, and (2) some registers are shared
248between adjacent registers windows.
249
250Because the set of register windows is finite, it is possible to execute enough
251save instructions without corresponding restore's to consume all of the
252register windows.  This is easily accomplished in a high level language because
253each subroutine typically performs a save instruction upon entry.  Thus having
254a subroutine call depth greater than the number of register windows will result
255in a window overflow condition.  The window overflow condition generates a trap
256which must be handled in software.  The window overflow trap handler is
257responsible for saving the contents of the oldest register window on the
258program stack.
259
260Similarly, the subroutines will eventually complete and begin to perform
261restore's.  If the restore results in the need for a register window which has
262previously been written to memory as part of an overflow, then a window
263underflow condition results.  Just like the window overflow, the window
264underflow condition must be handled in software by a trap handler.  The window
265underflow trap handler is responsible for reloading the contents of the
266register window requested by the restore instruction from the program stack.
267
268The cansave, canrestore, otherwin, and cwp are used in conjunction to manage
269the finite set of register windows and detect the window overflow and underflow
270conditions. The first three of these registers must satisfy the invariant
271cansave + canrestore + otherwin = nwindow - 2, where nwindow is the number of
272register windows.  The cwp contains the index of the register window currently
273in use.  RTEMS does not use the cleanwin and otherwin registers.
274
275The save instruction increments the cwp modulo the number of register windows,
276and if cansave is 0 then it also generates a window overflow. Similarly, the
277restore instruction decrements the cwp modulo the number of register windows,
278and if canrestore is 0 then it also generates a window underflow.
279
280Unlike with the SPARC model, the SPARC-64 port does not assume that a register
281window is available for a trap. The window overflow and underflow conditions
282are not detected without hardware generating the trap. (These conditions can be
283detected by reading the register window registers and doing some simple
284arithmetic.)
285
286The window overflow and window underflow trap handlers are a critical part of
287the run-time environment for a SPARC application.  The SPARC architectural
288specification allows for the number of register windows to be any power of two
289less than or equal to 32.  The most common choice for SPARC implementations
290appears to be 8 register windows.  This results in the cwp ranging in value
291from 0 to 7 on most implementations.
292
293The second complicating factor is the sharing of registers between adjacent
294register windows.  While each register window has its own set of local
295registers, the input and output registers are shared between adjacent windows.
296The output registers for register window N are the same as the input registers
297for register window ((N + 1) modulo RW) where RW is the number of register
298windows.  An alternative way to think of this is to remember how parameters are
299passed to a subroutine on the SPARC.  The caller loads values into what are its
300output registers.  Then after the callee executes a save instruction, those
301parameters are available in its input registers.  This is a very efficient way
302to pass parameters as no data is actually moved by the save or restore
303instructions.
[d755cbd]304
305Call and Return Mechanism
306-------------------------
307
[f233256]308The SPARC architecture supports a simple yet effective call and return
309mechanism.  A subroutine is invoked via the call (call) instruction.  This
310instruction places the return address in the caller's output register 7 (o7).
311After the callee executes a save instruction, this value is available in input
312register 7 (i7) until the corresponding restore instruction is executed.
313
314The callee returns to the caller via a jmp to the return address.  There is a
315delay slot following this instruction which is commonly used to execute a
316restore instruction - if a register window was allocated by this subroutine.
317
318It is important to note that the SPARC subroutine call and return mechanism
319does not automatically save and restore any registers.  This is accomplished
320via the save and restore instructions which manage the set of registers
321windows.  This allows for the compiler to generate leaf-optimized functions
[d389819]322that utilize the caller's output registers without using save and restore.
[d755cbd]323
324Calling Mechanism
325-----------------
326
[f233256]327All RTEMS directives are invoked using the regular SPARC calling convention via
328the call instruction.
[d755cbd]329
330Register Usage
331--------------
332
[f233256]333As discussed above, the call instruction does not automatically save any
334registers.  The save and restore instructions are used to allocate and
335deallocate register windows.  When a register window is allocated, the new set
336of local registers are available for the exclusive use of the subroutine which
337allocated this register set.
[d755cbd]338
339Parameter Passing
340-----------------
341
[f233256]342RTEMS assumes that arguments are placed in the caller's output registers with
343the first argument in output register 0 (o0), the second argument in output
344register 1 (o1), and so forth.  Until the callee executes a save instruction,
345the parameters are still visible in the output registers.  After the callee
346executes a save instruction, the parameters are visible in the corresponding
347input registers.  The following pseudo-code illustrates the typical sequence
348used to call a RTEMS directive with three (3) arguments:
349
350.. code-block:: c
[d755cbd]351
352    load third argument into o2
353    load second argument into o1
354    load first argument into o0
355    invoke directive
356
357User-Provided Routines
358----------------------
359
[f233256]360All user-provided routines invoked by RTEMS, such as user extensions, device
361drivers, and MPCI routines, must also adhere to these calling conventions.
[d755cbd]362
363Memory Model
364============
365
[f233256]366A processor may support any combination of memory models ranging from pure
367physical addressing to complex demand paged virtual memory systems.  RTEMS
368supports a flat memory model which ranges contiguously over the processor's
369allowable address space.  RTEMS does not support segmentation or virtual memory
370of any kind.  The appropriate memory model for RTEMS provided by the targeted
371processor and related characteristics of that model are described in this
372chapter.
[d755cbd]373
374Flat Memory Model
375-----------------
376
[f233256]377The SPARC-64 architecture supports a flat 64-bit address space with addresses
378ranging from 0x0000000000000000 to 0xFFFFFFFFFFFFFFFF.  Each address is
379represented by a 64-bit value (and an 8-bit address space identifider or ASI)
380and is byte addressable. The address may be used to reference a single byte,
381half-word (2-bytes), word (4 bytes), doubleword (8 bytes), or quad-word (16
382bytes).  Memory accesses within this address space are performed in big endian
383fashion by the SPARC. Memory accesses which are not properly aligned generate a
384"memory address not aligned" trap (type number 0x34). The following table lists
385the alignment requirements for a variety of data accesses:
[d755cbd]386
[f233256]387.. table::
[d755cbd]388
389    +--------------+-----------------------+
390    |   Data Type  | Alignment Requirement |
391    +--------------+-----------------------+
392    |     byte     |          1            |
393    |   half-word  |          2            |
394    |     word     |          4            |
395    |  doubleword  |          8            |
396    |   quadword   |          16           |
397    +--------------+-----------------------+
398
[f233256]399RTEMS currently does not support any SPARC Memory Management Units, therefore,
400virtual memory or segmentation systems involving the SPARC are not supported.
[d755cbd]401
402Interrupt Processing
403====================
404
[f233256]405RTEMS and associated documentation uses the terms interrupt and vector.  In the
406SPARC architecture, these terms correspond to traps and trap type,
407respectively.  The terms will be used interchangeably in this manual. Note that
408in the SPARC manuals, interrupts are a subset of the traps that are delivered
409to software interrupt handlers.
[d755cbd]410
411Synchronous Versus Asynchronous Traps
412-------------------------------------
413
[f233256]414The SPARC architecture includes two classes of traps: synchronous (precise) and
415asynchronous (deferred).  Asynchronous traps occur when an external event
416interrupts the processor.  These traps are not associated with any instruction
417executed by the processor and logically occur between instructions.  The
418instruction currently in the execute stage of the processor is allowed to
419complete although subsequent instructions are annulled.  The return address
420reported by the processor for asynchronous traps is the pair of instructions
421following the current instruction.
422
423Synchronous traps are caused by the actions of an instruction.  The trap
424stimulus in this case either occurs internally to the processor or is from an
425external signal that was provoked by the instruction.  These traps are taken
426immediately and the instruction that caused the trap is aborted before any
427state changes occur in the processor itself.  The return address reported by
428the processor for synchronous traps is the instruction which caused the trap
429and the following instruction.
[d755cbd]430
431Vectoring of Interrupt Handler
432------------------------------
433
[f233256]434Upon receipt of an interrupt the SPARC automatically performs the following
435actions:
[d755cbd]436
[f233256]437- The trap level is set. This provides access to a fresh set of privileged
438  trap-state registers used to save the current state, in effect, pushing a
439  frame on the trap stack.  TL <- TL + 1
[d755cbd]440
441- Existing state is preserved
442  - TSTATE[TL].CCR <- CCR
443  - TSTATE[TL].ASI <- ASI
444  - TSTATE[TL].PSTATE <- PSTATE
445  - TSTATE[TL].CWP <- CWP
446  - TPC[TL] <- PC
447  - TNPC[TL] <- nPC
448
449- The trap type is preserved. TT[TL] <- the trap type
450
451- The PSTATE register is updated to a predefined state
452  - PSTATE.MM is unchanged
453  - PSTATE.RED <- 0
454  - PSTATE.PEF <- 1 if FPU is present, 0 otherwise
455  - PSTATE.AM <- 0 (address masking is turned off)
456  - PSTATE.PRIV <- 1 (the processor enters privileged mode)
457  - PSTATE.IE <- 0 (interrupts are disabled)
458  - PSTATE.AG <- 1 (global regs are replaced with alternate globals)
459  - PSTATE.CLE <- PSTATE.TLE (set endian mode for traps)
460
461- For a register-window trap only, CWP is set to point to the register
462  window that must be accessed by the trap-handler software, that is:
463
464  - If TT[TL] = 0x24 (a clean window trap), then CWP <- CWP + 1.
465  - If (0x80 <= TT[TL] <= 0xBF) (window spill trap), then CWP <- CWP +
466    CANSAVE + 2.
467  - If (0xC0 <= TT[TL] <= 0xFF) (window fill trap), then CWP <- CWP1.
468  - For non-register-window traps, CWP is not changed.
469
470- Control is transferred into the trap table:
471
472  - PC <- TBA<63:15> (TL>0) TT[TL] 0 0000
473  - nPC <- TBA<63:15> (TL>0) TT[TL] 0 0100
474  - where (TL>0) is 0 if TL = 0, and 1 if TL > 0.
475
476In order to safely invoke a subroutine during trap handling, traps must be
[f233256]477enabled to allow for the possibility of register window overflow and underflow
478traps.
[d755cbd]479
[f233256]480If the interrupt handler was installed as an RTEMS interrupt handler, then upon
481receipt of the interrupt, the processor passes control to the RTEMS interrupt
482handler which performs the following actions:
[d755cbd]483
[d389819]484- saves the state of the interrupted task on it's stack,
[d755cbd]485
486- switches the processor to trap level 0,
487
[f233256]488- if this is the outermost (i.e. non-nested) interrupt, then the RTEMS
489  interrupt handler switches from the current stack to the interrupt stack,
[d755cbd]490
491- enables traps,
492
493- invokes the vectors to a user interrupt service routine (ISR).
494
[f233256]495Asynchronous interrupts are ignored while traps are disabled.  Synchronous
496traps which occur while traps are disabled may result in the CPU being forced
497into an error mode.
[d755cbd]498
[f233256]499A nested interrupt is processed similarly with the exception that the current
500stack need not be switched to the interrupt stack.
[d755cbd]501
502Traps and Register Windows
503--------------------------
504
505XXX
506
507Interrupt Levels
508----------------
509
[f233256]510Sixteen levels (0-15) of interrupt priorities are supported by the SPARC
511architecture with level fifteen (15) being the highest priority.  Level
512zero (0) indicates that interrupts are fully enabled.  Interrupt requests for
513interrupts with priorities less than or equal to the current interrupt mask
[d755cbd]514level are ignored.
515
[f233256]516Although RTEMS supports 256 interrupt levels, the SPARC only supports sixteen.
517RTEMS interrupt levels 0 through 15 directly correspond to SPARC processor
518interrupt levels.  All other RTEMS interrupt levels are undefined and their
519behavior is unpredictable.
[d755cbd]520
521Disabling of Interrupts by RTEMS
522--------------------------------
523
524XXX
525
526Interrupt Stack
527---------------
528
[f233256]529The SPARC architecture does not provide for a dedicated interrupt stack.  Thus
530by default, trap handlers would execute on the stack of the RTEMS task which
531they interrupted.  This artificially inflates the stack requirements for each
532task since EVERY task stack would have to include enough space to account for
533the worst case interrupt stack requirements in addition to it's own worst case
534usage.  RTEMS addresses this problem on the SPARC by providing a dedicated
535interrupt stack managed by software.
[d755cbd]536
[f233256]537During system initialization, RTEMS allocates the interrupt stack from the
538Workspace Area.  The amount of memory allocated for the interrupt stack is
539determined by the interrupt_stack_size field in the CPU Configuration Table.
540As part of processing a non-nested interrupt, RTEMS will switch to the
541interrupt stack before invoking the installed handler.
[d755cbd]542
543Default Fatal Error Processing
544==============================
545
[f233256]546Upon detection of a fatal error by either the application or RTEMS the fatal
547error manager is invoked.  The fatal error manager will invoke the
548user-supplied fatal error handlers.  If no user-supplied handlers are
549configured, the RTEMS provided default fatal error handler is invoked.  If the
550user-supplied fatal error handlers return to the executive the default fatal
551error handler is then invoked.  This chapter describes the precise operations
552of the default fatal error handler.
[d755cbd]553
554Default Fatal Error Handler Operations
555--------------------------------------
556
[f233256]557The default fatal error handler which is invoked by the fatal_error_occurred
558directive when there is no user handler configured or the user handler returns
559control to RTEMS.  The default fatal error handler disables processor
560interrupts to level 15, places the error code in g1, and goes into an infinite
[d755cbd]561loop to simulate a halt processor instruction.
562
563Symmetric Multiprocessing
564=========================
565
566SMP is not supported.
567
568Thread-Local Storage
569====================
570
571Thread-local storage is supported.
572
573Board Support Packages
574======================
575
[f233256]576An RTEMS Board Support Package (BSP) must be designed to support a particular
577processor and target board combination.  This chapter presents a discussion of
578SPARC specific BSP issues.  For more information on developing a BSP, refer to
579the chapter titled Board Support Packages in the RTEMS Applications User's
580Guide.
[d755cbd]581
582HelenOS and Open Firmware
583-------------------------
584
[f233256]585The provided BSPs make use of some bootstrap and low-level hardware code of the
586HelenOS operating system. These files can be found in the shared/helenos
587directory of the sparc64 bsp directory.  Consult the sources for more detailed
588information.
[d755cbd]589
590The shared BSP code also uses the Open Firmware interface to re-use firmware
591code, primarily for console support and default trap handlers.
Note: See TracBrowser for help on using the repository browser.