Changeset ad2ca17 in rtems-docs


Ignore:
Timestamp:
Aug 16, 2018, 11:11:53 PM (11 months ago)
Author:
Joel Sherrill <joel@…>
Branches:
master
Children:
587d14f
Parents:
67195aa
git-author:
Joel Sherrill <joel@…> (08/16/18 23:11:53)
git-committer:
Joel Sherrill <joel@…> (08/20/18 16:07:47)
Message:

cpu-supplement/sparc.rst: Fix me

File:
1 edited

Legend:

Unmodified
Added
Removed
  • cpu-supplement/sparc.rst

    r67195aa rad2ca17  
    770770- Must initialize the SPARC's initial trap table with at least trap handlers
    771771  for register window overflow and register window underflow.
     772
     773....................................
     774....
     775
     776Understanding stacks and registers in the SPARC architecture(s)
     777===============================================================
     778
     779The content in this section originally appeared at
     780https://www.sics.se/~psm/sparcstack.html. It appears here with the
     781gracious permission of the author Peter Magnusson.
     782
     783
     784The SPARC architecture from Sun Microsystems has some "interesting"
     785characteristics. After having to deal with both compiler, interpreter, OS
     786emulator, and OS porting issues for the SPARC, I decided to gather notes
     787and documentation in one place. If there are any issues you don't find
     788addressed by this page, or if you know of any similar Net resources, let
     789me know. This document is limited to the V8 version of the architecture.
     790
     791General Structure
     792-----------------
     793
     794SPARC has 32 general purpose integer registers visible to the program
     795at any given time. Of these, 8 registers are global registers and 24
     796registers are in a register window. A window consists of three groups
     797of 8 registers, the out, local, and in registers. See table 1. A SPARC
     798implementation can have from 2 to 32 windows, thus varying the number
     799of registers from 40 to 520. Most implentations have 7 or 8 windows. The
     800variable number of registers is the principal reason for the SPARC being
     801"scalable".
     802
     803At any given time, only one window is visible, as determined by the
     804current window pointer (CWP) which is part of the processor status
     805register (PSR). This is a five bit value that can be decremented or
     806incremented by the SAVE and RESTORE instructions, respectively. These
     807instructions are generally executed on procedure call and return
     808(respectively). The idea is that the in registers contain incoming
     809parameters, the local register constitute scratch registers, the out
     810registers contain outgoing parameters, and the global registers contain
     811values that vary little between executions. The register windows overlap
     812partially, thus the out registers become renamed by SAVE to become the in
     813registers of the called procedure. Thus, the memory traffic is reduced
     814when going up and down the procedure call. Since this is a frequent
     815operation, performance is improved.
     816
     817(That was the idea, anyway. The drawback is that upon interactions
     818with the system the registers need to be flushed to the stack,
     819necessitating a long sequence of writes to memory of data that is
     820often mostly garbage. Register windows was a bad idea that was caused
     821by simulation studies that considered only programs in isolation, as
     822opposed to multitasking workloads, and by considering compilers with
     823poor optimization. It also caused considerable problems in implementing
     824high-end SPARC processors such as the SuperSPARC, although more recent
     825implementations have dealt effectively with the obstacles. Register
     826windows is now part of the compatibility legacy and not easily removed
     827from the architecture.)
     828
     829================ ======== ================
     830Register  Group  Mnemonic Register Address
     831================ ======== ================
     832global           %g0-%g7  r[0]-r[7]
     833out              %o0-%o7  r[8]-r[15]
     834local            %l0-%l7  r[16]-r[23]
     835in               %i0-%i7  r[24]-r[31]
     836================ ======== ================
     837
     838.. Table 1 - Visible Registers
     839
     840The overlap of the registers is illustrated in figure 1. The figure
     841shows an implementation with 8 windows, numbered 0 to 7 (labeled w0 to
     842w7 in the figure).. Each window corresponds to 24 registers, 16 of which
     843are shared with "neighboring" windows. The windows are arranged in a
     844wrap-around manner, thus window number 0 borders window number 7. The
     845common cause of changing the current window, as pointed to by CWP, is
     846the RESTORE and SAVE instuctions, shown in the middle. Less common is
     847the supervisor RETT instruction (return from trap) and the trap event
     848(interrupt, exception, or TRAP instruction).
     849
     850
     851.. image:: sparcwin.gif
     852
     853Figure 1 - Windowed Registers
     854
     855The "WIM" register is also indicated in the top left of figure 1. The
     856window invalid mask is a bit map of valid windows. It is generally used
     857as a pointer, i.e. exactly one bit is set in the WIM register indicating
     858which window is invalid (in the figure it's window 7). Register windows
     859are generally used to support procedure calls, so they can be viewed
     860as a cache of the stack contents. The WIM "pointer" indicates how
     861many procedure calls in a row can be taken without writing out data to
     862memory. In the figure, the capacity of the register windows is fully
     863utilized. An additional call will thus exceed capacity, triggering a
     864window overflow trap. At the other end, a window underflow trap occurs
     865when the register window "cache" if empty and more data needs to be
     866fetched from memory.
     867
     868Register Semantics
     869------------------
     870
     871phe SPARC Architecture includes recommended software semantics. These are
     872described in the architecture manual, the SPARC ABI (application binary
     873interface) standard, and, unfortunately, in various other locations as
     874well (including header files and compiler documentation).
     875
     876Figure 2 shows a summary of register contents at any given time.
     877
     878.. code-block:: asm
     879
     880                 %g0  (r00)       always zero
     881                 %g1  (r01)  [1]  temporary value
     882                 %g2  (r02)  [2]  global 2
     883     global      %g3  (r03)  [2]  global 3
     884                 %g4  (r04)  [2]  global 4
     885                 %g5  (r05)       reserved for SPARC ABI
     886                 %g6  (r06)       reserved for SPARC ABI
     887                 %g7  (r07)       reserved for SPARC ABI
     888
     889                 %o0  (r08)  [3]  outgoing parameter 0 / return value from callee   
     890                 %o1  (r09)  [1]  outgoing parameter 1
     891                 %o2  (r10)  [1]  outgoing parameter 2
     892     out         %o3  (r11)  [1]  outgoing parameter 3
     893                 %o4  (r12)  [1]  outgoing parameter 4
     894                 %o5  (r13)  [1]  outgoing parameter 5
     895            %sp, %o6  (r14)  [1]  stack pointer
     896                 %o7  (r15)  [1]  temporary value / address of CALL instruction
     897
     898                 %l0  (r16)  [3]  local 0
     899                 %l1  (r17)  [3]  local 1
     900                 %l2  (r18)  [3]  local 2
     901     local       %l3  (r19)  [3]  local 3
     902                 %l4  (r20)  [3]  local 4
     903                 %l5  (r21)  [3]  local 5
     904                 %l6  (r22)  [3]  local 6
     905                 %l7  (r23)  [3]  local 7
     906
     907                 %i0  (r24)  [3]  incoming parameter 0 / return value to caller
     908                 %i1  (r25)  [3]  incoming parameter 1
     909                 %i2  (r26)  [3]  incoming parameter 2
     910     in          %i3  (r27)  [3]  incoming parameter 3
     911                 %i4  (r28)  [3]  incoming parameter 4
     912                 %i5  (r29)  [3]  incoming parameter 5
     913            %fp, %i6  (r30)  [3]  frame pointer
     914                 %i7  (r31)  [3]  return address - 8
     915
     916Notes:
     917
     918# assumed by caller to be destroyed (volatile) across a procedure call
     919
     920# should not be used by SPARC ABI library code
     921
     922# assumed by caller to be preserved across a procedure call
     923
     924.. Above was Figure 2 - SPARC register semantics
     925
     926Particular compilers are likely to vary slightly.
     927
     928Note that globals %g2-%g4 are reserved for the "application", which
     929includes libraries and compiler. Thus, for example, libraries may
     930overwrite these registers unless they've been compiled with suitable
     931flags. Also, the "reserved" registers are presumed to be allocated
     932(in the future) bottom-up, i.e. %g7 is currently the "safest" to use.
     933
     934Optimizing linkers and interpreters are exmples that use global registers.
     935
     936Register Windows and the Stack
     937------------------------------
     938
     939The SPARC register windows are, naturally, intimately related to the
     940stack. In particular, the stack pointer (%sp or %o6) must always point
     941to a free block of 64 bytes. This area is used by the operating system
     942(Solaris, SunOS, and Linux at least) to save the current local and in
     943registers upon a system interupt, exception, or trap instruction. (Note
     944that this can occur at any time.)
     945
     946Other aspects of register relations with memory are programming
     947convention. The typical, and recommended, layout of the stack is shown
     948in figure 3. The figure shows a stack frame.
     949
     950.. code-block:: asm
     951                    low addresses
     952               +-------------------------+         
     953     %sp  -->  | 16 words for storing    |
     954               | LOCAL and IN registers  |
     955               +-------------------------+
     956               |  one-word pointer to    |
     957               | aggregate return value  |
     958               +-------------------------+
     959               |   6 words for callee    |
     960               |   to store register     |
     961               |       arguments         |
     962               +-------------------------+
     963               |  outgoing parameters    |
     964               |  past the 6th, if any   |
     965               +-------------------------+
     966               |  space, if needed, for  |
     967               |  compiler temporaries   |
     968               |   and saved floating-   |
     969               |    point registers      |
     970               +-------------------------+
     971                    .................
     972               +-------------------------+
     973               |    space dynamically    |
     974               |    allocated via the    |
     975               |  alloca() library call  |
     976               +-------------------------+
     977               |  space, if needed, for  |
     978               |    automatic arrays,    |
     979               |    aggregates, and      |
     980               |   addressable scalar    |
     981               |       automatics        |
     982               +-------------------------+
     983    %fp  -->
     984                     high addresses
     985
     986.. Figure 3 - Stack frame contents
     987
     988Note that the top boxes of figure 3 are addressed via the stack pointer
     989(%sp), as positive offsets (including zero), and the bottom boxes are
     990accessed over the frame pointer using negative offsets (excluding zero),
     991and that the frame pointer is the old stack pointer. This scheme allows
     992the separation of information known at compile time (number and size
     993of local parameters, etc) from run-time information (size of blocks
     994allocated by alloca()).
     995
     996"addressable scalar automatics" is a fancy name for local variables.
     997
     998The clever nature of the stack and frame pointers are that they are always
     99916 registers apart in the register windows. Thus, a SAVE instruction will
     1000make the current stack pointer into the frame pointer and, since the SAVE
     1001instruction also doubles as an ADD, create a new stack pointer. Figure 4
     1002illustrates what the top of a stack might look like during execution. (The
     1003listing is from the "pwin" command in the SimICS simulator.)
     1004
     1005.. code-block:: asm
     1006
     1007                  REGISTER WINDOWS
     1008                 +--+---+----------+
     1009                 |g0|r00|0x00000000| global
     1010                 |g1|r01|0x00000006| registers
     1011                 |g2|r02|0x00091278|
     1012      g0-g7      |g3|r03|0x0008ebd0|
     1013                 |g4|r04|0x00000000|        (note: 'save' and 'trap' decrements CWP,
     1014                 |g5|r05|0x00000000|        i.e. moves it up on this diagram. 'restore'
     1015                 |g6|r06|0x00000000|        and 'rett' increments CWP, i.e. down)
     1016                 |g7|r07|0x00000000|
     1017                 +--+---+----------+
     1018 CWP (2)         |o0|r08|0x00000002|
     1019                 |o1|r09|0x00000000|                            MEMORY
     1020                 |o2|r10|0x00000001|
     1021      o0-o7      |o3|r11|0x00000001|             stack growth
     1022                 |o4|r12|0x000943d0|
     1023                 |o5|r13|0x0008b400|                  ^
     1024                 |sp|r14|0xdffff9a0| ----\           /|\
     1025                 |o7|r15|0x00062abc|     |            |                     addresses
     1026                 +--+---+----------+     |     +--+----------+         virtual     physical
     1027                 |l0|r16|0x00087c00|     \---> |l0|0x00000000|        0xdffff9a0  0x000039a0  top of frame 0   
     1028                 |l1|r17|0x00027fd4|           |l1|0x00000000|        0xdffff9a4  0x000039a4
     1029                 |l2|r18|0x00000000|           |l2|0x0009df80|        0xdffff9a8  0x000039a8
     1030      l0-l7      |l3|r19|0x00000000|           |l3|0x00097660|        0xdffff9ac  0x000039ac
     1031                 |l4|r20|0x00000000|           |l4|0x00000014|        0xdffff9b0  0x000039b0
     1032                 |l5|r21|0x00097678|           |l5|0x00000001|        0xdffff9b4  0x000039b4
     1033                 |l6|r22|0x0008b400|           |l6|0x00000004|        0xdffff9b8  0x000039b8
     1034                 |l7|r23|0x0008b800|           |l7|0x0008dd60|        0xdffff9bc  0x000039bc
     1035              +--+--+---+----------+           +--+----------+
     1036 CWP+1 (3)    |o0|i0|r24|0x00000002|           |i0|0x00091048|        0xdffff9c0  0x000039c0
     1037              |o1|i1|r25|0x00000000|           |i1|0x00000011|        0xdffff9c4  0x000039c4
     1038              |o2|i2|r26|0x0008b7c0|           |i2|0x00091158|        0xdffff9c8  0x000039c8
     1039      i0-i7   |o3|i3|r27|0x00000019|           |i3|0x0008d370|        0xdffff9cc  0x000039cc
     1040              |o4|i4|r28|0x0000006c|           |i4|0x0008eac4|        0xdffff9d0  0x000039d0
     1041              |o5|i5|r29|0x00000000|           |i5|0x00000000|        0xdffff9d4  0x000039d4
     1042              |o6|fp|r30|0xdffffa00| ----\     |fp|0x00097660|        0xdffff9d8  0x000039d8
     1043              |o7|i7|r31|0x00040468|     |     |i7|0x00000000|        0xdffff9dc  0x000039dc
     1044              +--+--+---+----------+     |     +--+----------+
     1045                                         |        |0x00000001|        0xdffff9e0  0x000039e0  parameters
     1046                                         |        |0x00000002|        0xdffff9e4  0x000039e4
     1047                                         |        |0x00000040|        0xdffff9e8  0x000039e8
     1048                                         |        |0x00097671|        0xdffff9ec  0x000039ec
     1049                                         |        |0xdffffa68|        0xdffff9f0  0x000039f0
     1050                                         |        |0x00024078|        0xdffff9f4  0x000039f4
     1051                                         |        |0x00000004|        0xdffff9f8  0x000039f8
     1052                                         |        |0x0008dd60|        0xdffff9fc  0x000039fc
     1053              +--+------+----------+     |     +--+----------+
     1054              |l0|      |0x00087c00|     \---> |l0|0x00091048|        0xdffffa00  0x00003a00  top of frame 1
     1055              |l1|      |0x000c8d48|           |l1|0x0000000b|        0xdffffa04  0x00003a04
     1056              |l2|      |0x000007ff|           |l2|0x00091158|        0xdffffa08  0x00003a08
     1057              |l3|      |0x00000400|           |l3|0x000c6f10|        0xdffffa0c  0x00003a0c
     1058              |l4|      |0x00000000|           |l4|0x0008eac4|        0xdffffa10  0x00003a10
     1059              |l5|      |0x00088000|           |l5|0x00000000|        0xdffffa14  0x00003a14
     1060              |l6|      |0x0008d5e0|           |l6|0x000c6f10|        0xdffffa18  0x00003a18
     1061              |l7|      |0x00088000|           |l7|0x0008cd00|        0xdffffa1c  0x00003a1c
     1062              +--+--+---+----------+           +--+----------+
     1063 CWP+2 (4)    |i0|o0|   |0x00000002|           |i0|0x0008cb00|        0xdffffa20  0x00003a20
     1064              |i1|o1|   |0x00000011|           |i1|0x00000003|        0xdffffa24  0x00003a24
     1065              |i2|o2|   |0xffffffff|           |i2|0x00000040|        0xdffffa28  0x00003a28
     1066              |i3|o3|   |0x00000000|           |i3|0x0009766b|        0xdffffa2c  0x00003a2c
     1067              |i4|o4|   |0x00000000|           |i4|0xdffffa68|        0xdffffa30  0x00003a30
     1068              |i5|o5|   |0x00064c00|           |i5|0x000253d8|        0xdffffa34  0x00003a34
     1069              |i6|o6|   |0xdffffa70| ----\     |i6|0xffffffff|        0xdffffa38  0x00003a38
     1070              |i7|o7|   |0x000340e8|     |     |i7|0x00000000|        0xdffffa3c  0x00003a3c
     1071              +--+--+---+----------+     |     +--+----------+
     1072                                         |        |0x00000001|        0xdffffa40  0x00003a40  parameters
     1073                                         |        |0x00000000|        0xdffffa44  0x00003a44
     1074                                         |        |0x00000000|        0xdffffa48  0x00003a48
     1075                                         |        |0x00000000|        0xdffffa4c  0x00003a4c
     1076                                         |        |0x00000000|        0xdffffa50  0x00003a50
     1077                                         |        |0x00000000|        0xdffffa54  0x00003a54
     1078                                         |        |0x00000002|        0xdffffa58  0x00003a58
     1079                                         |        |0x00000002|        0xdffffa5c  0x00003a5c
     1080                                         |        |    .     |
     1081                                         |        |    .     |        .. etc (another 16 bytes)
     1082                                         |        |    .     |
     1083
     1084.. Figure 4 - Sample stack contents
     1085
     1086Note how the stack contents are not necessarily synchronized with the
     1087registers. Various events can cause the register windows to be "flushed"
     1088to memory, including most system calls. A programmer can force this
     1089update by using ST_FLUSH_WINDOWS trap, which also reduces the number of
     1090valid windows to the minimum of 1.
     1091
     1092Writing a library for multithreaded execution is an example that requires
     1093explicit flushing, as is longjmp().
     1094
     1095Procedure epilogue and prologue
     1096-------------------------------
     1097
     1098The stack frame described in the previous section leads to the standard
     1099entry/exit mechanisms listed in figure 5.
     1100
     1101.. code-block:: asm
     1102
     1103  function:
     1104    save  %sp, -C, %sp
     1105
     1106               ; perform function, leave return value,   
     1107               ; if any, in register %i0 upon exit
     1108
     1109    ret        ; jmpl %i7+8, %g0
     1110    restore    ; restore %g0,%g0,%g0
     1111
     1112.. Figure 5 - Epilogue/prologue in procedures
     1113The SAVE instruction decrements the CWP, as discussed earlier, and also
     1114performs an addition. The constant "C" that is used in the figure to
     1115indicate the amount of space to make on the stack, and thus corresponds
     1116to the frame contents in Figure 3. The minimum is therefore the 16 words
     1117for the LOCAL and IN registers, i.e. (hex) 0x40 bytes.
     1118
     1119A confusing element of the SAVE instruction is that the source operands
     1120(the first two parameters) are read from the old register window, and
     1121the destination operand (the rightmost parameter) is written to the new
     1122window. Thus, allthough "%sp" is indicated as both source and destination,
     1123the result is actually written into the stack pointer of the new window
     1124(the source stack pointer becomes renamed and is now the frame pointer).
     1125
     1126The return instructions are also a bit particular. ret is a synthetic
     1127instruction, corresponding to jmpl (jump linked). This instruction
     1128jumps to the address resulting from adding 8 to the %i7 register. The
     1129source instruction address (the address of the ret instruction itself)
     1130is written to the %g0 register, i.e. it is discarded.
     1131
     1132The restore instruction is similarly a synthetic instruction, and is
     1133just a short form for a restore that choses not to perform an addition.
     1134
     1135The calling instruction, in turn, typically looks as follows:
     1136
     1137.. code-block:: asm
     1138
     1139    call <function>    ; jmpl <address>, %o7
     1140    mov 0, %o0
     1141
     1142Again, the call instruction is synthetic, and is actually the same
     1143instruction that performs the return. This time, however, it is interested
     1144in saving the return address, into register %o7. Note that the delay
     1145slot is often filled with an instruction related to the parameters,
     1146in this example it sets the first parameter to zero.
     1147Note also that the return value is also generally passed in %o0.
     1148
     1149Leaf procedures are different. A leaf procedure is an optimization that
     1150reduces unnecessary work by taking advantage of the knowledge that no
     1151call instructions exist in many procedures. Thus, the save/restore couple
     1152can be eliminated. The downside is that such a procedure may only use
     1153the out registers (since the in and local registers actually belong to
     1154the caller). See Figure 6.
     1155
     1156.. code-block:: asm
     1157
     1158  function:
     1159               ; no save instruction needed upon entry
     1160
     1161               ; perform function, leave return value,   
     1162               ; if any, in register %o0 upon exit
     1163
     1164    retl       ; jmpl %o7+8, %g0
     1165    nop        ; the delay slot can be used for something else   
     1166
     1167.. Figure 6 - Epilogue/prologue in leaf procedures
     1168
     1169Note in the figure that there is only one instruction overhead, namely the
     1170retl instruction. retl is also synthetic (return from leaf subroutine), is
     1171again a variant of the jmpl instruction, this time with %o7+8 as target.
     1172
     1173Yet another variation of epilogue is caused by tail call elimination,
     1174an optimization supported by some compilers (including Sun's C compiler
     1175but not GCC). If the compiler detects that a called function will return
     1176to the calling function, it can replace its place on the stack with the
     1177called function. Figure 7 contains an example.
     1178
     1179.. code-block:: asm
     1180
     1181       int
     1182        foo(int n)
     1183      {
     1184        if (n == 0)
     1185          return 0;
     1186        else
     1187          return bar(n);
     1188      }
     1189         cmp     %o0,0
     1190        bne     .L1
     1191        or      %g0,%o7,%g1
     1192        retl
     1193        or      %g0,0,%o0
     1194  .L1:  call    bar
     1195        or      %g0,%g1,%o7
     1196
     1197.. Figure 7 - Example of tail call elimination
     1198
     1199Note that the call instruction overwrites register %o7 with the program
     1200counter. Therefore the above code saves the old value of %o7, and restores
     1201it in the delay slot of the call instruction. If the function call is
     1202register indirect, this twiddling with %o7 can be avoided, but of course
     1203that form of call is slower on modern processors.
     1204
     1205The benefit of tail call elimination is to remove an indirection upon
     1206return. It is also needed to reduce register window usage, since otherwise
     1207the foo() function in Figure 7 would need to allocate a stack frame to
     1208save the program counter.
     1209
     1210A special form of tail call elimination is tail recursion elimination,
     1211which detects functions calling themselves, and replaces it with a simple
     1212branch. Figure 8 contains an example.
     1213
     1214.. code-block:: asm
     1215
     1216         int
     1217          foo(int n)
     1218        {
     1219          if (n == 0)
     1220            return 1;
     1221          else
     1222            return (foo(n - 1));
     1223        }
     1224         cmp     %o0,0
     1225        be      .L1
     1226        or      %g0,%o0,%g1
     1227        subcc   %g1,1,%g1
     1228  .L2:  bne     .L2
     1229        subcc   %g1,1,%g1
     1230  .L1:  retl
     1231        or      %g0,1,%o0
     1232
     1233.. comment Figure 8 - Example of tail recursion elimination
     1234
     1235Needless to say, these optimizations produce code that is difficult to debug.
     1236
     1237Procedures, stacks, and debuggers
     1238----------------------------------
     1239
     1240When debugging an application, your debugger will be parsing the binary
     1241and consulting the symbol table to determine procedure entry points. It
     1242will also travel the stack frames "upward" to determine the current
     1243call chain.
     1244
     1245When compiling for debugging, compilers will generate additional code
     1246as well as avoid some optimizations in order to allow reconstructing
     1247situations during execution. For example, GCC/GDB makes sure original
     1248parameter values are kept intact somewhere for future parsing of
     1249the procedure call stack. The live in registers other than %i0 are
     1250not touched. %i0 itself is copied into a free local register, and its
     1251location is noted in the symbol file. (You can find out where variables
     1252reside by using the "info address" command in GDB.)
     1253
     1254Given that much of the semantics relating to stack handling and procedure
     1255call entry/exit code is only recommended, debuggers will sometimes
     1256be fooled. For example, the decision as to wether or not the current
     1257procedure is a leaf one or not can be incorrect. In this case a spurious
     1258procedure will be inserted between the current procedure and it's "real"
     1259parent. Another example is when the application maintains its own implicit
     1260call hierarchy, such as jumping to function pointers. In this case the
     1261debugger can easily become totally confused.
     1262
     1263The window overflow and underflow traps
     1264---------------------------------------
     1265
     1266When the SAVE instruction decrements the current window pointer (CWP)
     1267so that it coincides with the invalid window in the window invalid mask
     1268(WIM), a window overflow trap occurs. Conversely, when the RESTORE or
     1269RETT instructions increment the CWP to coincide with the invalid window,
     1270a window underflow trap occurs.
     1271
     1272Either trap is handled by the operating system. Generally, data is
     1273written out to memory and/or read from memory, and the WIM register
     1274suitably altered.
     1275
     1276The code in Figure 9 and Figure 10 below are bare-bones handlers for
     1277the two traps. The text is directly from the source code, and sort of
     1278works. (As far as I know, these are minimalistic handlers for SPARC
     1279V8). Note that there is no way to directly access window registers
     1280other than the current one, hence the code does additional save/restore
     1281instructions. It's pretty tricky to understand the code, but figure 1
     1282should be of help.
     1283
     1284.. code-block:: asm
     1285
     1286        /* a SAVE instruction caused a trap */
     1287window_overflow:
     1288        /* rotate WIM on bit right, we have 8 windows */
     1289        mov %wim,%l3
     1290        sll %l3,7,%l4
     1291        srl %l3,1,%l3
     1292        or  %l3,%l4,%l3
     1293        and %l3,0xff,%l3
     1294
     1295        /* disable WIM traps */
     1296        mov %g0,%wim
     1297        nop; nop; nop
     1298
     1299        /* point to correct window */
     1300        save
     1301
     1302        /* dump registers to stack */
     1303        std %l0, [%sp +  0]
     1304        std %l2, [%sp +  8]
     1305        std %l4, [%sp + 16]
     1306        std %l6, [%sp + 24]
     1307        std %i0, [%sp + 32]
     1308        std %i2, [%sp + 40]
     1309        std %i4, [%sp + 48]
     1310        std %i6, [%sp + 56]
     1311
     1312        /* back to where we should be */
     1313        restore
     1314
     1315        /* set new value of window */
     1316        mov %l3,%wim
     1317        nop; nop; nop
     1318
     1319        /* go home */
     1320        jmp %l1
     1321        rett %l2
     1322Figure 9 - window_underflow trap handler
     1323        /* a RESTORE instruction caused a trap */
     1324window_underflow:
     1325       
     1326        /* rotate WIM on bit LEFT, we have 8 windows */
     1327        mov %wim,%l3
     1328        srl %l3,7,%l4
     1329        sll %l3,1,%l3
     1330        or  %l3,%l4,%l3
     1331        and %l3,0xff,%l3
     1332
     1333        /* disable WIM traps */
     1334        mov %g0,%wim
     1335        nop; nop; nop
     1336
     1337        /* point to correct window */
     1338        restore
     1339        restore
     1340
     1341        /* dump registers to stack */
     1342        ldd [%sp +  0], %l0
     1343        ldd [%sp +  8], %l2
     1344        ldd [%sp + 16], %l4
     1345        ldd [%sp + 24], %l6
     1346        ldd [%sp + 32], %i0
     1347        ldd [%sp + 40], %i2
     1348        ldd [%sp + 48], %i4
     1349        ldd [%sp + 56], %i6
     1350
     1351        /* back to where we should be */
     1352        save
     1353        save
     1354
     1355        /* set new value of window */
     1356        mov %l3,%wim
     1357        nop; nop; nop
     1358
     1359        /* go home */
     1360        jmp %l1
     1361        rett %l2
     1362
     1363.. comment Figure 10 - window_underflow trap handler
     1364
Note: See TracChangeset for help on using the changeset viewer.