source: rtems/c/src/lib/libbsp/powerpc/motorola_powerpc/bootloader/head.S @ 2a6a029f

4.104.114.84.95
Last change on this file since 2a6a029f was ba46ffa6, checked in by Joel Sherrill <joel.sherrill@…>, on 06/14/99 at 16:51:13

This is a large patch from Eric Valette <valette@…> that was
described in the message following this paragraph. This patch also includes
a mcp750 BSP.

From valette@… Mon Jun 14 10:03:08 1999
Date: Tue, 18 May 1999 01:30:14 +0200 (CEST)
From: VALETTE Eric <valette@…>
To: joel@…
Cc: raguet@…, rtems-snapshots@…, valette@…
Subject: Questions/Suggestion? regarding RTEMS PowerPC code (long)

Dear knowledgeable RTEMS powerpc users,

As some of you may know, I'm currently finalizing a port
of RTEMS on a MCP750 Motorola board. I have done most
of it but have some questions to ask before submitting
the port.

In order to understand some of the changes I have made
or would like to make, maybe it is worth describing the
MCP750 Motorola board.

the MCP750 is a COMPACT PCI powerpc board with :

1) a MPC750 233 MHz processor,
2) a raven bus bridge/PCI controller that
implement an OPENPIC compliant interrupt controller,
3) a VIA 82C586 PCI/ISA bridge that offers a PC
compliant IO for keyboard, serial line, IDE, and
the well known PC 8259 cascaded PIC interrupt
architecture model,
4) a DEC 21140 Ethernet controller,
5) the PPCBUG Motorola firmware in flash,
6) A DEC PCI bridge,

This architecture is common to most Motorola 60x/7xx
board except that :

1) on VME board, the DEC PCI bridge is replaced by
a VME chipset,
2) the VIA 82C586 PCI/ISA bridge is replaced by
another bridge that is almost fully compatible
with the via bridge...

So the port should be a rather close basis for many
60x/7xx motorola board...

On this board, I already have ported Linux 2.2.3 and
use it both as a development and target board.

Now the questions/suggestions I have :

1) EXCEPTION CODE


As far as I know exceptions on PPC are handled like
interrupts. I dislike this very much as :

a) Except for the decrementer exception (and
maybe some other on mpc8xx), exceptions are
not recoverable and the handler just need to print
the full context and go to the firmware or debugger...
b) The interrupt switch is only necessary for the
decrementer and external interrupt (at least on
6xx,7xx).
c) The full context for exception is never saved and
thus cannot be used by debugger... I do understand
the most important for interrupts low level code
is to save the minimal context enabling to call C
code for performance reasons. On non recoverable
exception on the other hand, the most important is
to save the maximum information concerning proc status
in order to analyze the reason of the fault. At
least we will need this in order to implement the
port of RGDB on PPC

==> I wrote an API for connecting raw exceptions (and thus
raw interrupts) for mpc750. It should be valid for most
powerpc processors... I hope to find a way to make this coexist
with actual code layout. The code is actually located
in lib/libcpu/powerpc/mpc750 and is thus optional
(provided I write my own version of exec/score/cpu/powerpc/cpu.c ...)

See remark about files/directory layout organization in 4)

2) Current Implementation of ISR low level code


I do not understand why the MSR EE flags is cleared
again in exec/score/cpu/powerpc/irq_stubs.S

#if (PPC_USE_SPRG)

mfmsr r5
mfspr r6, sprg2

#else

lwz r6,msr_initial(r11)
lis r5,~PPC_MSR_DISABLE_MASK@ha
ori r5,r5,~PPC_MSR_DISABLE_MASK@l
and r6,r6,r5
mfmsr r5

#endif

Reading the doc, when a decrementer interrupt or an
external interrupt is active, the MSR EE flag is already
cleared. BTW if exception/interrupt could occur, it would
trash SRR0 and SRR1. In fact the code may be useful to set
MSR[RI] that re-enables exception processing. BTW I will need
to set other value in MSR to handle interrupts :

a) I want the MSR[IR] and MSR[DR] to be set for
performance reasons and also because I need DBAT
support to have access to PCI memory space as the
interrupt controller is in the PCI space.

Reading the code, I see others have the same kind of request :

/* SCE 980217

*

  • We need address translation ON when we call our ISR routine

mtmsr r5

*/

This is just another prof that even the lowest level
IRQ code is fundamentally board dependent and
not simply processor dependent especially when
the processor use external interrupt controller
because it has a single interrupt request line...

Note that if you look at the PPC code high level interrupt
handling code, as the "set_vector" routine that really connects
the interrupt is in the BSP/startup/genpvec.c,
the fact that IRQ handling is BSP specific is DE-FACTO
acknowledged.

I know I have already expressed this and understand that this
would require some heavy change in the code but believe
me you will reach a point where you will not be able
to find a compatible while optimum implementation for low level
interrupt handling code...) In my case this is already true...

So please consider removing low level IRQ handling from
exec/score/cpu/* and only let there exception handling code...
Exceptions are usually only processor dependent and do
not depend on external hardware mechanism to be masked or
acknowledged or re-enabled (there are probably exception but ...)

I have already done this for pc386 bsp but need to make it again.
This time I will even propose an API.

3) R2/R13 manipulation for EABI implementation


I do not understand the handling of r2 and r13 in the
EABI case. The specification for r2 says pointer to sdata2,
sbss2 section => constant. However I do not see -ffixed-r2
passed to any compilation system in make/custom/*
(for info linux does this on PPC).

So either this is a default compiler option when choosing
powerpc-rtems and thus we do not need to do anything with
this register as all the code is compiled with this compiler
and linked together OR this register may be used by rtems code
and then we do not need any special initialization or
handling.

The specification for r13 says pointer to the small data
area. r13 argumentation is the same except that as far
as I know the usage of the small data area requires
specific compiler support so that access to variables is
compiled via loading the LSB in a register and then
using r13 to get full address... It is like a small
memory model and it was present in IBM C compilers.

=> I propose to suppress any specific code for r2 and
r13 in the EABI case.

4) Code layout organization (yes again :-))


I think there are a number of design flaws in the way
the code is for ppc organized and I will try to point them out.
I have been beaten by this again on this new port, and
was beaten last year while modifying code for pc386.

a) exec/score/cpu/* vs lib/libcpu/cpu/*.

I think that too many things are put in exec/score/cpu that
have nothing to do with RTEMS internals but are rather
related to CPU feature.

This include at least :

a) registers access routine (e.g GET_MSR_Value),
b) interrupt masking/unmasking routines,
c) cache_mngt_routine,
d) mmu_mngt_routine,
e) Routines to connect the raw_exception, raw_interrupt
handler,

b) lib/libcpu/cpu/powerpc/*

With a processor family as exuberant as the powerpc family,
and their well known subtle differences (604 vs 750) or
unfortunately majors (8xx vs 60x) the directory structure
is fine (except maybe the names that are not homogeneous)

powerpc

ppc421 mpc821 ...

I only needed to add mpc750. But the fact that libcpu.a was not
produced was a pain and the fact that this organization may
duplicates code is also problematic.

So, except if the support of automake provides a better solution
I would like to propose something like this :

powerpc

mpc421 mpc821 ... mpc750 shared wrapup

with the following rules :

a) "shared" would act as a source container for sources that may
be shared among processors. Needed files would be compiled inside
the processor specific directory using the vpath Makefile
mechanism. "shared" may also contain compilation code
for routine that are really shared and not worth to inline...
(did not found many things so far as registers access routine
ARE WORTH INLINING)... In the case something is compiled there,
it should create libcpushared.a

b) layout under processor specific directory is free provided
that

1)the result of the compilation process exports :

libcpu/powerpc/"PROC"/*.h in $(PROJECT_INCLUDE)/libcpu

2) each processor specific directory creates
a library called libcpuspecific.a

Note that this organization enables to have a file that
is nearly the same than in shared but that must differ
because of processor differences...

c) "wrapup" should create libcpu.a using libcpushared.a
libcpuspecific.a and export it $(PROJECT_INCLUDE)/libcpu

The only thing I have no ideal solution is the way to put shared
definitions in "shared" and only processor specific definition
in "proc". To give a concrete example, most MSR bit definition
are shared among PPC processors and only some differs. if we create
a single msr.h in shared it will have ifdef. If in msr.h we
include libcpu/msr_c.h we will need to have it in each prowerpc
specific directory (even empty). Opinions are welcomed ...

Note that a similar mechanism exist in libbsp/i386 that also
contains a shared directory that is used by several bsp
like pc386 and i386ex and a similar wrapup mechanism...

NB: I have done this for mpc750 and other processors could just use
similar Makefiles...

c) The exec/score/cpu/powerpc directory layout.

I think the directory layout should be the same than the
libcpu/powerpc. As it is not, there are a lot of ifdefs
inside the code... And of course low level interrupt handling
code should be removed...

Besides that I do not understand why

1) things are compiled in the wrap directory,
2) some includes are moved to rtems/score,

I think the "preinstall" mechanism enables to put
everything in the current directory (or better in a per processor
directory),

5) Interrupt handling API


Again :-). But I think that using all the features the PIC
offers is a MUST for RT system. I already explained in the
prologue of this (long and probably boring) mail that the MCP750
boards offers an OPENPIC compliant architecture and that
the VIA 82586 PCI/ISA bridge offers a PC compatible IO and
PIC mapping. Here is a logical view of the RAVEN/VIA 82586
interrupt mapping :


| OPEN | <-----|8259|
| PIC | | | 2 ------
|(RAVEN)| | | <-----|8259|
| | | | | | 11
| | | | | | <----
| | | | | |
| | | | | |


------
| VIA PCI/ISA bridge
| x
-------- PCI interrupts

OPENPIC offers interrupt priorities among PCI interrupts
and interrupt selective masking. The 8259 offers the same kind
of feature. With actual powerpc interrupt code :

1) there is no way to specify priorities among
interrupts handler. This is REALLY a bad thing.
For me it is as importnat as having priorities
for threads...
2) for my implementation, each ISR should
contain the code that acknowledge the RAVEN
and 8259 cascade, modify interrupt mask on both
chips, and reenable interrupt at processor level,
..., restore then on interrupt return,.... This code
is actually similar to code located in some
genpvec.c powerpc files,
3) I must update _ISR_Nesting_level because
irq.inl use it...
4) the libchip code connects the ISR via set_vector
but the libchip handler code does not contain any code to
manipulate external interrupt controller hardware
in order to acknoledge the interrupt or re-enable
them (except for the target hardware of course)
So this code is broken unless set_vector adds an
additionnal prologue/epilogue before calling/returning
from in order to acknoledge/mask the raven and the
8259 PICS... => Anyway already EACH BSP MUST REWRITE
PART OF INTERRUPT HANDLING CODE TO CORRECTLY IMPLEMENT
SET_VECTOR.

I would rather offer an API similar to the one provided
in libbsp/i386/shared/irq/irq.h so that :

1) Once the driver supplied methods is called the
only things the ISR has to do is to worry about the
external hardware that triggered the interrupt.
Everything on openpic/VIA/processor would have been
done by the low levels (same things as set-vector)
2) The caller will need to supply the on/off/isOn
routine that are fundamental to correctly implements
debuggers/performance monitoring is a portable way
3) A globally configurable interrupt priorities
mechanism...

I have nothing against providing a compatible
set_vector just to make libchip happy but
as I have already explained in other
mails (months ago), I really think that the ISR
connection should be handled by the BSP and that no
code containing irq connection should exist the
rtems generic layers... Thus I really dislike
libchip on this aspect because in a long term
it will force to adopt the less reach API
for interrupt handling that exists (set_vector).

Additional note : I think the _ISR_Is_in_progress()
inline routine should be :

1) Put in a processor specific section,
2) Should not rely on a global variable,

As :

a) on symmetric MP, there is one interrupt level
per CPU,
b) On processor that have an ISP (e,g 68040),
this variable is useless (MSR bit testing could
be used)
c) On PPC, instead of using the address of the
variable via CPU_IRQ_info.Nest_level a dedicated
SPR could be used.

NOTE: most of this is also true for _Thread_Dispatch_disable_level

END NOTE


Please do not take what I said in the mail as a criticism for
anyone who submitted ppc code. Any code present helped me
a lot understanding PPC behavior. I just wanted by this
mail to :

1) try to better understand the actual code,
2) propose concrete ways of enhancing current code
by providing an alternative implementation for MCP750. I
will make my best effort to try to brake nothing but this
is actually hard due to the file layout organisation.
3) make understandable some changes I will probably make
if joel let me do them :-)

Any comments/objections are welcomed as usual.

--


/ ` Eric Valette

/-- o _. Canon CRF

(_, / (_(_( Rue de la touche lambert

35517 Cesson-Sevigne Cedex
FRANCE

Tel: +33 (0)2 99 87 68 91 Fax: +33 (0)2 99 84 11 30
E-mail: valette@…

  • Property mode set to 100644
File size: 8.3 KB
Line 
1/*
2 * $Id$
3 *     
4 * This code is loaded by the ROM loader at some arbitrary location.
5 * Move it to high memory so that it can load the kernel at 0x0000.
6 *
7 */
8
9#include "bootldr.h"
10#include <libcpu/cpu.h>
11#include <rtems/score/targopts.h>
12#include "asm.h"
13               
14#undef TEST_PPCBUG_CALLS       
15#define FRAME_SIZE 32
16#define LOCK_CACHES (HID0_DLOCK|HID0_ILOCK)
17#define INVL_CACHES (HID0_DCI|HID0_ICFI)
18#define ENBL_CACHES (HID0_DCE|HID0_ICE)
19
20#define USE_PPCBUG
21#undef  USE_PPCBUG
22       
23        START_GOT
24        GOT_ENTRY(_GOT2_TABLE_)
25        GOT_ENTRY(_FIXUP_TABLE_)
26        GOT_ENTRY(.bss)
27        GOT_ENTRY(codemove)
28        GOT_ENTRY(0)
29        GOT_ENTRY(__bd)
30        GOT_ENTRY(moved)
31        GOT_ENTRY(_binary_rtems_gz_start)
32        GOT_ENTRY(_binary_initrd_gz_start)
33        GOT_ENTRY(_binary_initrd_gz_end)
34#ifdef TEST_PPCBUG_CALLS       
35        GOT_ENTRY(banner_start)
36        GOT_ENTRY(banner_end)
37#endif 
38        END_GOT
39        .globl  start
40        .type   start,@function
41/* Point the stack into the PreP partition header in the x86 reserved
42 * code area, so that simple C routines can be called.
43 */
44start:  bl      1f
451:      mflr    r1
46        li      r0,0
47        stwu    r0,start-1b-0x400+0x1b0-FRAME_SIZE(r1)
48        stmw    r26,FRAME_SIZE-24(r1)
49        GET_GOT
50        mfmsr   r28             /* Turn off interrupts */
51        ori     r0,r28,MSR_EE
52        xori    r0,r0,MSR_EE
53        mtmsr   r0
54       
55/* Enable the caches, from now on cr2.eq set means processor is 601 */
56        mfpvr   r0
57        mfspr   r29,HID0
58        srwi    r0,r0,16
59        cmplwi  cr2,r0,1
60        beq     2,2f
61#ifndef USE_PPCBUG
62        ori     r0,r29,ENBL_CACHES|INVL_CACHES|LOCK_CACHES
63        xori    r0,r0,INVL_CACHES|LOCK_CACHES
64        sync
65        isync
66        mtspr   HID0,r0
67#endif
682:      bl      reloc
69       
70/* save all the parameters and the orginal msr/hid0/r31 */
71        lwz     bd,GOT(__bd)
72        stw     r3,0(bd)
73        stw     r4,4(bd)
74        stw     r5,8(bd)
75        stw     r6,12(bd)
76        lis     r3,__size@sectoff@ha
77        stw     r7,16(bd)
78        stw     r8,20(bd)
79        addi    r3,r3,__size@sectoff@l
80        stw     r9,24(bd)
81        stw     r10,28(bd)
82        stw     r28,o_msr(bd)
83        stw     r29,o_hid0(bd)
84        stw     r31,o_r31(bd)
85
86/* Call the routine to fill boot_data structure from residual data.
87 * And to find where the code has to be moved.
88 */
89        bl      early_setup
90
91/* Now we need to relocate ourselves, where we are told to. First put a
92 * copy of the codemove routine to some place in memory.
93 * (which may be where the 0x41 partition was loaded, so size is critical).
94 */
95        lwz     r4,GOT(codemove)
96        li      r5,_size_codemove
97        lwz     r3,mover(bd)
98        lwz     r6,cache_lsize(bd)
99        bl      codemove
100        mtctr   r3              # Where the temporary codemove is.
101        lwz     r3,image(bd)
102        lis     r5,_edata@sectoff@ha
103        lwz     r4,GOT(0)       # Our own address
104        addi    r5,r5,_edata@sectoff@l
105        lwz     r6,cache_lsize(bd)
106        lwz     r8,GOT(moved)
107        sub     r7,r3,r4        # Difference to adjust pointers.
108        add     r8,r8,r7
109        add     r30,r30,r7
110        add     bd,bd,r7
111/* Call the copy routine but return to the new area. */
112        mtlr    r8              # for the return address
113        bctr                    # returns to the moved instruction
114/* Establish the new top stack frame. */
115moved:  lwz     r1,stack(bd)
116        li      r0,0
117        stwu    r0,-16(r1)
118
119/* relocate again */
120        bl      reloc   
121/* Clear all of BSS */
122        lwz     r10,GOT(.bss)
123        li      r0,__bss_words@sectoff@l
124        subi    r10,r10,4
125        cmpwi   r0,0
126        mtctr   r0
127        li      r0,0
128        beq     4f
1293:      stwu    r0,4(r10)
130        bdnz    3b
131
132/* Final memory initialization. First switch to unmapped mode
133 * in case the FW had set the MMU on, and flush the TLB to avoid
134 * stale entries from interfering. No I/O access is allowed
135 * during this time!
136 */
137#ifndef USE_PPCBUG     
1384:      bl      MMUoff
139#endif 
140        bl      flush_tlb
141/* Some firmware versions leave stale values in the BATs, it's time
142 * to invalidate them to avoid interferences with our own mappings.
143 * But the 601 valid bit is in the BATL (IBAT only) and others are in
144 * the [ID]BATU. Bloat, bloat.. fortunately thrown away later.
145 */
146        li      r3,0
147        beq     cr2,5f
148        mtdbatu 0,r3
149        mtdbatu 1,r3
150        mtdbatu 2,r3
151        mtdbatu 3,r3
1525:      mtibatu 0,r3
153        mtibatl 0,r3
154        mtibatu 1,r3
155        mtibatl 1,r3
156        mtibatu 2,r3
157        mtibatl 2,r3
158        mtibatu 3,r3
159        mtibatl 3,r3
160        lis     r3,__size@sectoff@ha
161        addi    r3,r3,__size@sectoff@l
162        sync                            # We are going to touch SDR1 !
163        bl      mm_init
164        bl      MMUon
165       
166/* Now we are mapped and can perform I/O if we want */
167#ifdef TEST_PPCBUG_CALLS       
168/* Experience seems to show that PPCBug can only be called with the
169 * data cache disabled and with MMU disabled. Bummer.
170 */     
171        li      r10,0x22                # .OUTLN
172        lwz     r3,GOT(banner_start)
173        lwz     r4,GOT(banner_end)
174        sc
175#endif 
176        bl      setup_hw
177        lwz     r4,GOT(_binary_rtems_gz_start)
178        lis     r5,_rtems_gz_size@sectoff@ha
179        lwz     r6,GOT(_binary_initrd_gz_start)
180        lis     r3,_rtems_size@sectoff@ha
181        lwz     r7,GOT(_binary_initrd_gz_end)
182        addi    r5,r5,_rtems_gz_size@sectoff@l
183        addi    r3,r3,_rtems_size@sectoff@l
184        sub     r7,r7,r6
185        bl      decompress_kernel
186
187/* Back here we are unmapped and we start the kernel, passing up to eight
188 * parameters just in case, only r3 to r7 used for now. Flush the tlb so
189 * that the loaded image starts in a clean state.
190 */
191        bl      flush_tlb
192        lwz     r3,0(bd)
193        lwz     r4,4(bd)
194        lwz     r5,8(bd)
195        lwz     r6,12(bd)
196        lwz     r7,16(bd)
197        lwz     r8,20(bd)
198        lwz     r9,24(bd)
199        lwz     r10,28(bd)
200
201        lwz     r30,0(0)
202        mtctr   r30
203/*
204 *      Linux code again
205        lis     r30,0xdeadc0de@ha
206        addi    r30,r30,0xdeadc0de@l
207        stw     r30,0(0)
208        li      r30,0
209*/
210        dcbst   0,r30   /* Make sure it's in memory ! */
211/* We just flash invalidate and disable the dcache, unless it's a 601,
212 * critical areas have been flushed and we don't care about the stack
213 * and other scratch areas.
214 */
215        beq     cr2,1f
216        mfspr   r0,HID0
217        ori     r0,r0,HID0_DCI|HID0_DCE
218        sync
219        mtspr   HID0,r0
220        xori    r0,r0,HID0_DCI|HID0_DCE
221        mtspr   HID0,r0
222/* Provisional return to FW, works for PPCBug */
223#if 0
2241:      mfmsr   r10
225        ori     r10,r10,MSR_IP
226        mtmsr   r10
227        li      r10,0x63
228        sc
229#else
2301:      bctr
231#endif
232               
233       
234
235/* relocation function, r30 must point to got2+0x8000 */
236reloc: 
237/* Adjust got2 pointers, no need to check for 0, this code already puts
238 * a few entries in the table.
239 */
240        li      r0,__got2_entries@sectoff@l
241        la      r12,GOT(_GOT2_TABLE_)
242        lwz     r11,GOT(_GOT2_TABLE_)
243        mtctr   r0
244        sub     r11,r12,r11
245        addi    r12,r12,-4
2461:      lwzu    r0,4(r12)
247        add     r0,r0,r11
248        stw     r0,0(r12)
249        bdnz    1b
250       
251/* Now adjust the fixups and the pointers to the fixups in case we need
252 * to move ourselves again.
253 */     
2542:      li      r0,__fixup_entries@sectoff@l
255        lwz     r12,GOT(_FIXUP_TABLE_)
256        cmpwi   r0,0
257        mtctr   r0
258        addi    r12,r12,-4
259        beqlr
2603:      lwzu    r10,4(r12)
261        lwzux   r0,r10,r11
262        add     r0,r0,r11
263        stw     r10,0(r12)
264        stw     r0,0(r10)
265        bdnz    3b
266        blr             
267
268/* Set the MMU on and off: code is always mapped 1:1 and does not need MMU,
269 * but it does not cost so much to map it also and it catches calls through
270 * NULL function pointers.
271 */
272        .globl  MMUon
273        .type   MMUon,@function
274MMUon:  mfmsr   r0
275        ori     r0,r0,MSR_IR|MSR_DR|MSR_IP
276        mflr    r11
277        xori    r0,r0,MSR_IP
278        mtsrr0  r11
279        mtsrr1  r0
280        rfi
281        .globl  MMUoff
282        .type   MMUoff,@function
283MMUoff: mfmsr   r0
284        ori     r0,r0,MSR_IR|MSR_DR|MSR_IP
285        mflr    r11
286        xori    r0,r0,MSR_IR|MSR_DR
287        mtsrr0  r11
288        mtsrr1  r0
289        rfi
290
291/* Due to the PPC architecture (and according to the specifications), a
292 * series of tlbie which goes through a whole 256 MB segment always flushes
293 * the whole TLB. This is obviously overkill and slow, but who cares ?
294 * It takes about 1 ms on a 200 MHz 603e and works even if residual data
295 * get the number of TLB entries wrong.
296 */
297flush_tlb:
298        lis     r11,0x1000
2991:      addic.  r11,r11,-0x1000
300        tlbie   r11
301        bnl     1b
302/* tlbsync is not implemented on 601, so use sync which seems to be a superset
303 * of tlbsync in all cases and do not bother with CPU dependant code
304 */
305        sync   
306        blr                                     
307/* A few utility functions, some copied from arch/ppc/lib/string.S */
308
309#if 0
310        .globl  strnlen
311        .type   strnlen,@function
312strnlen:
313        addi    r4,r4,1
314        mtctr   r4
315        addi    r4,r3,-1
3161:      lbzu    r0,1(r4)
317        cmpwi   0,r0,0
318        bdnzf   eq,1b
319        subf    r3,r3,r4
320        blr
321#endif
322        .globl  codemove
323codemove:
324        .type   codemove,@function
325/* r3 dest, r4 src, r5 length in bytes, r6 cachelinesize */
326        cmplw   cr1,r3,r4
327        addi    r0,r5,3
328        srwi.   r0,r0,2
329        beq     cr1,4f  /* In place copy is not necessary */
330        beq     7f      /* Protect against 0 count */
331        mtctr   r0
332        bge     cr1,2f
333       
334        la      r8,-4(r4)
335        la      r7,-4(r3)
3361:      lwzu    r0,4(r8)
337        stwu    r0,4(r7)       
338        bdnz    1b
339        b       4f
340
3412:      slwi    r0,r0,2
342        add     r8,r4,r0
343        add     r7,r3,r0
3443:      lwzu    r0,-4(r8)
345        stwu    r0,-4(r7)
346        bdnz    3b
347       
348/* Now flush the cache: note that we must start from a cache aligned
349 * address. Otherwise we might miss one cache line.
350 */
3514:      cmpwi   r6,0
352        add     r5,r3,r5
353        beq     7f      /* Always flush prefetch queue in any case */
354        subi    r0,r6,1
355        andc    r3,r3,r0
356        mr      r4,r3
3575:      cmplw   r4,r5   
358        dcbst   0,r4
359        add     r4,r4,r6
360        blt     5b
361        sync            /* Wait for all dcbst to complete on bus */
362        mr      r4,r3
3636:      cmplw   r4,r5   
364        icbi    0,r4
365        add     r4,r4,r6
366        blt     6b
3677:      sync            /* Wait for all icbi to complete on bus */
368        isync
369        blr
370        .size   codemove,.-codemove
371_size_codemove=.-codemove
372
373        .section        ".data" # .rodata
374        .align 2
375#ifdef TEST_PPCBUG_CALLS       
376banner_start:   
377        .ascii "This message was printed by PPCBug with MMU enabled"
378banner_end:     
379#endif
Note: See TracBrowser for help on using the repository browser.