source: rtems/c/src/lib/libcpu/powerpc/shared/pgtable.h @ 817466c
Last change on this file since 817466c was a4f6b02, checked in by Joel Sherrill <joel.sherrill@…>, on Jun 14, 1999 at 5:54:21 PM

This is a large patch from Eric Valette <valette@…> that was
described in the message following this paragraph. This patch also includes
a mcp750 BSP.

From valette@… Mon Jun 14 10:03:08 1999
Date: Tue, 18 May 1999 01:30:14 +0200 (CEST)
From: VALETTE Eric <valette@…>
To: joel@…
Cc: raguet@…, rtems-snapshots@…, valette@…
Subject: Questions/Suggestion? regarding RTEMS PowerPC code (long)

Dear knowledgeable RTEMS powerpc users,

As some of you may know, I'm currently finalizing a port
of RTEMS on a MCP750 Motorola board. I have done most
of it but have some questions to ask before submitting
the port.

In order to understand some of the changes I have made
or would like to make, maybe it is worth describing the
MCP750 Motorola board.

the MCP750 is a COMPACT PCI powerpc board with :

1) a MPC750 233 MHz processor,
2) a raven bus bridge/PCI controller that
implement an OPENPIC compliant interrupt controller,
3) a VIA 82C586 PCI/ISA bridge that offers a PC
compliant IO for keyboard, serial line, IDE, and
the well known PC 8259 cascaded PIC interrupt
architecture model,
4) a DEC 21140 Ethernet controller,
5) the PPCBUG Motorola firmware in flash,
6) A DEC PCI bridge,

This architecture is common to most Motorola 60x/7xx
board except that :

1) on VME board, the DEC PCI bridge is replaced by
a VME chipset,
2) the VIA 82C586 PCI/ISA bridge is replaced by
another bridge that is almost fully compatible
with the via bridge...

So the port should be a rather close basis for many
60x/7xx motorola board...

On this board, I already have ported Linux 2.2.3 and
use it both as a development and target board.

Now the questions/suggestions I have :


As far as I know exceptions on PPC are handled like
interrupts. I dislike this very much as :

a) Except for the decrementer exception (and
maybe some other on mpc8xx), exceptions are
not recoverable and the handler just need to print
the full context and go to the firmware or debugger...
b) The interrupt switch is only necessary for the
decrementer and external interrupt (at least on
c) The full context for exception is never saved and
thus cannot be used by debugger... I do understand
the most important for interrupts low level code
is to save the minimal context enabling to call C
code for performance reasons. On non recoverable
exception on the other hand, the most important is
to save the maximum information concerning proc status
in order to analyze the reason of the fault. At
least we will need this in order to implement the
port of RGDB on PPC

==> I wrote an API for connecting raw exceptions (and thus
raw interrupts) for mpc750. It should be valid for most
powerpc processors... I hope to find a way to make this coexist
with actual code layout. The code is actually located
in lib/libcpu/powerpc/mpc750 and is thus optional
(provided I write my own version of exec/score/cpu/powerpc/cpu.c ...)

See remark about files/directory layout organization in 4)

2) Current Implementation of ISR low level code

I do not understand why the MSR EE flags is cleared
again in exec/score/cpu/powerpc/irq_stubs.S


mfmsr r5
mfspr r6, sprg2


lwz r6,msr_initial(r11)
and r6,r6,r5
mfmsr r5


Reading the doc, when a decrementer interrupt or an
external interrupt is active, the MSR EE flag is already
cleared. BTW if exception/interrupt could occur, it would
trash SRR0 and SRR1. In fact the code may be useful to set
MSR[RI] that re-enables exception processing. BTW I will need
to set other value in MSR to handle interrupts :

a) I want the MSR[IR] and MSR[DR] to be set for
performance reasons and also because I need DBAT
support to have access to PCI memory space as the
interrupt controller is in the PCI space.

Reading the code, I see others have the same kind of request :

/* SCE 980217


  • We need address translation ON when we call our ISR routine

mtmsr r5


This is just another prof that even the lowest level
IRQ code is fundamentally board dependent and
not simply processor dependent especially when
the processor use external interrupt controller
because it has a single interrupt request line...

Note that if you look at the PPC code high level interrupt
handling code, as the "set_vector" routine that really connects
the interrupt is in the BSP/startup/genpvec.c,
the fact that IRQ handling is BSP specific is DE-FACTO

I know I have already expressed this and understand that this
would require some heavy change in the code but believe
me you will reach a point where you will not be able
to find a compatible while optimum implementation for low level
interrupt handling code...) In my case this is already true...

So please consider removing low level IRQ handling from
exec/score/cpu/* and only let there exception handling code...
Exceptions are usually only processor dependent and do
not depend on external hardware mechanism to be masked or
acknowledged or re-enabled (there are probably exception but ...)

I have already done this for pc386 bsp but need to make it again.
This time I will even propose an API.

3) R2/R13 manipulation for EABI implementation

I do not understand the handling of r2 and r13 in the
EABI case. The specification for r2 says pointer to sdata2,
sbss2 section => constant. However I do not see -ffixed-r2
passed to any compilation system in make/custom/*
(for info linux does this on PPC).

So either this is a default compiler option when choosing
powerpc-rtems and thus we do not need to do anything with
this register as all the code is compiled with this compiler
and linked together OR this register may be used by rtems code
and then we do not need any special initialization or

The specification for r13 says pointer to the small data
area. r13 argumentation is the same except that as far
as I know the usage of the small data area requires
specific compiler support so that access to variables is
compiled via loading the LSB in a register and then
using r13 to get full address... It is like a small
memory model and it was present in IBM C compilers.

=> I propose to suppress any specific code for r2 and
r13 in the EABI case.

4) Code layout organization (yes again :-))

I think there are a number of design flaws in the way
the code is for ppc organized and I will try to point them out.
I have been beaten by this again on this new port, and
was beaten last year while modifying code for pc386.

a) exec/score/cpu/* vs lib/libcpu/cpu/*.

I think that too many things are put in exec/score/cpu that
have nothing to do with RTEMS internals but are rather
related to CPU feature.

This include at least :

a) registers access routine (e.g GET_MSR_Value),
b) interrupt masking/unmasking routines,
c) cache_mngt_routine,
d) mmu_mngt_routine,
e) Routines to connect the raw_exception, raw_interrupt

b) lib/libcpu/cpu/powerpc/*

With a processor family as exuberant as the powerpc family,
and their well known subtle differences (604 vs 750) or
unfortunately majors (8xx vs 60x) the directory structure
is fine (except maybe the names that are not homogeneous)


ppc421 mpc821 ...

I only needed to add mpc750. But the fact that libcpu.a was not
produced was a pain and the fact that this organization may
duplicates code is also problematic.

So, except if the support of automake provides a better solution
I would like to propose something like this :


mpc421 mpc821 ... mpc750 shared wrapup

with the following rules :

a) "shared" would act as a source container for sources that may
be shared among processors. Needed files would be compiled inside
the processor specific directory using the vpath Makefile
mechanism. "shared" may also contain compilation code
for routine that are really shared and not worth to inline...
(did not found many things so far as registers access routine
ARE WORTH INLINING)... In the case something is compiled there,
it should create libcpushared.a

b) layout under processor specific directory is free provided

1)the result of the compilation process exports :

libcpu/powerpc/"PROC"/*.h in $(PROJECT_INCLUDE)/libcpu

2) each processor specific directory creates
a library called libcpuspecific.a

Note that this organization enables to have a file that
is nearly the same than in shared but that must differ
because of processor differences...

c) "wrapup" should create libcpu.a using libcpushared.a
libcpuspecific.a and export it $(PROJECT_INCLUDE)/libcpu

The only thing I have no ideal solution is the way to put shared
definitions in "shared" and only processor specific definition
in "proc". To give a concrete example, most MSR bit definition
are shared among PPC processors and only some differs. if we create
a single msr.h in shared it will have ifdef. If in msr.h we
include libcpu/msr_c.h we will need to have it in each prowerpc
specific directory (even empty). Opinions are welcomed ...

Note that a similar mechanism exist in libbsp/i386 that also
contains a shared directory that is used by several bsp
like pc386 and i386ex and a similar wrapup mechanism...

NB: I have done this for mpc750 and other processors could just use
similar Makefiles...

c) The exec/score/cpu/powerpc directory layout.

I think the directory layout should be the same than the
libcpu/powerpc. As it is not, there are a lot of ifdefs
inside the code... And of course low level interrupt handling
code should be removed...

Besides that I do not understand why

1) things are compiled in the wrap directory,
2) some includes are moved to rtems/score,

I think the "preinstall" mechanism enables to put
everything in the current directory (or better in a per processor

5) Interrupt handling API

Again :-). But I think that using all the features the PIC
offers is a MUST for RT system. I already explained in the
prologue of this (long and probably boring) mail that the MCP750
boards offers an OPENPIC compliant architecture and that
the VIA 82586 PCI/ISA bridge offers a PC compatible IO and
PIC mapping. Here is a logical view of the RAVEN/VIA 82586
interrupt mapping :

| OPEN | <-----|8259|
| PIC | | | 2 ------
|(RAVEN)| | | <-----|8259|
| | | | | | 11
| | | | | | <----
| | | | | |
| | | | | |

| VIA PCI/ISA bridge
| x
-------- PCI interrupts

OPENPIC offers interrupt priorities among PCI interrupts
and interrupt selective masking. The 8259 offers the same kind
of feature. With actual powerpc interrupt code :

1) there is no way to specify priorities among
interrupts handler. This is REALLY a bad thing.
For me it is as importnat as having priorities
for threads...
2) for my implementation, each ISR should
contain the code that acknowledge the RAVEN
and 8259 cascade, modify interrupt mask on both
chips, and reenable interrupt at processor level,
..., restore then on interrupt return,.... This code
is actually similar to code located in some
genpvec.c powerpc files,
3) I must update _ISR_Nesting_level because
irq.inl use it...
4) the libchip code connects the ISR via set_vector
but the libchip handler code does not contain any code to
manipulate external interrupt controller hardware
in order to acknoledge the interrupt or re-enable
them (except for the target hardware of course)
So this code is broken unless set_vector adds an
additionnal prologue/epilogue before calling/returning
from in order to acknoledge/mask the raven and the
8259 PICS... => Anyway already EACH BSP MUST REWRITE

I would rather offer an API similar to the one provided
in libbsp/i386/shared/irq/irq.h so that :

1) Once the driver supplied methods is called the
only things the ISR has to do is to worry about the
external hardware that triggered the interrupt.
Everything on openpic/VIA/processor would have been
done by the low levels (same things as set-vector)
2) The caller will need to supply the on/off/isOn
routine that are fundamental to correctly implements
debuggers/performance monitoring is a portable way
3) A globally configurable interrupt priorities

I have nothing against providing a compatible
set_vector just to make libchip happy but
as I have already explained in other
mails (months ago), I really think that the ISR
connection should be handled by the BSP and that no
code containing irq connection should exist the
rtems generic layers... Thus I really dislike
libchip on this aspect because in a long term
it will force to adopt the less reach API
for interrupt handling that exists (set_vector).

Additional note : I think the _ISR_Is_in_progress()
inline routine should be :

1) Put in a processor specific section,
2) Should not rely on a global variable,

As :

a) on symmetric MP, there is one interrupt level
per CPU,
b) On processor that have an ISP (e,g 68040),
this variable is useless (MSR bit testing could
be used)
c) On PPC, instead of using the address of the
variable via CPU_IRQ_info.Nest_level a dedicated
SPR could be used.

NOTE: most of this is also true for _Thread_Dispatch_disable_level


Please do not take what I said in the mail as a criticism for
anyone who submitted ppc code. Any code present helped me
a lot understanding PPC behavior. I just wanted by this
mail to :

1) try to better understand the actual code,
2) propose concrete ways of enhancing current code
by providing an alternative implementation for MCP750. I
will make my best effort to try to brake nothing but this
is actually hard due to the file layout organisation.
3) make understandable some changes I will probably make
if joel let me do them :-)

Any comments/objections are welcomed as usual.


/ ` Eric Valette

/-- o _. Canon CRF

(_, / (_(_( Rue de la touche lambert

35517 Cesson-Sevigne Cedex

Tel: +33 (0)2 99 87 68 91 Fax: +33 (0)2 99 84 11 30
E-mail: valette@…

  • Property mode set to 100644
File size: 5.5 KB
1#ifndef _PPC_PGTABLE_H
2#define _PPC_PGTABLE_H
5 * The PowerPC MMU uses a hash table containing PTEs, together with
6 * a set of 16 segment registers (on 32-bit implementations), to define
7 * the virtual to physical address mapping.
8 *
9 * We use the hash table as an extended TLB, i.e. a cache of currently
10 * active mappings.  We maintain a two-level page table tree, much like
11 * that used by the i386, for the sake of the Linux memory management code.
12 * Low-level assembler code in head.S (procedure hash_page) is responsible
13 * for extracting ptes from the tree and putting them into the hash table
14 * when necessary, and updating the accessed and modified bits in the
15 * page table tree.
16 *
17 * The PowerPC MPC8xx uses a TLB with hardware assisted, software tablewalk.
18 * We also use the two level tables, but we can put the real bits in them
19 * needed for the TLB and tablewalk.  These definitions require Mx_CTR.PPM = 0,
20 * Mx_CTR.PPCS = 0, and MD_CTR.TWAM = 1.  The level 2 descriptor has
21 * additional page protection (when Mx_CTR.PPCS = 1) that allows TLB hit
22 * based upon user/super access.  The TLB does not have accessed nor write
23 * protect.  We assume that if the TLB get loaded with an entry it is
24 * accessed, and overload the changed bit for write protect.  We use
25 * two bits in the software pte that are supposed to be set to zero in
26 * the TLB entry (24 and 25) for these indicators.  Although the level 1
27 * descriptor contains the guarded and writethrough/copyback bits, we can
28 * set these at the page level since they get copied from the Mx_TWC
29 * register when the TLB entry is loaded.  We will use bit 27 for guard, since
30 * that is where it exists in the MD_TWC, and bit 26 for writethrough.
31 * These will get masked from the level 2 descriptor at TLB load time, and
32 * copied to the MD_TWC before it gets loaded.
33 */
35/* PMD_SHIFT determines the size of the area mapped by the second-level page tables */
36#define PMD_SHIFT       22
37#define PMD_SIZE        (1UL << PMD_SHIFT)
38#define PMD_MASK        (~(PMD_SIZE-1))
40/* PGDIR_SHIFT determines what a third-level page table entry can map */
41#define PGDIR_SHIFT     22
42#define PGDIR_SIZE      (1UL << PGDIR_SHIFT)
43#define PGDIR_MASK      (~(PGDIR_SIZE-1))
46 * entries per page directory level: our page-table tree is two-level, so
47 * we don't really have any PMD directory.
48 */
49#define PTRS_PER_PTE    1024
50#define PTRS_PER_PMD    1
51#define PTRS_PER_PGD    1024
54/* Just any arbitrary offset to the start of the vmalloc VM area: the
55 * current 64MB value just means that there will be a 64MB "hole" after the
56 * physical memory until the kernel virtual memory starts.  That means that
57 * any out-of-bounds memory accesses will hopefully be caught.
58 * The vmalloc() routines leaves a hole of 4kB between each vmalloced
59 * area for the same reason. ;)
60 *
61 * We no longer map larger than phys RAM with the BATs so we don't have
62 * to worry about the VMALLOC_OFFSET causing problems.  We do have to worry
63 * about clashes between our early calls to ioremap() that start growing down
64 * from ioremap_base being run into the VM area allocations (growing upwards
65 * from VMALLOC_START).  For this reason we have ioremap_bot to check when
66 * we actually run into our mappings setup in the early boot with the VM
67 * system.  This really does become a problem for machines with good amounts
68 * of RAM.  -- Cort
69 */
70#define VMALLOC_OFFSET (0x4000000) /* 64M */
71#define VMALLOC_START ((((long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
72#define VMALLOC_VMADDR(x) ((unsigned long)(x))
73#define VMALLOC_END     ioremap_bot
76 * Bits in a linux-style PTE.  These match the bits in the
77 * (hardware-defined) PowerPC PTE as closely as possible.
78 */
79#define _PAGE_PRESENT   0x001   /* software: pte contains a translation */
80#define _PAGE_USER      0x002   /* matches one of the PP bits */
81#define _PAGE_RW        0x004   /* software: user write access allowed */
82#define _PAGE_GUARDED   0x008
83#define _PAGE_COHERENT  0x010   /* M: enforce memory coherence (SMP systems) */
84#define _PAGE_NO_CACHE  0x020   /* I: cache inhibit */
85#define _PAGE_WRITETHRU 0x040   /* W: cache write-through */
86#define _PAGE_DIRTY     0x080   /* C: page changed */
87#define _PAGE_ACCESSED  0x100   /* R: page referenced */
88#define _PAGE_HWWRITE   0x200   /* software: _PAGE_RW & _PAGE_DIRTY */
89#define _PAGE_SHARED    0
96#define PAGE_NONE       __pgprot(_PAGE_PRESENT | _PAGE_ACCESSED)
98#define PAGE_SHARED     __pgprot(_PAGE_BASE | _PAGE_RW | _PAGE_USER | \
99                                 _PAGE_SHARED)
100#define PAGE_COPY       __pgprot(_PAGE_BASE | _PAGE_USER)
101#define PAGE_READONLY   __pgprot(_PAGE_BASE | _PAGE_USER)
104                                 _PAGE_NO_CACHE )
107 * The PowerPC can only do execute protection on a segment (256MB) basis,
108 * not on a page basis.  So we consider execute permission the same as read.
109 * Also, write permissions imply read permissions.
110 * This is the closest we can get..
111 */
112#define __P000  PAGE_NONE
113#define __P001  PAGE_READONLY
114#define __P010  PAGE_COPY
115#define __P011  PAGE_COPY
116#define __P100  PAGE_READONLY
117#define __P101  PAGE_READONLY
118#define __P110  PAGE_COPY
119#define __P111  PAGE_COPY
121#define __S000  PAGE_NONE
122#define __S001  PAGE_READONLY
123#define __S010  PAGE_SHARED
124#define __S011  PAGE_SHARED
125#define __S100  PAGE_READONLY
126#define __S101  PAGE_READONLY
127#define __S110  PAGE_SHARED
128#define __S111  PAGE_SHARED
129#endif /* _PPC_PGTABLE_H */
Note: See TracBrowser for help on using the repository browser.