07/26/99 21:35:15 (25 years ago)
Joel Sherrill <joel.sherrill@…>
4.10, 4.11, 4.8, 4.9, 5, master

Patch from Eric Valette <valette@…> based on a tremendous
bug report from David Decotigny <David.Decotigny@…>:

During the last few days, I've been back working on RTEMS. Let me
remind you that RTEMS didn't boot on our (old) Dell P90 machines (ref:
PC 590) : we could only get a reboot out of them.

1/ The symptoms

Hopefully, the problem was rather deterministic. The stack couldn't be
written correctly : issueing one or more "push" would always push '0'
onto the stack. The way to solve this was to issue a "pop", such as
"pushl eax ; popl eax". After this "pop", the stack would be writeable

BUT, it will be writable for 8 consecutive "push"s. After these 8
"push"s, the other "push"s are wrong again, and a blank push/pop is

Considering that the L1 cache lines of this pentium are 32 bytes long,
and that 8 long int are 32 bytes long too, it came to us that there
was a problem with the cache.

Actually, the bug of the push could be shown through memory accesses
directly : writing on an not-in-cache mem location would put 0 until
this mem location is accessed through a single "read". Then, the whole
cache line would be right again.

2/ The consequences

Of course, that was the first thing that we've been able to observe ;)
RTEMS could not boot. Actually, when a "call" pushed 0 onto the stack,
the ret could only lead to raise an exception a bit later. Since, in
the early stage, the Interrupt vector points to 0, averything couldn't
get worse : triple fault + reboot.

3/ Explanation

This cache mechanism corruption only appeared after load_segment()
returned (through a jump). Investigating a bit further shows that this
appears /sometimes/ during the PICs initialization.

"Sometimes" proved to be "When writing something with the 4th bit of
%al set". That is "when writing 0x28 or 0xff" for example. Clearing
this bit would just make the things work right.

Actually, this isn't a bug in the proper PIC initialization (which is
quite academic). It came from the "delay" routine, which theoretically
does nothing but writing to an "inexistant" port (0xed), in order to
lose some time.

BUT, in the special case of our Dell P90, it appears that this 0xed
port does something cruel with the cache mechanism when its 4th bit
(aka bit 3 or 0x8) is set.

I didn't investigate this non-standard behaviour of the P90 any
further : I don't know if this is documented, or if it is just another
(known ?) bug of the early Pentiums. Just notice that we have 5 such
machines, and it has the same effect on the cache mechanism.

1 edited


  • c/src/lib/libbsp/i386/pc386/startup/ldsegs.S

    r29e68b75 r38bfb0d  
    63         outb    al, $0xED       # about 1uS delay
     63        outb    al, $0x80       # about 1uS delay
    6464        ret
Note: See TracChangeset for help on using the changeset viewer.