#4013 assigned defect

Using the size of an object to deduce the alignment is broken on some architectures

Reported by: Sebastian Huber Owned by: Sebastian Huber
Priority: normal Milestone: 5.2
Component: score Version: 5
Severity: normal Keywords:
Cc: Blocked By:
Blocking:

Description

For example, code like this

RTEMS_INLINE_ROUTINE bool _Partition_Is_buffer_area_aligned(
  const void *starting_address
)
{
  return (((uintptr_t) starting_address) % CPU_SIZEOF_POINTER) == 0;
}

is broken on architectures with relaxed alignment requirements, e.g. on m68k is alignof(void *) == 2 and sizeof(void *) == 4. This may cause sporadic failures of psxconfig01.

Change History (9)

comment:1 Changed on Jun 26, 2020 at 12:29:19 AM by Chris Johns

The test can be tagged intermittent for the traditional architectures it is sporadic on.

I am missing where the _Partition_Is_buffer_area_aligned() call and alignof(void *) == 2 come into the test psxconfig01?

comment:2 Changed on Jun 26, 2020 at 5:09:18 AM by Sebastian Huber

The test uses a global data area for a rtems_partition_create(). The address map determines if the test fails or passes.

comment:3 Changed on Jun 26, 2020 at 12:48:07 PM by Joel Sherrill

Why isn't the area or the structure declared with RTEMS_ALIGNED()? It has alignment requirements that are known at compile time and should be specified.

comment:4 Changed on Jun 26, 2020 at 2:18:39 PM by Sebastian Huber

Using the RTEMS_ALIGNED() would work, but the real problem is that RTEMS demands higher alignment requirements than the architecture.

comment:5 in reply to:  4 Changed on Jun 29, 2020 at 6:00:11 AM by Chris Johns

Replying to Sebastian Huber:

Using the RTEMS_ALIGNED() would work, but the real problem is that RTEMS demands higher alignment requirements than the architecture.

Yes and that is a good thing. The alignment the architecture can support does not mean the performance is optimal when that alignment is used. On a m68k if the stack pointer is not aligned to a long work (4) the performance suffers.

RTEMS has always place extra constraints on architectures to extract the best performance possible.

comment:6 Changed on Jun 29, 2020 at 6:24:59 AM by Sebastian Huber

This ticket is not about the stack pointer. It is about the alignment of objects and in particular partitions. When the compiler places something like this on 2-byte alignment

typedef struct {
  uint64_t data [16];
} area;

#if CONFIGURE_MAXIMUM_PARTITIONS > 0
  static area partition_areas [CONFIGURE_MAXIMUM_PARTITIONS];
#endif

with -O2, then the performance cannot be that bad on this architecture.

The user has always the option to fine tune this. Why should RTEMS enforce artificial alignment requirements?

comment:7 Changed on Jun 29, 2020 at 1:16:52 PM by Joel Sherrill

The goal was to ensure generically that the start of a partition was aligned the same on all architectures. Buffers allocated from one should meet the alignment requirements of a malloc'ed object which means the alignment of a double. It is the user's responsibility that the memory passed in is aligned properly. I contend that this test is defective if it does not address alignment.

comment:8 Changed on Jun 29, 2020 at 1:55:57 PM by Sebastian Huber

On m68k I get for this test code

double d;
int i;
long long ll;

the following output

m68k-rtems5-gcc -S -O2 -o - test.c
#NO_APP
        .file   "test.c"
        .text
        .globl  ll
        .section        .bss
        .align  2
        .type   ll, @object
        .size   ll, 8
ll:
        .zero   8
        .globl  i
        .align  2
        .type   i, @object
        .size   i, 4
i:
        .zero   4
        .globl  d
        .align  2
        .type   d, @object
        .size   d, 8
d:
        .zero   8
        .ident  "GCC: (GNU) 10.1.0"

So, the alignment is 2 bytes for int, double, and long long. Also the GCC provides this:

#define __BIGGEST_ALIGNMENT__ 2

There is no RTEMS API to align the buffer memory to some RTEMS-defined boundary.

comment:9 in reply to:  6 Changed on Jun 30, 2020 at 4:56:36 AM by Chris Johns

Replying to Sebastian Huber:

This ticket is not about the stack pointer. It is about the alignment of objects and in particular partitions. When the compiler places something like this on 2-byte alignment

typedef struct {
  uint64_t data [16];
} area;

#if CONFIGURE_MAXIMUM_PARTITIONS > 0
  static area partition_areas [CONFIGURE_MAXIMUM_PARTITIONS];
#endif

with -O2, then the performance cannot be that bad on this architecture.

The user has always the option to fine tune this. Why should RTEMS enforce artificial alignment requirements?

The stack pointer is provided as an example of the extreme end of the issue and while what you show is simpler and the overheads might not be as apparent they are present. Also please remember the m68k does not have a pipeline or a smart bus interface that can join accesses in a pipeline into a single larger access that other architectures can.

Consider a standard prologue for a function on a m68k. It will be something like:

movem.l a0-a7,sp-

If the m68k has a 32bit bus and the SP is mis-aligned it will silently use a bus cycle for each half of the 32bit value written or two bus cycles for each 32bit word. If the SP was aligned on a long-word boundary half the number of cycles would be need. The m68k/Coldfire designers designed the device assuming the SP would be long word aligned even through it can support being misaligned. We need to capture and enforce this or we degrade the performance of the system.

The issue is way the address bus is set out, there is no A0 or A1, they are data strobes to gate the part of the 32 bit data that is active for the access.

Note: See TracTickets for help on using tickets.