#2957 closed defect (fixed)

Shared memory support internal locking is broken

Reported by: Sebastian Huber Owned by: Gedare Bloom
Priority: normal Milestone: 5.1
Component: score Version: 5
Severity: normal Keywords:
Cc: Blocked By:
Blocking:

Description

The top level lock is an ISR lock (interrupt disable/enable or SMP lock) and the low level lock is potentially a mutex. The problem is exposed by test psxshm02:

#0  _Terminate (the_source=INTERNAL_ERROR_CORE, the_error=31) at ../../../../../../rtems/c/src/../../cpukit/score/src/interr.c:35
#1  0x00111654 in _Internal_error (core_error=INTERNAL_ERROR_BAD_THREAD_DISPATCH_ENVIRONMENT) at ../../../../../../rtems/c/src/../../cpukit/score/src/interr.c:52
#2  0x00117010 in _Thread_Do_dispatch (cpu_self=0x2035c0 <_Per_CPU_Information>, level=1611071955) at ../../../../../../rtems/c/src/../../cpukit/score/src/threaddispatch.c:190
#3  0x0011a568 in _Thread_Dispatch_enable (cpu_self=0x2035c0 <_Per_CPU_Information>) at ../../cpukit/../../../realview_pbx_a9_qemu/lib/include/rtems/score/threaddispatch.h:227
#4  0x0011b6c4 in _Thread_Change_life (clear=THREAD_LIFE_PROTECTED, set=THREAD_LIFE_PROTECTED, ignore=(unknown: 0)) at ../../../../../../rtems/c/src/../../cpukit/score/src/threadrestart.c:684
#5  0x0011b6ea in _Thread_Set_life_protection (state=THREAD_LIFE_PROTECTED) at ../../../../../../rtems/c/src/../../cpukit/score/src/threadrestart.c:691
#6  0x0010f3dc in _API_Mutex_Lock (the_mutex=0x2037d8) at ../../../../../../rtems/c/src/../../cpukit/score/src/apimutexlock.c:31
#7  0x001050a0 in _RTEMS_Lock_allocator () at ../../cpukit/../../../realview_pbx_a9_qemu/lib/include/rtems/score/apimutex.h:120
#8  0x00105442 in rtems_heap_allocate_aligned_with_boundary (size=10004, alignment=0, boundary=0) at ../../../../../../rtems/c/src/../../cpukit/libcsupport/src/malloc_deferred.c:89
#9  0x001055a6 in malloc (size=10004) at ../../../../../../rtems/c/src/../../cpukit/libcsupport/src/malloc.c:39
#10 0x0011e820 in realloc (ptr=0x0, size=10004) at ../../../../../../rtems/c/src/../../cpukit/libcsupport/src/realloc.c:62
#11 0x0010b1a2 in _POSIX_Shm_Object_resize_from_heap (shm_obj=0x204870, size=10004) at ../../../../../../rtems/c/src/../../cpukit/posix/src/shmheap.c:59
#12 0x0010b6ac in shm_ftruncate (iop=0x202cf8 <rtems_libio_iops+168>, length=10004) at ../../../../../../rtems/c/src/../../cpukit/posix/src/shmopen.c:83
#13 0x00104cfc in ftruncate (fd=3, length=10004) at ../../../../../../rtems/c/src/../../cpukit/libcsupport/src/ftruncate.c:37
#14 0x001008e0 in POSIX_Init (argument=0x0) at ../../../../../../../rtems/c/src/../../testsuites/psxtests/psxshm02/init.c:54
#15 0x001201ee in _Thread_Entry_adaptor_pointer (executing=0x2041a8) at ../../../../../../rtems/c/src/../../cpukit/score/src/threadentryadaptorpointer.c:25
#16 0x00120302 in _Thread_Handler () at ../../../../../../rtems/c/src/../../cpukit/score/src/threadhandler.c:88

Change History (5)

comment:1 Changed on Mar 28, 2017 at 7:39:49 PM by Gedare Bloom

Yes: the shm code has some rather questionable design choices in its locking patterns. Here it acquires a thread_queue and then calls a user plugin function, in this case malloc is called which eventually leads to the error shown.

Is there somewhere with guidance written for how to use the various lock constructs available in RTEMS? Or what are their constraints on use. Because I didn't even know this was a problem, and apparently did not spend much time thinking about this while writing the code.

The shm code uses 2 locking patterns. The one causing a problem here is the use of a thread_queue to protect the invocations to operations on the shared-memory object. This should be some fine-grained mutex lock that allows the operation invoked to also acquire/release internal locks such as is needed by malloc.

The other lock used is the _Objects_Allocator_lock to protect the integrity of the POSIX_Shm_Control Object. I think this one is fine.

comment:2 in reply to:  1 Changed on Mar 29, 2017 at 5:34:52 AM by Sebastian Huber

Replying to Gedare:

Yes: the shm code has some rather questionable design choices in its locking patterns. Here it acquires a thread_queue and then calls a user plugin function, in this case malloc is called which eventually leads to the error shown.

Is there somewhere with guidance written for how to use the various lock constructs available in RTEMS? Or what are their constraints on use. Because I didn't even know this was a problem, and apparently did not spend much time thinking about this while writing the code.

Good question, maybe we should something add to the manual, e.g. in Key Concepts, 3.3 Communication and Synchronization.

The shm code uses 2 locking patterns. The one causing a problem here is the use of a thread_queue to protect the invocations to operations on the shared-memory object. This should be some fine-grained mutex lock that allows the operation invoked to also acquire/release internal locks such as is needed by malloc.

The thread queue should be replaced with a mutex. For simplicity, maybe just the allocator mutex. We really need the self-contained mutexes for internal use.

The other lock used is the _Objects_Allocator_lock to protect the integrity of the POSIX_Shm_Control Object. I think this one is fine.

Yes.

comment:3 Changed on May 11, 2017 at 7:31:02 AM by Sebastian Huber

Milestone: 4.124.12.0

comment:4 Changed on Jun 30, 2017 at 1:21:04 PM by Gedare Bloom

Resolution: fixed
Status: assignedclosed

comment:5 Changed on Nov 9, 2017 at 6:27:14 AM by Sebastian Huber

Milestone: 4.12.05.1

Milestone renamed

Note: See TracTickets for help on using tickets.