#2124 closed defect (fixed)

Strict order mutex introduces unbounded priority inversion

Reported by: Gedare Owned by: Gedare
Priority: normal Milestone: 4.11.1
Component: score Version: 4.11
Severity: normal Keywords:
Cc: javamonn@… Blocked By:
Blocking:

Description

The option to ENABLE_STRICT_ORDER_MUTEX is not implemented correctly. It can introduce an unbounded priority inversion in certain circumstances. See http://www.rtems.com/ml/rtems-users/2009/may/msg00093.html and the spsem02 test case http://www.rtems.org/pipermail/rtems-devel/2013-May/003154.html

Change History (6)

comment:1 Changed on Dec 17, 2013 at 6:32:05 PM by Daniel Ramirez

Cc: Daniel Ramirez added

The current implementation of strict order mutexes is concerned only with that they are obtained/released in LIFO order, no attention is payed to correctly stepping down the inherited priority. LIFO order is checked by maintaining a list of Chain_nodes, and prepending a new node to this list every time a thread obtains a mutex. The thread has a pointer to the head of this list, as well as a count of the current mutexes held by it. When a mutex held by this thread is released, a Chain_node is popped off of the list and compared to the mutex we are attempting to release. If they match, we know it was in LIFO order. The specific line of code responsible for the bug can be found in the function responsible for these checks, _CORE_mutex_Pop_priority, line 60. In it, after the released mutex passes the LIFO check, we set the priority of the thread back to what it was before we had obtained that particular mutex. This is done by accessing the "queue" member of the CORE_mutex_control struct, which is a pointer to the CORE_mutex_order_list struct mentioned earlier. This CORE_mutex_order_list struct contains a Priority_Control member "priority_before" that is set as the priority before the task obtains the mutex.

The solution to this bug is complex because it requires some overhead management. Ideally, the Thread_Control_struct should contain a list of pointers to resources currently held by it. Right now, this struct contains only a count of the resources. Mutexes themselves contain a list of threads currently blocked on them. So, given a list of resources held by a thread, you could step through this list of held mutexes, and then the subsequent list of waiting threads for each mutex to find the thread with the highest priority. Then, you could set the priority of the mutex holding thread to match that. Note that this might not result in a change, as you might've released a mutex blocked on by threads of lesser priority. You could possibly implement checks to avoid stepping through the list in this case, as well as use a different data structure as opposed to unordered lists in order to reduce time complexity. As mutex operations should be kept as fast as possible, a better way to implement this feature would be to fetch the waiting threads on all mutexes and sort them in a list by priority, so the popping operation is kept at constant time. Keeping this list up to date would be complex (both time complex and code complex) however.

A ready made data structure that would allow for the bug to be fixed but would result in some major reworking of the existing code would be something modeled after the plist seen in the linux kernel here: http://lxr.free-electrons.com/source/include/linux/plist.h. This has been discussed by the RTEMS community previously.

Consider that any new structure would allow for the list of Chain_nodes to be removed, as you could check FIFO order in a way provided for by the new structure. This should be considered when weighing the cost/benefit analysis of determining whether the added overhead is worth it.

comment:2 Changed on Nov 23, 2014 at 4:37:54 PM by Joel Sherrill

Owner: changed from Joel Sherrill to Gedare
Status: newassigned

comment:3 Changed on Nov 24, 2014 at 6:58:28 PM by Gedare

Version: HEAD4.11

Replace Version=HEAD with Version=4.11 for the tickets with Milestone >= 4.11

comment:4 Changed on Dec 24, 2014 at 2:53:37 AM by Gedare

Milestone: 4.114.11.1

comment:5 Changed on Apr 29, 2016 at 7:17:54 AM by Sebastian Huber

This problem will be fixed as a side-effect of the OMIP implementation, see #2556.

comment:6 Changed on May 2, 2016 at 10:08:51 AM by Sebastian Huber <sebastian.huber@…>

Resolution: fixed
Status: assignedclosed

In 500a8e9c62dca9f62611ecca64857dadb2bc0557/rtems:

score: Delete RTEMS_STRICT_ORDER_MUTEX

Remove support for strict order mutexes.

Close #2124.

Note: See TracTickets for help on using tickets.