#4531 closed defect (fixed)

Data corruption in SMP schedulers

Reported by: Sebastian Huber Owned by: Sebastian Huber
Priority: normal Milestone: 6.1
Component: score Version: 5
Severity: critical Keywords: SMP, qualification
Cc: Blocked By:


Certain operations involving sticky thread queues, thread to processor affinity changes, priority updates, or thread pinning lead to data corruption in SMP schedulers, in particular the default SMP scheduler (SMP EDF).

Change History (21)

comment:1 Changed on 10/20/21 at 05:37:48 by Sebastian Huber

Severity: normalcritical

comment:2 Changed on 11/23/21 at 13:35:06 by Sebastian Huber <sebastian.huber@…>

In 577262a7/rtems:

score: Add red-black tree append/prepend

These functions are a faster alternative to _RBTree_Insert_inline() if
it is known that the new node is the maximum/minimum node.

Update #4531.

comment:3 Changed on 11/23/21 at 13:35:10 by Sebastian Huber <sebastian.huber@…>

In 45e942d/rtems:

score: Rename _Scheduler_Set_idle_thread()

Rename _Scheduler_Set_idle_thread() in _Scheduler_Node_set_idle_user() and move
it to <rtems/score/schedulernodeimpl.h>.

Update #4531.

comment:4 Changed on 11/23/21 at 13:35:13 by Sebastian Huber <sebastian.huber@…>

In 7ae4f569/rtems:

score: Not set CPU in _Scheduler_Use_idle_thread()

Do not set the CPU of the idle thread in _Scheduler_Use_idle_thread(). This
helps to use _Scheduler_Try_to_schedule_node() under more general conditions in
the future, for example in case the owner and user of a node are not the same.

Update #4531.

comment:5 Changed on 11/23/21 at 13:35:17 by Sebastian Huber <sebastian.huber@…>

In f0f60a1/rtems:

score: Change _Scheduler_Try_to_schedule_node()

Add the victim node as parameter instead of the idle thread.

Update #4531.

comment:6 Changed on 11/23/21 at 13:35:20 by Sebastian Huber <sebastian.huber@…>

In bd55f69/rtems:

score: Simplify _Scheduler_Exchange_idle_thread()

Remove superfluous idle parameter.

Update #4531.

comment:7 Changed on 11/23/21 at 13:35:24 by Sebastian Huber <sebastian.huber@…>

In e787091/rtems:

score: Add missing idle thread releases

Update #4531.

comment:8 Changed on 11/23/21 at 13:35:28 by Sebastian Huber <sebastian.huber@…>

In 81659420/rtems:

score: Add missing idle thread exchanges

Update #4531.

comment:9 Changed on 11/23/21 at 13:35:31 by Sebastian Huber <sebastian.huber@…>

In 6286a40/rtems:

score: Scheduler insert after move

Insert nodes after moving the second node to reduce the items in the
data structure for the insert operation. This also avoids having two
nodes for the same processor inserted into the scheduled chain.

Update #4531.

comment:10 Changed on 11/23/21 at 13:35:35 by Sebastian Huber <sebastian.huber@…>

In 757a1096/rtems:

score: Remove return value from enqueue scheduled

The return value was unused. Remove it.

Update #4531.

comment:11 Changed on 11/23/21 at 13:35:38 by Sebastian Huber <sebastian.huber@…>

In a53229bb/rtems:

score: Use extract from scheduled callbacks

Use the extract from scheduled callback provided by the scheduler
implementation in the SMP scheduler framework.

Update #4531.

comment:12 Changed on 11/23/21 at 13:35:42 by Sebastian Huber <sebastian.huber@…>

In 9d3e8212/rtems:

score: Rework affine ready queue handling

Rework the handling of the affine ready queue for the EDF SMP scheduler.
Do the queue handling in the node insert, move, and extract operations.
Remove the queue handling from _Scheduler_EDF_SMP_Allocate_processor().

Update #4531.

comment:13 Changed on 11/23/21 at 13:35:45 by Sebastian Huber <sebastian.huber@…>

In 75527ef3/rtems:

score: Optimize SMP EDF move to ready operation

If a node is moved from the scheduled chain to the ready queue, then we
know that it is the highest priority ready node. So, it can be
prepended to the ready queue without doing any comparisons.

Update #4531.

comment:14 Changed on 11/23/21 at 13:35:49 by Sebastian Huber <sebastian.huber@…>

In 3781709/rtems:

score: Add SMP scheduler idle exchange callback

Update #4531.

comment:15 Changed on 11/23/21 at 13:35:56 by Sebastian Huber <sebastian.huber@…>

In ff20bc9/rtems:

score: Rework idle handling in SMP schedulers

This patch fixes an issue with the idle thread handling in the SMP
scheduler framework used for the MrsP locking protocol. The approach to
use a simple chain of unused idle threads is broken for schedulers which
support thread to processor affinity. The reason is that the thread to
processor affinity introduces another ordering indicator which may under
certain conditions lead to a reordering of idle threads in the scheduled
chain. This reordering is not propagated to the chain of unused idle
threads. This could lead to use an idle thread for a sticky scheduler
node which is already in use. This locks up the system in infinite
loops in the thread context switch procedure.

To fix this, the SMP scheduler implementations must now provide
callbacks to get and release an unused idle thread.

Update #4531.

comment:16 Changed on 11/23/21 at 13:36:09 by Sebastian Huber <sebastian.huber@…>

In dcd8b93/rtems:

score: Move _Scheduler_Block_node()

Move _Scheduler_Block_node() into _Scheduler_SMP_Block(). This simplifies the
code and makes it easier to review.

Update #4531.

comment:17 Changed on 11/23/21 at 13:36:13 by Sebastian Huber <sebastian.huber@…>

In c6362f6/rtems:

score: Move _Scheduler_Unblock_node()

Move _Scheduler_Unblock_node() into _Scheduler_SMP_Unblock(). This simplifies
the code and makes it easier to review.

Update #4531.

comment:18 Changed on 11/23/21 at 13:36:16 by Sebastian Huber <sebastian.huber@…>

In d0434b88/rtems:

score: Remove victim thread from CPU allocation

Update #4531.

comment:19 Changed on 11/23/21 at 13:36:20 by Sebastian Huber <sebastian.huber@…>

In 4d90289e/rtems:

score: _Scheduler_SMP_Schedule_highest_ready()

Simplify callers of _Scheduler_SMP_Schedule_highest_ready(). Move the node
state change and the extraction from scheduled into
_Scheduler_SMP_Schedule_highest_ready(). Move the idle thread release to the
caller which have more information about the presence of an idle thread.

Update #4531.

comment:20 Changed on 11/23/21 at 13:36:23 by Sebastian Huber <sebastian.huber@…>

In fc64e837/rtems:

score: Rework ask for help requests

Process ask for help requests on the current processor. This avoids
using inter-processor interrupts to make the system behaviour a bit more

Update #4531.

comment:21 Changed on 11/23/21 at 13:36:27 by Sebastian Huber <sebastian.huber@…>

Resolution: fixed
Status: assignedclosed

In 6443c9d/rtems:

score: Fix assertion in SMP scheduler framework

Properly assert that the scheduled chain is not empty. Fix formatting.

Close #4531.

Note: See TracTickets for help on using tickets.