Notice: We have migrated to GitLab launching 2024-05-01 see here: https://gitlab.rtems.org/

#1270 closed defect (fixed)

[libblock, bdbuf] bdbuf race conditions

Reported by: strauman Owned by: Joel Sherrill
Priority: normal Milestone: 4.7
Component: lib/block Version: 4.7
Severity: normal Keywords:
Cc: chrisj@…, victor.vengerov@… Blocked By:
Blocking:

Description

I believe I identified two race conditions in cpukit/libblock/src/bdbuf.c:

A) Assume two threads of different priorities try to read the same block

with rtems_bdbuf_read():

1) low-priority thread calls find_or_assign_buffer(). Assuming the target

block is not cached the low-priority thread obtains a fresh buffer
and proceeds starting a data-transfer from disk. Eventually, the task
blocks on the buffer's transfer_sem (in rtems_bdbuf_read()).

2) at this point, the low-priority task is blocked and the high-priority

task is scheduled. It calls find_or_assign_buffer() and finds the
buffer that was already allocated to the low-priority task.
Since a data-transfer is still in progress, the high-priority task
finds that the buffer's 'in_progress' flag is set and therefore
blocks on the transfer_sema.

3) both tasks are now blocked on the transfer_sema. The low-priority task

is waiting in rtems_bdbuf_read(), the high_priority task waits in
find_or_assign_buffer().

4) data transfer completes, ISR calls bdbuf_read_transfer_done() which

releases and also flushes the transfer_sema (I suspect this sequence
was chosen hoping that the task blocking inside rtems_bdbuf_read()
would unblock first and other tasks that possibly wait in
find_or_assign_buffer() would be scheduled later - however, if
CORE_mutex_surrender; CORE_mutex_flush are called from an ISR
and multiple tasks are unblocked then the CPU is eventually
dispatched to the highest-priority thread and not necessarily
to the one that was unblocked first).


5) ISR terminates and the scheduler picks the highest priority task to

run. This is the one blocking in find_or_assign_buffer(). The high-
priority task then loops and finds that the 'in_progress' flag is
still set and therefore again blocks on tranfer_sem.

6) at this point, the low-priority task is scheduled again. It finishes

'rtems_bdbuf_read()' by setting the 'actual' flag to TRUE and the
'in_progress' flag to FALSE.

==> second task is still blocked in find_or_assign_buffer()

PROPOSED FIX: set 'actual' and reset 'in_progress' flags from bdbuf_read_transfer_done().

B) Another race condition has already been described in an email

Assume two threads concurrently read the same block
(rtems_bdbuf_read()) which shall not be cached yet.
We also assume that enough free blocks are available.

Hence, the first thread runs w/o preemption through

1 obtaining a buffer (find_or_assign_buffer())
2 setting up a read request to disk
3 initializing bd_buf's transfer semaphore

until

4 blocking on the transfer semaphore

at this point, the first thread is suspended blocking
for the semaphore so that the second thread can
enter the critical section. The second thread enters
find_or_assign_buffer(), finds the buffer already allocated
by thread 1 and executes the code fragment

while ( bd_buf->in_progress != 0 )
{

/* Potential for race condition if IRQ happens here */
rtems_interrupt_disable(level)
_CORE_mutex_Seize(&bd_buf->transfer_sema, ...)

}

If an IRQ signalling completion of the read transfer
happens at the marked point above ('in_progress' has
been set to 1 by thread #1 when scheduling the disk read transfer
[step 2 above]) then the ISR releases and flushes the transfer_sema
(bdbuf_read_transfer_done()) which eventually results in thread #1
resuming execution.
Thread #2, however, which is not yet blocked on the transfer_sema
will subsequently do so and therefore hang.

IMO the loop should be modified to

rtems_interrupt_disable()
while (bd_buf->in_progress)
{

_CORE_mutex_Seize()
rtems_interrupt_disable()

}
rtems_interrupt_enable()

Accordant modifications are needed at similar locations in the
code.

Attachments (1)

bdbuf.c.fix_race_conds.diff (5.4 KB) - added by strauman on 01/06/08 at 04:38:04.
patch to fix mentioned race conditions

Download all attachments as: .zip

Change History (4)

Changed on 01/06/08 at 04:38:04 by strauman

Attachment: bdbuf.c.fix_race_conds.diff added

patch to fix mentioned race conditions

comment:1 Changed on 01/06/08 at 05:01:26 by strauman

Summary: [libblock] bdbuf race conditions[libblock, bdbuf] bdbuf race conditions

comment:2 Changed on 08/06/08 at 03:21:45 by Chris Johns

Cc: Chris Johns added
Resolution: fixed
Status: newclosed

Fixed with a rewrite the bdbuf code. The pre-emption disable and SCORE mutex access has been removed. These patches not longer apply.

comment:3 Changed on 10/10/17 at 06:49:19 by Sebastian Huber

Component: scorelib/block
Note: See TracTickets for help on using tickets.