#3334 closed defect (fixed)

deadlock in _once()

Reported by: Stavros Passas Owned by: Sebastian Huber
Priority: normal Milestone: 5.1
Component: posix Version: 5
Severity: normal Keywords:
Cc: Blocked By:
Blocking:

Description

RTEMS threads getting locked up when using certain c++ functionality.
Issue happens for example when std::future is combined with std::async.

Investigating deeper, seems like this happens if std::async executes before std::future gets scheduled to run. Both of these create a pthread_once instance.

_once() uses a common semaphore for all calls, thus the first function (async.get usually) gets the lock, calls its “init” function (which blocks until the second function has completed. After this, std::future also uses pthread_once to execute, but because the lock is already taken, it also blocks, casing a deadlock.

Attached you can find a test application that reproduces the deadlock.

Attachments (2)

Add-test-executing-interlocking-pthread_once.patch (11.8 KB) - added by Stavros Passas on Mar 13, 2018 at 1:19:22 PM.
Test Application
3334-Fix-pthread_once-deadlock.patch (3.7 KB) - added by Stavros Passas on Mar 13, 2018 at 1:28:56 PM.
Proposed fix

Download all attachments as: .zip

Change History (12)

Changed on Mar 13, 2018 at 1:19:22 PM by Stavros Passas

Test Application

comment:1 Changed on Mar 13, 2018 at 1:22:26 PM by Stavros Passas

Copying the suggestion from Sebastian, (from the mailing list) about this issue:

"Please open a ticket and provide a test case for the RTEMS test suite. Maybe we have to use dedicated mutexes for each pthread_once_t object. This is what Linux and FreeBSD do. This would require a Newlib update."

Changed on Mar 13, 2018 at 1:28:56 PM by Stavros Passas

Proposed fix

comment:2 Changed on Mar 13, 2018 at 1:34:02 PM by Stavros Passas

I agree with Sebastian, that using one dedicated mutex for each pthread_once_t instance would be a longer term and elegant solution, but it would also add overhead for each pthread_t instance.

I am adding a different proposed solution, which doesn't require newlib changes (and increasing the pthread_t size):

The _once implementation uses a single mutex. Currently this mutex protects the whole function, while I believe we need to protect reads/writes to the once_state variable only. Concurrent tasks finding the state on RUNNING, could just yield until the state becomes ONCE_STATE_COMPLETE.

comment:3 Changed on Mar 13, 2018 at 2:12:26 PM by Sebastian Huber

Milestone: 4.11.45.1

Please send patches to the mailing list.

The yield loop may fail if thread priorities come into play. It should be replaced with a condition variable. So, for the once implementation we need a mutex and a condition variable (#include <rtems/thread.h>). There is currently no condition variable with API mutex support. We need protection from asynchronous deletion. Maybe use _Thread_Set_life_protection() directly in _Once().

If we want to back port this fix to RTEMS 4.11, then we have to use <sys/lock.h> instead of <rtems/thread.h>.

comment:4 Changed on Oct 14, 2018 at 1:12:03 AM by Chris Johns

What is happening with this ticket?

comment:5 Changed on Oct 14, 2018 at 1:13:12 AM by Chris Johns

Version: 4.115

comment:6 Changed on Oct 14, 2018 at 8:27:03 PM by Chris Johns

Milestone: 5.1Indefinite
Status: newassigned
Version: 56

comment:7 Changed on Oct 15, 2018 at 5:05:17 AM by Sebastian Huber

Owner: set to Sebastian Huber

comment:8 Changed on Feb 12, 2019 at 12:20:52 PM by Sebastian Huber

Milestone: Indefinite5.1
Status: assignedaccepted
Version: 65

comment:9 Changed on Feb 18, 2019 at 6:26:41 AM by Sebastian Huber <sebastian.huber@…>

Resolution: fixed
Status: acceptedclosed

In e4ad14cc/rtems:

score: Avoid some deadlocks in _Once()

Recursive usage of the same pthread_once_t results now in a deadlock.
Previously, an error of EINVAL was returned. This usage scenario is
invalid according to the POSIX pthread_once() specification.

Close #3334.

comment:10 Changed on Feb 18, 2019 at 7:34:12 AM by Sebastian Huber <sebastian.huber@…>

In 3d65f45/rtems:

psxtests/psxonce01: Fix typo

Update #3334.

Note: See TracTickets for help on using tickets.