#2002 closed defect (wontfix)

ioctl recursive perimeter lock driver deadlock vulnerability

Reported by: Jeffrey Hill Owned by: Joel Sherrill
Priority: highest Milestone: 4.11.2
Component: network/legacy Version: 4.11
Severity: normal Keywords:
Cc: Blocked By:
Blocking:

Description (last modified by Chris Johns)

In summary, a generalized deadlock potential exists any time rtems_bsdnet_ioctl calls rtems_bsdnet_ifconfig which calls the driver, and the driver tries to release the bsd networking semaphore, but the lock count doesn't decrement to zero, so the lock is never released.

What happened to me (when writing an Altera Triple Speed Ethernet Driver for NIOS2) was as follows (names here are slightly different than reality). Of course other scenarios are possible.

user calls
rtems_bsdnet_ioctl which takes bsd stack lock, it calls
rtems_bsdnet_ifconfig which locks bsd stack recursively, it calls
driver_ioctl function when setting IF_UP flag to true, it calls
driver_begin_communicating and it discovers it is already communicating, it calls
driver_stop_communicating which iscovers that tx/rx threads are running, it calls
bsd_locking_semaphore_release while waiting for the tx/rx threads to shutdown
rip

I fixed this of by changing to a noop if they set IF_UP flag and the driver is already up and running, but sometimes that might be less than robust because we are not forcing a restart of the auxiliary threads. Furthermore, if the user sets the UP flag to false then we cant avoid this issue; we will definitely need to release the lock when the driver threads are forced to exit?

POTENTIAL FIX:
Usually what is done is to make a rtems_bsdnet_ifconfig_nolock_private function and then call it form both rtems_bsdnet_ioctl and rtems_bsdnet_ifconfig; presumably the perimeter functions must lock only once on the way in, or in any case thats a common convention with multi-threaded code.

On Jan 30, 2012, at 12:30 PM, Hill, Jeffrey O wrote:


From: Eric Norum
Sent: Monday, January 30, 2012 11:21 AM
To: Hill, Jeffrey O
Cc: Till Straumann
Subject: Re: rtems bsd network deadlock potential

The network mutex is to be taken whenever making the transition from
'user' code from 'kernel' code. I did this because the BSD kernel from
which the networking code was lifted was, like many (all?) old UNIXes,
non-reentrant. It's possible that over the years some code has been
added to the IOCTL support that ends up calling a 'user' level routine
from 'kernel' level which then calls some 'kernel' code again. This
should be fixed. kernel code should never call user code -- just to avoid
the nested mutex problem that Jeff is reporting. Perhaps some IOCTL
routine need to be split up with a user-level wrapper that takes the mutex
then calls the kernel level routine -- and that kernel level routine
should be what any other kernel level code invokes.

I'm afraid that I don't have time to look at this now.

On Jan 30, 2012, at 9:30 AM, Hill, Jeffrey O wrote:

It could well be that the intention is that rtems_bsdnet_ioctl()

executes

atomically w/o the driver temporarily releasing the lock and doing
communication. That could alter internal state in unintended ways.

Ok, maybe this is just part of the design, but I am left with some

doubts if this type of (taking the lock twice to prevent the state from
changing while in the driver) enforcement policy is applied uniformly. It
might even be that this is in place purely because of accidental
inconsistencies in the way the lock is acquired on the way in.

Considering this further, isn't it quite routine and normal for the

driver to shutdown auxiliary threads (which take the lock) when inside the
driver ioctl function if the user sets the UP flag to false? Presumably
this can't be done reliably w/o releasing the lock in the driver?

Of course the RTEMS designers, who know all of the consequences will

need to decide. I am only identifying what appear to be issues when I see
them.

Jeff


From: Till Straumann
Sent: Monday, January 30, 2012 10:07 AM
To: Hill, Jeffrey O
Cc: Eric Norum
Subject: Re: rtems bsd network deadlock potential

I see. However, I'm not sure if that is not a programming error in the
driver.
It could well be that the intention is that rtems_bsdnet_ioctl()

executes

atomically w/o the driver temporarily releasing the lock and doing
communication.
That could alter internal state in unintended ways.

T.

On 01/30/2012 10:58 AM, Hill, Jeffrey O wrote:

Hi Till,

What happened to me was as follows (names are slightly different than

reality), but of course other scenarios are possible.

rtems_bsdnet_ioctl calls (it locks), it calls
rtems_bsdnet_ifconfig calls (it locks recursively), it calls
driver_ioctl function (because IF_UP flag is being set to true), it

calls

driver_begin_communicating (which discovers that it is already

communicating), it calls

driver_stop_communicating (which discovers that tx/rx threads are

running), it calls

bsd_locking_semaphore_release (while waiting for the tx/rx threads to

shutdown)

rip

I fixed this of course by changing to a noop if they set IF_UP flag

and

the driver is already up and running, but sometimes that might be less
robust because we are not forcing a restart of the auxiliary threads.

In summary, a generalized deadlock potential exists any time

rtems_bsdnet_ioctl calls rtems_bsdnet_ifconfig which calls the driver,

and

the driver tries to release the semaphore, but the lock count doesn't
decrement to zero, so the lock is never released.

Usually what is done is to make a rtems_bsdnet_ifconfig_nolock_private

and then call it form both rtems_bsdnet_ioctl and

rtems_bsdnet_ifconfig;

the perimeter functions must lock only once on the way in.

Jeff


From: Till Straumann
Sent: Friday, January 27, 2012 3:36 PM
To: Hill, Jeffrey O
Cc: Eric Norum
Subject: Re: rtems bsd network deadlock potential

Maybe I'm missing something but AFAIK the networking semaphore is
basically
a mutex which you can take multiple times from the same thread.

Could you please explain in more detail?

T.

On 01/27/2012 04:28 PM, Hill, Jeffrey O wrote:

Hi Eric, Till,

FWIW, I noticed today that there is a situation where

rtems_bsdnet_ioctl

calls rtems_bsdnet_ifconfig, but both functions take the bsd

networking

semaphore resulting in a recursive reference counted lock. Therefore

if

the driver's implementation of ioctl calls rtems_bsdnet_event_receive
there will be a deadlock (because the internal attempt to unlock is
silently unsuccessful). I will no-doubt try to come up with a

workaround

but perhaps the situation is somewhat precarious.

Is this serious enough that I should report a bug to the RTEMS bug

tracking system?

#0 ( rtems_bsdnet_event_receive(event_in=8, option_set=0, ticks=0,

event_out=0xa7a9f4) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/rtems/rtems_glue.c:687)

#1 0x5f34 alt_tse_soft_tx_stop(pSoftSgdmaTx=0xb24084)

(/home/hill/nios2-

rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:206)

#2 0x5fa8 alt_tse_soft_tx_destroy(pSoftSgdmaTx=0xb24084)

(/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:216)

#3 0x8808 alt_tse_stop_comm(ifp=0xb23c3c) (/home/hill/nios2-

rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:1554)

#4 0x88a8 alt_tse_start_comm(pParm=0xb23c3c) (/home/hill/nios2-

rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:1576)

#5 0x8a90 alt_tse_start_comm_no_status(pParm=0xb23c3c)

(/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:1651)

#6 0xe5a8 ether_ioctl(ifp=0xb23c3c, command=1, data=<value

optimized

out>) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/net/if_ethersubr.c:838)

#7 0x8bc0 alt_tse_ioctl(ifp=0xb23c3c, cmmd=2149607692,

data=0xb24648

"\210F\262") (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:1680)

#8 0x3272c in_ifinit(ifp=0xb23c3c, ia=0xb24648, sin=<value

optimized

out>, scrub=1) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/netinet/in.c:480)

#9 0x331a0 in_control(so=<value optimized out>, cmd=2149607692,

data=0xa7aba0 "tse0", ifp=0xb23c3c) (/home/hill/nios2-

rtems/rtems/rtems-

4.11.0-/cpukit/libnetworking/netinet/in.c:312)

#10 0x2632c old_control(so=0x0, cmd=10987900, data=0xa7a9f4

"\034\252\247", ifp=<value optimized out>) (/home/hill/nios2-
rtems/rtems/rtems-4.11.0-

/cpukit/libnetworking/kern/uipc_socket2.c:801)

#11 0xfcc8 ifioctl(so=0xb23e08, cmd=1, data=0xa7aba0 "tse0",

p=<value

optimized out>) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/net/if.c:605)

#12 0x1c3e8 so_ioctl(iop=0xaf2544, command=1, buffer=<value

optimized out>) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/rtems/rtems_syscall.c:713)

#13 ( rtems_bsdnet_ioctl(iop=0xaf2544, command=1, buffer=<value

optimized out>) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/rtems/rtems_syscall.c:731)

#14 0x3093c ioctl(fd=<value optimized out>, command=1)

(/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libcsupport/src/ioctl.c:50)

#15 0x194b8 rtems_bsdnet_ifconfig(ifname=0x4afb4 "tse0",

cmd=2149607692, param=0xa7abe0) (/home/hill/nios2-rtems/rtems/rtems-
4.11.0-/cpukit/libnetworking/rtems/rtems_glue.c:1114)

#16 0x19718 rtems_bsdnet_setup_interface(name=0x4afb4 "tse0",

ip_address=0x4afbc "128.165.34.102", ip_netmask=0x4afcc

"255.255.255.0")

(/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/rtems/rtems_glue.c:879)

#17 0x19d88 rtems_bsdnet_setup() (/home/hill/nios2-

rtems/rtems/rtems-4.11.0-

/cpukit/libnetworking/rtems/rtems_glue.c:959)

#18 ( rtems_bsdnet_initialize_network() (/home/hill/nios2-

rtems/rtems/rtems-4.11.0-

/cpukit/libnetworking/rtems/rtems_glue.c:1018)

#19 0x360 Init(ignored=336840) (init.c:51)
#20 0x3a268 _Thread_Handler() (/home/hill/nios2-rtems/rtems/rtems-

4.11.0-/cpukit/score/src/threadhandler.c:157)

#21 0x132c boot_card(cmdline=0xa74338 "DD\247") (/home/hill/nios2-

rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/../../shared/bootcard.c:268)

#22 ( 0x00000000 in ??() (??:??)

Jeff

--
Eric Norum

--
Eric Norum

Change History (7)

comment:1 Changed on 01/31/12 at 14:29:35 by Jeffrey Hill

In summary, a generalized deadlock potential exists any time rtems_bsdnet_ioctl calls rtems_bsdnet_ifconfig which calls the driver, and the driver tries to release the bsd networking semaphore, but the lock count doesn't decrement to zero, so the lock is never released.

What happened to me (when writing an Altera Triple Speed Ethernet Driver for NIOS2) was as follows (names here are slightly different than reality). Of course other scenarios are possible.

user calls
rtems_bsdnet_ioctl which takes bsd stack lock, it calls
rtems_bsdnet_ifconfig which locks bsd stack recursively, it calls
driver_ioctl function when setting IF_UP flag to true, it calls
driver_begin_communicating and it discovers it is already communicating, it calls
driver_stop_communicating which iscovers that tx/rx threads are running, it calls
bsd_locking_semaphore_release while waiting for the tx/rx threads to shutdown
rip

I fixed this of by changing to a noop if they set IF_UP flag and the driver is already up and running, but sometimes that might be less than robust because we are not forcing a restart of the auxiliary threads. Furthermore, if the user sets the UP flag to false then we cant avoid this issue; we will definitely need to release the lock when the driver threads are forced to exit?

POTENTIAL FIX:
Usually what is done is to make a rtems_bsdnet_ifconfig_nolock_private function and then call it form both rtems_bsdnet_ioctl and rtems_bsdnet_ifconfig; presumably the perimeter functions must lock only once on the way in, or in any case thats a common convention with multi-threaded code.

On Jan 30, 2012, at 12:30 PM, Hill, Jeffrey O wrote:


From: Eric Norum
Sent: Monday, January 30, 2012 11:21 AM
To: Hill, Jeffrey O
Cc: Till Straumann
Subject: Re: rtems bsd network deadlock potential

The network mutex is to be taken whenever making the transition from
'user' code from 'kernel' code. I did this because the BSD kernel from
which the networking code was lifted was, like many (all?) old UNIXes,
non-reentrant. It's possible that over the years some code has been
added to the IOCTL support that ends up calling a 'user' level routine
from 'kernel' level which then calls some 'kernel' code again. This
should be fixed. kernel code should never call user code -- just to avoid
the nested mutex problem that Jeff is reporting. Perhaps some IOCTL
routine need to be split up with a user-level wrapper that takes the mutex
then calls the kernel level routine -- and that kernel level routine
should be what any other kernel level code invokes.

I'm afraid that I don't have time to look at this now.

On Jan 30, 2012, at 9:30 AM, Hill, Jeffrey O wrote:

It could well be that the intention is that rtems_bsdnet_ioctl()

executes

atomically w/o the driver temporarily releasing the lock and doing
communication. That could alter internal state in unintended ways.

Ok, maybe this is just part of the design, but I am left with some

doubts if this type of (taking the lock twice to prevent the state from
changing while in the driver) enforcement policy is applied uniformly. It
might even be that this is in place purely because of accidental
inconsistencies in the way the lock is acquired on the way in.

Considering this further, isn't it quite routine and normal for the

driver to shutdown auxiliary threads (which take the lock) when inside the
driver ioctl function if the user sets the UP flag to false? Presumably
this can't be done reliably w/o releasing the lock in the driver?

Of course the RTEMS designers, who know all of the consequences will

need to decide. I am only identifying what appear to be issues when I see
them.

Jeff


From: Till Straumann
Sent: Monday, January 30, 2012 10:07 AM
To: Hill, Jeffrey O
Cc: Eric Norum
Subject: Re: rtems bsd network deadlock potential

I see. However, I'm not sure if that is not a programming error in the
driver.
It could well be that the intention is that rtems_bsdnet_ioctl()

executes

atomically w/o the driver temporarily releasing the lock and doing
communication.
That could alter internal state in unintended ways.

T.

On 01/30/2012 10:58 AM, Hill, Jeffrey O wrote:

Hi Till,

What happened to me was as follows (names are slightly different than

reality), but of course other scenarios are possible.

rtems_bsdnet_ioctl calls (it locks), it calls
rtems_bsdnet_ifconfig calls (it locks recursively), it calls
driver_ioctl function (because IF_UP flag is being set to true), it

calls

driver_begin_communicating (which discovers that it is already

communicating), it calls

driver_stop_communicating (which discovers that tx/rx threads are

running), it calls

bsd_locking_semaphore_release (while waiting for the tx/rx threads to

shutdown)

rip

I fixed this of course by changing to a noop if they set IF_UP flag

and

the driver is already up and running, but sometimes that might be less
robust because we are not forcing a restart of the auxiliary threads.

In summary, a generalized deadlock potential exists any time

rtems_bsdnet_ioctl calls rtems_bsdnet_ifconfig which calls the driver,

and

the driver tries to release the semaphore, but the lock count doesn't
decrement to zero, so the lock is never released.

Usually what is done is to make a rtems_bsdnet_ifconfig_nolock_private

and then call it form both rtems_bsdnet_ioctl and

rtems_bsdnet_ifconfig;

the perimeter functions must lock only once on the way in.

Jeff


From: Till Straumann
Sent: Friday, January 27, 2012 3:36 PM
To: Hill, Jeffrey O
Cc: Eric Norum
Subject: Re: rtems bsd network deadlock potential

Maybe I'm missing something but AFAIK the networking semaphore is
basically
a mutex which you can take multiple times from the same thread.

Could you please explain in more detail?

T.

On 01/27/2012 04:28 PM, Hill, Jeffrey O wrote:

Hi Eric, Till,

FWIW, I noticed today that there is a situation where

rtems_bsdnet_ioctl

calls rtems_bsdnet_ifconfig, but both functions take the bsd

networking

semaphore resulting in a recursive reference counted lock. Therefore

if

the driver's implementation of ioctl calls rtems_bsdnet_event_receive
there will be a deadlock (because the internal attempt to unlock is
silently unsuccessful). I will no-doubt try to come up with a

workaround

but perhaps the situation is somewhat precarious.

Is this serious enough that I should report a bug to the RTEMS bug

tracking system?

#0 ( rtems_bsdnet_event_receive(event_in=8, option_set=0, ticks=0,

event_out=0xa7a9f4) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/rtems/rtems_glue.c:687)

#1 0x5f34 alt_tse_soft_tx_stop(pSoftSgdmaTx=0xb24084)

(/home/hill/nios2-

rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:206)

#2 0x5fa8 alt_tse_soft_tx_destroy(pSoftSgdmaTx=0xb24084)

(/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:216)

#3 0x8808 alt_tse_stop_comm(ifp=0xb23c3c) (/home/hill/nios2-

rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:1554)

#4 0x88a8 alt_tse_start_comm(pParm=0xb23c3c) (/home/hill/nios2-

rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:1576)

#5 0x8a90 alt_tse_start_comm_no_status(pParm=0xb23c3c)

(/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:1651)

#6 0xe5a8 ether_ioctl(ifp=0xb23c3c, command=1, data=<value

optimized

out>) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/net/if_ethersubr.c:838)

#7 0x8bc0 alt_tse_ioctl(ifp=0xb23c3c, cmmd=2149607692,

data=0xb24648

"\210F\262") (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/network/if_alttse.c:1680)

#8 0x3272c in_ifinit(ifp=0xb23c3c, ia=0xb24648, sin=<value

optimized

out>, scrub=1) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/netinet/in.c:480)

#9 0x331a0 in_control(so=<value optimized out>, cmd=2149607692,

data=0xa7aba0 "tse0", ifp=0xb23c3c) (/home/hill/nios2-

rtems/rtems/rtems-

4.11.0-/cpukit/libnetworking/netinet/in.c:312)

#10 0x2632c old_control(so=0x0, cmd=10987900, data=0xa7a9f4

"\034\252\247", ifp=<value optimized out>) (/home/hill/nios2-
rtems/rtems/rtems-4.11.0-

/cpukit/libnetworking/kern/uipc_socket2.c:801)

#11 0xfcc8 ifioctl(so=0xb23e08, cmd=1, data=0xa7aba0 "tse0",

p=<value

optimized out>) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/net/if.c:605)

#12 0x1c3e8 so_ioctl(iop=0xaf2544, command=1, buffer=<value

optimized out>) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/rtems/rtems_syscall.c:713)

#13 ( rtems_bsdnet_ioctl(iop=0xaf2544, command=1, buffer=<value

optimized out>) (/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/rtems/rtems_syscall.c:731)

#14 0x3093c ioctl(fd=<value optimized out>, command=1)

(/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libcsupport/src/ioctl.c:50)

#15 0x194b8 rtems_bsdnet_ifconfig(ifname=0x4afb4 "tse0",

cmd=2149607692, param=0xa7abe0) (/home/hill/nios2-rtems/rtems/rtems-
4.11.0-/cpukit/libnetworking/rtems/rtems_glue.c:1114)

#16 0x19718 rtems_bsdnet_setup_interface(name=0x4afb4 "tse0",

ip_address=0x4afbc "128.165.34.102", ip_netmask=0x4afcc

"255.255.255.0")

(/home/hill/nios2-rtems/rtems/rtems-4.11.0-
/cpukit/libnetworking/rtems/rtems_glue.c:879)

#17 0x19d88 rtems_bsdnet_setup() (/home/hill/nios2-

rtems/rtems/rtems-4.11.0-

/cpukit/libnetworking/rtems/rtems_glue.c:959)

#18 ( rtems_bsdnet_initialize_network() (/home/hill/nios2-

rtems/rtems/rtems-4.11.0-

/cpukit/libnetworking/rtems/rtems_glue.c:1018)

#19 0x360 Init(ignored=336840) (init.c:51)
#20 0x3a268 _Thread_Handler() (/home/hill/nios2-rtems/rtems/rtems-

4.11.0-/cpukit/score/src/threadhandler.c:157)

#21 0x132c boot_card(cmdline=0xa74338 "DD\247") (/home/hill/nios2-

rtems/rtems/rtems-4.11.0-
/c/src/lib/libbsp/nios2/neek/../../shared/bootcard.c:268)

#22 ( 0x00000000 in ??() (??:??)

Jeff

--
Eric Norum

--
Eric Norum

comment:2 Changed on 11/22/14 at 14:35:38 by Gedare Bloom

Description: modified (diff)
Owner: changed from Eric Norum to Joel Sherrill
Status: newassigned

comment:3 Changed on 11/24/14 at 18:58:28 by Gedare Bloom

Version: HEAD4.11

Replace Version=HEAD with Version=4.11 for the tickets with Milestone >= 4.11

comment:4 Changed on 12/19/14 at 05:06:18 by Gedare Bloom

Priority: normalhighest

Bump priority to highest for tickets with a fix attached or seemingly simple fix proposed in the description or comments.

comment:5 Changed on 03/02/15 at 20:42:45 by Gedare Bloom

Milestone: 4.114.11.1

bump milestone

comment:6 Changed on 01/26/17 at 07:16:00 by Sebastian Huber

Milestone: 4.11.14.11.2

comment:7 Changed on 03/21/17 at 04:03:35 by Chris Johns

Description: modified (diff)
Resolution: wontfix
Status: assignedclosed

Use 4.12 and libbsd.

Note: See TracTickets for help on using tickets.