Opened on 08/08/22 at 23:08:09
Last modified on 08/14/22 at 06:13:12
#4698 new defect
Potential Deadlock While Performing RFS Operations
Reported by: | Joel | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | arch/sparc | Version: | 4.11 |
Severity: | normal | Keywords: | RFS |
Cc: | Blocked By: | ||
Blocking: |
Description
I'm hitting what seems to be a potential deadlock issue in a RTEMS 4.11 and SPARC V8 based system. The file system is configured as RTEMS_FILESYSTEM_TYPE_RFS. The system creates, modifies, and deletes files in an asynchronous fashion throughout the execution. At some point, I started to see intermittent dropout of system logging. I think the "at some point" can be attributed to the volume of file system related operations growing as the system grew.
As I investigated, it appears that the system never returns from fopen() calls (at least most commonly). I modified the implementation of the system such that readdir() is called periodically (like 50 times per execution), in the meantime, I spam opening and closing files. This seems to reproduce the issue every time.
To remove any potential issues with the system that may be causing, or exacerbating the problem, I wrote a standalone application with 2 tasks. One task would open/close files. The other task would read the directory. Same file system type, disk size, device and mount path, etc. I was able to reproduce the issue within this application consistently. That being said, the deadlock seems to happen anywhere from a few minutes, to almost a half hour. Attached is the implementation of initialization of the filesystem and tasks.
GDB doesn't seem to give much useful information on either the full system case, or the sample application. That being said, it is interesting that the backtrace and thread info show the same thing. We seem to end up in the BSP idle thread.
Attachments (2)
Change History (3)
Changed on 08/08/22 at 23:08:43 by Joel
Changed on 08/14/22 at 06:10:23 by Chris Johns
comment:1 Changed on 08/14/22 at 06:13:12 by Chris Johns
I have updated the code to run on RTEMS 6 and to test the locking and it is working. I have tested it on sparc/erc32
and arm/xilinx_zynq_a9_qemu
with SMP and I do not see an issue.
Updated to work on RTEMS 6