#2622 closed defect (fixed)
FAT file corruption when pre-empted while appending to a file
Reported by: | Stella Laurenzo | Owned by: | Sebastian Huber |
---|---|---|---|
Priority: | normal | Milestone: | 4.11.2 |
Component: | fs/fat | Version: | 4.11 |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: |
Description
We've been circling around some odd problems for a while where some of our files end up with garbage sequences in them. I'll save you the hand-wringing diagnostic steps, and jump to the conclusion: when opening and appending to an existing file, sometimes a cluster gets written that contains data from another concurrent write operation (to a different file). An isolated repro is hard to get, but we wedged our code into a state where we can repro it 100% of the time.
I traced the problem down to this sequence (introduced in commit 42a22f0824c4618b864582804ce1440b548a462f - 2012):
In fat_file_write_fat32_or_non_root_dir:
if (file_cln_initial < file_cln_cnt) overwrite_cluster = true;Triggers (in fat_block_write):
if ( overwrite_block || (bytes_to_write == fs_info->vol.bytes_per_block)) { rc = fat_buf_access(fs_info, sec_num, FAT_OP_TYPE_GET, &blk_buf); } else { rc = fat_buf_access(fs_info, sec_num, FAT_OP_TYPE_READ, &blk_buf); }
I have a task that wakes up every 5s, opens the file for append, and writes some hundreds of bytes. With a little bit of logging, we find that each operation that does not extend past the first cluster (4KiB) takes the FAT_OP_TYPE_READ branch. Then as soon as the first write to the second file cluster is made (which is usually an overflow from a user-level write that spanned the 4K boundary), all future writes take the FAT_OP_TYPE_GET branch.
I was convinced for a while that perhaps some proximate code of ours was corrupting some bit of accounting, but upon reading through what this is doing, I cannot wrap my head around how the intention was correct. The "if (file_cln_initial < file_cln_cnt)" condition could be unpacked to:
if (fat_fd->map.file_cln < (seek_disk_cln - start_disk_cln))
I don't see how this arithmetic is correct. We are comparing a file cln to the delta between two disk clns, which unless if I am missing something, is meaningless. Also, we are getting the file cln from the cache, the interpretation of which depends entirely on the operation that took place when it was queried (which is in fat_file_write).
I think the only way this makes sense is if this check were instead passing if we are writing to the last cluster of the file at offset 0 within the cluster. At any other time, this needs to be a read-modify-write because we can't just overwrite the cluster. I'm not sure how to express this, though.
It turns out that for many operations without considering pre-emption, the buffer you get back with fat_buf_access(FAT_OP_TYPE_GET) is populated with the cluster data. When writing sequentially to a file from a single task, this seems to hold together. However, being pre-empted by a higher priority writer may cause some buffer churn and will result in writing a cluster that has the beginning corrupted. We see this as periodic corruption, the beginning of which is always aligned to a 4KiB file offset boundary.
If we hard-code overwrite_cluster to always be false, we do not experience corruption (assuming some performance penalty in these corner cases).
Can someone either confirm or explain what this code is (supposed to be) doing? I'm not ruling out that we are causing a problem here, but right now I am leaning to a defect in the filesystem.
Attachments (1)
Change History (10)
Changed on 03/04/16 at 21:48:54 by Stella Laurenzo
Attachment: | fatbug.diff added |
---|
comment:1 Changed on 03/04/16 at 21:49:45 by Stella Laurenzo
comment:2 Changed on 03/04/16 at 22:29:21 by Gedare Bloom
Owner: | set to Sebastian Huber |
---|---|
Status: | new → assigned |
comment:3 Changed on 03/14/16 at 09:18:22 by Sebastian Huber
The mentioned commit is not a part of the RTEMS 4.10 series. Which RTEMS version do you actually use? It would be very good to have a self-contained test case. For example see testsuites/fstests/fsdosfswrite01/init.c in the latest RTEMS sources.
comment:4 Changed on 12/23/16 at 14:10:53 by Sebastian Huber
Priority: | normal → low |
---|
comment:5 Changed on 01/25/17 at 13:45:44 by Sebastian Huber
Milestone: | 4.10.3 → Indefinite |
---|---|
Owner: | changed from Sebastian Huber to Needs Funding |
comment:6 Changed on 03/14/17 at 13:44:37 by Sebastian Huber
Milestone: | Indefinite → 4.11.2 |
---|---|
Owner: | changed from Needs Funding to Sebastian Huber |
Priority: | low → normal |
Version: | 4.10 → 4.11 |
comment:8 Changed on 03/21/17 at 15:16:52 by Sebastian Huber <sebastian.huber@…>
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
In c38f1fcf/rtems:
comment:9 Changed on 10/10/17 at 06:50:58 by Sebastian Huber
Component: | fs → fs/fat |
---|
We are running locally with this patch and have not experienced concurrency issues.