#2622 closed defect (fixed)

FAT file corruption when pre-empted while appending to a file

Reported by: Stella Laurenzo Owned by: Sebastian Huber
Priority: normal Milestone: 4.11.2
Component: fs/fat Version: 4.11
Severity: normal Keywords:
Cc: Blocked By:


We've been circling around some odd problems for a while where some of our files end up with garbage sequences in them. I'll save you the hand-wringing diagnostic steps, and jump to the conclusion: when opening and appending to an existing file, sometimes a cluster gets written that contains data from another concurrent write operation (to a different file). An isolated repro is hard to get, but we wedged our code into a state where we can repro it 100% of the time.

I traced the problem down to this sequence (introduced in commit 42a22f0824c4618b864582804ce1440b548a462f - 2012):

In fat_file_write_fat32_or_non_root_dir:

        if (file_cln_initial < file_cln_cnt)
            overwrite_cluster = true;

Triggers (in fat_block_write):

        if (   overwrite_block
            || (bytes_to_write == fs_info->vol.bytes_per_block))
            rc = fat_buf_access(fs_info, sec_num, FAT_OP_TYPE_GET, &blk_buf);
        else {
            rc = fat_buf_access(fs_info, sec_num, FAT_OP_TYPE_READ, &blk_buf);

I have a task that wakes up every 5s, opens the file for append, and writes some hundreds of bytes. With a little bit of logging, we find that each operation that does not extend past the first cluster (4KiB) takes the FAT_OP_TYPE_READ branch. Then as soon as the first write to the second file cluster is made (which is usually an overflow from a user-level write that spanned the 4K boundary), all future writes take the FAT_OP_TYPE_GET branch.

I was convinced for a while that perhaps some proximate code of ours was corrupting some bit of accounting, but upon reading through what this is doing, I cannot wrap my head around how the intention was correct. The "if (file_cln_initial < file_cln_cnt)" condition could be unpacked to:

  if (fat_fd->map.file_cln < (seek_disk_cln - start_disk_cln))

I don't see how this arithmetic is correct. We are comparing a file cln to the delta between two disk clns, which unless if I am missing something, is meaningless. Also, we are getting the file cln from the cache, the interpretation of which depends entirely on the operation that took place when it was queried (which is in fat_file_write).

I think the only way this makes sense is if this check were instead passing if we are writing to the last cluster of the file at offset 0 within the cluster. At any other time, this needs to be a read-modify-write because we can't just overwrite the cluster. I'm not sure how to express this, though.

It turns out that for many operations without considering pre-emption, the buffer you get back with fat_buf_access(FAT_OP_TYPE_GET) is populated with the cluster data. When writing sequentially to a file from a single task, this seems to hold together. However, being pre-empted by a higher priority writer may cause some buffer churn and will result in writing a cluster that has the beginning corrupted. We see this as periodic corruption, the beginning of which is always aligned to a 4KiB file offset boundary.

If we hard-code overwrite_cluster to always be false, we do not experience corruption (assuming some performance penalty in these corner cases).

Can someone either confirm or explain what this code is (supposed to be) doing? I'm not ruling out that we are causing a problem here, but right now I am leaning to a defect in the filesystem.

Attachments (1)

fatbug.diff (2.7 KB) - added by Stella Laurenzo on 03/04/16 at 21:48:54.

Download all attachments as: .zip

Change History (10)

Changed on 03/04/16 at 21:48:54 by Stella Laurenzo

Attachment: fatbug.diff added

comment:1 Changed on 03/04/16 at 21:49:45 by Stella Laurenzo

We are running locally with this patch and have not experienced concurrency issues.

comment:2 Changed on 03/04/16 at 22:29:21 by Gedare Bloom

Owner: set to Sebastian Huber
Status: newassigned

comment:3 Changed on 03/14/16 at 09:18:22 by Sebastian Huber

The mentioned commit is not a part of the RTEMS 4.10 series. Which RTEMS version do you actually use? It would be very good to have a self-contained test case. For example see testsuites/fstests/fsdosfswrite01/init.c in the latest RTEMS sources.

comment:4 Changed on 12/23/16 at 14:10:53 by Sebastian Huber

Priority: normallow

comment:5 Changed on 01/25/17 at 13:45:44 by Sebastian Huber

Milestone: 4.10.3Indefinite
Owner: changed from Sebastian Huber to Needs Funding

comment:6 Changed on 03/14/17 at 13:44:37 by Sebastian Huber

Milestone: Indefinite4.11.2
Owner: changed from Needs Funding to Sebastian Huber
Priority: lownormal
Version: 4.104.11

comment:7 Changed on 03/16/17 at 14:34:50 by Sebastian Huber <sebastian.huber@…>

In e69ee36/rtems:

dosfs: Fix fat_file_write()

Remove forced overwrite which leads to file data corruption. The logic
to determine a forced overwrite was fundamentally broken. For simplity,
disable this feature.

Update #2622.

comment:8 Changed on 03/21/17 at 15:16:52 by Sebastian Huber <sebastian.huber@…>

Resolution: fixed
Status: assignedclosed

In c38f1fcf/rtems:

dosfs: Fix fat_file_write()

Remove forced overwrite which leads to file data corruption. The logic
to determine a forced overwrite was fundamentally broken. For simplity,
disable this feature.

Close #2622.

comment:9 Changed on 10/10/17 at 06:50:58 by Sebastian Huber

Component: fsfs/fat
Note: See TracTickets for help on using tickets.