Opened on 11/29/21 at 08:28:43
Closed on 12/09/21 at 07:21:20
#4552 closed defect (fixed)
untar: problems with existing directories
Reported by: | Christian Mauderer | Owned by: | Christian Mauderer |
---|---|---|---|
Priority: | normal | Milestone: | Indefinite |
Component: | lib | Version: | 5 |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: |
Description
Our current implementation of untar in cpukit/libmisc/untar/untar.c has problems if a directory in the archive already exists. Note that this is no problem, if the archive contains only a file.
The problem exists on 5 and master.
Example: If I have a tar.gz file which contains a file and directories l1/l2/x.txt and call Untar_FromGzChunk_Print twice, the first attempt will print
untar: dir: l1 untar: dir: l1/l2 untar: file: l1/l2/x.txt (s:12,m:0644)
After that the directories l1 already exists. So if I re-try to extract the archive, I'll get the following:
untar: dir: l1 untar: mkdir: l1: (17) File exists
My expectation would have been that the files are just integrated into an existing directory structure. If a file exists, it should be overwritten.
We have multiple references for expected behavior. GNU or BSD tar
or POSIX pax
. In my experience tar
is the better known tool so my suggestion would be to use the default behavior of tar
as a reference.
GNU or BSD tar
I tested the default behavior of GNU tar
and BSD tar
. It seems to be the same for both:
- If a directory structure exists, the files from the archive will be integrated. Existing files are overwritten.
- If a file exists and the archive contains a directory with the same name, the file is removed and a directory is created. In the above example: if
l1/l2
is a file it will be overwritten with a new directory.
- If a directory exists and the archive contains a file with the same name, the directory will be replaced if it is empty. If it contains files, the result is an error.
- An archive also can contain only a file without the parent directories. If in that case one of the parent directories exists as a file extracting the archive results in an error. In the example: if
l1/l2
is a file and the archive doesn't contain the directories but only the filel1/l2/x.txt
that would be an error.
In case of an error, it is possible that the archive has been partially extracted.
Note: GNU tar
has options to change the behavior (like --recursive-unlink). I'm sure there are similar options in BSD tar
. From my point of view we should adapt to the default behavior, so I ignored these options.
The POSIX pax
utility
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html
Default behavior is described as follows:
If an attempt is made to extract a directory when the directory already exists, this shall not be considered an error. If an attempt is made to extract a FIFO when the FIFO already exists, this shall not be considered an error.
From some quick tests pax
has a similar behavior like tar
. The only difference I noted is that empty directories are not overwritten with files from the archive.
Change History (2)
comment:1 Changed on 11/29/21 at 08:42:53 by Christian Mauderer
comment:2 Changed on 12/09/21 at 07:21:20 by Christian Mauderer <christian.mauderer@…>
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
In [changeset:"ff3f3490df7120c9ec039b5acd1935265c3f9262/rtems" ff3f3490/rtems]:
PS: A maybe related (closed) ticket is #3823. The solution to that ticket changed the behavior.