From: Paul Eggert Date: Wed, 8 Sep 2010 20:40:10 +0000 (-0700) Subject: tar: improve documentation of reliability and security issues X-Git-Url: https://git.dogcows.com/gitweb?p=chaz%2Ftar;a=commitdiff_plain;h=c743301494cf038412a89de6aabea16c04facf2a tar: improve documentation of reliability and security issues * doc/tar.texi (Reliability and security, Reliability): (Permissions problems, Data corruption and repair, Race conditions): (Security, Privacy, Integrity, Live untrusted data): (Security rules of thumb): New nodes. --- diff --git a/doc/tar.texi b/doc/tar.texi index 364d005..983e47a 100644 --- a/doc/tar.texi +++ b/doc/tar.texi @@ -107,6 +107,7 @@ document. The rest of the menu lists all the lower level nodes. * Date input formats:: * Formats:: * Media:: +* Reliability and security:: Appendices @@ -8556,6 +8557,9 @@ For example: $ @kbd{tar -c -f archive.tar -C / home} @end smallexample +@xref{Integrity}, for some of the security-related implications +of using this option. + @include getdate.texi @node Formats @@ -9337,6 +9341,9 @@ and use @option{--dereference} (@option{-h}): many systems do not support symbolic links, and moreover, your distribution might be unusable if it contains unresolved symbolic links. +The @option{--dereference} option is not secure if an untrusted user +can modify files during creation or extraction. @xref{Security}. + @node hard links @subsection Hard Links @cindex File names, using hard links @@ -11721,6 +11728,275 @@ disabled) switch, a notch which can be popped out or covered, a ring which can be removed from the center of a tape reel, or some other changeable feature. +@node Reliability and security +@chapter Reliability and Security + +The @command{tar} command reads and writes files as any other +application does, and is subject to the usual caveats about +reliability and security. This section contains some commonsense +advice on the topic. + +@menu +* Reliability:: +* Security:: +@end menu + +@node Reliability +@section Reliability + +Ideally, when @command{tar} is creating an archive, it reads from a +file system that is not being modified, and encounters no errors or +inconsistencies while reading and writing. If this is the case, the +archive should faithfully reflect what was read. Similarly, when +extracting from an archive, ideally @command{tar} ideally encounters +no errors and the extracted files faithfully reflect what was in the +archive. + +However, when reading or writing real-world file systems, several +things can go wrong; these include permissions problems, corruption of +data, and race conditions. + +@menu +* Permissions problems:: +* Data corruption and repair:: +* Race conditions:: +@end menu + +@node Permissions problems +@subsection Permissions Problems + +If @command{tar} encounters errors while reading or writing files, it +normally reports an error and exits with nonzero status. The work it +does may therefore be incomplete. For example, when creating an +archive, if @command{tar} cannot read a file then it cannot copy the +file into the archive. + +@node Data corruption and repair +@subsection Data Corruption and Repair + +If an archive becomes corrupted by an I/O error, this may corrupt the +data in an extracted file. Worse, it may corrupt the file's metadata, +which may cause later parts of the archive to become misinterpreted. +An tar-format archive contains a checksum that most likely will detect +errors in the metadata, but it will not detect errors in the data. + +If data corruption is a concern, you can compute and check your own +checksums of an archive by using other programs, such as +@command{cksum}. + +When attempting to recover from a read error or data corruption in an +archive, you may need to skip past the questionable data and read the +rest of the archive. This requires some expertise in the archive +format and in other software tools. + +@node Race conditions +@subsection Race conditions + +If some other process is modifying the file system while @command{tar} +is reading or writing files, the result may well be inconsistent due +to race conditions. For example, if another process creates some +files in a directory while @command{tar} is creating an archive +containing the directory's files, @command{tar} may see some of the +files but not others, or it may see a file that is in the process of +being created. The resulting archive may not be a snapshot of the +file system at any point in time. If an application such as a +database system depends on an accurate snapshot, restoring from the +@command{tar} archive of a live file system may therefore break that +consistency and may break the application. The simplest way to avoid +the consistency issues is to avoid making other changes to the file +system while tar is reading it or writing it. + +When creating an archive, several options are available to avoid race +conditions. Some hosts have a way of snapshotting a file system, or +of temporarily suspending all changes to a file system, by (say) +suspending the only virtual machine that can modify a file system; if +you use these facilities and have @command{tar -c} read from a +snapshot when creating an archive, you can avoid inconsistency +problems. More drastically, before starting @command{tar} you could +suspend or shut down all processes other than @command{tar} that have +access to the file system, or you could unmount the file system and +then mount it read-only. + +When extracting from an archive, one approach to avoid race conditions +is to create a directory that no other process can write to, and +extract into that. + +@node Security +@section Security + +In some cases @command{tar} may be used in an adversarial situation, +where an untrusted user is attempting to gain information about or +modify otherwise-inaccessible files. Dealing with untrusted data +(that is, data generated by an untrusted user) typically requires +extra care, because even the smallest mistake in the use of +@command{tar} is more likely to be exploited by an adversary than by a +race condition. + +@menu +* Privacy:: +* Integrity:: +* Live untrusted data:: +* Security rules of thumb:: +@end menu + +@node Privacy +@subsection Privacy + +Standard privacy concerns apply when using @command{tar}. For +example, suppose you are archiving your home directory into a file +@file{/archive/myhome.tar}. Any secret information in your home +directory, such as your SSH secret keys, are copied faithfully into +the archive. Therefore, if your home directory contains any file that +should not be read by some other user, the archive itself should be +not be readable by that user. And even if the archive's data are +inaccessible to untrusted users, its metadata (such as size or +last-modified date) may reveal some information about your home +directory; if the metadata are intended to be private, the archive's +parent directory should also be inaccessible to untrusted users. + +One precaution is to create @file{/archive} so that it is not +accessible to any user, unless that user also has permission to access +all the files in your home directory. + +Similarly, when extracting from an archive, take care that the +permissions of the extracted files are not more generous than what you +want. Even if the archive itself is readable only to you, files +extracted from it have their own permissions that may differ. + +@node Integrity +@subsection Integrity + +When creating archives, take care that they are not writable by a +untrusted user; otherwise, that user could modify the archive, and +when you later extract from the archive you will get incorrect data. + +When @command{tar} extracts from an archive, by default it writes into +files relative to the working directory. If the archive was generated +by an untrusted user, that user therefore can write into any file +under the working directory. If the working directory contains a +symbolic link to another directory, the untrusted user can also write +into any file under the referenced directory. When extracting from an +untrusted archive, it is therefore good practice to create an empty +directory and run @command{tar} in that directory. + +When extracting from two or more untrusted archives, each one should +be extracted independently, into different empty directories. +Otherwise, the first archive could create a symbolic link into an area +outside the working directory, and the second one could follow the +link and overwrite data that is not under the working directory. For +example, when restoring from a series of incremental dumps, the +archives should have been created by a trusted process, as otherwise +the incremental restores might alter data outside the working +directory. + +If you use the @option{--absolute-names} (@option{-P}) option when +extracting, @command{tar} respects any file names in the archive, even +file names that begin with @file{/} or contain @file{..}. As this +lets the archive overwrite any file in your system that you can write, +the @option{--absolute-names} (@option{-P}) option should be used only +for trusted archives. + +Conversely, with the @option{--keep-old-files} (@option{-k}) option, +@command{tar} refuses to replace existing files when extracting; and +with the @option{--no-overwrite-dir} option, @command{tar} refuses to +replace the permissions or ownership of already-existing directories. +These options may help when extracting from untrusted archives. + +@node Live untrusted data +@subsection Dealing with Live Untrusted Data + +Extra care is required when creating from or extracting into a file +system that is accessible to untrusted users. For example, superusers +who invoke @command{tar} must be wary about its actions being hijacked +by an adversary who is reading or writing the file system at the same +time that @command{tar} is operating. + +When creating an archive from a live file system, @command{tar} is +vulnerable to denial-of-service attacks. For example, an adversarial +user could create the illusion of an indefinitely-deep directory +hierarchy @file{d/e/f/g/...} by creating directories one step ahead of +@command{tar}, or the illusion of an indefinitely-long file by +creating a sparse file but arranging for blocks to be allocated just +before @command{tar} reads them. There is no easy way for +@command{tar} to distinguish these scenarios from legitimate uses, so +you may need to monitor @command{tar}, just as you'd need to monitor +any other system service, to detect such attacks. + +While a superuser is extracting from an archive into a live file +system, an untrusted user might replace a directory with a symbolic +link, in hopes that @command{tar} will follow the symbolic link and +extract data into files that the untrusted user does not have access +to. Even if the archive was generated by the superuser, it may +contain a file such as @file{d/etc/passwd} that the untrusted user +earlier created in order to break in; if the untrusted user replaces +the directory @file{d/etc} with a symbolic link to @file{/etc} while +@command{tar} is running, @command{tar} will overwrite +@file{/etc/passwd}. This attack can be prevented by extracting into a +directory that is inaccessible to untrusted users. + +Similar attacks via symbolic links are also possible when creating an +archive, if the untrusted user can modify an ancestor of a top-level +argument of @command{tar}. For example, an untrusted user that can +modify @file{/home/eve} can hijack a running instance of @samp{tar -cf +- /home/eve/Documents/yesterday} by replacing +@file{/home/eve/Documents} with a symbolic link to some other +location. Attacks like these can be prevented by making sure that +untrusted users cannot modify any files that are top-level arguments +to @command{tar}, or any ancestor directories of these files. + +@node Security rules of thumb +@subsection Security Rules of Thumb + +This section briefly summarizes rules of thumb for avoiding security +pitfalls. + +@itemize @bullet + +@item +Protect archives at least as much as you protect any of the files +being archived. + +@item +Extract from an untrusted archive only into an otherwise-empty +directory. This directory and its parent should be accessible only to +trusted users. For example: + +@example +@group +$ @kbd{chmod go-rwx .} +$ @kbd{mkdir -m go-rwx dir} +$ @kbd{cd dir} +$ @kbd{tar -xvf /archives/got-it-off-the-net.tar.gz} +@end group +@end example + +As a corollary, do not do an incremental restore from an untrusted archive. + +@item +Do not let untrusted users access files extracted from untrusted +archives without checking first for problems such as setuid programs. + +@item +Do not let untrusted users modify directories that are ancestors of +top-level arguments of @command{tar}. For example, while you are +executing @samp{tar -cf /archive/u-home.tar /u/home}, do not let an +untrusted user modify @file{/}, @file{/archive}, or @file{/u}. + +@item +Pay attention to the diagnostics and exit status of @command{tar}. + +@item +When archiving live file systems, monitor running instances of +@command{tar} to detect denial-of-service attacks. + +@item +Avoid unusual options such as @option{--absolute-names} (@option{-P}), +@option{--dereference} (@option{-h}), @option{--overwrite}, +@option{--recursive-unlink}, and @option{--remove-files} unless you +understand their security implications. + +@end itemize + @node Changes @appendix Changes