Fix typos

[chaz/tar] / doc / tar.texi
diff --git a/doc/tar.texi b/doc/tar.texi

index d39c2def1d65b27e48da5d1984817a23d6c5ddb3..b4bb450abd405f53328a758f9765e05abd1be05e 100644 (file)
--- a/doc/tar.texi
+++ b/doc/tar.texi
@@ -13,14 +13,14 @@
  @c Maintenance notes:
  @c  1. Pay attention to @FIXME{}s and @UNREVISED{}s
  @c  2. Before creating final variant:
-@c    1.1. Run `make check-options' to make sure all options are properly
+@c    2.1. Run `make check-options' to make sure all options are properly
  @c         documented;
-@c    2.1. Run `make master-menu' (see comment before the master menu).
+@c    2.2. Run `make master-menu' (see comment before the master menu).
  
  @include rendition.texi
  @include value.texi
  
-@defcodeindex op  
+@defcodeindex op
  
  @c Put everything in one index (arbitrarily chosen to be the concept index).
  @syncodeindex fn cp
@@ -109,8 +109,8 @@ Appendices
  
  * Changes::
  * Configuring Help Summary::
-* Genfile::
  * Tar Internals::
+* Genfile::
  * Free Software Needs Free Documentation::
  * Copying This Manual::
  * Index of Command Line Options::
@@ -330,11 +330,18 @@ Making @command{tar} Archives More Portable
  * posix::                       @acronym{POSIX} archives
  * Checksumming::                Checksumming Problems
  * Large or Negative Values::    Large files, negative time stamps, etc.
+* Other Tars::                  How to Extract GNU-Specific Data Using
+                                Other @command{tar} Implementations
  
  @GNUTAR{} and @acronym{POSIX} @command{tar}
  
  * PAX keywords:: Controlling Extended Header Keywords.
  
+How to Extract GNU-Specific Data Using Other @command{tar} Implementations
+
+* Split Recovery::       Members Split Between Volumes
+* Sparse Recovery::      Sparse Members
+
  Using Less Space through Compression
  
  * gzip::                        Creating and Reading Compressed Archives
@@ -369,12 +376,6 @@ Using Multiple Tapes
  * Tarcat::                      Concatenate Volumes into a Single Archive
  
  
-Genfile
-
-* Generate Mode::     File Generation Mode.
-* Status Mode::       File Status Mode.
-* Exec Mode::         Synchronous Execution mode.
-
  Tar Internals
  
  * Standard::           Basic Tar Format
@@ -389,6 +390,12 @@ Storing Sparse Files
  * PAX 0::                PAX Format, Versions 0.0 and 0.1
  * PAX 1::                PAX Format, Version 1.0
  
+Genfile
+
+* Generate Mode::     File Generation Mode.
+* Status Mode::       File Status Mode.
+* Exec Mode::         Synchronous Execution mode.
+
  Copying This Manual
  
  * GNU Free Documentation License::  License for copying this manual
@@ -519,7 +526,7 @@ pipes).  @command{tar} may even access remote devices or files (as archives).
  You can use @command{tar} archives in many ways.  We want to stress a few
  of them: storage, backup, and transportation.
  
-@FIXME{the following table entries need a bit of work..}
+@FIXME{the following table entries need a bit of work.}
  @table @asis
  @item Storage
  Often, @command{tar} archives are used to store related files for
@@ -789,7 +796,7 @@ Similarly, the term ``command'' can be confusing, as it is often used in
  two different ways.  People sometimes refer to @command{tar} ``commands''.
  A @command{tar} @dfn{command} is the entire command line of user input
  which tells @command{tar} what to do --- including the operation, options,
-and any arguments (file names, pipes, other commands, etc).  However,
+and any arguments (file names, pipes, other commands, etc.).  However,
  you will also sometimes hear the term ``the @command{tar} command''.  When
  the word ``command'' is used specifically like this, a person is usually
  referring to the @command{tar} @emph{operation}, not the whole line.
@@ -891,7 +898,7 @@ clear, and we will give many examples both using and not using
  @option{--verbose} to show the differences.
  
  Each instance of @option{--verbose} on the command line increases the
-verbosity level by one, so if you need more details on the output, 
+verbosity level by one, so if you need more details on the output,
  specify it twice.
  
  When reading archives (@option{--list}, @option{--extract},
@@ -904,7 +911,7 @@ In contrast, when writing archives (@option{--create}, @option{--append},
  default.  So, a single @option{--verbose} option shows the file names
  being added to the archive, while two @option{--verbose} options
  enable the full listing.
-   
+
  For example, to create an archive in verbose mode:
  
  @smallexample
@@ -988,16 +995,10 @@ not encounter this.
  The archive member is a GNU @dfn{volume header} (@pxref{Tape Files}).
  
  @item --Continued at byte @var{n}--
-Encountered only at the beginning of a multy-volume archive
+Encountered only at the beginning of a multi-volume archive
  (@pxref{Using Multiple Tapes}).  This archive member is a continuation
  from the previous volume. The number @var{n} gives the offset where
-the original file was split.  
-
-@item --Mangled file names--
-This archive member contains @dfn{mangled file names} declarations,
-a special member type that was used by early versions of @GNUTAR{}.
-You probably will never encounter this, unless you are reading a very
-old archive.
+the original file was split.
  
  @item  unknown file type @var{c}
  An archive member of unknown type. @var{c} is the type character from
@@ -1371,7 +1372,7 @@ particular archive contains.  You can use the @option{--list}
  appear in the archive, as well as various attributes of the files at
  the time they were archived.  For example, you can examine the archive
  @file{collection.tar} that you created in the last section with the
-command, 
+command,
  
  @smallexample
  $ @kbd{tar --list --file=collection.tar}
@@ -1573,7 +1574,7 @@ mistakenly deleted one of the files you had placed in the archive
  @file{collection.tar} earlier (say, @file{blues}), you can extract it
  from the archive without changing the archive's structure.  Its
  contents will be identical to the original file @file{blues} that you
-deleted. 
+deleted.
  
  First, make sure you are in the @file{practice} directory, and list the
  files in the directory.  Now, delete the file, @samp{blues}, and list
@@ -1623,7 +1624,7 @@ Here, @option{--wildcards} instructs @command{tar} to treat
  command line arguments as globbing patterns and @option{--no-anchored}
  informs it that the patterns apply to member names after any @samp{/}
  delimiter.  The use of globbing patterns is discussed in detail in
-@xref{wildcards}. 
+@xref{wildcards}.
  
  You can extract a file to standard output by combining the above options
  with the @option{--to-stdout} (@option{-O}) option (@pxref{Writing to Standard
@@ -1887,13 +1888,33 @@ All abnormal exits, whether immediate or delayed, should always be
  clearly diagnosed on @code{stderr}, after a line stating the nature of
  the error.
  
-@GNUTAR{} returns only a few exit statuses.  I'm really
-aiming simplicity in that area, for now.  If you are not using the
-@option{--compare} @option{--diff}, @option{-d}) option, zero means
-that everything went well, besides maybe innocuous warnings.  Nonzero
-means that something went wrong. Right now, as of today, ``nonzero''
-is almost always 2, except for remote operations, where it may be
-128.
+Possible exit codes of @GNUTAR{} are summarized in the following
+table:
+
+@table @asis
+@item 0
+@samp{Successful termination}.
+
+@item 1
+@samp{Some files differ}.  If tar was invoked with @option{--compare}
+(@option{--diff}, @option{-d}) command line option, this means that
+some files in the archive differ from their disk counterparts
+(@pxref{compare}).  If tar was given @option{--create},
+@option{--append} or @option{--update} option, this exit code means
+that some files were changed while being archived and so the resulting
+archive does not contain the exact copy of the file set.
+
+@item 2
+@samp{Fatal error}.  This means that some fatal, unrecoverable error
+occurred.
+@end table
+
+If @command{tar} has invoked a subprocess and that subprocess exited with a
+nonzero exit code, @command{tar} exits with that code as well.
+This can happen, for example, if @command{tar} was given some
+compression option (@pxref{gzip}) and the external compressor program
+failed.  Another example is @command{rmt} failure during backup to the
+remote device (@pxref{Remote Tape Server}).
  
  @node using tar options
  @section Using @command{tar} Options
@@ -1973,7 +1994,7 @@ Some options @emph{may} take an argument.  Such options may have at
  most long and short forms, they do not have old style equivalent.  The
  rules for specifying an argument for such options are stricter than
  those for specifying mandatory arguments.  Please, pay special
-attention to them. 
+attention to them.
  
  @menu
  * Long Options::                Long Option Style
@@ -1988,7 +2009,7 @@ attention to them.
  Each option has at least one @dfn{long} (or @dfn{mnemonic}) name starting with two
  dashes in a row, e.g., @option{--list}.  The long names are more clear than
  their corresponding short or old names.  It sometimes happens that a
-single long option has many different different names which are
+single long option has many different names which are
  synonymous, such as @option{--compare} and @option{--diff}.  In addition,
  long option names can be given unique abbreviations.  For example,
  @option{--cre} can be used in place of @option{--create} because there is no
@@ -2426,7 +2447,7 @@ total number of hard links for the file, a warning message will be
  output @footnote{Earlier versions of @GNUTAR{} understood @option{-l} as a
  synonym for @option{--one-file-system}.  The current semantics, which
  complies to UNIX98, was introduced with version
-1.15.91. @xref{Changes}, for more information.}. 
+1.15.91. @xref{Changes}, for more information.}.
  
  @opsummary{compress}
  @opsummary{uncompress}
@@ -2484,6 +2505,11 @@ patterns in the file @var{file}.  @xref{exclude}.
  Automatically excludes all directories
  containing a cache directory tag.  @xref{exclude}.
  
+@opsummary{exclude-tag}
+@item --exclude-tag=@var{file}
+
+Exclude all directories, containing file named @var{file}.  @xref{exclude}.
+
  @opsummary{file}
  @item --file=@var{archive}
  @itemx -f @var{archive}
@@ -2569,7 +2595,7 @@ options to @command{tar} and exit. @xref{help}.
  @opsummary{ignore-case}
  @item --ignore-case
  Ignore case when matching member or file names with
-patterns. @xref{controlling pattern-matching}. 
+patterns. @xref{controlling pattern-matching}.
  
  @opsummary{ignore-command-error}
  @item --ignore-command-error
@@ -2800,7 +2826,7 @@ and group IDs when creating a @command{tar} file, rather than names.
  @item -o
  The function of this option depends on the action @command{tar} is
  performing.  When extracting files, @option{-o} is a synonym for
-@option{--no-same-owner}, i.e.  it prevents @command{tar} from
+@option{--no-same-owner}, i.e., it prevents @command{tar} from
  restoring ownership of files being extracted.
  
  When creating an archive, it is a synonym for
@@ -2881,7 +2907,7 @@ discussion, @xref{transform}.
  
  To see transformed member names in verbose listings, use
  @option{--show-transformed-names} option
-(@pxref{show-transformed-names}).  
+(@pxref{show-transformed-names}).
  
  @opsummary{quote-chars}
  @item --quote-chars=@var{string}
@@ -2973,7 +2999,7 @@ appending it to an archive.  @xref{remove files}.
  @item --restrict
  
  Disable use of some potentially harmful @command{tar} options.
-Currently this option disables shell invocaton from multi-volume menu
+Currently this option disables shell invocation from multi-volume menu
  (@pxref{Using Multiple Tapes}).
  
  @opsummary{rmt-command}
@@ -3048,6 +3074,13 @@ names.  @xref{listing member and file names}.
  Invokes a @acronym{GNU} extension when adding files to an archive that handles
  sparse files efficiently.  @xref{sparse}.
  
+@opsummary{sparse-version}
+@item --sparse-version=@var{version}
+
+Specified the @dfn{format version} to use when archiving sparse
+files.  Implies @option{--sparse}.  @xref{sparse}. For the description
+of the supported sparse formats, @xref{Sparse Formats}.
+
  @opsummary{starting-file}
  @item --starting-file=@var{name}
  @itemx -K @var{name}
@@ -3263,7 +3296,7 @@ them with the equivalent long option.
  @item -m @tab @ref{--touch}.
  
  @item -o @tab When creating, @ref{--no-same-owner}, when extracting ---
-@ref{--portability}. 
+@ref{--portability}.
  
  The later usage is deprecated.  It is retained for compatibility with
  the earlier versions of @GNUTAR{}.  In the future releases
@@ -3392,14 +3425,14 @@ information about @GNUTAR{} is this Texinfo documentation.
  
  @opindex show-defaults
  @GNUTAR{} has some predefined defaults that are used when you do not
-explicitely specify another values.  To obtain a list of such
+explicitly specify another values.  To obtain a list of such
  defaults, use @option{--show-defaults} option.  This will output the
  values in the form of @command{tar} command line options:
  
  @smallexample
  @group
  @kbd{tar --show-defaults}
---format=gnu -f- -b20 --quoting-style=escape 
+--format=gnu -f- -b20 --quoting-style=escape
  --rmt-command=/etc/rmt --rsh-command=/usr/bin/rsh
  @end group
  @end smallexample
@@ -3514,14 +3547,14 @@ statistics is to be printed:
  Print statistics upon delivery of signal @var{signo}.  Valid arguments
  are: @code{SIGHUP}, @code{SIGQUIT}, @code{SIGINT}, @code{SIGUSR1} and
  @code{SIGUSR2}.  Shortened names without @samp{SIG} prefix are also
-accepted. 
+accepted.
  @end table
  
  Both forms of @option{--totals} option can be used simultaneously.
  Thus, @kbd{tar -x --totals --totals=USR1} instructs @command{tar} to
  extract all members from its default archive and print statistics
  after finishing the extraction, as well as when receiving signal
-@code{SIGUSR1}. 
+@code{SIGUSR1}.
  
  @anchor{Progress information}
  @cindex Progress information
@@ -3676,7 +3709,7 @@ consequence of doing so.  The usual consequence is that the single
  file, which was meant to be saved, is rather destroyed.
  @end enumerate
  
-So, recognizing the likelihood and the catastrophical nature of these
+So, recognizing the likelihood and the catastrophic nature of these
  errors, @GNUTAR{} now takes some distance from elegance, and
  cowardly refuses to create an archive when @option{--create} option is
  given, there are no arguments besides options, and
@@ -3926,7 +3959,7 @@ archive in the order in which they were archived.  Thus, when the
  archive is extracted, a file archived later in time will replace a
  file of the same name which was archived earlier, even though the
  older version of the file will remain in the archive unless you delete
-all versions of the file. 
+all versions of the file.
  
  Supposing you change the file @file{blues} and then append the changed
  version to @file{collection.tar}.  As you saw above, the original
@@ -4238,7 +4271,7 @@ tar: funk not found in archive
  The spirit behind the @option{--compare} (@option{--diff},
  @option{-d}) option is to check whether the archive represents the
  current state of files on disk, more than validating the integrity of
-the archive media.  For this later goal, @xref{verify}. 
+the archive media.  For this later goal, @xref{verify}.
  
  @node create options
  @section Options Used by @option{--create}
@@ -4447,7 +4480,7 @@ The @option{--ignore-zeros} (@option{-i}) option is turned off by default becaus
  versions of @command{tar} write garbage after the end-of-archive entry,
  since that part of the media is never supposed to be read.  @GNUTAR{}
  does not write after the end of an archive, but seeks to
-maintain compatiblity among archiving utilities.
+maintain compatibility among archiving utilities.
  
  @table @option
  @item --ignore-zeros
@@ -4645,7 +4678,7 @@ Use in conjunction with @option{--extract} (@option{--get}, @option{-x}).
  To set the modes (access permissions) of extracted files to those
  recorded for those files in the archive, use @option{--same-permissions}
  in conjunction with the @option{--extract} (@option{--get},
-@option{-x}) operation.  
+@option{-x}) operation.
  
  @table @option
  @opindex preserve-permissions
@@ -4662,7 +4695,7 @@ archive, instead of current umask settings.  Use in conjunction with
  @node Directory Modification Times and Permissions
  @unnumberedsubsubsec Directory Modification Times and Permissions
  
-After sucessfully extracting a file member, @GNUTAR{} normally
+After successfully extracting a file member, @GNUTAR{} normally
  restores its permissions and modification times, as described in the
  previous sections.  This cannot be done for directories, because
  after extracting a directory @command{tar} will almost certainly
@@ -4693,9 +4726,9 @@ incremental archives (@pxref{Incremental Dumps}).  The member order in
  an incremental archive is reversed: first all directory members are
  stored, followed by other (non-directory) members.  So, when extracting
  from incremental archives, @GNUTAR{} alters the above procedure.  It
-remebers all restored directories, and restores their meta-data
+remembers all restored directories, and restores their meta-data
  only after the entire archive has been processed.  Notice, that you do
-not need to specity any special options for that, as @GNUTAR{}
+not need to specify any special options for that, as @GNUTAR{}
  automatically detects archives in incremental format.
  
  There may be cases, when such processing is required for normal archives
@@ -4778,7 +4811,7 @@ or even like this if you want to process the concatenation of the files:
  tar -xOzf foo.tgz bigfile1 bigfile2 | process
  @end smallexample
  
-Hovewer, @option{--to-command} may be more convenient for use with
+However, @option{--to-command} may be more convenient for use with
  multiple files. See the next section.
  
  @node Writing to an External Program
@@ -5381,7 +5414,7 @@ then in order to restore the exact contents the file system  had when
  the last level was created, you will need to restore from all backups
  in turn.  Continuing our example, to restore the state of @file{/usr}
  file system, one would do@footnote{Notice, that since both archives
-were created withouth @option{-P} option (@pxref{absolute}), these
+were created without @option{-P} option (@pxref{absolute}), these
  commands should be run from the root file system.}:
  
  @smallexample
@@ -5409,7 +5442,7 @@ Versions of @GNUTAR{} up to 1.15.1 used to dump verbatim binary
  contents of the DUMPDIR header (with terminating nulls) when
  @option{--incremental} or @option{--listed-incremental} option was
  given, no matter what the verbosity level.  This behavior, and,
-especially, the binary output it produced were considered incovenient
+especially, the binary output it produced were considered inconvenient
  and were changed in version 1.16}:
  
  @smallexample
@@ -5560,7 +5593,7 @@ normally be the host that actually contains the file system.  However,
  the host machine must have @GNUTAR{} installed, and
  must be able to access the directory containing the backup scripts and
  their support files using the same file name that is used on the
-machine where the scripts are run (i.e.  what @command{pwd} will print
+machine where the scripts are run (i.e., what @command{pwd} will print
  when in that directory on that machine).  If the host that contains
  the file system does not have this capability, you can specify another
  host as long as it can access the file system through NFS.
@@ -5609,7 +5642,7 @@ to use public key authentication.
  
  @defvr {Backup variable} RSH_COMMAND
  
-Full file name of @command{rsh} binary on remote mashines.  This will
+Full file name of @command{rsh} binary on remote machines.  This will
  be passed via @option{--rsh-command} option to the remote invocation
  of @GNUTAR{}.
  @end defvr
@@ -5894,7 +5927,7 @@ Force backup even if today's log file already exists.
  @item -v[@var{level}]
  @itemx --verbose[=@var{level}]
  Set verbosity level.  The higher the level is, the more debugging
-information will be output during execution.  Devault @var{level}
+information will be output during execution.  Default @var{level}
  is 100, which means the highest debugging level.
  
  @item -t @var{start-time}
@@ -5966,7 +5999,7 @@ Start restoring from the given backup level, instead of the default 0.
  @item -v[@var{level}]
  @itemx --verbose[=@var{level}]
  Set verbosity level.  The higher the level is, the more debugging
-information will be output during execution.  Devault @var{level}
+information will be output during execution.  Default @var{level}
  is 100, which means the highest debugging level.
  
  @item -h
@@ -6069,7 +6102,7 @@ floppy disk, or CD write drive.
  If you do not name the archive, @command{tar} uses the value of the
  environment variable @env{TAPE} as the file name for the archive.  If
  that is not available, @command{tar} uses a default, compiled-in archive
-name, usually that for tape unit zero (i.e.  @file{/dev/tu00}).
+name, usually that for tape unit zero (i.e., @file{/dev/tu00}).
  
  @cindex Standard input and output
  @cindex tar to standard input and output
@@ -6136,7 +6169,7 @@ can be inhibited by using the @option{--force-local} option.
  When the archive is being created to @file{/dev/null}, @GNUTAR{}
  tries to minimize input and output operations.  The Amanda backup
  system, when used with @GNUTAR{}, has an initial sizing pass which
-uses this feature. 
+uses this feature.
  
  @node Selecting Archive Members
  @section Selecting Archive Members
@@ -6164,9 +6197,9 @@ name, replacing @dfn{escape sequences} according to the following
  table:
  
  @multitable @columnfractions 0.20 0.60
-@headitem Escape @tab Replaced with    
+@headitem Escape @tab Replaced with
  @item \a         @tab Audible bell (ASCII 7)
-@item \b         @tab Backspace (ASCII 8)  
+@item \b         @tab Backspace (ASCII 8)
  @item \f         @tab Form feed (ASCII 12)
  @item \n         @tab New line (ASCII 10)
  @item \r         @tab Carriage return (ASCII 13)
@@ -6222,7 +6255,7 @@ By default, @command{tar} takes file names from the command line.  However,
  there are other ways to specify file or member names, or to modify the
  manner in which @command{tar} selects the files or members upon which to
  operate.  In general, these methods work both for specifying the names
-of files and archive members. 
+of files and archive members.
  
  @node files
  @section Reading Names from a File
@@ -6234,7 +6267,7 @@ Instead of giving the names of files or archive members on the command
  line, you can put the names into a file, and then use the
  @option{--files-from=@var{file-of-names}} (@option{-T
  @var{file-of-names}}) option to @command{tar}.  Give the name of the
-file which contains the list of files to include as the argument to 
+file which contains the list of files to include as the argument to
  @option{--files-from}.  In the list, the file names should be separated by
  newlines.  You will frequently use this option when you have generated
  the list of files to archive with the @command{find} utility.
@@ -6363,7 +6396,7 @@ The @option{--null} option causes
  @option{--files-from=@var{file-of-names}} (@option{-T @var{file-of-names}})
  to read file names terminated by a @code{NUL} instead of a newline, so
  files whose names contain newlines can be archived using
-@option{--files-from}. 
+@option{--files-from}.
  
  @table @option
  @opindex null
@@ -6414,7 +6447,7 @@ Causes @command{tar} to ignore files that match the @var{pattern}.
  @findex exclude
  The @option{--exclude=@var{pattern}} option prevents any file or
  member whose name matches the shell wildcard (@var{pattern}) from
-being operated on. 
+being operated on.
  For example, to create an archive with all the contents of the directory
  @file{src} except for files whose names end in @file{.o}, use the
  command @samp{tar -cf src.tar --exclude='*.o' src}.
@@ -6453,6 +6486,38 @@ Various applications write cache directory tags into directories they
  use to hold regenerable, non-precious data, so that such data can be
  more easily excluded from backups.
  
+@findex exclude-tag
+Another option, @option{--exclude-tag}, provides a generalization of
+this concept.  It takes a single argument, a file name to look for.
+Any directory that contains this file will be excluded from the dump.
+
+@table @option
+@opindex exclude-tag
+@item --exclude-tag=@var{file}
+Causes @command{tar} to ignore directories containing @var{file}.
+Multiple @option{--exclude-tag} options can be given.
+@end table
+
+For example:
+
+@smallexample
+$ @kbd{find dir}
+dir
+dir/blues
+dir/jazz
+dir/folk
+dir/folk/tagfile
+$ @kbd{tar -cf archive.tar --exclude-tag=tagfile -v}
+dir/
+dir/blues
+dir/jazz
+./tar: dir/folk/: contains a cache directory tag tagfile; not dumped
+$ @kbd{tar -tf archive.tar}
+dir/
+dir/blues
+dir/jazz
+@end smallexample
+
  @menu
  * problems with exclude::
  @end menu
@@ -6512,7 +6577,7 @@ might fail.
  @item
  @FIXME{The change in semantics must have occurred before 1.11,
  so I doubt if it is worth mentioning at all. Anyway, should at
-least specify in which version the semantics changed.} 
+least specify in which version the semantics changed.}
  In earlier versions of @command{tar}, what is now the
  @option{--exclude-from} option was called @option{--exclude} instead.
  Now, @option{--exclude} applies to patterns listed on the command
@@ -6597,7 +6662,7 @@ There are no inclusion members in create mode (@option{--create} and
  command line refer to @emph{files}, not archive members.
  
  By default, inclusion members are compared with archive members
-literally @footnote{Notice that earlier @GNUTAR{} versions used 
+literally @footnote{Notice that earlier @GNUTAR{} versions used
  globbing for inclusion members, which contradicted to UNIX98
  specification and was not documented. @xref{Changes}, for more
  information on this and other changes.} and exclusion members are
@@ -6625,7 +6690,7 @@ This behavior can be altered by using the following options:
  @table @option
  @opindex wildcards
  @item --wildcards
-Treat all member names as wildcards. 
+Treat all member names as wildcards.
  
  @opindex no-wildcards
  @item --no-wildcards
@@ -6644,7 +6709,7 @@ b.c
  Notice quoting of the pattern to prevent the shell from interpreting
  it.
  
-The effect of @option{--wildcards} option is cancelled by
+The effect of @option{--wildcards} option is canceled by
  @option{--no-wildcards}.  This can be used to pass part of
  the command line arguments verbatim and other part as globbing
  patterns.  For example, the following invocation:
@@ -6808,7 +6873,7 @@ Quoting styles:
  No quoting, display each character as is:
  
  @smallexample
-@group 
+@group
  $ @kbd{tar tf arch.tar --quoting-style=literal}
  ./
  ./a space
@@ -6951,7 +7016,7 @@ quoting style would not quote them.
  @end table
  
  For example, using @samp{escape} quoting (compare with the usual
-escape listing above): 
+escape listing above):
  
  @smallexample
  @group
@@ -7024,7 +7089,7 @@ $ @kbd{tar -xf usr.tar --strip=2 usr/include/stdlib.h}
  
  The option @option{--strip=2} instructs @command{tar} to strip the
  two leading components (@file{usr/} and @file{include/}) off the file
-name. 
+name.
  
  If you add to the above invocation @option{--verbose} (@option{-v})
  option, you will note that the verbose listing still contains the
@@ -7117,10 +7182,10 @@ Only replace the @var{number}th match of the @var{regexp}.
  Note: the @var{posix} standard does not specify what should happen
  when you mix the @samp{g} and @var{number} modifiers.  @GNUTAR{}
  follows the GNU @command{sed} implementation in this regard, so
-the the interaction is defined to be: ignore matches before the
+the interaction is defined to be: ignore matches before the
  @var{number}th, and then match and replace all matches from the
  @var{number}th on.
-                                   
+
  @end table
  
  Any delimiter can be used in lieue of @samp{/}, the only requirement being
@@ -7188,7 +7253,7 @@ $ @kbd{tar -cf arch.tar --transform='s,^usr/,var/,' \
  If both @option{--strip-components} and @option{--transform} are used
  together, then @option{--transform} is applied first, and the required
  number of components is then stripped from its result.
-    
+
  @node after
  @section Operating Only on New Files
  @UNREVISED
@@ -7404,7 +7469,6 @@ mentioned by name on the standard error.
  
  @node directory
  @subsection Changing the Working Directory
-@UNREVISED
  
  @FIXME{need to read over this node now for continuity; i've switched
  things around some.}
@@ -7490,12 +7554,10 @@ For instance, the file list for the above example will be:
  
  @smallexample
  @group
--C
-/etc
+-C/etc
  passwd
  hosts
--C
-/lib
+--directory=/lib
  libc.a
  @end group
  @end smallexample
@@ -7507,9 +7569,6 @@ To use it, you would invoke @command{tar} as follows:
  $ @kbd{tar -c -f foo.tar --files-from list}
  @end smallexample
  
-Notice also that you can only use the short option variant in the file
-list, i.e., always use @option{-C}, not @option{--directory}.
-
  The interpretation of @option{--directory} is disabled by
  @option{--null} option.
  
@@ -7673,7 +7732,7 @@ cases the maximum file name length will be shorter than 256
  characters.
  @item The maximum length of a symbolic link name is limited to
  100 characters.
-@item Maximum size of a file the archive is able to accomodate
+@item Maximum size of a file the archive is able to accommodate
  is 8GB
  @item Maximum value of UID/GID is 2097151.
  @item Maximum number of bits in device major and minor numbers is 21.
@@ -7719,154 +7778,617 @@ to create archives in @samp{gnu} format, however, future version will
  switch to @samp{posix}.
  
  @menu
-* Portability::                 Making @command{tar} Archives More Portable
  * Compression::                 Using Less Space through Compression
  * Attributes::                  Handling File Attributes
+* Portability::                 Making @command{tar} Archives More Portable
  * cpio::                        Comparison of @command{tar} and @command{cpio}
  @end menu
  
-@node Portability
-@section Making @command{tar} Archives More Portable
-
-Creating a @command{tar} archive on a particular system that is meant to be
-useful later on many other machines and with other versions of @command{tar}
-is more challenging than you might think.  @command{tar} archive formats
-have been evolving since the first versions of Unix.  Many such formats
-are around, and are not always compatible with each other.  This section
-discusses a few problems, and gives some advice about making @command{tar}
-archives more portable.
-
-One golden rule is simplicity.  For example, limit your @command{tar}
-archives to contain only regular files and directories, avoiding
-other kind of special files.  Do not attempt to save sparse files or
-contiguous files as such.  Let's discuss a few more problems, in turn.
-
-@FIXME{Discuss GNU extensions (incremental backups, multi-volume
-archives and archive labels) in GNU and PAX formats.}
+@node Compression
+@section Using Less Space through Compression
  
  @menu
-* Portable Names::              Portable Names
-* dereference::                 Symbolic Links
-* old::                         Old V7 Archives
-* ustar::                       Ustar Archives
-* gnu::                         GNU and old GNU format archives.
-* posix::                       @acronym{POSIX} archives
-* Checksumming::                Checksumming Problems
-* Large or Negative Values::    Large files, negative time stamps, etc.
+* gzip::                        Creating and Reading Compressed Archives
+* sparse::                      Archiving Sparse Files
  @end menu
  
-@node Portable Names
-@subsection Portable Names
-
-Use portable file and member names.  A name is portable if it contains
-only ASCII letters and digits, @samp{/}, @samp{.}, @samp{_}, and
-@samp{-}; it cannot be empty, start with @samp{-} or @samp{//}, or
-contain @samp{/-}.  Avoid deep directory nesting.  For portability to
-old Unix hosts, limit your file name components to 14 characters or
-less.
+@node gzip
+@subsection Creating and Reading Compressed Archives
+@cindex Compressed archives
+@cindex Storing archives in compressed format
  
-If you intend to have your @command{tar} archives to be read under
-MSDOS, you should not rely on case distinction for file names, and you
-might use the @acronym{GNU} @command{doschk} program for helping you
-further diagnosing illegal MSDOS names, which are even more limited
-than System V's.
+@GNUTAR{} is able to create and read compressed archives.  It supports
+@command{gzip} and @command{bzip2} compression programs.  For backward
+compatibility, it also supports @command{compress} command, although
+we strongly recommend against using it, since there is a patent
+covering the algorithm it uses and you could be sued for patent
+infringement merely by running @command{compress}!  Besides, it is less
+effective than @command{gzip} and @command{bzip2}.
  
-@node dereference
-@subsection Symbolic Links
-@cindex File names, using symbolic links
-@cindex Symbolic link as file name
+Creating a compressed archive is simple: you just specify a
+@dfn{compression option} along with the usual archive creation
+commands.  The compression option is @option{-z} (@option{--gzip}) to
+create a @command{gzip} compressed archive, @option{-j}
+(@option{--bzip2}) to create a @command{bzip2} compressed archive, and
+@option{-Z} (@option{--compress}) to use @command{compress} program.
+For example:
  
-@opindex dereference
-Normally, when @command{tar} archives a symbolic link, it writes a
-block to the archive naming the target of the link.  In that way, the
-@command{tar} archive is a faithful record of the file system contents.
-@option{--dereference} (@option{-h}) is used with @option{--create} (@option{-c}), and causes
-@command{tar} to archive the files symbolic links point to, instead of
-the links themselves.  When this option is used, when @command{tar}
-encounters a symbolic link, it will archive the linked-to file,
-instead of simply recording the presence of a symbolic link.
+@smallexample
+$ @kbd{tar cfz archive.tar.gz .}
+@end smallexample
  
-The name under which the file is stored in the file system is not
-recorded in the archive.  To record both the symbolic link name and
-the file name in the system, archive the file under both names.  If
-all links were recorded automatically by @command{tar}, an extracted file
-might be linked to a file name that no longer exists in the file
-system.
+Reading compressed archive is even simpler: you don't need to specify
+any additional options as @GNUTAR{} recognizes its format
+automatically.  Thus, the following commands will list and extract the
+archive created in previous example:
  
-If a linked-to file is encountered again by @command{tar} while creating
-the same archive, an entire second copy of it will be stored.  (This
-@emph{might} be considered a bug.)
+@smallexample
+# List the compressed archive
+$ @kbd{tar tf archive.tar.gz}
+# Extract the compressed archive
+$ @kbd{tar xf archive.tar.gz}
+@end smallexample
  
-So, for portable archives, do not archive symbolic links as such,
-and use @option{--dereference} (@option{-h}): many systems do not support
-symbolic links, and moreover, your distribution might be unusable if
-it contains unresolved symbolic links.
+The only case when you have to specify a decompression option while
+reading the archive is when reading from a pipe or from a tape drive
+that does not support random access.  However, in this case @GNUTAR{}
+will indicate which option you should use.  For example:
  
-@node old
-@subsection Old V7 Archives
-@cindex Format, old style
-@cindex Old style format
-@cindex Old style archives
-@cindex v7 archive format
+@smallexample
+$ @kbd{cat archive.tar.gz | tar tf -}
+tar: Archive is compressed.  Use -z option
+tar: Error is not recoverable: exiting now
+@end smallexample
  
-Certain old versions of @command{tar} cannot handle additional
-information recorded by newer @command{tar} programs.  To create an
-archive in V7 format (not ANSI), which can be read by these old
-versions, specify the @option{--format=v7} option in
-conjunction with the @option{--create} (@option{-c}) (@command{tar} also
-accepts @option{--portability} or @samp{op-old-archive} for this
-option).  When you specify it,
-@command{tar} leaves out information about directories, pipes, fifos,
-contiguous files, and device files, and specifies file ownership by
-group and user IDs instead of group and user names.
+If you see such diagnostics, just add the suggested option to the
+invocation of @GNUTAR{}:
  
-When updating an archive, do not use @option{--format=v7}
-unless the archive was created using this option.
+@smallexample
+$ @kbd{cat archive.tar.gz | tar tfz -}
+@end smallexample
  
-In most cases, a @emph{new} format archive can be read by an @emph{old}
-@command{tar} program without serious trouble, so this option should
-seldom be needed.  On the other hand, most modern @command{tar}s are
-able to read old format archives, so it might be safer for you to
-always use @option{--format=v7} for your distributions.
+Notice also, that there are several restrictions on operations on
+compressed archives.  First of all, compressed archives cannot be
+modified, i.e., you cannot update (@option{--update} (@option{-u})) them or delete
+(@option{--delete}) members from them.  Likewise, you cannot append
+another @command{tar} archive to a compressed archive using
+@option{--append} (@option{-r})).  Secondly, multi-volume archives cannot be
+compressed.
  
-@node ustar
-@subsection Ustar Archive Format
+The following table summarizes compression options used by @GNUTAR{}.
  
-@cindex ustar archive format
-Archive format defined by @acronym{POSIX}.1-1988 specification is called
-@code{ustar}.  Although it is more flexible than the V7 format, it
-still has many restrictions (@xref{Formats,ustar}, for the detailed
-description of @code{ustar} format).  Along with V7 format,
-@code{ustar} format is a good choice for archives intended to be read
-with other implementations of @command{tar}.
+@table @option
+@opindex gzip
+@opindex ungzip
+@item -z
+@itemx --gzip
+@itemx --ungzip
+Filter the archive through @command{gzip}.
  
-To create archive in @code{ustar} format, use @option{--format=ustar}
-option in conjunction with the @option{--create} (@option{-c}).
+You can use @option{--gzip} and @option{--gunzip} on physical devices
+(tape drives, etc.) and remote files as well as on normal files; data
+to or from such devices or remote files is reblocked by another copy
+of the @command{tar} program to enforce the specified (or default) record
+size.  The default compression parameters are used; if you need to
+override them, set @env{GZIP} environment variable, e.g.:
  
-@node gnu
-@subsection @acronym{GNU} and old @GNUTAR{} format
+@smallexample
+$ @kbd{GZIP=--best tar cfz archive.tar.gz subdir}
+@end smallexample
  
-@cindex GNU archive format
-@cindex Old GNU archive format
-@GNUTAR{} was based on an early draft of the
-@acronym{POSIX} 1003.1 @code{ustar} standard.  @acronym{GNU} extensions to
-@command{tar}, such as the support for file names longer than 100
-characters, use portions of the @command{tar} header record which were
-specified in that @acronym{POSIX} draft as unused.  Subsequent changes in
-@acronym{POSIX} have allocated the same parts of the header record for
-other purposes.  As a result, @GNUTAR{} format is
-incompatible with the current @acronym{POSIX} specification, and with
-@command{tar} programs that follow it.
+@noindent
+Another way would be to avoid the @option{--gzip} (@option{--gunzip}, @option{--ungzip}, @option{-z}) option and run
+@command{gzip} explicitly:
  
-In the majority of cases, @command{tar} will be configured to create
-this format by default.  This will change in the future releases, since
-we plan to make @samp{posix} format the default.
+@smallexample
+$ @kbd{tar cf - subdir | gzip --best -c - > archive.tar.gz}
+@end smallexample
  
-To force creation a @GNUTAR{} archive, use option
-@option{--format=gnu}.
+@cindex corrupted archives
+About corrupted compressed archives: @command{gzip}'ed files have no
+redundancy, for maximum compression.  The adaptive nature of the
+compression scheme means that the compression tables are implicitly
+spread all over the archive.  If you lose a few blocks, the dynamic
+construction of the compression tables becomes unsynchronized, and there
+is little chance that you could recover later in the archive.
  
-@node posix
-@subsection @GNUTAR{} and @acronym{POSIX} @command{tar}
+There are pending suggestions for having a per-volume or per-file
+compression in @GNUTAR{}.  This would allow for viewing the
+contents without decompression, and for resynchronizing decompression at
+every volume or file, in case of corrupted archives.  Doing so, we might
+lose some compressibility.  But this would have make recovering easier.
+So, there are pros and cons.  We'll see!
+
+@opindex bzip2
+@item -j
+@itemx --bzip2
+Filter the archive through @code{bzip2}.  Otherwise like @option{--gzip}.
+
+@opindex compress
+@opindex uncompress
+@item -Z
+@itemx --compress
+@itemx --uncompress
+Filter the archive through @command{compress}.  Otherwise like @option{--gzip}.
+
+The @acronym{GNU} Project recommends you not use
+@command{compress}, because there is a patent covering the algorithm it
+uses.  You could be sued for patent infringement merely by running
+@command{compress}.
+
+@opindex use-compress-program
+@item --use-compress-program=@var{prog}
+Use external compression program @var{prog}.  Use this option if you
+have a compression program that @GNUTAR{} does not support.  There
+are two requirements to which @var{prog} should comply:
+
+First, when called without options, it should read data from standard
+input, compress it and output it on standard output.
+
+Secondly, if called with @option{-d} argument, it should do exactly
+the opposite, i.e., read the compressed data from the standard input
+and produce uncompressed data on the standard output.
+@end table
+
+@cindex gpg, using with tar
+@cindex gnupg, using with tar
+@cindex Using encrypted archives
+The @option{--use-compress-program} option, in particular, lets you
+implement your own filters, not necessarily dealing with
+compression/decompression.  For example, suppose you wish to implement
+PGP encryption on top of compression, using @command{gpg} (@pxref{Top,
+gpg, gpg ---- encryption and signing tool, gpg, GNU Privacy Guard
+Manual}).  The following script does that:
+
+@smallexample
+@group
+#! /bin/sh
+case $1 in
+-d) gpg --decrypt - | gzip -d -c;;
+'') gzip -c | gpg -s ;;
+*)  echo "Unknown option $1">&2; exit 1;;
+esac
+@end group
+@end smallexample
+
+Suppose you name it @file{gpgz} and save it somewhere in your
+@env{PATH}.  Then the following command will create a compressed
+archive signed with your private key:
+
+@smallexample
+$ @kbd{tar -cf foo.tar.gpgz --use-compress=gpgz .}
+@end smallexample
+
+@noindent
+Likewise, the following command will list its contents:
+
+@smallexample
+$ @kbd{tar -tf foo.tar.gpgz --use-compress=gpgz .}
+@end smallexample
+
+@ignore
+The above is based on the following discussion:
+
+     I have one question, or maybe it's a suggestion if there isn't a way
+     to do it now.  I would like to use @option{--gzip}, but I'd also like
+     the output to be fed through a program like @acronym{GNU}
+     @command{ecc} (actually, right now that's @samp{exactly} what I'd like
+     to use :-)), basically adding ECC protection on top of compression.
+     It seems as if this should be quite easy to do, but I can't work out
+     exactly how to go about it.  Of course, I can pipe the standard output
+     of @command{tar} through @command{ecc}, but then I lose (though I
+     haven't started using it yet, I confess) the ability to have
+     @command{tar} use @command{rmt} for it's I/O (I think).
+
+     I think the most straightforward thing would be to let me specify a
+     general set of filters outboard of compression (preferably ordered,
+     so the order can be automatically reversed on input operations, and
+     with the options they require specifiable), but beggars shouldn't be
+     choosers and anything you decide on would be fine with me.
+
+     By the way, I like @command{ecc} but if (as the comments say) it can't
+     deal with loss of block sync, I'm tempted to throw some time at adding
+     that capability.  Supposing I were to actually do such a thing and
+     get it (apparently) working, do you accept contributed changes to
+     utilities like that?  (Leigh Clayton @file{loc@@soliton.com}, May 1995).
+
+  Isn't that exactly the role of the
+  @option{--use-compress-prog=@var{program}} option?
+  I never tried it myself, but I suspect you may want to write a
+  @var{prog} script or program able to filter stdin to stdout to
+  way you want.  It should recognize the @option{-d} option, for when
+  extraction is needed rather than creation.
+
+  It has been reported that if one writes compressed data (through the
+  @option{--gzip} or @option{--compress} options) to a DLT and tries to use
+  the DLT compression mode, the data will actually get bigger and one will
+  end up with less space on the tape.
+@end ignore
+
+@node sparse
+@subsection Archiving Sparse Files
+@cindex Sparse Files
+
+Files in the file system occasionally have @dfn{holes}.  A @dfn{hole}
+in a file is a section of the file's contents which was never written.
+The contents of a hole reads as all zeros.  On many operating systems,
+actual disk storage is not allocated for holes, but they are counted
+in the length of the file.  If you archive such a file, @command{tar}
+could create an archive longer than the original.  To have @command{tar}
+attempt to recognize the holes in a file, use @option{--sparse}
+(@option{-S}).  When you use this option, then, for any file using
+less disk space than would be expected from its length, @command{tar}
+searches the file for consecutive stretches of zeros.  It then records
+in the archive for the file where the consecutive stretches of zeros
+are, and only archives the ``real contents'' of the file.  On
+extraction (using @option{--sparse} is not needed on extraction) any
+such files have holes created wherever the continuous stretches of zeros
+were found.  Thus, if you use @option{--sparse}, @command{tar} archives
+won't take more space than the original.
+
+@table @option
+@opindex sparse
+@item -S
+@itemx --sparse
+This option instructs @command{tar} to test each file for sparseness
+before attempting to archive it.  If the file is found to be sparse it
+is treated specially, thus allowing to decrease the amount of space
+used by its image in the archive.
+
+This option is meaningful only when creating or updating archives.  It
+has no effect on extraction.
+@end table
+
+Consider using @option{--sparse} when performing file system backups,
+to avoid archiving the expanded forms of files stored sparsely in the
+system.
+
+Even if your system has no sparse files currently, some may be
+created in the future.  If you use @option{--sparse} while making file
+system backups as a matter of course, you can be assured the archive
+will never take more space on the media than the files take on disk
+(otherwise, archiving a disk filled with sparse files might take
+hundreds of tapes).  @xref{Incremental Dumps}.
+
+However, be aware that @option{--sparse} option presents a serious
+drawback.  Namely, in order to determine if the file is sparse
+@command{tar} has to read it before trying to archive it, so in total
+the file is read @strong{twice}.  So, always bear in mind that the
+time needed to process all files with this option is roughly twice
+the time needed to archive them without it.
+@FIXME{A technical note:
+
+Programs like @command{dump} do not have to read the entire file; by
+examining the file system directly, they can determine in advance
+exactly where the holes are and thus avoid reading through them.  The
+only data it need read are the actual allocated data blocks.
+@GNUTAR{} uses a more portable and straightforward
+archiving approach, it would be fairly difficult that it does
+otherwise.  Elizabeth Zwicky writes to @file{comp.unix.internals}, on
+1990-12-10:
+
+@quotation
+What I did say is that you cannot tell the difference between a hole and an
+equivalent number of nulls without reading raw blocks.  @code{st_blocks} at
+best tells you how many holes there are; it doesn't tell you @emph{where}.
+Just as programs may, conceivably, care what @code{st_blocks} is (care
+to name one that does?), they may also care where the holes are (I have
+no examples of this one either, but it's equally imaginable).
+
+I conclude from this that good archivers are not portable.  One can
+arguably conclude that if you want a portable program, you can in good
+conscience restore files with as many holes as possible, since you can't
+get it right.
+@end quotation
+}
+
+@cindex sparse formats, defined
+When using @samp{POSIX} archive format, @GNUTAR{} is able to store
+sparse files using in three distinct ways, called @dfn{sparse
+formats}.  A sparse format is identified by its @dfn{number},
+consisting, as usual of two decimal numbers, delimited by a dot.  By
+default, format @samp{1.0} is used.  If, for some reason, you wish to
+use an earlier format, you can select it using
+@option{--sparse-version} option.
+
+@table @option
+@opindex sparse-version
+@item --sparse-version=@var{version}
+
+Select the format to store sparse files in.  Valid @var{version} values
+are: @samp{0.0}, @samp{0.1} and @samp{1.0}.  @xref{Sparse Formats},
+for a detailed description of each format.
+@end table
+
+Using @option{--sparse-format} option implies @option{--sparse}.
+
+@node Attributes
+@section Handling File Attributes
+@UNREVISED
+
+When @command{tar} reads files, it updates their access times.  To
+avoid this, use the @option{--atime-preserve[=METHOD]} option, which can either
+reset the access time retroactively or avoid changing it in the first
+place.
+
+Handling of file attributes
+
+@table @option
+@opindex atime-preserve
+@item --atime-preserve
+@itemx --atime-preserve=replace
+@itemx --atime-preserve=system
+Preserve the access times of files that are read.  This works only for
+files that you own, unless you have superuser privileges.
+
+@option{--atime-preserve=replace} works on most systems, but it also
+restores the data modification time and updates the status change
+time.  Hence it doesn't interact with incremental dumps nicely
+(@pxref{Incremental Dumps}), and it can set access or data modification times
+incorrectly if other programs access the file while @command{tar} is
+running.
+
+@option{--atime-preserve=system} avoids changing the access time in
+the first place, if the operating system supports this.
+Unfortunately, this may or may not work on any given operating system
+or file system.  If @command{tar} knows for sure it won't work, it
+complains right away.
+
+Currently @option{--atime-preserve} with no operand defaults to
+@option{--atime-preserve=replace}, but this is intended to change to
+@option{--atime-preserve=system} when the latter is better-supported.
+
+@opindex touch
+@item -m
+@itemx --touch
+Do not extract data modification time.
+
+When this option is used, @command{tar} leaves the data modification times
+of the files it extracts as the times when the files were extracted,
+instead of setting it to the times recorded in the archive.
+
+This option is meaningless with @option{--list} (@option{-t}).
+
+@opindex same-owner
+@item --same-owner
+Create extracted files with the same ownership they have in the
+archive.
+
+This is the default behavior for the superuser,
+so this option is meaningful only for non-root users, when @command{tar}
+is executed on those systems able to give files away.  This is
+considered as a security flaw by many people, at least because it
+makes quite difficult to correctly account users for the disk space
+they occupy.  Also, the @code{suid} or @code{sgid} attributes of
+files are easily and silently lost when files are given away.
+
+When writing an archive, @command{tar} writes the user id and user name
+separately.  If it can't find a user name (because the user id is not
+in @file{/etc/passwd}), then it does not write one.  When restoring,
+it tries to look the name (if one was written) up in
+@file{/etc/passwd}.  If it fails, then it uses the user id stored in
+the archive instead.
+
+@opindex no-same-owner
+@item --no-same-owner
+@itemx -o
+Do not attempt to restore ownership when extracting.  This is the
+default behavior for ordinary users, so this option has an effect
+only for the superuser.
+
+@opindex numeric-owner
+@item --numeric-owner
+The @option{--numeric-owner} option allows (ANSI) archives to be written
+without user/group name information or such information to be ignored
+when extracting.  It effectively disables the generation and/or use
+of user/group name information.  This option forces extraction using
+the numeric ids from the archive, ignoring the names.
+
+This is useful in certain circumstances, when restoring a backup from
+an emergency floppy with different passwd/group files for example.
+It is otherwise impossible to extract files with the right ownerships
+if the password file in use during the extraction does not match the
+one belonging to the file system(s) being extracted.  This occurs,
+for example, if you are restoring your files after a major crash and
+had booted from an emergency floppy with no password file or put your
+disk into another machine to do the restore.
+
+The numeric ids are @emph{always} saved into @command{tar} archives.
+The identifying names are added at create time when provided by the
+system, unless @option{--old-archive} (@option{-o}) is used.  Numeric ids could be
+used when moving archives between a collection of machines using
+a centralized management for attribution of numeric ids to users
+and groups.  This is often made through using the NIS capabilities.
+
+When making a @command{tar} file for distribution to other sites, it
+is sometimes cleaner to use a single owner for all files in the
+distribution, and nicer to specify the write permission bits of the
+files as stored in the archive independently of their actual value on
+the file system.  The way to prepare a clean distribution is usually
+to have some Makefile rule creating a directory, copying all needed
+files in that directory, then setting ownership and permissions as
+wanted (there are a lot of possible schemes), and only then making a
+@command{tar} archive out of this directory, before cleaning
+everything out.  Of course, we could add a lot of options to
+@GNUTAR{} for fine tuning permissions and ownership.
+This is not the good way, I think.  @GNUTAR{} is
+already crowded with options and moreover, the approach just explained
+gives you a great deal of control already.
+
+@xopindex{same-permissions, short description}
+@xopindex{preserve-permissions, short description}
+@item -p
+@itemx --same-permissions
+@itemx --preserve-permissions
+Extract all protection information.
+
+This option causes @command{tar} to set the modes (access permissions) of
+extracted files exactly as recorded in the archive.  If this option
+is not used, the current @code{umask} setting limits the permissions
+on extracted files.  This option is by default enabled when
+@command{tar} is executed by a superuser.
+
+
+This option is meaningless with @option{--list} (@option{-t}).
+
+@opindex preserve
+@item --preserve
+Same as both @option{--same-permissions} and @option{--same-order}.
+
+The @option{--preserve} option has no equivalent short option name.
+It is equivalent to @option{--same-permissions} plus @option{--same-order}.
+
+@FIXME{I do not see the purpose of such an option.  (Neither I.  FP.)
+Neither do I. --Sergey}
+
+@end table
+
+@node Portability
+@section Making @command{tar} Archives More Portable
+
+Creating a @command{tar} archive on a particular system that is meant to be
+useful later on many other machines and with other versions of @command{tar}
+is more challenging than you might think.  @command{tar} archive formats
+have been evolving since the first versions of Unix.  Many such formats
+are around, and are not always compatible with each other.  This section
+discusses a few problems, and gives some advice about making @command{tar}
+archives more portable.
+
+One golden rule is simplicity.  For example, limit your @command{tar}
+archives to contain only regular files and directories, avoiding
+other kind of special files.  Do not attempt to save sparse files or
+contiguous files as such.  Let's discuss a few more problems, in turn.
+
+@FIXME{Discuss GNU extensions (incremental backups, multi-volume
+archives and archive labels) in GNU and PAX formats.}
+
+@menu
+* Portable Names::              Portable Names
+* dereference::                 Symbolic Links
+* old::                         Old V7 Archives
+* ustar::                       Ustar Archives
+* gnu::                         GNU and old GNU format archives.
+* posix::                       @acronym{POSIX} archives
+* Checksumming::                Checksumming Problems
+* Large or Negative Values::    Large files, negative time stamps, etc.
+* Other Tars::                  How to Extract GNU-Specific Data Using
+                                Other @command{tar} Implementations
+@end menu
+
+@node Portable Names
+@subsection Portable Names
+
+Use portable file and member names.  A name is portable if it contains
+only ASCII letters and digits, @samp{/}, @samp{.}, @samp{_}, and
+@samp{-}; it cannot be empty, start with @samp{-} or @samp{//}, or
+contain @samp{/-}.  Avoid deep directory nesting.  For portability to
+old Unix hosts, limit your file name components to 14 characters or
+less.
+
+If you intend to have your @command{tar} archives to be read under
+MSDOS, you should not rely on case distinction for file names, and you
+might use the @acronym{GNU} @command{doschk} program for helping you
+further diagnosing illegal MSDOS names, which are even more limited
+than System V's.
+
+@node dereference
+@subsection Symbolic Links
+@cindex File names, using symbolic links
+@cindex Symbolic link as file name
+
+@opindex dereference
+Normally, when @command{tar} archives a symbolic link, it writes a
+block to the archive naming the target of the link.  In that way, the
+@command{tar} archive is a faithful record of the file system contents.
+@option{--dereference} (@option{-h}) is used with @option{--create} (@option{-c}), and causes
+@command{tar} to archive the files symbolic links point to, instead of
+the links themselves.  When this option is used, when @command{tar}
+encounters a symbolic link, it will archive the linked-to file,
+instead of simply recording the presence of a symbolic link.
+
+The name under which the file is stored in the file system is not
+recorded in the archive.  To record both the symbolic link name and
+the file name in the system, archive the file under both names.  If
+all links were recorded automatically by @command{tar}, an extracted file
+might be linked to a file name that no longer exists in the file
+system.
+
+If a linked-to file is encountered again by @command{tar} while creating
+the same archive, an entire second copy of it will be stored.  (This
+@emph{might} be considered a bug.)
+
+So, for portable archives, do not archive symbolic links as such,
+and use @option{--dereference} (@option{-h}): many systems do not support
+symbolic links, and moreover, your distribution might be unusable if
+it contains unresolved symbolic links.
+
+@node old
+@subsection Old V7 Archives
+@cindex Format, old style
+@cindex Old style format
+@cindex Old style archives
+@cindex v7 archive format
+
+Certain old versions of @command{tar} cannot handle additional
+information recorded by newer @command{tar} programs.  To create an
+archive in V7 format (not ANSI), which can be read by these old
+versions, specify the @option{--format=v7} option in
+conjunction with the @option{--create} (@option{-c}) (@command{tar} also
+accepts @option{--portability} or @option{--old-archive} for this
+option).  When you specify it,
+@command{tar} leaves out information about directories, pipes, fifos,
+contiguous files, and device files, and specifies file ownership by
+group and user IDs instead of group and user names.
+
+When updating an archive, do not use @option{--format=v7}
+unless the archive was created using this option.
+
+In most cases, a @emph{new} format archive can be read by an @emph{old}
+@command{tar} program without serious trouble, so this option should
+seldom be needed.  On the other hand, most modern @command{tar}s are
+able to read old format archives, so it might be safer for you to
+always use @option{--format=v7} for your distributions.  Notice,
+however, that @samp{ustar} format is a better alternative, as it is
+free from many of @samp{v7}'s drawbacks.
+
+@node ustar
+@subsection Ustar Archive Format
+
+@cindex ustar archive format
+Archive format defined by @acronym{POSIX}.1-1988 specification is called
+@code{ustar}.  Although it is more flexible than the V7 format, it
+still has many restrictions (@xref{Formats,ustar}, for the detailed
+description of @code{ustar} format).  Along with V7 format,
+@code{ustar} format is a good choice for archives intended to be read
+with other implementations of @command{tar}.
+
+To create archive in @code{ustar} format, use @option{--format=ustar}
+option in conjunction with the @option{--create} (@option{-c}).
+
+@node gnu
+@subsection @acronym{GNU} and old @GNUTAR{} format
+
+@cindex GNU archive format
+@cindex Old GNU archive format
+@GNUTAR{} was based on an early draft of the
+@acronym{POSIX} 1003.1 @code{ustar} standard.  @acronym{GNU} extensions to
+@command{tar}, such as the support for file names longer than 100
+characters, use portions of the @command{tar} header record which were
+specified in that @acronym{POSIX} draft as unused.  Subsequent changes in
+@acronym{POSIX} have allocated the same parts of the header record for
+other purposes.  As a result, @GNUTAR{} format is
+incompatible with the current @acronym{POSIX} specification, and with
+@command{tar} programs that follow it.
+
+In the majority of cases, @command{tar} will be configured to create
+this format by default.  This will change in the future releases, since
+we plan to make @samp{POSIX} format the default.
+
+To force creation a @GNUTAR{} archive, use option
+@option{--format=gnu}.
+
+@node posix
+@subsection @GNUTAR{} and @acronym{POSIX} @command{tar}
  
  @cindex POSIX archive format
  @cindex PAX archive format
@@ -7876,7 +8398,7 @@ Starting from version 1.14 @GNUTAR{} features full support for
  A @acronym{POSIX} conformant archive will be created if @command{tar}
  was given @option{--format=posix} (@option{--format=pax}) option.  No
  special option is required to read and extract from a @acronym{POSIX}
-archive. 
+archive.
  
  @menu
  * PAX keywords:: Controlling Extended Header Keywords.
@@ -8070,483 +8592,357 @@ be extracted by any tar implementation that understands older
  @FIXME{Describe how @acronym{POSIX} archives are extracted by non
  POSIX-aware tars.}
  
-@node Compression
-@section Using Less Space through Compression
-
-@menu
-* gzip::                        Creating and Reading Compressed Archives
-* sparse::                      Archiving Sparse Files
-@end menu
-
-@node gzip
-@subsection Creating and Reading Compressed Archives
-@cindex Compressed archives
-@cindex Storing archives in compressed format
-
-@GNUTAR{} is able to create and read compressed archives.  It supports
-@command{gzip} and @command{bzip2} compression programs.  For backward
-compatibilty, it also supports @command{compress} command, although
-we strongly recommend against using it, since there is a patent
-covering the algorithm it uses and you could be sued for patent
-infringement merely by running @command{compress}!  Besides, it is less
-effective than @command{gzip} and @command{bzip2}.
-
-Creating a compressed archive is simple: you just specify a
-@dfn{compression option} along with the usual archive creation
-commands.  The compression option is @option{-z} (@option{--gzip}) to
-create a @command{gzip} compressed archive, @option{-j}
-(@option{--bzip2}) to create a @command{bzip2} compressed archive, and
-@option{-Z} (@option{--compress}) to use @command{compress} program.
-For example:
-
-@smallexample
-$ @kbd{tar cfz archive.tar.gz .}
-@end smallexample
-
-Reading compressed archive is even simpler: you don't need to specify
-any additional options as @GNUTAR{} recognizes its format
-automatically.  Thus, the following commands will list and extract the
-archive created in previous example:
-
-@smallexample
-# List the compressed archive
-$ @kbd{tar tf archive.tar.gz}
-# Extract the compressed archive
-$ @kbd{tar xf archive.tar.gz}
-@end smallexample
-
-The only case when you have to specify a decompression option while
-reading the archive is when reading from a pipe or from a tape drive
-that does not support random access.  However, in this case @GNUTAR{}
-will indicate which option you should use.  For example:
-
-@smallexample
-$ @kbd{cat archive.tar.gz | tar tf -}
-tar: Archive is compressed.  Use -z option
-tar: Error is not recoverable: exiting now
-@end smallexample
-
-If you see such diagnostics, just add the suggested option to the
-invocation of @GNUTAR{}:
-
-@smallexample
-$ @kbd{cat archive.tar.gz | tar tfz -}
-@end smallexample
-
-Notice also, that there are several restrictions on operations on
-compressed archives.  First of all, compressed archives cannot be
-modified, i.e., you cannot update (@option{--update} (@option{-u})) them or delete
-(@option{--delete}) members from them.  Likewise, you cannot append
-another @command{tar} archive to a compressed archive using
-@option{--append} (@option{-r})).  Secondly, multi-volume archives cannot be
-compressed.
-
-The following table summarizes compression options used by @GNUTAR{}.
-
-@table @option
-@opindex gzip
-@opindex ungzip
-@item -z
-@itemx --gzip
-@itemx --ungzip
-Filter the archive through @command{gzip}.
-
-You can use @option{--gzip} and @option{--gunzip} on physical devices
-(tape drives, etc.) and remote files as well as on normal files; data
-to or from such devices or remote files is reblocked by another copy
-of the @command{tar} program to enforce the specified (or default) record
-size.  The default compression parameters are used; if you need to
-override them, set @env{GZIP} environment variable, e.g.:
-
-@smallexample
-$ @kbd{GZIP=--best tar cfz archive.tar.gz subdir}
-@end smallexample
-
-@noindent
-Another way would be to avoid the @option{--gzip} (@option{--gunzip}, @option{--ungzip}, @option{-z}) option and run
-@command{gzip} explicitly:
-
-@smallexample
-$ @kbd{tar cf - subdir | gzip --best -c - > archive.tar.gz}
-@end smallexample
-
-@cindex corrupted archives
-About corrupted compressed archives: @command{gzip}'ed files have no
-redundancy, for maximum compression.  The adaptive nature of the
-compression scheme means that the compression tables are implicitly
-spread all over the archive.  If you lose a few blocks, the dynamic
-construction of the compression tables becomes unsynchronized, and there
-is little chance that you could recover later in the archive.
-
-There are pending suggestions for having a per-volume or per-file
-compression in @GNUTAR{}.  This would allow for viewing the
-contents without decompression, and for resynchronizing decompression at
-every volume or file, in case of corrupted archives.  Doing so, we might
-lose some compressibility.  But this would have make recovering easier.
-So, there are pros and cons.  We'll see!
-
-@opindex bzip2
-@item -j
-@itemx --bzip2
-Filter the archive through @code{bzip2}.  Otherwise like @option{--gzip}.
-
-@opindex compress
-@opindex uncompress
-@item -Z
-@itemx --compress
-@itemx --uncompress
-Filter the archive through @command{compress}.  Otherwise like @option{--gzip}.
-
-The @acronym{GNU} Project recommends you not use
-@command{compress}, because there is a patent covering the algorithm it
-uses.  You could be sued for patent infringement merely by running
-@command{compress}.
-
-@opindex use-compress-program
-@item --use-compress-program=@var{prog}
-Use external compression program @var{prog}.  Use this option if you
-have a compression program that @GNUTAR{} does not support.  There
-are two requirements to which @var{prog} should comply:
-
-First, when called without options, it should read data from standard
-input, compress it and output it on standard output.
-
-Secondly, if called with @option{-d} argument, it should do exactly
-the opposite, i.e., read the compressed data from the standard input
-and produce uncompressed data on the standard output.
-@end table
-
-@cindex gpg, using with tar
-@cindex gnupg, using with tar
-@cindex Using encrypted archives
-The @option{--use-compress-program} option, in particular, lets you
-implement your own filters, not necessarily dealing with
-compression/decomression.  For example, suppose you wish to implement
-PGP encryption on top of compression, using @command{gpg} (@pxref{Top,
-gpg, gpg ---- encryption and signing tool, gpg, GNU Privacy Guard
-Manual}).  The following script does that:  
-
-@smallexample
-@group
-#! /bin/sh
-case $1 in
--d) gpg --decrypt - | gzip -d -c;;
-'') gzip -c | gpg -s ;;
-*)  echo "Unknown option $1">&2; exit 1;;
-esac
-@end group
-@end smallexample
+@node Other Tars
+@subsection How to Extract GNU-Specific Data Using Other @command{tar} Implementations
  
-Suppose you name it @file{gpgz} and save it somewhere in your
-@env{PATH}.  Then the following command will create a commpressed
-archive signed with your private key:
+In previous sections you became acquainted with various quirks
+necessary to make your archives portable.  Sometimes you may need to
+extract archives containing GNU-specific members using some
+third-party @command{tar} implementation or an older version of
+@GNUTAR{}.  Of course your best bet is to have @GNUTAR{} installed,
+but if it is for some reason impossible, this section will explain
+how to cope without it.
  
-@smallexample
-$ @kbd{tar -cf foo.tar.gpgz --use-compress=gpgz .}
-@end smallexample
+When we speak about @dfn{GNU-specific} members we mean two classes of
+them: members split between the volumes of a multi-volume archive and
+sparse members.  You will be able to always recover such members if
+the archive is in PAX format.  In addition split members can be
+recovered from archives in old GNU format.  The following subsections
+describe the required procedures in detail.
  
-@noindent
-Likewise, the following command will list its contents:
+@menu
+* Split Recovery::       Members Split Between Volumes
+* Sparse Recovery::      Sparse Members
+@end menu
  
-@smallexample
-$ @kbd{tar -tf foo.tar.gpgz --use-compress=gpgz .}
-@end smallexample
+@node Split Recovery
+@subsubsection Extracting Members Split Between Volumes
  
-@ignore
-The above is based on the following discussion:
+@cindex Mutli-volume archives, extracting using non-GNU tars
+If a member is split between several volumes of an old GNU format archive
+most third party @command{tar} implementation will fail to extract
+it.  To extract it, use @command{tarcat} program (@pxref{Tarcat}).
+This program is available from
+@uref{http://www.gnu.org/@/software/@/tar/@/utils/@/tarcat.html, @GNUTAR{}
+home page}.  It concatenates several archive volumes into a single
+valid archive.  For example, if you have three volumes named from
+@file{vol-1.tar} to @file{vol-3.tar}, you can do the following to
+extract them using a third-party @command{tar}:
  
-     I have one question, or maybe it's a suggestion if there isn't a way
-     to do it now.  I would like to use @option{--gzip}, but I'd also like
-     the output to be fed through a program like @acronym{GNU}
-     @command{ecc} (actually, right now that's @samp{exactly} what I'd like
-     to use :-)), basically adding ECC protection on top of compression.
-     It seems as if this should be quite easy to do, but I can't work out
-     exactly how to go about it.  Of course, I can pipe the standard output
-     of @command{tar} through @command{ecc}, but then I lose (though I
-     haven't started using it yet, I confess) the ability to have
-     @command{tar} use @command{rmt} for it's I/O (I think).
+@smallexample
+$ @kbd{tarcat vol-1.tar vol-2.tar vol-3.tar | tar xf -}
+@end smallexample
  
-     I think the most straightforward thing would be to let me specify a
-     general set of filters outboard of compression (preferably ordered,
-     so the order can be automatically reversed on input operations, and
-     with the options they require specifiable), but beggars shouldn't be
-     choosers and anything you decide on would be fine with me.
+@cindex Mutli-volume archives in PAX format, extracting using non-GNU tars
+You could use this approach for most (although not all) PAX
+format archives as well.  However, extracting split members from a PAX
+archive is a much easier task, because PAX volumes are constructed in
+such a way that each part of a split member is extracted to a
+different file by @command{tar} implementations that are not aware of
+GNU extensions.  More specifically, the very first part retains its
+original name, and all subsequent parts are named using the pattern:
  
-     By the way, I like @command{ecc} but if (as the comments say) it can't
-     deal with loss of block sync, I'm tempted to throw some time at adding
-     that capability.  Supposing I were to actually do such a thing and
-     get it (apparently) working, do you accept contributed changes to
-     utilities like that?  (Leigh Clayton @file{loc@@soliton.com}, May 1995).
- 
-  Isn't that exactly the role of the
-  @option{--use-compress-prog=@var{program}} option? 
-  I never tried it myself, but I suspect you may want to write a
-  @var{prog} script or program able to filter stdin to stdout to
-  way you want.  It should recognize the @option{-d} option, for when
-  extraction is needed rather than creation.
+@smallexample
+%d/GNUFileParts.%p/%f.%n
+@end smallexample
  
-  It has been reported that if one writes compressed data (through the
-  @option{--gzip} or @option{--compress} options) to a DLT and tries to use
-  the DLT compression mode, the data will actually get bigger and one will
-  end up with less space on the tape.
-@end ignore
+@noindent
+where symbols preceeded by @samp{%} are @dfn{macro characters} that
+have the following meaning:
  
-@node sparse
-@subsection Archiving Sparse Files
-@cindex Sparse Files
-@UNREVISED
+@multitable @columnfractions .25 .55
+@headitem Meta-character @tab Replaced By
+@item %d @tab  The directory name of the file, equivalent to the
+result of the @command{dirname} utility on its full name.
+@item %f @tab  The file name of the file, equivalent to the result
+of the @command{basename} utility on its full name.
+@item %p @tab  The process ID of the @command{tar} process that
+created the archive.
+@item %n @tab  Ordinal number of this particular part.
+@end multitable
  
-@table @option
-@opindex sparse
-@item -S
-@itemx --sparse
-Handle sparse files efficiently.
-@end table
+For example, if the file @file{var/longfile} was split during archive
+creation between three volumes, and the creator @command{tar} process
+had process ID @samp{27962}, then the member names will be:
  
-This option causes all files to be put in the archive to be tested for
-sparseness, and handled specially if they are.  The @option{--sparse}
-(@option{-S}) option is useful when many @code{dbm} files, for example, are being
-backed up.  Using this option dramatically decreases the amount of
-space needed to store such a file.
+@smallexample
+var/longfile
+var/GNUFileParts.27962/longfile.1
+var/GNUFileParts.27962/longfile.2
+@end smallexample
  
-In later versions, this option may be removed, and the testing and
-treatment of sparse files may be done automatically with any special
-@acronym{GNU} options.  For now, it is an option needing to be specified on
-the command line with the creation or updating of an archive.
+When you extract your archive using a third-party @command{tar}, these
+files will be created on your disk, and the only thing you will need
+to do to restore your file in its original form is concatenate them in
+the proper order, for example:
  
-Files in the file system occasionally have @dfn{holes}.  A @dfn{hole} in a file
-is a section of the file's contents which was never written.  The
-contents of a hole read as all zeros.  On many operating systems,
-actual disk storage is not allocated for holes, but they are counted
-in the length of the file.  If you archive such a file, @command{tar}
-could create an archive longer than the original.  To have @command{tar}
-attempt to recognize the holes in a file, use @option{--sparse} (@option{-S}).  When
-you use this option, then, for any file using less disk space than
-would be expected from its length, @command{tar} searches the file for
-consecutive stretches of zeros.  It then records in the archive for
-the file where the consecutive stretches of zeros are, and only
-archives the ``real contents'' of the file.  On extraction (using
-@option{--sparse} is not needed on extraction) any such
-files have holes created wherever the continuous stretches of zeros
-were found. Thus, if you use @option{--sparse}, @command{tar} archives
-won't take more space than the original.
+@smallexample
+@group
+$ @kbd{cd var}
+$ @kbd{cat GNUFileParts.27962/longfile.1 \
+  GNUFileParts.27962/longfile.2 >> longfile}
+$ rm -f GNUFileParts.27962
+@end group
+@end smallexample
  
-A file is sparse if it contains blocks of zeros whose existence is
-recorded, but that have no space allocated on disk.  When you specify
-the @option{--sparse} option in conjunction with the @option{--create}
-(@option{-c}) operation, @command{tar} tests all files for sparseness
-while archiving. If @command{tar} finds a file to be sparse, it uses a
-sparse representation of the file in the archive.  @xref{create}, for
-more information about creating archives.
+Notice, that if the @command{tar} implementation you use supports PAX
+format archives, it will probably emit warnings about unknown keywords
+during extraction.  They will look like this:
  
-@option{--sparse} is useful when archiving files, such as dbm files,
-likely to contain many nulls.  This option dramatically
-decreases the amount of space needed to store such an archive.
+@smallexample
+@group
+Tar file too small
+Unknown extended header keyword 'GNU.volume.filename' ignored.
+Unknown extended header keyword 'GNU.volume.size' ignored.
+Unknown extended header keyword 'GNU.volume.offset' ignored.
+@end group
+@end smallexample
  
-@quotation
-@strong{Please Note:} Always use @option{--sparse} when performing file
-system backups, to avoid archiving the expanded forms of files stored
-sparsely in the system.
+@noindent
+You can safely ignore these warnings.
  
-Even if your system has no sparse files currently, some may be
-created in the future.  If you use @option{--sparse} while making file
-system backups as a matter of course, you can be assured the archive
-will never take more space on the media than the files take on disk
-(otherwise, archiving a disk filled with sparse files might take
-hundreds of tapes).  @xref{Incremental Dumps}.
-@end quotation
+If your @command{tar} implementation is not PAX-aware, you will get
+more warnings and more files generated on your disk, e.g.:
  
-@command{tar} ignores the @option{--sparse} option when reading an archive.
+@smallexample
+@group
+$ @kbd{tar xf vol-1.tar}
+var/PaxHeaders.27962/longfile: Unknown file type 'x', extracted as
+normal file
+Unexpected EOF in archive
+$ @kbd{tar xf vol-2.tar}
+tmp/GlobalHead.27962.1: Unknown file type 'g', extracted as normal file
+GNUFileParts.27962/PaxHeaders.27962/sparsefile.1: Unknown file type
+'x', extracted as normal file
+@end group
+@end smallexample
  
-@table @option
-@item --sparse
-@itemx -S
-Files stored sparsely in the file system are represented sparsely in
-the archive.  Use in conjunction with write operations.
-@end table
+Ignore these warnings.  The @file{PaxHeaders.*} directories created
+will contain files with @dfn{extended header keywords} describing the
+extracted files.  You can delete them, unless they describe sparse
+members.  Read further to learn more about them.
  
-However, users should be well aware that at archive creation time,
-@GNUTAR{} still has to read whole disk file to
-locate the @dfn{holes}, and so, even if sparse files use little space
-on disk and in the archive, they may sometimes require inordinate
-amount of time for reading and examining all-zero blocks of a file.
-Although it works, it's painfully slow for a large (sparse) file, even
-though the resulting tar archive may be small.  (One user reports that
-dumping a @file{core} file of over 400 megabytes, but with only about
-3 megabytes of actual data, took about 9 minutes on a Sun Sparcstation
-ELC, with full CPU utilization.)
-
-This reading is required in all cases and is not related to the fact
-the @option{--sparse} option is used or not, so by merely @emph{not}
-using the option, you are not saving time@footnote{Well!  We should say
-the whole truth, here.  When @option{--sparse} is selected while creating
-an archive, the current @command{tar} algorithm requires sparse files to be
-read twice, not once.  We hope to develop a new archive format for saving
-sparse files in which one pass will be sufficient.}.
+@node Sparse Recovery
+@subsubsection Extracting Sparse Members
  
-Programs like @command{dump} do not have to read the entire file; by
-examining the file system directly, they can determine in advance
-exactly where the holes are and thus avoid reading through them.  The
-only data it need read are the actual allocated data blocks.
-@GNUTAR{} uses a more portable and straightforward
-archiving approach, it would be fairly difficult that it does
-otherwise.  Elizabeth Zwicky writes to @file{comp.unix.internals}, on
-1990-12-10:
+@cindex sparse files, extracting with non-GNU tars
+Any @command{tar} implementation will be able to extract sparse members from a
+PAX archive.  However, the extracted files will be @dfn{condensed},
+i.e., any zero blocks will be removed from them.  When we restore such
+a condensed file to its original form, by adding zero bloks (or
+@dfn{holes}) back to their original locations, we call this process
+@dfn{expanding} a compressed sparse file.
  
-@quotation
-What I did say is that you cannot tell the difference between a hole and an
-equivalent number of nulls without reading raw blocks.  @code{st_blocks} at
-best tells you how many holes there are; it doesn't tell you @emph{where}.
-Just as programs may, conceivably, care what @code{st_blocks} is (care
-to name one that does?), they may also care where the holes are (I have
-no examples of this one either, but it's equally imaginable).
+@pindex xsparse
+To expand a file, you will need a simple auxiliary program called
+@command{xsparse}.  It is available in source form from
+@uref{http://www.gnu.org/@/software/@/tar/@/utils/@/xsparse.html, @GNUTAR{}
+home page}.
  
-I conclude from this that good archivers are not portable.  One can
-arguably conclude that if you want a portable program, you can in good
-conscience restore files with as many holes as possible, since you can't
-get it right.
-@end quotation
+@cindex sparse files v.1.0, extracting with non-GNU tars
+Let's begin with archive members in @dfn{sparse format
+version 1.0}@footnote{@xref{PAX 1}.}, which are the easiest to expand.
+The condensed file will contain both file map and file data, so no
+additional data will be needed to restore it.  If the original file
+name was @file{@var{dir}/@var{name}}, then the condensed file will be
+named @file{@var{dir}/@/GNUSparseFile.@var{n}/@/@var{name}}, where
+@var{n} is a decimal number@footnote{technically speaking, @var{n} is a
+@dfn{process ID} of the @command{tar} process which created the
+archive (@pxref{PAX keywords}).}.
  
-@node Attributes
-@section Handling File Attributes
-@UNREVISED
+To expand a version 1.0 file, run @command{xsparse} as follows:
  
-When @command{tar} reads files, it updates their access times.  To
-avoid this, use the @option{--atime-preserve[=METHOD]} option, which can either
-reset the access time retroactively or avoid changing it in the first
-place.
+@smallexample
+$ @kbd{xsparse @file{cond-file}}
+@end smallexample
  
-Handling of file attributes
+@noindent
+where @file{cond-file} is the name of the condensed file.  The utility
+will deduce the name for the resulting expanded file using the
+following algorithm:
+
+@enumerate 1
+@item If @file{cond-file} does not contain any directories,
+@file{../cond-file} will be used;
+
+@item If @file{cond-file} has the form
+@file{@var{dir}/@var{t}/@var{name}}, where both @var{t} and @var{name}
+are simple names, with no @samp{/} characters in them, the output file
+name will be @file{@var{dir}/@var{name}}.
+
+@item Otherwise, if @file{cond-file} has the form
+@file{@var{dir}/@var{name}}, the output file name will be
+@file{@var{name}}.
+@end enumerate
  
-@table @option
-@opindex atime-preserve
-@item --atime-preserve
-@itemx --atime-preserve=replace
-@itemx --atime-preserve=system
-Preserve the access times of files that are read.  This works only for
-files that you own, unless you have superuser privileges.
+In the unlikely case when this algorithm does not suite your needs,
+you can explicitly specify output file name as a second argument to
+the command:
  
-@option{--atime-preserve=replace} works on most systems, but it also
-restores the data modification time and updates the status change
-time.  Hence it doesn't interact with incremental dumps nicely
-(@pxref{Backups}), and it can set access or data modification times
-incorrectly if other programs access the file while @command{tar} is
-running.
+@smallexample
+$ @kbd{xsparse @file{cond-file} @file{out-file}}
+@end smallexample
  
-@option{--atime-preserve=system} avoids changing the access time in
-the first place, if the operating system supports this.
-Unfortunately, this may or may not work on any given operating system
-or file system.  If @command{tar} knows for sure it won't work, it
-complains right away.
+It is often a good idea to run @command{xsparse} in @dfn{dry run} mode
+first.  In this mode, the command does not actually expand the file,
+but verbosely lists all actions it would be taking to do so.  The dry
+run mode is enabled by @option{-n} command line argument:
  
-Currently @option{--atime-preserve} with no operand defaults to
-@option{--atime-preserve=replace}, but this is intended to change to
-@option{--atime-preserve=system} when the latter is better-supported.
+@smallexample
+@group
+$ @kbd{xsparse -n /home/gray/GNUSparseFile.6058/sparsefile}
+Reading v.1.0 sparse map
+Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
+`/home/gray/sparsefile'
+Finished dry run
+@end group
+@end smallexample
  
-@opindex touch
-@item -m
-@itemx --touch
-Do not extract data modification time.
+To actually expand the file, you would run:
  
-When this option is used, @command{tar} leaves the data modification times
-of the files it extracts as the times when the files were extracted,
-instead of setting it to the times recorded in the archive.
+@smallexample
+$ @kbd{xsparse /home/gray/GNUSparseFile.6058/sparsefile}
+@end smallexample
  
-This option is meaningless with @option{--list} (@option{-t}).
+@noindent
+The program behaves the same way all UNIX utilities do: it will keep
+quiet unless it has simething important to tell you (e.g. an error
+condition or something).  If you wish it to produce verbose output,
+similar to that from the dry run mode, use @option{-v} option:
  
-@opindex same-owner
-@item --same-owner
-Create extracted files with the same ownership they have in the
-archive.
+@smallexample
+@group
+$ @kbd{xsparse -v /home/gray/GNUSparseFile.6058/sparsefile}
+Reading v.1.0 sparse map
+Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
+`/home/gray/sparsefile'
+Done
+@end group
+@end smallexample
  
-This is the default behavior for the superuser,
-so this option is meaningful only for non-root users, when @command{tar}
-is executed on those systems able to give files away.  This is
-considered as a security flaw by many people, at least because it
-makes quite difficult to correctly account users for the disk space
-they occupy.  Also, the @code{suid} or @code{sgid} attributes of
-files are easily and silently lost when files are given away.
+Additionally, if your @command{tar} implementation has extracted the
+@dfn{extended headers} for this file, you can instruct @command{xstar}
+to use them in order to verify the integrity of the expanded file.
+The option @option{-x} sets the name of the extended header file to
+use.  Continuing our example:
  
-When writing an archive, @command{tar} writes the user id and user name
-separately.  If it can't find a user name (because the user id is not
-in @file{/etc/passwd}), then it does not write one.  When restoring,
-it tries to look the name (if one was written) up in
-@file{/etc/passwd}.  If it fails, then it uses the user id stored in
-the archive instead. 
+@smallexample
+@group
+$ @kbd{xsparse -v -x /home/gray/PaxHeaders.6058/sparsefile \
+  /home/gray/GNUSparseFile.6058/sparsefile}
+Reading extended header file
+Found variable GNU.sparse.major = 1
+Found variable GNU.sparse.minor = 0
+Found variable GNU.sparse.name = sparsefile
+Found variable GNU.sparse.realsize = 217481216
+Reading v.1.0 sparse map
+Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
+`/home/gray/sparsefile'
+Done
+@end group
+@end smallexample
  
-@opindex no-same-owner
-@item --no-same-owner
-@itemx -o
-Do not attempt to restore ownership when extracting.  This is the
-default behavior for ordinary users, so this option has an effect
-only for the superuser.
+@anchor{extracting sparse v.0.x}
+@cindex sparse files v.0.1, extracting with non-GNU tars
+@cindex sparse files v.0.0, extracting with non-GNU tars
+An @dfn{extended header} is a special @command{tar} archive header
+that precedes an archive member and contains a set of
+@dfn{variables}, describing the member properties that cannot be
+stored in the standard @code{ustar} header.  While optional for
+expanding sparse version 1.0 members, use of extended headers is
+mandatory when expanding sparse members in older sparse formats: v.0.0
+and v.0.1 (The sparse formats are described in detail in @ref{Sparse
+Formats}.)  So, for this format, the question is: how to obtain
+extended headers from the archive?
+
+If you use a @command{tar} implementation that does not support PAX
+format, extended headers for each member will be extracted as a
+separate file.  If we represent the member name as
+@file{@var{dir}/@var{name}}, then the extended header file will be
+named @file{@var{dir}/@/PaxHeaders.@var{n}/@/@var{name}}, where
+@var{n} is an integer number.
+
+Things become more difficult if your @command{tar} implementation
+does support PAX headers, because in this case you will have to
+manually extract the headers.  We recommend the following algorithm:
+
+@enumerate 1
+@item
+Consult the documentation of your @command{tar} implementation for an
+option that prints @dfn{block numbers} along with the archive
+listing (analogous to @GNUTAR{}'s @option{-R} option).  For example,
+@command{star} has @option{-block-number}.
  
-@opindex numeric-owner
-@item --numeric-owner
-The @option{--numeric-owner} option allows (ANSI) archives to be written
-without user/group name information or such information to be ignored
-when extracting.  It effectively disables the generation and/or use
-of user/group name information.  This option forces extraction using
-the numeric ids from the archive, ignoring the names.
+@item
+Obtain verbose listing using the @samp{block number} option, and
+find block numbers of the sparse member in question and the member
+immediately following it.  For example, running @command{star} on our
+archive we obtain:
  
-This is useful in certain circumstances, when restoring a backup from
-an emergency floppy with different passwd/group files for example.
-It is otherwise impossible to extract files with the right ownerships
-if the password file in use during the extraction does not match the
-one belonging to the file system(s) being extracted.  This occurs,
-for example, if you are restoring your files after a major crash and
-had booted from an emergency floppy with no password file or put your
-disk into another machine to do the restore.
+@smallexample
+@group
+$ @kbd{star -t -v -block-number -f arc.tar}
+@dots{}
+star: Unknown extended header keyword 'GNU.sparse.size' ignored.
+star: Unknown extended header keyword 'GNU.sparse.numblocks' ignored.
+star: Unknown extended header keyword 'GNU.sparse.name' ignored.
+star: Unknown extended header keyword 'GNU.sparse.map' ignored.
+block        56:  425984 -rw-r--r--  gray/users Jun 25 14:46 2006 GNUSparseFile.28124/sparsefile
+block       897:   65391 -rw-r--r--  gray/users Jun 24 20:06 2006 README
+@dots{}
+@end group
+@end smallexample
  
-The numeric ids are @emph{always} saved into @command{tar} archives.
-The identifying names are added at create time when provided by the
-system, unless @option{--old-archive} (@option{-o}) is used.  Numeric ids could be
-used when moving archives between a collection of machines using
-a centralized management for attribution of numeric ids to users
-and groups.  This is often made through using the NIS capabilities.
+@noindent
+(as usual, ignore the warnings about unknown keywords.)
  
-When making a @command{tar} file for distribution to other sites, it
-is sometimes cleaner to use a single owner for all files in the
-distribution, and nicer to specify the write permission bits of the
-files as stored in the archive independently of their actual value on
-the file system.  The way to prepare a clean distribution is usually
-to have some Makefile rule creating a directory, copying all needed
-files in that directory, then setting ownership and permissions as
-wanted (there are a lot of possible schemes), and only then making a
-@command{tar} archive out of this directory, before cleaning
-everything out.  Of course, we could add a lot of options to
-@GNUTAR{} for fine tuning permissions and ownership.
-This is not the good way, I think.  @GNUTAR{} is
-already crowded with options and moreover, the approach just explained
-gives you a great deal of control already.
+@item
+Let @var{size} be the size of the sparse member, @var{Bs} be its block number
+and @var{Bn} be the block number of the next member.
+Compute:
  
-@xopindex{same-permissions, short description}
-@xopindex{preserve-permissions, short description}
-@item -p
-@itemx --same-permissions
-@itemx --preserve-permissions
-Extract all protection information.
+@smallexample
+@var{N} = @var{Bs} - @var{Bn} - @var{size}/512 - 2
+@end smallexample
  
-This option causes @command{tar} to set the modes (access permissions) of
-extracted files exactly as recorded in the archive.  If this option
-is not used, the current @code{umask} setting limits the permissions
-on extracted files.  This option is by default enabled when
-@command{tar} is executed by a superuser.
+@noindent
+This number gives the size of the extended header part in tar @dfn{blocks}.
+In our example, this formula gives: @code{897 - 56 - 425984 / 512 - 2
+= 7}.
  
+@item
+Use @command{dd} to extract the headers:
  
-This option is meaningless with @option{--list} (@option{-t}).
+@smallexample
+@kbd{dd if=@var{archive} of=@var{hname} bs=512 skip=@var{Bs} count=@var{N}}
+@end smallexample
  
-@opindex preserve
-@item --preserve
-Same as both @option{--same-permissions} and @option{--same-order}.
+@noindent
+where @var{archive} is the archive name, @var{hname} is a name of the
+file to store the extended header in, @var{Bs} and @var{N} are
+computed in previous steps.
  
-The @option{--preserve} option has no equivalent short option name.
-It is equivalent to @option{--same-permissions} plus @option{--same-order}.
+In our example, this command will be
  
-@FIXME{I do not see the purpose of such an option.  (Neither I.  FP.)
-Neither do I. --Sergey}
+@smallexample
+$ @kbd{dd if=arc.tar of=xhdr bs=512 skip=56 count=7}
+@end smallexample
+@end enumerate
  
-@end table
+Finally, you can expand the condensed file, using the obtained header:
+
+@smallexample
+@group
+$ @kbd{xsparse -v -x xhdr GNUSparseFile.6058/sparsefile}
+Reading extended header file
+Found variable GNU.sparse.size = 217481216
+Found variable GNU.sparse.numblocks = 208
+Found variable GNU.sparse.name = sparsefile
+Found variable GNU.sparse.map = 0,2048,1050624,2048,@dots{}
+Expanding file `GNUSparseFile.28124/sparsefile' to `sparsefile'
+Done
+@end group
+@end smallexample
  
  @node cpio
  @section Comparison of @command{tar} and @command{cpio}
@@ -9083,7 +9479,7 @@ examples of format parameter considerations.
  @opindex blocking-factor
  The data in an archive is grouped into blocks, which are 512 bytes.
  Blocks are read and written in whole number multiples called
-@dfn{records}.  The number of blocks in a record (i.e.  the size of a
+@dfn{records}.  The number of blocks in a record (i.e., the size of a
  record in units of 512 bytes) is called the @dfn{blocking factor}.
  The @option{--blocking-factor=@var{512-size}} (@option{-b
  @var{512-size}}) option specifies the blocking factor of an archive.
@@ -9141,7 +9537,7 @@ it would normally.  To extract files from an archive with a non-standard
  blocking factor (particularly if you're not sure what the blocking factor
  is), you can usually use the @option{--read-full-records} (@option{-B}) option while
  specifying a blocking factor larger then the blocking factor of the archive
-(i.e.  @samp{tar --extract --read-full-records --blocking-factor=300}.
+(i.e., @samp{tar --extract --read-full-records --blocking-factor=300}.
  @xref{list}, for more information on the @option{--list} (@option{-t})
  operation.  @xref{Reading}, for a more detailed explanation of that option.
  
@@ -9547,15 +9943,15 @@ on several media volumes of fixed size.  Although in this section we will
  often call @samp{volume} a @dfn{tape}, there is absolutely no
  requirement for multi-volume archives to be stored on tapes.  Instead,
  they can use whatever media type the user finds convenient, they can
-even be located on files.  
+even be located on files.
  
-When creating a multi-volume arvhive, @GNUTAR{} continues to fill
+When creating a multi-volume archive, @GNUTAR{} continues to fill
  current volume until it runs out of space, then it switches to
  next volume (usually the operator is queried to replace the tape on
  this point), and continues working on the new volume.  This operation
-continues untill all requested files are dumped.  If @GNUTAR{} detects
+continues until all requested files are dumped.  If @GNUTAR{} detects
  end of media while dumping a file, such a file is archived in split
-form.  Some very big files can even be split across several volumes. 
+form.  Some very big files can even be split across several volumes.
  
  Each volume is itself a valid @GNUTAR{} archive, so it can be read
  without any special options.  Consequently any file member residing
@@ -9633,7 +10029,7 @@ $ @kbd{tar --create --tape-length=41943040 --file=/dev/tape @var{files}}
  When @GNUTAR{} comes to the end of a storage media, it asks you to
  change the volume.  The built-in prompt for POSIX locale
  is@footnote{If you run @GNUTAR{} under a different locale, the
-translation to the locale's language will be used.}: 
+translation to the locale's language will be used.}:
  
  @smallexample
  Prepare volume #@var{n} for `@var{archive}' and hit return:
@@ -9672,7 +10068,7 @@ otherwise @command{tar} will write over the volume it just finished.)
  The volume number used by @command{tar} in its tape-changing prompt
  can be changed; if you give the
  @option{--volno-file=@var{file-of-number}} option, then
-@var{file-of-number} should be an unexisting file to be created, or
+@var{file-of-number} should be an non-existing file to be created, or
  else, a file already containing a decimal number.  That number will be
  used as the volume number of the first volume written.  When
  @command{tar} is finished, it will rewrite the file with the
@@ -9688,7 +10084,7 @@ the number used in the prompt.)
  If you want more elaborate behavior than this, you can write a special
  @dfn{new volume script}, that will be responsible for changing the
  volume, and instruct @command{tar} to use it instead of its normal
-prompting procedure: 
+prompting procedure:
  
  @table @option
  @item --info-script=@var{script-name}
@@ -9752,7 +10148,7 @@ $ @kbd{tar cMff /dev/tape0 /dev/tape1 @var{files}}
  @end smallexample
  
  The second method is to use the @samp{n} response to the tape-change
-prompt.  
+prompt.
  
  Finally, the most flexible approach is to use a volume script, that
  writes new archive name to the file descriptor #3.  For example, the
@@ -9802,7 +10198,7 @@ To extract an archive member from one volume (assuming it is described
  that volume), use @option{--extract}, again without
  @option{--multi-volume}.
  
-If an archive member is split across volumes (i.e.  its entry begins on
+If an archive member is split across volumes (i.e., its entry begins on
  one volume of the media and ends on another), you need to specify
  @option{--multi-volume} to extract it successfully.  In this case, you
  should load the volume where the archive member starts, and use
@@ -9822,22 +10218,10 @@ added later.  To label subsequent volumes, specify
  @option{--label=@var{archive-label}} again in conjunction with the
  @option{--append}, @option{--update} or @option{--concatenate} operation.
  
-@FIXME{This is no longer true: Multivolume archives in @samp{POSIX}
-format can be extracted using any posix-compliant tar
-implementation.  The split members can then be recreated from parts
-using a simple shell script. Provide more information about it:}
-Beware that there is @emph{no} real standard about the proper way, for
-a @command{tar} archive, to span volume boundaries.  If you have a
-multi-volume created by some vendor's @command{tar}, there is almost
-no chance you could read all the volumes with @GNUTAR{}.
-The converse is also true: you may not expect
-multi-volume archives created by @GNUTAR{} to be
-fully recovered by vendor's @command{tar}.  Since there is little
-chance that, in mixed system configurations, some vendor's
-@command{tar} will work on another vendor's machine, and there is a
-great chance that @GNUTAR{} will work on most of
-them, your best bet is to install @GNUTAR{} on all
-machines between which you know exchange of files is possible.
+Notice that multi-volume support is a GNU extension and the archives
+created in this mode should be read only using @GNUTAR{}.  If you
+absolutely have to process such archives using a third-party @command{tar}
+implementation, read @ref{Split Recovery}.
  
  @node Tape Files
  @subsection Tape Files
@@ -9935,7 +10319,7 @@ creating multiple volume archives.
  @cindex Listing volume label
    The volume label will be displayed by @option{--list} along with
  the file contents.  If verbose display is requested, it will also be
-explicitely marked as in the example below:
+explicitly marked as in the example below:
  
  @smallexample
  @group
@@ -9980,7 +10364,7 @@ with using @option{--label} option, @command{tar} will first check if
  the archive label matches the one specified and will refuse to proceed
  if it does not.  Use this as a safety precaution to avoid accidentally
  overwriting existing archives.  For example, if you wish to add files
-to @file{archive}, presumably labelled with string @samp{My volume},
+to @file{archive}, presumably labeled with string @samp{My volume},
  you will get:
  
  @smallexample
@@ -9992,7 +10376,7 @@ tar: Archive not labeled to match `My volume'
  
  @noindent
  in case its label does not match.  This will work even if
-@file{archive} is not labelled at all.
+@file{archive} is not labeled at all.
  
    Similarly, @command{tar} will refuse to list or extract the
  archive if its label doesn't match the @var{archive-label}
@@ -10223,7 +10607,7 @@ This option is deprecated.  Please use @option{--format=posix} instead.
  @appendix Configuring Help Summary
  
  Running @kbd{tar --help} displays the short @command{tar} option
-summary (@pxref{help}). This summary is organised by @dfn{groups} of
+summary (@pxref{help}). This summary is organized by @dfn{groups} of
  semantically close options. The options within each group are printed
  in the following order: a short option, eventually followed by a list
  of corresponding long option names, followed by a short description of
@@ -10432,14 +10816,14 @@ output. Default is 12.
  Right margin of the text output. Used for wrapping.
  @end deftypevr
  
-@node Genfile
-@appendix Genfile
-@include genfile.texi
-
  @node Tar Internals
  @appendix Tar Internals
  @include intern.texi
  
+@node Genfile
+@appendix Genfile
+@include genfile.texi
+
  @node Free Software Needs Free Documentation
  @appendix Free Software Needs Free Documentation
  @include freemanuals.texi
@@ -10457,7 +10841,7 @@ Right margin of the text output. Used for wrapping.
  @appendix Index of Command Line Options
  
  This appendix contains an index of all @GNUTAR{} long command line
-options. The options are listed without the preceeding double-dash.
+options. The options are listed without the preceding double-dash.
  For a cross-reference of short command line options, @ref{Short Option Summary}.
  
  @printindex op