Talk:tar (computing)

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Error in file size limits ?[edit]

The article states "12 bytes reserved for storing the file size, only 11 octal digits can be stored. This gives a maximum file size of 8 gigabytes on archived files." This seems to be in error, 11 decimal digits is 99,999,999,999 Bytes. 99,999,999,999 /1024 =~ 97,656,249 KBytes. 97,656,249 / 1000 =~ 97,656 MegaBytes. 97,656 /1000 =~ 97 GigaBytes. —Preceding unsigned comment added by 194.66.238.27 (talk) 13:27, 13 March 2008 (UTC)

Note that it says octal digits, not decimal digits. 11 octal digits is 8^11-1 = 2^33 - 1 ≈ 8 GB. MHenoch (talk) 15:54, 23 April 2008 (UTC)

Format UID File Size File Name Devn gnu 1.8e19 Unlimited Unlimited 63 oldgnu 1.8e19 Unlimited Unlimited 63 v7 2097151 8GB 99 n/a ustar 2097151 8GB 256 21 posix Unlimited Unlimited Unlimited Unlimited http://www.gnu.org/software/automake/manual/tar/Formats.html 194.66.238.27 (talk) 19:29, 18 March 2008 (UTC)


Command examples[edit]

Please Please Please add these examples back to the main tar page! Do you have no idea how useful it is?


This is the most useful information of this computer program. These simple command examples perform a service while the discussion of compression formats is more arcane knowledge whose usefulness is rare indeed. We don't need a man page - but the simple command line examples are advantageous. If we dont keep them on the wiki page, i would like to at least reference them here. I agree! this is the first time i've seen a wiki page get more useless. Whenever i wanted to know how to use tar, I would simply look it up on wiki. now the page serves no purpose.

Some simple examples of using the Tar program.

To create a Tar file

Creates a GZIP-compressed Tar file named eglinux.tar.gz of all files with a .txt suffix:

tar -czvf eglinux.tar.gz *.txt

To list files in a compressed Tar file

tar -tf eglinux.tar.gz

To extract files from a Tar file

Extracts all files from a compressed Tar file named eglinux.tar.gz.

tar -xvf eglinux.tar.gz

Extracts to a specific folder:

tar -xvf eglinux.tar.gz -C ~/des

Other versions of Tar may require the -z option to specify the compression type. —Preceding unsigned comment added by 64.52.70.60 (talk) 15:46, 29 December 2008 (UTC)

This article is about the tar format -- not the tar program. Command options to a program seems to me to be useless when discussing a file format. FrederikHertzum (talk) 12:40, 21 May 2009 (UTC)

Fred that is like saying it is useless to describe or even acknowledge baking when discussing cake. The command utility is main methodology to create a tar file. —Preceding unsigned comment added by 65.19.16.50 (talk) 14:23, 2 November 2010 (UTC)

+1 for adding examples. Who'd have thought the talk page would be more useful than the main page... —Preceding unsigned comment added by 67.183.113.131 (talk) 09:00, 20 January 2010 (UTC)

+1 for re-adding examples! Why has this most useful information been removed? —Preceding unsigned comment added by 217.10.60.85 (talk) 14:35, 6 May 2010 (UTC)

+1 for putting examples back. —Preceding unsigned comment added by 87.236.134.251 (talk) 17:05, 27 January 2011 (UTC)

+1 for adding them back, quite useful indeed and a good case of how to apply the format [[[Special:Contributions/62.96.36.158|62.96.36.158]] (talk) 10:00, 25 July 2011 (UTC)]

Again, examples for the linux specific tar command would be on it's own separate page, just as information on photoshop tools would not be on a page about the JPEG format. In any case, examples of CLI commands are not encyclopaedic — Preceding unsigned comment added by 49.194.90.184 (talk) 07:28, 19 August 2011 (UTC)

All examples mentioned in this section are not correct tar syntax and thus do not belong into this article. While some implementations accept a '-' in fron of options, the standard does not mention it... --Schily (talk) 09:46, 19 August 2011 (UTC)

No information about the file format[edit]

I wouldn't remove information from a wiki page that tens of thousands of people use as a reference, ie the command line examples in this tar article. One tar page can contain both pedantic information as well as command line examples. —Preceding unsigned comment added by 68.4.49.44 (talk) 17:59, 8 February 2009 (UTC)

Some irony that an article called Tar file format manages to say exactly nothing about the Tar file format. ==Tagishsimon (talk)

Now it does! (although I am tired so will write about the ustar extensions another day, unless anyone else want to) Sam Jervis 23:18, 6 Jan 2005 (UTC)
IMO, the article should be splitted into "tar (file format)" and "tar (file archiver)" --Minghong 12:41, 23 Mar 2005 (UTC)

I would suggest "tar (utility)" to describe the command-line utility and "tar (file format)" to describe the format. In particular, there are many programs that read and write tar formats that are not the tar utility. This includes various GUI archiving tools, for example. It also includes other tools that are not general-purpose archivers, such as the FreeBSD package tools, which use tar format. (I believe RPM is based on cpio format and Deb uses ar format, but I can't remember exactly.) It's also worth noting that many command-line tar programs can read and write archive formats other than the tar format. Tim Kientzle 00:17, 2 September 2007 (UTC)

Tar vs Ar[edit]

I never knew about ar before I came to Wikipedia. But now that I do, why is tar used instead of ar? :)--Chealer 22:58, 2005 Apr 1 (UTC)

Because traditionally tar was used for backups to tape and ar was used for static libraries. So ar's format is highly tied to ld and static libraries. Furthermore ar archives are not necessarily cross platform due to endianess, and the tar format was standardized through POSIX.--b4hand 16:41, 30 Apr 2005 (UTC)
Thanks for the information. Don't you think this should be in the main article? 15:12, 29 February 2012 (UTC) — Preceding unsigned comment added by 217.125.117.197 (talk)
This is not really correct....
In 1978 the PDP-11 was dominating and people just started to think about byte order and similar. One important reason is that ar does not support subdirectories. --Schily (talk) 16:39, 29 February 2012 (UTC)

Conformance to quality standards of Wikipedia[edit]

This article is somewhat of a tutorial to a user of the specific tool in a specific operating system. Moreover words like "Whoops" and examples make this article both lengthy as well below-par. Please consider conformance, or put a {{cleanup|January 2006}} tag to notify people who use this as an encyclopedia and not as a man-page. —Preceding unsigned comment added by Dormant25 (talkcontribs) 06:53, 3 January 2006

I've removed the cleanup tag as there are no examples of colloquialism anymore and AFAICC it is perfectly acceptable to include in an article about a file format how to create and view files in that format. I'm sure most readers want to know how to actually use it to get stuff done, not the other more esoteric details like the technical details of the format and its history. Joe Llywelyn Griffith Blakesley talk contrib 19:10, 6 August 2006 (UTC)
I disagree with Mr. Blakesley. The user who wanted to get straight to using the format would go to the manpage, reference book, tutorial book, operating system documentation, or whatever. The person who is interested in how a tar file is actually laid out, what the advantages/disadvantages of the layout are, etc., without researching it herself - she would be the one to turn to wikipedia - wikipedia would be the natural place to turn, and she is the user we should be focusing on. Now, it is appropriate to have some discussion of what tools people use to manage these archives, and what tools can read or write them, but for a detailed description of how the tools are used (and a more encyclopedic description of how the tools are implemented/how they came to be as they are). I would suggest putting that on a separate page (i.e. tar (file format), tar (unix command), and eventually maybe BSD tar, and GNU tar...). I note that there is already a mistake about BSD tar in the article. Jimmy hartzell 17:27, 15 February 2007 (UTC)

It looks like we have a reference to i.e. that should be e.g. Is there really only one way that tarbombs would have improper file structure? —Preceding unsigned comment added by 67.183.113.131 (talk) 08:57, 20 January 2010 (UTC)

Compression[edit]

The vast majority of instances of the "tar" program that one is likely to encounter now include the "z" (compress) option. Thus the discussion about why this is a bad idea is kind of silly. Tim Bray 05:22, 1 February 2006 (UTC)

tarball[edit]

Does anybody know where the word tarball comes from? --Lionel H. Grillet 12:36, 13 March 2006 (UTC)

Just a wild guess - tapes must have messed up sometimes back then, and when you roll it up like a ball of yarn then you've made yourself a nice tar-ball! —Preceding unsigned comment added by 90.185.94.157 (talk) 01:23, 25 January 2009 (UTC)
Probably from the word... "Tarball", which is similar to a tarbaby. --maru (talk) contribs 14:01, 13 March 2006 (UTC)
Has anyone else noticed the word tarball in the song Bubble Toes by Jack Johnson? —Preceding unsigned comment added by Samineru (talkcontribs) 00:07, April 2, 2006 (UTC)
I was under the impression that tarballs were .tgz files, not .tar files. 71.123.19.163 05:42, 30 June 2006 (UTC)
No... .tgz denotes a gzip compressed tar archive. —Preceding unsigned comment added by 98.67.36.36 (talk) 16:09, 28 June 2009 (UTC)
The term TARBALL is slang, a sort of self depricating joke that requires the following background:
1) In the early days, tar was used to archive programs and program source because end users and their data files were typically stored in other formats (or not at all -- we were programers and one thing we did NOT care about was end users)
2) One of the worst comments you could make about a program design was that it was a MUDBALL -- meaning it had no structure at all, a total dirty mess from start to finish, etc.
3) Ergo your archived source code became known as a tarball
I was there.
It's a shame to let the name tarball become slang for any tar archive. —Preceding unsigned comment added by 63.197.50.90 (talkcontribs) 16:25, 3 August 2006
The article calls the term a "pun". How is it a pun? Ksn (talk) 14:02, 15 April 2014 (UTC)

Konqueror web archives[edit]

I've added an additional file extension, .war, which are Konqueror web archives. Due to the possibility of confusion with Java archives (also .war) I think this needs a citation.

There are throw-away comments on the web that ".war files are renamed .tar.gz" files, especially on KDE discussion lists, and I believe that they are reliable, because I have played with .war files and satisfied myself that they actually are tar files, compressed with gzip. E.g. tar -xzf archive.war will extract the files from one. Problem is, I can't find an authorative source to cite. Maybe the KDE source code?

Also, there are incorrect references on some mailing lists that .war files are "zip files". I've also seen posts made on KDE mailing lists suggesting that Konq should/will use .wtz instead of .war, but I haven't seen any evidence that was ever any more than a proposal.

Assistence in finding an authoritive source to cite will be appreciated. Limeguin 15:21, 24 August 2006 (UTC)

Java WAR archives are identical in format to Java JAR archives, which are themselves "zip" files. The mailing list posters are correct, but the konqueror/java distinction will be the source of the confusion above. 81.187.12.206 (talk) 21:37, 25 November 2008 (UTC)

BSD tar[edit]

If my understanding is correct, BSD tar does not need an external program to handle gzip/gunzip, but rather uses the libarchive library it is a wrapper for. If this is in fact the case, the article should be corrected, but I cannot find any documentation that explicitly states that no external program is called. Jimmy hartzell 17:29, 15 February 2007 (UTC)

As the author of libarchive, I can confirm that it uses the libz and libbz2 libraries to implement gzip and bzip2 compression and decompression internally. My bsdtar program (not "BSD tar", by the way) does use libarchive. Kientzle 06:06, 1 September 2007 (UTC)

Additionally, BSD tar does not really exists. The major BSDs all have their own versions of tar. The OpenBSD version, for example, does not support -j --David Chisnall 17:31, 14 August 2007 (UTC)

There are at least two different programs that go by the name "bsdtar": One is a port of an old BSD Unix implementation to MSDOS that was distributed by O'Reilly under the name "bsdtar" as part of a CDROM accompanying one of their books. The other is my own from-scratch tar implementation that I named "bsdtar" because it's released under a BSD license, in contrast to "GNU tar." My "bsdtar" is currently the default system tar for FreeBSD and at least one Linux distro that I'm aware of. Kientzle 06:06, 1 September 2007 (UTC)

To the person who removed commands section[edit]

I thought this section was really useful and think it should be put back. —The preceding unsigned comment was added by 164.165.217.254 (talk) 23:36, 30 March 2007 (UTC).

Heaven forbid that wikipedia ever contains any useful information ever. Only mindless (but "Encyclopedic"!) trivia will be allowed. Grr. Freeking... grk... delete-happy... grumble...
Yeah, useful to all those unix gurus who go to wikipedia before man for usage arguments. All zero of them. Chris Cunningham 15:24, 4 April 2007 (UTC)
This might be more relevant if man pages consistently provided examples. I think a more common usage pattern is to search the internet when man pages turn out to be incomplete or misleading. Also, some distributions don't even come with man pages for some common commands by default (e.g. Cygwin doesn't provide a man page for tar in my current distribution.)
1) Man pages are significantly less readable. 2) What about the windows gurus who find themselves stuck in unix for whatever reason? -- See, this is the problem with WP - no one considers even slightly different use-cases. Case in point: various articles on games - deleted because they're "guides for gamers and we don't do that lol" - no one considered (the only slightly out-of-the-box idea) of game designers/developers doing legitimate research (gack you got me started - don't get me started!). —The preceding unsigned comment was added by 60.240.227.227 (talk) 15:47, 4 April 2007 (UTC).
Just get used to it: Wikipedia is an encyclopedia; for other kinds of content, there are (tens of?) thousands wikis out there, which you are equally welcome to edit — or even migrate deleted Wikipedia content to, as Wikipedia is licensed under the GFDL. -- intgr 16:11, 4 April 2007 (UTC)
"intgr" is right. Wikipedia's official policy as stated at WP:NOT#IINFO makes it very clear that Wikipedia articles are not instruction manuals or textbooks. Manpages may be less readable but they're certainly more reliable! Rwxrwxrwx 14:00, 5 April 2007 (UTC)
Ok, then who *is* the intended audience for this page? Is the point that this would not be read by someone who would never use tar and has no interest whatsoever in seeing what the command looks like in action and yet has a burning desire to, e.g. read about the 512 byte header and whether there is a checksum at byte 148?
It might be useful to have command information, but it doesn't belong in an article on the file format. --Brouhaha 18:42, 21 September 2007 (UTC)
This is a disambiguation page. The actual tar page discusses black resins. This is the only relevant page that comes up when you search wiki or google for the Unix Tar command. Also, the ending of the page title, "and related command line program name, the tape archiver" should be considered proof that the command specifics are relevant scope.
Would anybody object if I created a Tar (program) page (that's distinct from the Tar (file format) page)? that way it would be appropriate to have a few example commands on the program page. for what it's worth i personally think it's entirely appropriate to have a few example commands for every *nix command page. Roadnottaken (talk) 14:18, 23 February 2009 (UTC)

Tgz redirects here, but it not mentioned[edit]

I've noticed that Tgz is a redirect to this article, but there is no mention of those three letters anywhere in the article. That's not good. I don't know much about tar, but this should be fixed by someone. -- 199.60.2.105 20:15, 12 June 2007 (UTC)

Explanation added --tcsetattr (talk / contribs) 03:03, 5 September 2007 (UTC)

How the end of an archive is formatted[edit]

The sentence "The end of an archive is marked by at least two consecutive zero-filled blocks." under "Format details" is correct, but gives an incomplete picture of the total size of a tar file.

Per GNU info pages, in addition to having two full blocks of zeros at the end of it, an archive is padded with more zeros as required to make its size a multiple of a "record". A record is a group of blocks, typically 20, which get written to tape in one shot with no spaces between them. The number of blocks can be changed by using the -b option.

The result is that by default, the smallest archive is quite big (10240 bytes), which is interesting and surprising. Daniel Romaniuk 03:07, 21 June 2007 (UTC)

pardon my formatting, but:
bash-3.00$ touch blah
bash-3.00$ tar cvf blah.tar blah
a blah 0K
bash-3.00$ ls -liah | grep blah
660304 -rw-------   1 cc199700 staff          0 Jun 21 09:44 blah
661202 -rw-------   1 cc199700 staff       1.5K Jun 21 09:44 blah.tar
This is Solaris tar, admittedly, but 1.5k isn't 10k. Chris Cunningham 08:49, 21 June 2007 (UTC)
Good to know. According to Solaris documentation, you also can specify a blocking factor (or number of blocks per record) using the -b option. The difference is that yours defaults to 1. Perhaps we could add the explanation about record size, without mentioning what the default might be... Daniel Romaniuk 13:56, 21 June 2007 (UTC)

tape drive limitations[edit]

The article claims that early tape drives only supported 512-byte blocks. This is clearly incorrect for the vast majority of early tape drives, and in fact tar usually wrote 10240-byte tape records. The tape drive and its formatter (controller) had no idea that the 10240-byte record was divided into any smaller unit (block) by the software.

There is probably some good reason that the format was originally designed to use 512-byte blocks, and it might even be due to a limitation of some particular tape drive, but the blanket claim is false and needs to be corrected or removed. I added a 'fact' tag for now, but will remove the claim if support or clarification is not forthcoming. --Brouhaha 18:47, 21 September 2007 (UTC)

How do you download this file?[edit]

Can anybody tell me how to download this file? --WikiCats (talk) 13:47, 22 May 2008 (UTC)

Passing paramters[edit]

In the examples in this article there's usage of "-" (minus sign) before the paramters. In some systems it doesn't work with minus sign... —Preceding unsigned comment added by 80.74.120.200 (talk) 09:03, 8 July 2008 (UTC)

Explanation added now. -j.eng (talk) 20:16, 17 October 2009 (UTC)

Sparse files should be included[edit]

Gnu tar and PAX both supports sparse files. —Preceding unsigned comment added by Athulin (talkcontribs) 19:46, 25 March 2009 (UTC)

File Offsets should (also) be Hexadecimal[edit]

The format tables should include the file offsets in hexadecimal not (only) in decimal. Hexadecimal offsets are widely used while decimal ones are kind of exotic. Common Unix-Tools like hexdump or xxd do not support the display of decimal offsets. —Preceding unsigned comment added by Martin scharrer (talkcontribs) 15:52, 29 July 2009 (UTC)

Error in Structure Table?[edit]

I have looked around, and never found a tar struct that has 156 name bytes. They are all 100, and no single byte after that. I don't want to change the page if I am wrong, but could someone else back me up and change it? At the very least it is confusing what it means to convey. —Preceding unsigned comment added by 216.50.65.6 (talk) 20:26, 25 March 2010 (UTC)

Tarpipe?[edit]

Really? This got introduced as copying files, it would be playing hardball (tardball?) on the same computer. "cp -a" is not good enough? I think the main point is left out where you have no other connection between two systems just a simple pipe, like "tar c | nc -lp 8888 -q 0" on one computer and "nc 192.168.0.2 8888 | tar x" on the other. Cf. Hay (talk) 12:57, 2 July 2011 (UTC)

Recent edits by 120.151.160.158[edit]

Wikipedia is no playground for advertizing your favorite OS.

You did not introduce new information as there are already pointers to useful and complete information. http://cdrecord.berlios.de/private/man/star/star.4.html lists the standard and vendor unique extensions and the standard http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html gives complete standard information that is not biased like yours.

Please follow the Wikipedia rules, do not introduce advertizing and do not add non-helpful partial information altough the complete information is available from the article already. --Schily (talk) 10:12, 5 December 2011 (UTC)

The term 'tarball' is used before it is defined.[edit]

This arcitle talks about 'tarballs' and only later on explains that this is an alternative term for a tar file. 'tarball' needs to be defined before it is used. 109.150.205.103 (talk) 22:56, 2 April 2012 (UTC)

Format details wording[edit]

Currently the article reads “A tar file is the concatenation of one or more files”. For me it was too easy to read as “. . . one or more tar files”, but I understand that you cannot simply append the contents of one Tar file to the end of another because there is a special end-of-file marker in the middle. Instead I think the sentence is just trying to say something like the files stored within an archive are joined together. Can we find a better way to express this? Maybe replacing “tar file” with “archive”? Vadmium (talk, contribs) 12:24, 14 April 2012 (UTC).

The word "concatenation" is not optimal, but the changed text was not better. A tar file is a stream of objects that are made of a tar header and file content. A tar file may contain other tar files. --Schily (talk) 20:53, 14 April 2012 (UTC)

tar Magic with GNU tar[edit]

You mention the magic code defined in POSIX, yet on Linux (or any GNU tar supported platform) the magic code is "ustar \0" (the letters "ustar" followed by two spaces and a null byte.)

Since you mention GNU tar, it may be a good idea to mention the proper magic for the tool (or at least mention that the POSIX magic is not followed 100%.)

Alexis Wilke (talk) 05:13, 19 July 2012 (UTC)

GNU "tar" usually does not create sufficiently TAR compatible archives and for this reason does not use a POSIX tar magic. You are confusing "GNU tar supported platform" with systems where someone uses GNU tar. A "GNU tar supported platform" is a platform where GNU tar compiles and works. Few of these platform decided to use GNU tar as their primary archiver. While the deviating magic usually is no problem, the deviations from the POSIX archive format are a frequent problem in data exchange with GNU tar created archives. --Schily (talk) 09:14, 19 July 2012 (UTC)
The GNU Tar Manual might discuss this, and serve as citable source to expand this article's coverage on this point. There might also be some discussion on USENET, which google has archives of, but that is probably harder to find. Lentower (talk) 13:48, 19 July 2012 (UTC)

No evidence of indexing tool for random access tar file?[edit]

The article says now there are advanced tool that can create index file for tar file and optionally append it to a tar file. I googled for half an hour not being able to find a tool to index tar file, not to mention to integrate the index into the archive. Is it just me? Or there should be a reference link, if there is even on such a tool at all. 张韡武 (talk) 06:27, 11 September 2012 (UTC)

non sense ascii rational[edit]

Rational provided for ASCII is nonsense:

″To ensure portability across different architectures with different byte orderings, the information in the header record is encoded in ASCII″

  1. ASCII does not adress byte orderings issues.
  2. ASCII does not ensure portability a it is a so limited charset.

Might be better rational would be:

  • To ensure portability across different architectures wich encoded basic latin differently (such as different EBCDIC variations), the information in the header record was encoded in ASCII. ASCII is so limited that bytes filenames were used nonetheless the tar format does not provide the encoding so if a non ASCII filename is present, the tar software is unable to handle it reliabely. — Preceding unsigned comment added by 84.100.0.236 (talk) 21:22, 13 April 2013 (UTC)

'ustar' acronym expansion[edit]

Hi! I'm an occasional wikipedian, and a while ago the subject of the 'ustar' acronym for POSIX tar archives came up, and we wondered what it might expand to. Some googling revealed two conflicting expansions, and it wasn't really obvious which one was correct, so I did some digging to try to find the origin of the acronym. In the end the oldest reference to "Uniform Standard TApe Archive" that I could find was when it was added to this article in 2006, whereas I could find BSD manpages (that seem more authoritative on the subject than a random Wikipedia edit, for sure) from 2004 giving the "Unix Standard TAR" expansion for the acronym. With that in mind, I decided to edit the acronym expansion per WP:BOLD, citing a 2004 manpage (thus predating any mention I could find of "Uniform Standard TApe Archive", but since this proved to be a bit controversial (apparently, judging by the edit history) I thought it'd be a good idea to provide more links explaning this change here.

This is all admittedly a rather trivial matter, but since a lot of places (including .edu pages) provide the seemingly incorrect acronym expansion citing Wikipedia, I'd like to put a stop to the misinformation.

Here's what I'm basing this on:

  • As mentioned, the "Uniform" expansion was added in October 2006, with no explicit citation but menioning POSIX in the edit. I checked relevant pages of POSIX and couldn't find this expansion mentioned anywhere there.
  • Google searches for the Uniform variant (#1, #2) find no results when given exact strings with quotes.
  • Google searches for the Unix variant as it's phrased in the BSD manpage (#1) does find various results, dating back to 2001 (but most of the results seem to be the aforemenioned manpage in various forms).
  • This course gives the Uniform expansion, but it doesn't state where from, and *does* link to the Wikipedia article at the end… so it likely derives from the Wikipedia article.

I feel a bit silly for writing all this, but I just want to avoid spreading misinformation… hopefully this is enough rationale to explain why I changed what I did. —FireFly~ 13:03, 18 May 2017 (UTC)

I took the liberty of removing the {{cn}} after adding an additional citation: a book about file formats of the internet from 1995. —FireFly~ 18:12, 27 May 2017 (UTC)