Talk:Shebang (Unix)

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing / Software (Rated C-class, Low-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Low  This article has been rated as Low-importance on the project's importance scale.
Taskforce icon
This article is supported by WikiProject Software (marked as Mid-importance).
 

Alternate names[edit]

An edit was reverted as the sources were claimed to be unreliable, however they are cited directly by the author of TLDP's Advanced Scripting Guide -- http://tldp.org/LDP/abs/html/sha-bang.html#FTN.AEN201 84.93.143.206 (talk) 21:38, 19 January 2012 (UTC)

Just as a continuation - searching Google Books comes up with several publications calling a shebang a hash pling, pound-bang, and sha-bang. 84.93.143.206 (talk) 21:47, 19 January 2012 (UTC)
Citing a source doesn't confer reliability upon it, so even if it's cited by a reliable source, http://www.in-ulm.de/~mascheck/various/shebang should not be used to reference any facts in this article. If the facts are actually included in the reliable source, the reliable source can of course be used. If the facts are in books, you can use those too.
It does look like http://www.in-ulm.de/~mascheck/various/shebang could be a suitable external link. – Pnm (talk) 22:20, 19 January 2012 (UTC)

Ajax[edit]

I can't work out what the last sentence in the lead paragraph is saying:

The "shebang" or "hashbang" name is also sometimes used of state-preserving fragment identifiers in Ajax applications; Google Webmaster Central specifies that fragment identifiers starting with an exclamation point (...url#!state...) are indexed specially by the Googlebot.

Should this be 'sometimes used in', 'sometimes used as', or 'sometimes used instead of' or something else? 81.98.43.107 (talk) 17:11, 1 February 2012 (UTC)

Clarified a bit. Lorem Ip (talk) 00:25, 2 February 2012 (UTC)

env security[edit]

I think the phrasing about env security is unnecessarily convoluted. It is also a quite free interpretation of the source. My understanding is that the env approach is dangerous only if you use it for suid programs (or similar) without checking PATH, which is a recipe for disaster anyway. --LPfi (talk) 12:05, 27 April 2012 (UTC)

Agreed. I’ve introduced some alternative wording. Ewx (talk) 07:44, 28 April 2012 (UTC)

UTF-8 indentification[edit]

The page now says "UTF-8 can reliably be recognised as such by a simple algorithm".

This is wrong - UTF-8 cannot be distinguished reliably from any other encoding which uses 8bit per character unless that file happens to contain an invalid UTF-8 sequence. As an example, "aè!" in ISO-8859-1 is "aè!" if read as UTF-8. There is no way for an algorithm to say it's one way or the other. — Preceding unsigned comment added by 188.65.1.1 (talk) 10:33, 31 May 2012 (UTC)

If there are more than a few non-ASCII bytes and there is no invalid UTF-8, the probability that it is not UTF-8 is very low. No mathematical evidence is needed.--BIL (talk) 11:08, 31 May 2012 (UTC)
Might be reported to UTF-8 page? — Preceding unsigned comment added by 84.100.195.219 (talk) 23:37, 22 June 2012 (UTC)

Unneeded Shebang[edit]

Some people say the BOM indication is unneeded to them in UTF-8, as UTF-8 as no specific Byte Order.

On old Unix, Shebang is not compatible with BOM.

Nowadays, text files are generally written in UTF-8, and often with BOM.

This makes some files, those with both BOM and shebang not to work on old Unix.

But scripts with both BOM and shebang can still be launched indicating the interpreter to use on the command line.

So my question:

How is Shebang unneeded as the interpreter generally known and provided by the caller? — Preceding unsigned comment added by 77.199.89.101 (talk) 13:36, 2 July 2012 (UTC)

Nowadays, text files are generally written in UTF-8, and often with BOM. Agreed on the first claim, since any ASCII file is also UTF-8. But the second ... can you please name one widely used Unix file that's UTF-8 with a BOM? For example, in one of the thousands of Debian packages? JöG (talk) 04:46, 2 February 2013 (UTC)

As for your specific question: the shebang is needed if you intend to use it. The exec family of functions can only run a few different types of programs. Executables are one type; text files with a shebang line is another. exec() doesn't try to guess. JöG (talk) 04:46, 2 February 2013 (UTC)

The wrappers in libc (execvp, etc) will attempt to use the shell if execve does not recognize the file format. So arguably there is some (limited) guessing going on. Ewx (talk) 10:35, 3 February 2013 (UTC)

Examples[edit]

The existing examples are repetitive, mostly "me-toos" which are uninteresting because they are conventional scripting languages. Pruning those and adding a few different examples such as "make" and "env" would benefit the reader. TEDickey (talk) 10:02, 13 July 2012 (UTC)

I agree with this. The point of the examples is to demonstrate use of #!, not to enumerate everything you can possibly use it with (for which a category would make more sense). The article does not need multiple shells and multiple scripting languages in its examples. Ewx (talk) 07:53, 20 May 2013 (UTC)
I strongly disagree, the flags for each program differ widely as to how the program is written on the first line of a script (sed, awk and csh need a -f and tcc needs -run while other programs don't need options at all (PHP, Python) and even more programs require other options. (scsh, this) (guile, -s) (gs, -dNOSAFER) ), and a brief list of how to use different programs' options (including all of the common ones (awk, etc.) that require options) is helpful to have. I don't read it as "me-too," I read it as "This is how to invoke this common program in a script." Maybe this should be rewritten more clearly. TL,DR: We should list the programs that require flags after the shebang in the examples section. -- 12.218.76.10 (talk) 17:04, 26 October 2014 (UTC)
That would be a completely open ended list, and could easily become longer than the rest of the article put together. People who want to know how to script in a particular language should read a tutorial for that language. Ewx (talk) 09:03, 28 October 2014 (UTC)
I doubt it would be longer than a paragraph. After all, many or most of the programs either don't require any options or just require '-f' or another small option or two, with the exceptions above. If it does get too long, we could always come up with some kind of limitation. (most-popular languages list or some such thing.) -- 12.218.76.10 (talk) 17:08, 3 November 2014 (UTC)

UTF-8 as a de facto standard[edit]

Could you expand on why you removed that information? Simply not liking it is not a reason to remove it (see WP:IDONTLIKEIT). [1] and [2] both clearly shows that utf-8 is considered as a de facto standard, and I could get more citations if needed (those two was just the first hits I found). Belorn (talk) 09:37, 26 July 2012 (UTC)

Your claim is wrong: UTF-8 is not a defacto standard but a real stadard. It is however just one of many standards in the area of encoding. Your change made the article less exact. --Schily (talk) 09:54, 26 July 2012 (UTC)
Not my claim. To cite:
  1. ""The most widely accepted ("de facto standard") character encoding method is UTF-8."" (ibm.com) [3]
  2. ""Modern Linux installations use UTF-8 for their environment in any country with any language and is currently the de facto standard for to represent text"" (Readhat Glossary) [4]
  3. ""UTF-8 is the defacto standard console and text file on modern systems, though other encodings are still common"" (Mercuril project)[5]
  4. ""UTF-8: Unicode for all regions, mostly in 1-3 Octets (new de facto standard)"" (linuxtopia.org) [6]
  5. ""For many projects on Linux, the de facto standard is to use UTF-8."" (unifont.org), and so on. Python has also gone making utf-8 the default for text.
  6. ""itself (ASDF) only recognizes one encoding beside :default, and that is :utf-8, which is the de facto standard, already used by the vast majority of libraries that use more than ASCII."" (ASDF manual published by common-lisp.net). [7]
This is just a small subset of all articles I find doing a quick few min google search. Books should have even more statements like that. Are there a reason they are not relevant sources to the suggested change to the article? — Preceding unsigned comment added by Belorn (talkcontribs) 11:48, 26 July 2012 (UTC)
The most widespread coding on UNIX is either C or ISO-8859-1. --Schily (talk) 10:43, 27 July 2012 (UTC)

You're both right; UTF-8 is a standard, as in a standardised encoding of Unicode; but it's one among many, and the choice of UTF-8 over other encodings is a de facto standard in most cases (though there are certainly cases where it's an explicit and official standard, such as in XML). To escape the whole argument, and because the Magic number section had become rather bloated and messy, I've rewritten it and trimmed it down. It now repeats itself less, contains less irrelevant information, and totally avoids the question of what sort of standard UTF-8 represents. :-) -- Perey (talk) 12:23, 26 July 2012 (UTC)

Looks like a very good compromise. Belorn (talk) 20:04, 26 July 2012 (UTC)
As the BOM is now clearly marked as superfluous, I see no problem with this text. --Schily (talk) 10:48, 27 July 2012 (UTC)

Coining of "shebang" (to mean #!)[edit]

For the record, I believe I was the one to coin this particular usage of "shebang", sometime in the late 1980s. By the time I posted [8] in 1989 I was already trying to get people to adopt the term. (The Usenet article in question was patch 7 for Perl 3.0.) I'm not aware that anyone else coined it independently, though of course that is always possible. (Sorry for the anonymous post, but someone seems to have grabbed "TimToady" already, and I don't think it was me.) --Larry Wall 71.139.24.65 (talk) 02:49, 4 September 2012 (UTC)

since the slang word shebang was in fairly widespread use prior to computer culture, were you aware of that usage? I would trace the whole shebang back to that word, seems unavoidable. 68.174.97.122 (talk) 14:12, 22 February 2013 (UTC)

Proposed merge with Interpreter directive[edit]

Neither this article nor Interpreter directive is noteworthy enough for a wikipedia article really, but if we merge them together they might just pass. Felixphew (Ar! Ar! Ar!) 07:16, 19 June 2014 (UTC)

Except for UNOS (1982) #! is not an interpreter directive but a kernel feature. Schily (talk) 13:23, 14 July 2014 (UTC)