|This is the talk page for discussing improvements to the Shebang (Unix) article.
This is not a forum for general discussion of the article's subject.
|WikiProject Computing / Software||(Rated C-class, Low-importance)|
- 1 Wikipedia's Gestapo Edit Reviews
- 2 Disambiguation
- 3 Alternate names
- 4 Ajax
- 5 env security
- 6 UTF-8 indentification
- 7 Unneeded Shebang
- 8 Examples
- 9 UTF-8 as a de facto standard
- 10 Coining of "shebang" (to mean #!)
- 11 Proposed merge with Interpreter directive
- 12 the pathname of the script
- 13 Merge
- 14 Too Linux Centric?
Wikipedia's Gestapo Edit Reviews
I've made a number of small edits to this page as well as other pages. So far they have nearly all have been rolled back including those like this where a citation was requested. We'll see if the change I made gets rolled back or not. There is a request for original research to be reviewed. I will not invest any time on this if even my minor changes get rolled back. I've been developing software for over 20 years and am qualified to review and improve this page. But not if anything I do gets rolled back!
An edit was reverted as the sources were claimed to be unreliable, however they are cited directly by the author of TLDP's Advanced Scripting Guide -- http://tldp.org/LDP/abs/html/sha-bang.html#FTN.AEN201 18.104.22.168 (talk) 21:38, 19 January 2012 (UTC)
- Just as a continuation - searching Google Books comes up with several publications calling a shebang a hash pling, pound-bang, and sha-bang. 22.214.171.124 (talk) 21:47, 19 January 2012 (UTC)
- Citing a source doesn't confer reliability upon it, so even if it's cited by a reliable source, http://www.in-ulm.de/~mascheck/various/shebang should not be used to reference any facts in this article. If the facts are actually included in the reliable source, the reliable source can of course be used. If the facts are in books, you can use those too.
I can't work out what the last sentence in the lead paragraph is saying:
The "shebang" or "hashbang" name is also sometimes used of state-preserving fragment identifiers in Ajax applications; Google Webmaster Central specifies that fragment identifiers starting with an exclamation point (...url#!state...) are indexed specially by the Googlebot.
I think the phrasing about env security is unnecessarily convoluted. It is also a quite free interpretation of the source. My understanding is that the env approach is dangerous only if you use it for suid programs (or similar) without checking PATH, which is a recipe for disaster anyway. --LPfi (talk) 12:05, 27 April 2012 (UTC)
The page now says "UTF-8 can reliably be recognised as such by a simple algorithm".
This is wrong - UTF-8 cannot be distinguished reliably from any other encoding which uses 8bit per character unless that file happens to contain an invalid UTF-8 sequence. As an example, "aÃ¨!" in ISO-8859-1 is "aè!" if read as UTF-8. There is no way for an algorithm to say it's one way or the other. — Preceding unsigned comment added by 126.96.36.199 (talk) 10:33, 31 May 2012 (UTC)
- If there are more than a few non-ASCII bytes and there is no invalid UTF-8, the probability that it is not UTF-8 is very low. No mathematical evidence is needed.--BIL (talk) 11:08, 31 May 2012 (UTC)
Some people say the BOM indication is unneeded to them in UTF-8, as UTF-8 as no specific Byte Order.
On old Unix, Shebang is not compatible with BOM.
Nowadays, text files are generally written in UTF-8, and often with BOM.
This makes some files, those with both BOM and shebang not to work on old Unix.
But scripts with both BOM and shebang can still be launched indicating the interpreter to use on the command line.
So my question:
Nowadays, text files are generally written in UTF-8, and often with BOM. Agreed on the first claim, since any ASCII file is also UTF-8. But the second ... can you please name one widely used Unix file that's UTF-8 with a BOM? For example, in one of the thousands of Debian packages? JöG (talk) 04:46, 2 February 2013 (UTC)
As for your specific question: the shebang is needed if you intend to use it. The exec family of functions can only run a few different types of programs. Executables are one type; text files with a shebang line is another. exec() doesn't try to guess. JöG (talk) 04:46, 2 February 2013 (UTC)
- The wrappers in libc (execvp, etc) will attempt to use the shell if execve does not recognize the file format. So arguably there is some (limited) guessing going on. Ewx (talk) 10:35, 3 February 2013 (UTC)
The existing examples are repetitive, mostly "me-toos" which are uninteresting because they are conventional scripting languages. Pruning those and adding a few different examples such as "make" and "env" would benefit the reader. TEDickey (talk) 10:02, 13 July 2012 (UTC)
- I agree with this. The point of the examples is to demonstrate use of #!, not to enumerate everything you can possibly use it with (for which a category would make more sense). The article does not need multiple shells and multiple scripting languages in its examples. Ewx (talk) 07:53, 20 May 2013 (UTC)
- I strongly disagree, the flags for each program differ widely as to how the program is written on the first line of a script (sed, awk and csh need a -f and tcc needs -run while other programs don't need options at all (PHP, Python) and even more programs require other options. (scsh, this) (guile, -s) (gs, -dNOSAFER) ), and a brief list of how to use different programs' options (including all of the common ones (awk, etc.) that require options) is helpful to have. I don't read it as "me-too," I read it as "This is how to invoke this common program in a script." Maybe this should be rewritten more clearly. TL,DR: We should list the programs that require flags after the shebang in the examples section. -- 188.8.131.52 (talk) 17:04, 26 October 2014 (UTC)
- I doubt it would be longer than a paragraph. After all, many or most of the programs either don't require any options or just require '-f' or another small option or two, with the exceptions above. If it does get too long, we could always come up with some kind of limitation. (most-popular languages list or some such thing.) -- 184.108.40.206 (talk) 17:08, 3 November 2014 (UTC)
UTF-8 as a de facto standard
Could you expand on why you removed that information? Simply not liking it is not a reason to remove it (see WP:IDONTLIKEIT).  and  both clearly shows that utf-8 is considered as a de facto standard, and I could get more citations if needed (those two was just the first hits I found). Belorn (talk) 09:37, 26 July 2012 (UTC)
- Your claim is wrong: UTF-8 is not a defacto standard but a real stadard. It is however just one of many standards in the area of encoding. Your change made the article less exact. --Schily (talk) 09:54, 26 July 2012 (UTC)
- Not my claim. To cite:
- ""The most widely accepted ("de facto standard") character encoding method is UTF-8."" (ibm.com) 
- ""Modern Linux installations use UTF-8 for their environment in any country with any language and is currently the de facto standard for to represent text"" (Readhat Glossary) 
- ""UTF-8 is the defacto standard console and text file on modern systems, though other encodings are still common"" (Mercuril project)
- ""UTF-8: Unicode for all regions, mostly in 1-3 Octets (new de facto standard)"" (linuxtopia.org) 
- ""For many projects on Linux, the de facto standard is to use UTF-8."" (unifont.org), and so on. Python has also gone making utf-8 the default for text.
- ""itself (ASDF) only recognizes one encoding beside :default, and that is :utf-8, which is the de facto standard, already used by the vast majority of libraries that use more than ASCII."" (ASDF manual published by common-lisp.net). 
- This is just a small subset of all articles I find doing a quick few min google search. Books should have even more statements like that. Are there a reason they are not relevant sources to the suggested change to the article? — Preceding unsigned comment added by Belorn (talk • contribs) 11:48, 26 July 2012 (UTC)
You're both right; UTF-8 is a standard, as in a standardised encoding of Unicode; but it's one among many, and the choice of UTF-8 over other encodings is a de facto standard in most cases (though there are certainly cases where it's an explicit and official standard, such as in XML). To escape the whole argument, and because the Magic number section had become rather bloated and messy, I've rewritten it and trimmed it down. It now repeats itself less, contains less irrelevant information, and totally avoids the question of what sort of standard UTF-8 represents. :-) -- Perey (talk) 12:23, 26 July 2012 (UTC)
- As the BOM is now clearly marked as superfluous, I see no problem with this text. --Schily (talk) 10:48, 27 July 2012 (UTC)
Coining of "shebang" (to mean #!)
For the record, I believe I was the one to coin this particular usage of "shebang", sometime in the late 1980s. By the time I posted  in 1989 I was already trying to get people to adopt the term. (The Usenet article in question was patch 7 for Perl 3.0.) I'm not aware that anyone else coined it independently, though of course that is always possible. (Sorry for the anonymous post, but someone seems to have grabbed "TimToady" already, and I don't think it was me.) --Larry Wall 220.127.116.11 (talk) 02:49, 4 September 2012 (UTC)
- since the slang word shebang was in fairly widespread use prior to computer culture, were you aware of that usage? I would trace the whole shebang back to that word, seems unavoidable. 18.104.22.168 (talk) 14:12, 22 February 2013 (UTC)
Proposed merge with Interpreter directive
Neither this article nor Interpreter directive is noteworthy enough for a wikipedia article really, but if we merge them together they might just pass. Felixphew (Ar! Ar! Ar!) 07:16, 19 June 2014 (UTC)
- Except for UNOS (1982) #! is not an interpreter directive but a kernel feature. Schily (talk) 13:23, 14 July 2014 (UTC)
the pathname of the script
It would be nice if this article would mention how a shebang shell script can set a variable to a pathname to the script. This pathname is useful so the script can use relative paths to reference other files it needs. I have shell code to do this, but it isn't pretty and I'm not sure it's robust. Perhaps someone knows a bulletproof way to do this and could add it to this article.
Encyclopedant (talk) 08:23, 6 December 2014 (UTC)
- That depends on the language used and is nothing to do with #!. In general consult a tutorial or reference documentation for your choice of language. Ewx (talk) 09:41, 6 December 2014 (UTC)
- There have been many discussions in the POSIX standard teleconferences and the reason why this feature is not yet integrated in the standard is that there is currently no suitable proposal to deal with variable path names. Note that the POSIX standard does not specify pathnames and that #! <command> needs an absolute path name for <command> in order to avoid making it a security problem. Schily (talk) 17:09, 8 December 2014 (UTC)
Based on all the talk at Interpreter directive, the answer is clearly a no to the merge of the two. One is a list, the other gives details and history. The point was made there are different versions. I will wait a short time before removing this request that is not an agreement. Tag removed. Telecine Guy 04:50, 26 October 2015 (UTC)
Too Linux Centric?
#!path arg translates to this system call: execve("path", ["path", "arg"], env);