Talk:Cross-site scripting

Exploit scenarios

CLARIFICATION NEEDED ON Type-0 attack: In this section under the subsection Type-0 attack bullet #3 it says, "The malicious web page's JavaScript opens a vulnerable HTML page installed locally on Alice's computer." There is no explanation of how the vulnerable HTML page got installed locally on Alice's computer or how Mallory knew about it. This is the crux of this attack so without this this part of the explanation this scenario is not useful. I haven't found an answer to this or I would have corrected the article. I'm hoping someone else will read this who has more knowledge of this attach and add the clarifying information.

Terminology

The first paragraph in section: "Other forms of mitigation" is garbage. Just quoting text will not stop it from being interpretted as html. I can always put "> into the text to close the tag. That whole section should be either removed or heavily modified. It is naive and innaccurate. --129.97.84.62 15:06, 4 April 2006 (UTC)[reply]

The use of the word "quoting" in the entire article is very ambigious, which is why Mr 129.97.84.62 misunderstood the article. We should take the time to clarify - "quoting" should be replaced with "encoding". --Blaufish 19:46, 3 May 2006 (UTC)[reply]

Indeed, many people are very confused by "quoting" in HTML. I believe this is the official terminology for encoding HTML special characters, and this was mentioned at the beginning of the "Avoiding XSS vulnerabilities" section. However, for casual readers who don't read that section and aren't familiar with the term in this context, it would probably be best to use "encode" or "encoding" more often. -- TDM 05:07, 28 May 2006 (UTC)[reply]

I agree... "HTML quoting" made absolutely no sense to me, and information about what that means is not easily available. Google searches for "HTML quoting means," "what is HTML quoting," "what does HTML quoting," and the ever-popular define:"HTML quoting" all yeild no results. And there is no Wikipedia entry for "HTML quoting." If it is the official terminology there should probably be a wiki article about it as I'm still uncertain what other people think it is (or what its common usage is). "Encoding" seems better to me.

Avoiding XSS Scripting: blacklisting vs. whitelisting

There should be some mention of the two different approaches -- blacklisting (i.e., removing anything that can be recognised as a potential script injection) and whitelisting (i.e., only allowing stuff that can be determined not to be a potential script injection). If I had references for this kind of stuff, I'd add it myself, but I came here looking for them. :( JulesH 17:10, 27 July 2006 (UTC)[reply]

This "avoiding" section is written from a programmers point of view. How do users avoid these problems? Justforasecond 14:55, 31 July 2006 (UTC)[reply]

True, it is written from a programmers perspective, which is most relevant. Users can do very little to avoid such attacks, but perhaps a few things should be mentioned. All I can think of from the users' side, is disabling scripting in browsers (usually unworkable), and to avoid trusting links sent to them via email. --TDM 13:28, 8 August 2006 (UTC)[reply]

Or NoScript for Firefox ^.^ 58.160.188.225 09:34, 28 March 2007 (UTC)[reply]

Examples

I like the recent example of PayPal's XSS hole. However, it isn't mentioned what type it is. Is it a type 1 XSS? If so, we can probably remove the ATutor example, since it isn't very well known, and replace it with the PayPal one. We should also keep the number of examples down to 4-5 if possible. It could easily grow to 1000 if everyone put their favorite in, but we don't need that. --TDM 13:32, 8 August 2006 (UTC)[reply]

No one has responded to this, so I went ahead and ripped out several half-complete examples. It seems this section is becoming a bulletin board for script kiddies to advertise. Honestly people... XSS are a dime a dozen. Posting them to popular security mailing lists is more than enough to get your name out there. I did remove the ATutor example and improved some others, but some still don't list what type of XSS they are. If those who posted them could describe them a bit more, that would make this section more consistent and complete. TDM 22:51, 26 October 2006 (UTC)[reply]

Restructuring

Someone added the notice recently that this page may not meet Wikipedia's standards due to the need for restructuring. Could whoever added that elaborate? I don't see any new comments here specific to that evaluation. If there's some other organization that would work better, I'd be willing to improve the document. TDM 23:57, 19 October 2006 (UTC)[reply]

Guilty as charged. I should have added a note with some suggestions. Firstly, I found the article confusing and difficult to read, despite having a 15-year background as a systems software engineer. Specifically, the article does not _begin_ with a clear definition of cross-site scripting. Secondly, the sections characterising the types of cross-site scripting are hard to read. I would suggest that in a sense the article could be considered as "written backwards" in that the _examples_ given just after the "types" section show the clearest writing, and are the nearest thing the article has to a clear _definition_. Consider moving these to the front of the article since a "definition by example" would be an improvement. So "restructuring" the article could mean trying moving things around into a more logical order so that things are clearly defined _before_ they are referenced. And if clear definitions are not easily obtainable, perhaps because of lack of consensus, then definition-by-exemplification is definitely the way to go. CecilWard 10:20, 20 October 2006 (UTC)[reply]

I have expanded the first paragraph's description and re-ordered the first two sections, which will hopefully help a bit. I don't have time to rewrite the types at the moment, but I did try to clean up the real world examples a bit. I agree that an example early in the article will help those who don't have a clear grasp of all of the background material, but I think it should be relatively short and as simple as possible. When I originally put most of the text together, I wanted to be sure to put the vulnerability in the context of the same-origin policy, otherwise the technical reason why XSS is even a vulnerability at all may be difficult to understand. Because of the order in which things are referenced (e.g. "XSS" abbreviation), a major reordering would require a lot of rewriting as well. However, I agree that the long background section, coupled with the terminology section makes for a long read before casual readers get to any solidifying examples. Thanks for the feedback. TDM 22:48, 26 October 2006 (UTC)[reply]

Link to HTML Purifier

I'd like to add a link to HTML Purifier in the Prevention section, as it implements the most reliable method: parsing and stripping all tags/attributes not in the whitelist (as well as other protection). Unfortunately, I wrote the library, so if I put it on it's vanity. So could someone take a look and, if it looks useful, add the link for me? Thanks! — Edward Z. Yang^(Talk) 23:36, 29 November 2006 (UTC)[reply]

A bit quiet around here hmm... I'll wait another week. — Edward Z. Yang^(Talk) 02:37, 2 December 2006 (UTC)[reply]

In my not-so-humble-opinion, stripping tags is never the most reliable method. HTML entity encoding is likely the only safe method. Sure, you can't develop a complex stripping system that is designed only to allow good things through, and this might work most of the time, but browsers are just too inconsistent for this to let many sleep well at night. I don't care if you link to it, but don't change the text saying it is the best way to go or anything like that. TDM 17:32, 23 January 2007 (UTC)[reply]

The article already has text in "Avoiding XSS vulnerabilities" that states: "The most reliable method is for web applications to parse the HTML, strip tags and attributes that do not appear in a whitelist, and output valid HTML." (Which I did not add to the article). It probably is POV, but I think it's correct (we'll need to find a citation for it, then). Making a complex stripping system is not impossible: as HTML Purifier demonstrates, it has been done.

Browser inconsistency is a trickier issue, but I believe that it too poses no problem as long as you enforce standards-compliant code. Browsers begin to have wildly differing interpretations of HTML when it's ambiguous, when you have things like <IMG src="http://ha.ckers.org/" style"="style="a /onerror=alert(String.fromCharCode(88,83,83))//" >`> . If you get rid of this craziness and enforce well-formed XHTML, you're gold. (Just don't allow comments). — Edward Z. Yang^(Talk) 04:34, 25 January 2007 (UTC)[reply]

Vulnerability example/demonstration

I'm not familiar enough with this article to know exactly where this should go, but I think this presentation of a Google Desktop vulnerability is extremely educational - they show how such small vulnerabilities in this case end up cascading into complete control over the victim's computer. The vulnerabilities they use are all patched (I think including one glitch that's server-side), so they no longer work, so it should be safe to show. This sounds like Type 2 in the article. —AySz88 \^-^ 20:39, 22 February 2007 (UTC)[reply]

That looks like a good resource. I wouldn't have a problem with its addition. Is there a plain-text version, though? — Edward Z. Yang^(Talk) 00:38, 25 February 2007 (UTC)[reply]

XSS v CSS

I was almost certain we'd previously had a discussion on this, but obviously this is not the case. So, I'll bring it to the floor now.

I am strongly opposed to including the acronym "CSS" in the introduction paragraph of the article. It is misleading term that no one uses anymore, as the Terminology statement already states, and thus, while deserving mention in that segment, should not be in the intro. — Edward Z. Yang^(Talk) 22:42, 28 February 2007 (UTC)[reply]

It is true that most people (especically in the security community) today no longer use "CSS" to refer to cross-site scripting, since this acronym can refer to another technology. Nevertheless, AFAIK, a few existing articles (including some more recent ones) in the Internet still use this acronym, or use both acronyms simultaneously to refer to cross-site scripting (examples for using both: [1] and [2]). While we should certainly discuss the more appropriate or prefered term in the main article (e.g. in the Terminology section), it seems better that other terms are also mentioned in the intro as long as it is still used by some people or can be commonly found. Or, we can change/rephrase the intro statement a bit to make it more clear.--64.231.71.28 08:16, 1 March 2007 (UTC)[reply]

I can see where you're coming from. Maybe we could bump to the end of the intro paragraph. — Edward Z. Yang^(Talk) 22:27, 1 March 2007 (UTC)[reply]

It seems good.--64.231.71.28 01:38, 2 March 2007 (UTC)[reply]

This article is very well written

It's well-structured, concise, disambiguating, sufficiently detailed, and very clear. 64.221.248.17 22:24, 6 April 2007 (UTC)[reply]

following information is so good.

There's also a singing group

called XSS. I don't know how to do disambiguation pages, and I'm not an expert on XSS (that's why I was looking them up), but maybe someone can help clarify this? All I know about XSS is that they sing sort of hip-hop style R&B in English, and that they're at least popular in the middle east.

The Reason For Wiki Formatting?

Are XSS and the difficulty with interpreting and reformatting HTML some of the reasons why wikis don't use HTML for formatting? I know that one reason for not using HTML is that it might be difficult for some wiki users to learn. But it seems that the wiki formatting also helps prevent XSS while giving the users some control. --Lance E Sloan 16:57, 8 August 2007 (UTC)[reply]

Yes, wiki's use alternate formatting languages largely due to XSS. If they allowed raw HTML, it would be trivial to hijack anyone else's account and post on their behalf in most cases. Obviously alternate languages can be easier for non-programmers to learn, but I think this is the main reason. Keep in mind, the use of an alternate language does not prevent XSS alone. It must be very carefully implemented. I've seen bulletin board posting languages which allowed injection through attribute parameters. In the language I was testing, one would specify something like: [link url="http://..."]text[endlink] to produce <a href="http://...">text</a>. However, if you supplied "http://evil.example.org/%22%3e" as the URL, the page would render as <a href="http://evil.example.org/">">text</a>, indicating an obvious injection. TDM 13:40, 15 November 2007 (UTC)[reply]

Maybe the wrong place, but...

I have been getting strange XSS warnings in FF 2.0.0.6 from wikipedia articles with images lately. Does anyone know if there has been a change in the template formatting of images or if its a FF bug?

vulnerability or attack?

isn't cross site scripting really an attack and not a vulnerability? the vulnerability is most clearly input validation. the attack is script injection, of which cross site scripting is a a specific type of injection. do we agree? —Preceding unsigned comment added by 198.169.188.227 (talk) 19:32, 5 September 2007 (UTC)[reply]

Well, I would agree that cross-site scripting could be used to refer to an attack. However, there is a vulnerability at the core of it which allows the attack to succeed. I strongly disagree with the assertion that it's a "input validation" flaw, because the real problem output-encoding. These are very different issues, even though people tend to lump them together. What if you want your application to handle nearly any kind of input (free-form text field with multiple languages/character sets) but don't want it to be vulnerable? You can't validate the input carefully (and prevent HTML special characters from getting in there), but you *can* encode the output. It's an injection flaw, whose correct fix is to treat special characters as literals. Yes, you can use validation up-front in 95% of the cases to mitigate the problem, and you *should* do this, since input validation can mitigate other types of vulnerabilities as well, but it is just a mitigation. TDM 13:31, 15 November 2007 (UTC)[reply]