[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [WEB SECURITY] Advisory: Attack of the Mongolian space evaders... (and other Medieval XSS vectors)



It is similar but I want to point out some important differences.  Most of what you're pointing out aren't white space characters.  Of those two characters you found in FF3, only the first is the UTF-8 BOM.  UTF-8 does not have endianness like UTF-16 and UTF-32.  That second character you have is a reserved, and has different binary properties, and general categorization than the BOM.  BTW Mozilla has been working on a fix for the way formatting characters get handled by the javasript interpreter.

The characters in your FF2 list all share something in common - they have the general category Cf [Other, Format] assigned to them. The RTL and LTR marks have the pattern_whitespace binary property assigned, and some of the other chars are assigned a bidi control property, but mostly these are all Cf.  

The characters in my list are different - they're all assigned general category Zs [Separator, Space] and assigned a binary property of white space.  

The last thing I'd point out is the difference in the test cases from PDP's.  It looks like he's generating NCR decimal values to insert in the middle of function names and other parts of the statement.  I believe the NCR approach is fine for this, but I generated tests which inserted the UTF-8 or UTF-16 encoded bytes directly into the file.  Mainly because I'm testing more than HTML, but also because I don't know/trust how NCR gets interpreted across browsers, HTML, XML, CSS, and javascript.

Chris


  

-----Original Message-----
From: Bil Corry [mailto:bil@xxxxxxxxx] 
Sent: Sunday, September 14, 2008 7:01 PM
To: websecurity@xxxxxxxxxxxxx
Subject: Re: [WEB SECURITY] Advisory: Attack of the Mongolian space evaders... (and other Medieval XSS vectors)

Chris Weber wrote on 9/13/2008 4:52 PM: 
> The following code points all get treated as a space.  Making things like:
> 
> <a href=#[U+180E]onclick=alert()>
> 
> possible. This list includes many of the Unicode code points with the 
> white_space property:
> 
> U+2002 to U+200A
> U+205F
> U+3000
> U+180E Mongolian Vowel Separator
> U+1680 Ogham Space Mark

It's similar to what gnucitizen pointed out for Firefox last year:

	http://www.gnucitizen.org/blog/snippets-of-defense-ptiv/

When I ran his JavaScript script at the time with FF2, it found these as the whitespace chars that FF2 allows:

	&#8204
	&#8205
	&#8206
	&#8207
	&#8234
	&#8235
	&#8236
	&#8237
	&#8238
	&#8298
	&#8299
	&#8300
	&#8301
	&#8302
	&#8303
	&#65279

Re-running it again with FF3, I get this:

	&#65279
	&#65534

which is the UTF-8 BOM in little- and big-endian.


- Bil


----------------------------------------------------------------------------
Join us on IRC: irc.freenode.net #webappsec

Have a question? Search The Web Security Mailing List Archives: 
http://www.webappsec.org/lists/websecurity/archive/

Subscribe via RSS: 
http://www.webappsec.org/rss/websecurity.rss [RSS Feed]

Join WASC on LinkedIn
http://www.linkedin.com/e/gis/83336/4B20E4374DBA



----------------------------------------------------------------------------
Join us on IRC: irc.freenode.net #webappsec

Have a question? Search The Web Security Mailing List Archives:
http://www.webappsec.org/lists/websecurity/archive/

Subscribe via RSS:
http://www.webappsec.org/rss/websecurity.rss [RSS Feed]

Join WASC on LinkedIn
http://www.linkedin.com/e/gis/83336/4B20E4374DBA



Brought to you by http://www.webappsec.org
Search this site