[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[WEB SECURITY] HPP -- What is it, and what types of attacks does it augment?



HTTP Parameter Pollution -- I see this term cropping up again, in some
upcoming presentations, BlackHat DC, and due to my work still get
consistent questions about "HPP".

I wrote this response back in May, and realized I never posted it.

So "Parameter Pollution" is one of those absolutely terrible terms
(like "cookie poisoning") that fails to describe anything meaningful
about what it is, while casting the widest net possible.

Let us try to define it:

1) What is it
2)  What attacks does it augment
3) Where does it fit in classification and taxonomy?

First and most importantly -- Stephano and Luca's work is excellent.
Thank you. I really love the server/technology enumeration chart. Very
useful! Great work again. (though terrible name :)


Second: while it focuses on query-string delimiters and HTTP control
characters -- there is a bigger tiger by the tail.

The stuff you touch on directly relates to Arshan's paper on HTTP
method abuses. Many of the same root causes. Also some of the
techniques I have discussed, including a few things I was unable to
publish @OWASP last year, but maybe 2010 I'll get around to it.

Here is my take on HPP and the overall bucket of issues. Merry Christmas!

nota bene: Much of the material is stolen from security peoples much
smarter than I over the years, over coffee and beers. I have long
since stopped trying to remember who came up with what on the backs of
napkins or in test sessions. If I forget to credit someone in the
notes below, please remind me. And safely assume most everything below
is stolen from someone smarter than myself.

---

Down with the HPP: What is HPP?

Concepts:

     HPP as coined is too large of a bucket. There are more than one
concepts being mixed here. A more precise term for HPP as currently
defined might be "Unexpected, Out of Order, Concatenated, Multi-value,
and Malformed Parameter Injection".


What HPP is not:

     Ivan's Ristic's "Impedance Mismatch" is a clearly related to many
of these, but as a system-design weakness. HPP examples published so
far are more like attack-vectors, and go beyond Impedance Mismatch.
Though as defined today, Impedance Mismatch could fall into this huge
bucket.


Why HPP is hard to classify:

     Syntax Attacks -- obviously HPP techniques can be leveraged to
perform filter-evasion and launch all sorts of syntax-attacks, from
XSS to Control Character Injection (HTTP/RS, or back end system
attacks) to SQLi. But yet HPP does not fall exclusively into
"syntax-attack vectors".

     Semantic/Business Logic Attacks -- HPP can also be used to attack
business logic and semantic workflow, through injecting out-of-order,
unexpected-location, or multi-values, into a request. But yet it's not
exclusively an attack-vector for business-logic attacks. So what it
is?

I think in most cases it is most easily classified as an "impedance
mismatch" or "canonicalization weakness", but due to the complexity of
those weaknesses across multiple interpreters, HPP is probably easiest
to describe using attacks as examples, rather than the weaknesses.
(Thoughts?)

HPP appears to me to be the following different categorical buckets:

     1.1 Delimited Value Injection
     1.2 Multi-Value Injection
     1.3 Out-of-Order Value Injection
     1.4 Malformed Value Injection via Control Characters
     1.5 All of the above through using HTTP Malformed Request Methods

At a high-level:

    1.1 Delimiter Value injection:  Where one input takes many
delimited "things" as one big name=value pair. Example:
"?NAME=value:value1:value2" and "name:value:name:unexpectedcommand"
type injections targeting back end-interpreters that trusted the web
front-end validation. Your work excellently covers this via HTTP and
Query-string delimiters, which I'm not sure I have ever seen
specifically in most of the examples you provide.

Both Sverre H. Huseby and HD Moore have done work on injection using
delimiters here. Sverre published some of his online, and in his book
Innocent Code. HDM also helped me with delimiter-based attacks via
webapps fronting mainframes and Tandems years ago, though I don't
think any of it ever got published. HD would recall better. In some
cases it was possible to create back-end buffer overflows using these
same techniques as well! e.g. - :name:beyond:Buffer:Sized:Value: and
:name:BufferSizedValue:BeyondBufferSizedValue:

We definitely crashed services but we never could get an exploit
going, and chalked this up to our lack of knowledge on Tandem
shell-coding.


     1.2 Multi-Value Injection. Where
&name=value&name=value&name=value are injected, but not all are
validated and interpreted in the same way.

I have not seen this very often but it seems to occur where two or
more code-bases have been merged into one application. I have some
guesses why this occurs but no concrete source examples. Possibly
order-processing issues with arrays?


     1.3 Out-of-Order Value Injection: similar to 1.2, but where
ACTION is expected before ROLE, and you supply
&action=read&role=users&ACTION=EDIT


    1.4 Malformed Value Injection via Control Characters: take the
classic HTTP/RS or CRLF injection:
&name=value&sessionID=1234&returnURL=%3fCRLFSET-COOKIE:%20sessionID=SessionFixation


     1.5 Malformed Request Methods. This is similarly related to
Aspect Security & Arshan's work on HTTP Method Abuses. Note the
examples I posted at the time on that thread, and I've also reported
these to Microsoft before related to .NET Filter Evasions. I break
down many examples of these in my filter evasion paper, but the scope
of re-recombinant possibilities are vast.

This one always gets me, as it's a tough one to find, but trivial to
exploit. Due to the significant mathematical overhead required to test
all possible permutations of Syntax and Sematic attack vectors
black-box, it would seem this is not the best way to find them. They
produce a very low-order exploitable hit-rate for a very high volume
of required injections to accurately discover. I think there are
significant ways to optimize these tests, but that is an ongoing
research project at my employer and we do not have solid statistical
samples yet to say what works.

These are so clearly related to specific programming practices that it
is an area where I would expect source-code scanners (and human
auditors) to excel at discovering. However, when I would find these, I
would frequently find them while black-box testing applications that
had already been through source code scanning or human source-code
audits. I think this is because in some cases the code in question was
part of a default app server or framework function out-of-scope of the
review. In other cases it appeared missed due to some subtle code
added to post-processing of the entire session object  after
validation, and not obvious to the scanner or reviewer that the input
could still be tainted at that point. </best_guess>


One other minor point:

Many types of XSS filter evasions, and double/triple encoded syntax
attacks work for the exact same root reasons as HPP-leveraged attacks.

The web servers, app servers, and code components all can make
different decisions about how to interpret BOTH the NAMEs and VALUEs
in a pair, and the same occurs when you string NAME=VALUE together in
interesting ways using metacharacters, and even multiple/new
combinations of NAMEs and VALUEs.

Long and short -- don't just test the VALUEs. Sometimes the NAMEs are
vulnerable to these attacks as well, especially when the entire
NAME=VALUE is used in dynamic queries, or in the DOM, etc. etc. Some
IDEs abstract the name of objects such that developers never think
about validating the name, or where you have NAME1, NAME2, and NAME3
they loosely accept NAME* which introduces all sorts of crazy
attack-vector possibilities.

----

I believe there is a hierarchy of attack and weakness here, but there
are so many things contained in "HPP" that I find it hard to break
them out logically.

I attempted to create a logical structure for these attacks from in
Section 4 of my Syntax-attack/Filter Evasion monolith (for those of
you who have reviewed this). I hope it contains enough explanation to
give folks more ideas on where to go with this. If not, just ask me
for examples or clarification on any section.

another nota bene: The end of Section 4 (which I am posting below) on
HTTP methods attacks was significantly refined & expanded after
Arshan's paper from Aspect on this.

That work deserves full credit for HTTP method abuse documentation. I
have found and documented issues like these over the years, where
people explicitly validate variables or auth out of a specific part of
the request object (specific string that is a subset of a specific
method), but the application processes ALL values out of the entire
session object. These can lead to syntax-based filter evasion attacks,
and also to authC/Z bypasses.

People also bind arbitrary methods to default (unvalidated) method
handlers as well, opening up new attack vectors. You'll find mention
of this here and there in books on webappsec testing, and fuzzing HTTP
values (change GET to POST to HEAD, etc.). But Aspect/Arshan's work on
this is really the only single, formally documented set of examples I
know of.


This is my class structure for filter evasion in webapp Black Box testing:

	
4. Weak Input Validation bypasses for Regex and Metacharacter Ninjas

	4.1 -- The Regex BORK
		4.1.1 Null Character Injection: validations fails out on "null" value

		4.1.2 Meta Character Injection ([]{}<>): force validation [match] or
'escape' before/after attack string is processed

		4.1.4 Control Character Injection (CRLF, EOF, ::, etc): break up
attack strings in ways validation routine does not understand, but
target interpreter does
		
		4.1.3 Escape Sequences (//, <!-- comments etc.): see 4.1.4

		4.1.4 Padding Sequences (00, **): see 4.1.1

		4.1.5 Concatenation Sequences ( CharChar, Char+Char, CHcharAR,
etc.): bypass weak regex blacklist removal/replacements

		4.1.6 Regex parameter replacement by injecting commonly
matched/replaced parameters: attack regex directly, see 4.1.5


	4.2 -- The Delimiter Regex Bypass
		4.2.1 Web parsers: Email addy example (alphanum@xxxxxxxxxxx ==
alpha|attack|num@do|attack|main.site)
		
		4.2.2 Non-web parsers: injecting data and commands into back-end
systems (mainframes, Tandems, VAX, etc.) cookie=value:value1:value2"
and "name:value:name:unexpectedcommand". See "Innocent Code" for more
examples.
		
		4.2.3 SMTP: email SMTP shell-script post-processing examples
injected through HTTP

		4.2.4 Common Date Field regex match abuses (2/2/2 == 2/2/attack/2/2)


	4.3 -- The Trailing Regex Match (people write sloppy leading/trailing
regex matches):
		4.3.1 Account Number 000001001001 == 0<attack>000001001001,
<attack>1001001, etc.
		
		
	4.4 -- The Arbitrary Value Injection: Parsers Process New Things They
Should Not Process:
		4.4.1 Arbitrary Created-Value Injection: newValue=ATTACK
		
		Where all objects are arbitrarily processed, but only
explicitly-known and explicitly-named objects are validated.
		
		
		4.4.2 Arbitrary Persisted-Value Injection: /page1.asp?newValue=foo,
/page2.asp?newValue=fooATTACK
		
		Where one can create a new object that is arbitrarily persisted in
only one specific location of the application. However, after you
inject the new, benign object, you can attack the object later after
it is persisted in a location where it is NOT explicitly validated.
See 4.4.1
		
		4.4.3 Arbitrary Multi-Value Injection: knownValue=foo:foo2:fooU
		
		Where only a subset of delimited values are validated, and others
passed on to interpreters that process unvalidated values in the
delimited chain.
		
		4.4.4 Arbitrary NAME=value injection: NAME(attack)=expected_VALUE or
ATTACK=expected_VALUE
		
		Where object NAME is the source of injection, where NAME is weakly
validated if at all, but loosely accepted (while =VALUE is rigorously
validated). NAME and not VALUE becomes the attack-vector. These can be
tough to fix if the source issue is in an unmanaged code product the
application is using.
		
		4.4.5 First/Odd/Last Order Value Processing:
name=foo1;name=foo2;name=foo3;name=fooATTACK
		
		Where validation occurs on first or last same-name object and you
can execute attack by identifying "impedance mismatch" between
validation and target interpret processing
		
		
		
		4.5 -- The Unexpected/Malformed HTTP Method Injection
		
		4.5.1 HTTP Malformed Valid Method Request:  Where POSTdata expects
&NAME=VALUE, and you inject ?NAME=ATTACK in the URI on HTTP POST
Request, and vice-versa.
		
		The validation routine in these exploit examples usually explicitly
validates Request.Data.Querystring or Request.Data.Postdata or
Request.Data.Headers ONLY. You inject the expected NAME with =ATTACK
value in the reverse of the expected GET/POST or HTTP Header value
location and it bypasses the validation route checking only an
explicit subset of the request object.
		
		
		4.5.2 First/Odd/Last Order Value Processing on expected HTTP Method:
like 4.5.1
		
		Where validation only occurs on first occurrence of a VALUE, or
expected-location of a value in the HTTP request. You inject your
attack into NAME=VALUE in a part of the HTTP Request that is processed
by the target interpreter, where validation occurs ONLY on the
legitimate NAME=VALUE in the expected part of the request.

		
		4.5.3 NAME/VALUE processing on Unexpected HTTP Method: HTTP WEAK
/foo.php?query=' DROP Table Users ;-- like 4.5.1
		
		Where validation only occurs on expected HTTP Methods. By using
unexpected or arbitrary methods (like HEAD, WEAK, or PASS)
NAME=VALUE/ATTACK is passed if validation only occurs on explicit
expected methods (e.g.-GET), but the arbitrary method is mapped to a
default method (GET) processed by the target interpreter after
bypassing the validation routine. &See paper by Aspect Security for
detailed examples of some of these.

---

Does this structure make sense?

If so I will put work into mapping this all together and publishing it.

Or, alternately, one of you take it and run with it, and do something
smarter than I can.

Cheerio,

-- 
Arian Evans
I invest most of my money in motorcycles, mistresses, and martinis.
The rest of it I spend frivolously.

----------------------------------------------------------------------------
Join us on IRC: irc.freenode.net #webappsec

Have a question? Search The Web Security Mailing List Archives: 
http://www.webappsec.org/lists/websecurity/archive/

Subscribe via RSS: 
http://www.webappsec.org/rss/websecurity.rss [RSS Feed]

Join WASC on LinkedIn
http://www.linkedin.com/e/gis/83336/4B20E4374DBA



Brought to you by http://www.webappsec.org
Search this site