Uncontrolled format string
Uncontrolled format string is a type of software vulnerability, discovered around 1999, that can be used in security exploits. Previously thought harmless, format string exploits can be used to crash a program or to execute harmful code. The problem stems from the use of unchecked user input as the format string parameter in certain C functions that perform formatting, such as
printf(). A malicious user may use the
%x format tokens, among others, to print data from the stack or possibly other locations in memory. One may also write arbitrary data to arbitrary locations using the
%n format token, which commands
printf() and similar functions to write the number of bytes formatted to an address stored on the stack.
A typical exploit uses a combination of these techniques to force a program to overwrite the address of a library function or the return address on the stack with a pointer to some malicious shellcode. The padding parameters to format specifiers are used to control the number of bytes output and the
%x token is used to pop bytes from the stack until the beginning of the format string itself is reached. The start of the format string is crafted to contain the address that the
%n format token can then overwrite with the address of the malicious code to execute.
This is a common vulnerability because format bugs were previously thought harmless and resulted in vulnerabilities in many common tools. MITRE's CVE project lists roughly 500 vulnerable programs as of June 2007, and a trend analysis ranks it the 9th most-reported vulnerability type between 2001 and 2006.
Format string bugs most commonly appear when a programmer wishes to print a string containing user supplied data. The programmer may mistakenly write
printf(buffer) instead of
printf("%s", buffer). The first version interprets
buffer as a format string, and parses any formatting instructions it may contain. The second version simply prints a string to the screen, as the programmer intended.
Format bugs arise because C's argument passing conventions are not type-safe. In particular, the
varargs mechanism allows functions to accept any number of arguments (e.g.
printf) by "popping" as many arguments off the call stack as they wish, trusting the early arguments to indicate how many additional arguments are to be popped, and of what types.
Format string bugs can occur in other programming languages besides C, although they appear with less frequency and usually cannot be exploited to execute code of the attacker's choice.
Format bugs were first noted in 1990 in the fuzz testing work done at the University of Wisconsin (see Miller, Fredriksen, So 1990). They called these bugs "interaction effects" and noted their presence when testing the C shell (csh).
The use of format string bugs as an attack vector was discovered by Tymm Twillman during a security audit of the ProFTPd daemon. The audit uncovered an
snprintf that directly passed user-generated data without a format string. Extensive tests with contrived arguments to printf-style functions showed that use of this for privilege escalation was actually possible. This led to the first posting in September 1999 on the Bugtraq mailing list regarding this class of vulnerabilities, including a basic exploit. It was still several months, however, before the security community became aware of the full dangers of format string vulnerabilities as exploits for other software using this method began to surface. The first exploits leading to successful privilege escalation attack were published simultaneously on the Bugtraq list in June 2000 by Przemysław Frasunek and the person using nickname tf8. The seminal paper "Format String Attacks" by Tim Newsham was published in September 2000.
Many compilers can statically check format strings and produce warnings for dangerous or suspect formats.
This is only useful for detecting bad format strings that are known at compile-time. If the format string may come from the user or from a source external to the application, the application must validate the format string before using it. Care must also be taken if the application generates or selects format strings on the fly.
See also 
- Cross-application scripting exploits a similar kind of programming error
- Improper input validation
- SQL injection is a similar attack that succeeds when input is not filtered
- Robert C. Seacord, Secure Coding in C and C++. Addison Wesley, September, 2005. ISBN 0-321-33572-4
- Tobias Klein, Buffer Overflows und Format-String-Schwachstellen. Dpunkt Verlag, ISBN 3-89864-192-9.
- Crispin Cowan, Software Security for Open-Source Systems. IEEE Computer Society, IEEE Security & Privacy, January/February 2003, http://computer.org/security
- Barton Miller, Lars Fredriksen and Bryan So, An Empirical Study of the Reliability of UNIX Utilities. Communications of the ACM, vol. 33, no. 12 (December 1990). Also appears (in German translation) as Fatale Fehlerträchtigkeit: Eine empirische Studie zur Zuverlässigkeit von UNIX-Utilities, iX, March 1991. http://www.cs.wisc.edu/~bart/fuzz/
- Crispin Cowan. FormatGuard: Automatic Protection From printf Format String Vulnerabilities. Proceedings of the 10th USENIX Security Symposium, August 2001. http://www.usenix.com/events/sec01/full_papers/cowanbarringer/cowanbarringer.pdf
- "CWE-134: Uncontrolled Format String". Common Weakness Enumeration. MITRE. December 13, 2010. Retrieved March 5, 2011.
- Bugtraq: Format String Vulnerabilities in Perl Programs
- Bugtraq: Exploit for proftpd 1.2.0pre6
- 'WUFTPD 2.6.0 remote root exploit' - MARC
- 'WuFTPD: Providing *remote* root since at least1994' - MARC
- Bugtraq: Format String Attacks
- Warning Options - Using the GNU Compiler Collection (GCC)