Statically Detecting Likely Buffer Overflow Vulnerabilities
David
Larochelle
larochelle@cs.virginia.edu
University of Virginia, Department of Computer Science
David
Evans
evans@cs.virginia.edu
University of Virginia, Department of Computer Science
Abstract
Buffer overflow attacks may be today’s single most important security threat. This paper presents a new approach to mitigating buffer overflow vulnerabilities by detecting likely vulnerabilities through an analysis of the program source code. Our approach exploits information provided in semantic comments and uses lightweight and efficient static analyses. This paper describes an implementation of our approach that extends the LCLint annotation-assisted static checking tool. Our tool is as fast as a compiler and nearly as easy to use. We present experience using our approach to detect buffer overflow vulnerabilities in two security-sensitive programs.
1. Introduction
Buffer overflow attacks are an important and persistent security problem. Buffer overflows account for approximately half of all security vulnerabilities [CWPBW00, WFBA00]. Richard Pethia of CERT identified buffer overflow attacks as the single most important security problem at a recent software engineering conference [Pethia00]; Brian Snow of the NSA predicted that buffer overflow attacks would still be a problem in twenty years [Snow99].
Programs written in C are particularly susceptible to buffer overflow attacks. Space and performance were more important design considerations for C than safety. Hence, C allows direct pointer manipulations without any bounds checking. The standard C library includes many functions that are unsafe if they are not used carefully. Nevertheless, many security-critical programs are written in C.
Several run-time approaches to mitigating the risks associated
with buffer overflows have been proposed.
Despite their availability, these techniques are not used widely enough
to substantially mitigate the effectiveness of buffer overflow attacks. The next section describes representative
run-time approaches and speculates on why they are not more widely used. We propose, instead, to tackle the problem
by detecting likely buffer overflow vulnerabilities through a static analysis
of program source code. We have implemented
a prototype tool that does this by extending LCLint [Evans96]. Our work differs from other work on static
detection of buffer overflows in three key ways: (1) we exploit semantic
comments added to source code to enable local checking of interprocedural
properties; (2) we focus on lightweight static checking techniques that have
good performance and scalability characteristics, but sacrifice soundness and
completeness; and (3) we introduce loop heuristics, a simple approach for
efficiently analyzing many loops found in typical programs.
The next section of this paper provides some background on buffer
overflow attacks and previous attempts to mitigate the problem. Section 3 gives an overview of our
approach. In Section 4, we report on
our experience using our tool on wu-ftpd
and BIND, two security-sensitive
programs. The following two sections
provide some details on how our analysis is done. Section 7 compares our work to related work on buffer
overflow detection and static analysis.
The simplest buffer overflow attack, stack smashing [AlephOne96], overwrites a buffer on the stack to
replace the return address. When the
function returns, instead of jumping to the return address, control will jump
to the address that was placed on the stack by the attacker. This gives the attacker the ability to
execute arbitrary code. Programs
written in C are particularly susceptible to this type of attack. C provides direct low-level memory access
and pointer arithmetic without bounds checking. Worse, the standard C library provides unsafe functions (such as
gets) that write an unbounded amount of
user input into a fixed size buffer without any bounds checking [ISO99]. Buffers stored on the stack are often passed
to these functions. To exploit such
vulnerabilities, an attacker merely has to enter an input larger than the size
of the buffer and encode an attack program binary in that input. The Internet Worm of 1988 [Spafford88, RE89]
exploited this type of buffer overflow vulnerability in fingerd.
More sophisticated buffer overflow attacks may exploit unsafe buffer
usage on the heap. This is harder,
since most programs do not jump to addresses loaded from the heap or to code
that is stored in the heap.
Several run-time solutions to buffer overflow attacks have been
proposed. StackGuard [CPMH+98] is a compiler
that generates binaries that incorporate code designed to prevent stack
smashing attacks. It places a special
value on the stack next to the return address, and checks that it has not been
tampered with before jumping. Baratloo,
Singh and Tsai describe two run-time approaches: one replaces unsafe library
functions with safe implementations; the other modifies executables to perform
sanity checking of return addresses on the stack before they are used
[BST00].
Software fault isolation (SFI) is
a technique that inserts bit mask instructions before memory operations to
prevent access of out-of-range memory [WLAG93]. This alone does not offer much protection against typical buffer
overflow attacks since it would not prevent a program from writing to the stack
address where the return value is stored.
Generalizations of SFI can insert more expressive checking around
potentially dangerous operations to
restrict the behavior of programs more generally. Examples include Janus, which observes and mediates behavior by
monitoring system calls [GWTB96]; Naccio [ET99, Evans00a] and PSLang/PoET [ES99,
ES00] which transform object programs according to a safety policy; and
Generic Software Wrappers [FBF99] which wraps system calls with security
checking code.
Buffer overflow attacks can be made more difficult by modifications
to the operating system that put code and data in separate memory segments,
where the code segment is read-only and instructions cannot be executed from
the data segment. This does not
eliminate the buffer overflow problem, however, since an attacker can still
overwrite an address stored on the stack to make the program jump to any point
in the code segment. For programs that
use shared libraries, it is often possible for an attacker to jump to an
address in the code segment that can be used maliciously (e.g., a call to system). Developers decided against using this
approach in the Linux kernel since it did not solve the real problem and it
would prevent legitimate uses of self-modifying code [Torvalds98, Coolbaugh99].
Despite the availability of these and other run-time approaches,
buffer overflow attacks remain a persistent problem. Much of this may be due to lack of awareness of the severity of
the problem and the availability of practical solutions. Nevertheless, there are legitimate reasons
why the run-time solutions are unacceptable in some environments. Run-time solutions always incur some
performance penalty (StackGuard reports performance overhead of up to 40%
[CBDP+99]). The other problem with
run-time solutions is that while they may be able to detect or prevent a buffer
overflow attack, they effectively turn it into a denial-of-service attack. Upon detecting a buffer overflow, there is
often no way to recover other than terminating execution.
Static checking overcomes these problems by detecting likely vulnerabilities before deployment. Detecting buffer overflow vulnerabilities by analyzing code in general is an undecidable problem.[1] Nevertheless, it is possible to produce useful results using static analysis. Rather than attempting to verify that a program has no buffer overflow vulnerabilities, we wish to have reasonable confidence of detecting a high fraction of likely buffer overflow vulnerabilities. We are willing to accept a solution that is both unsound and incomplete. This means that our checker will sometimes generate false warnings and sometimes miss real problems. Our goal is to produce a tool that produces useful results for real programs with a reasonable effort. The next section describes our approach. We compare our work with other static approaches to detecting buffer overflow vulnerabilities in Section 7.
Our static analysis tool is built upon LCLint [EGHT94, Evans96,
Evans00b], an annotation-assisted lightweight static checking tool. Examples of problems detected by LCLint include violations of
information hiding, inconsistent modifications of caller-visible state or uses
of global variables, misuses of possibly NULL
references, uses of dead storage, memory leaks and problems with parameters
aliasing. LCLint is actually used by working
programmers, especially in the open source development community [Orcero00, PG00].
Our approach is to exploit semantic comments (henceforth called annotations) that are added to source
code and standard libraries.
Annotations describe programmer assumptions and intents. They are
treated as regular C comments by the compiler, but recognized as syntactic
entities by LCLint using the @ following the /* to
identify a semantic comment. For
example, the annotation /*@notnull@*/ can be used syntactically like a
type qualifier. In a parameter
declaration, it indicates that the value passed for this parameter may not be NULL.
Although annotations can be used on any declaration, for this discussion
we will focus exclusively on function and parameter declarations. We can also use annotations similarly in
declarations of global and local variables, types and type fields.
Annotations constrain the possible values a reference can contain
either before or after a function call.
For example, the /*@notnull@*/ annotation places a constraint on
the parameter value before the function body is entered. When LCLint checks the function body, it
assumes the initial value of the parameter is not NULL. When LCLint checks a call site, it reports a
warning unless it can determine that the value passed as the corresponding
parameter is never NULL.
Prior to this work, all annotations supported by LCLint classified
references as being in one of a small number of possible states. For example, the annotation /*@null@*/
indicated that a reference may be NULL,
and the annotation /*@notnull@*/ indicated that a reference is not NULL.
In order to do useful checking of buffer overflow vulnerabilities, we
need annotations that are more expressive.
We are concerned with how much memory has been allocated for a buffer,
something that cannot be adequately modeled using a finite number of states. Hence, we need to extend LCLint to support a
more general annotation language. The
annotations are more expressive, but still within the spirit of simple semantic
comments added to programs.
The new annotations allow programmers to explicitly state function
preconditions and postconditions using requires
and ensures clauses.[2]
We can use these clauses to describe assumptions about buffers that are
passed to functions and constrain the state of buffers when functions
return. For the analyses described in
this paper, four kinds of assumptions and constraints are used: minSet, maxSet,
minRead and maxRead.[3]
When used in a requires clause, the minSet and maxSet
annotations describe assumptions about the lowest and highest indices of a
buffer that may be safely used as an lvalue (e.g., on the left-hand side of an assignment). For example, consider a function with an
array parameter a and an integer parameter i that has a precondition requires maxSet(a) >=
i. The analysis assumes that at the beginning
of the function body, a[i] may be used as an lvalue. If a[i+1] were used before any modifications
to the value of a or i, LCLint would generate a warning since the function preconditions
are not sufficient to guarantee that a[i+1] can be used safely as an
lvalue. Arrays in C start with index 0, so
the declaration
char buf[MAXSIZE]
generates the constraints
maxSet(buf) = MAXSIZE –
1 and minSet(buf) = 0.
Similarly, the minRead and maxRead
constraints indicate the minimum and maximum indices of a buffer that may be
read safely. The value of maxRead for a given buffer is always less
than or equal to the value of maxSet. In cases where there are elements of the
buffer have not yet been initialized, the value of maxRead
may be lower than the value of maxSet.
At a call site, LCLint checks that the preconditions implied by
the requires clause of the called function are
satisfied before the call. Hence, for
the requires
maxSet(a) >= i
example, it would issue a warning if it cannot determine that the array passed
as a is allocated to hold at least as many elements as the value
passed as i. If minSet or maxSet
is used in an ensures clause, it indicates the state of a
buffer after the function returns.
Checking at the call site proceeds by assuming the postconditions are
true after the call returns.
For checking, we use an annotated version of the standard library
headers. For example, the function strcpy
is annotated as[4]:
char *strcpy (char *s1, const char
*s2)
/*@requires maxSet(s1) >= maxRead(s2)@*/
/*@ensures maxRead(s1) == maxRead(s2) /\ result == s1@*/;
The requires
clause specifies the precondition that the buffer s1 is allocated to hold at least as
many characters as are readable in the buffer s2 (that is, the number of characters
up to and including its null terminator).
The postcondition reflects the behavior of strcpy – it copies the string pointed to
by s2 into the buffer s1, and returns that buffer. In ensures
clauses, we use the result keyword to denote the value
returned by the function.
Many buffer overflows result from using library functions such as strcpy in unsafe ways. By annotating the standard library, many buffer overflow vulnerabilities can be detected even before adding any annotations to the target program. Selected annotated standard library functions are shown in Appendix A.
In order to test our approach, we used our tool on wu-ftpd, a popular open source ftp server, and BIND (Berkeley Internet Name Domain), a set of domain name tools and libraries that is considered the reference implementation of DNS. This section describes the process of running LCLint on these applications, and illustrates how our checking detected both known and unknown buffer overflow vulnerabilities in each application.
We analyzed wu-ftp-2.5.0[5], a version with known security vulnerabilities.
Running LCLint is similar to running a compiler. It is typically run from the command line by listing the source code files to check, along with flags that set checking parameters and control which classes of warnings are reported. It takes just over a minute for LCLint to analyze all 17 000 lines of wu-ftpd. Running LCLint on the entire unmodified source code for wu-ftpd without adding any annotations resulted in 243 warnings related to buffer overflow checking.
Consider a representative message[6]:
ftpd.c:1112:2: Possible out-of-bounds store. Unable to
resolve constraint:
maxRead ((entry->arg[0] @ ftpd.c:1112:23)) <= (1023)
needed to satisfy precondition:
requires maxSet ((ls_short @ ftpd.c:1112:14))
>= maxRead ((entry->arg[0] @ ftpd.c:1112:23))
derived from strcpy precondition:
requires maxSet (<param 1>) >=
maxRead (<param 2>)
Relevant code fragments are shown below with line 1112 in bold:
char
ls_short[1024];
…
extern struct aclmember * getaclentry(char
*keyword, struct aclmember **next);
…
int main(int argc, char **argv, char **envp)
{
…
entry = (struct aclmember *) NULL;
if
(getaclentry("ls_short", &entry)
&& entry->arg[0]
&& (int)strlen(entry->arg[0]) > 0)
{
strcpy(ls_short,entry->arg[0]);
…
This code is part of the initialization code that reads configuration files. Several buffer overflow vulnerabilities were found in the wu-ftpd initialization code. Although this vulnerability is not likely to be exploited, it can cause security holes if an untrustworthy user is able to alter configuration files.
The warning message indicates that a possible out-of-bounds store was detected on line 1112 and contains information about the constraint LCLint was unable to resolve. The warning results from the function call to strcpy. LCLint generates a precondition constraint corresponding to the strcpy requires clause maxSet(s1) >= maxRead(s2) by substituting the actual parameters:
maxSet (ls_short @ ftpd.c:1112:14) >= maxRead (entry->arg[0] @ ftpd.c:1112:23).
Note that the locations of the expressions passed as actual parameters are recorded in the constraint. Since values of expressions may change through the code, it is important that constraints identify values at particular program points.
The global variable ls_short was declared as an array of 1024 characters. Hence, LCLint determines maxSet (ls_short) is 1023. After the call to getaclentry, the local entry->arg[0] points to a string of arbitrary length read from the configuration file. Because there are no annotations on the getaclentry function, LCLint does not assume anything about its behavior. In particular, the value of maxRead (entry->arg[0]) is unknown. LCLint reports a possible buffer misuse, since the constraint derived from the strcpy requires clause may not be satisfied if the value of maxRead (entry->arg[0]) is greater than 1023.
To fix this problem, we modified the code to handle these values safely by using strncpy. Since ls_short is a fixed size buffer, a simple change to use strncpy and store a null character at the end of the buffer is sufficient to ensure that the code is safe.[7]
In other cases, eliminating a vulnerability involved both changing the code and adding annotations. For example, LCLint generated a warning for a call to strcpy in the function acl_getlimit:
int acl_getlimit(char *class, char *msgpathbuf) {
int limit;
struct
aclmember *entry = NULL;
if (msgpathbuf) *msgpathbuf = '\0';
while (getaclentry("limit",
&entry)) {
…
if (!strcasecmp(class, entry->arg[0]))
{
…
if (entry->arg[3]
&& msgpathbuf != NULL)
strcpy(msgpathbuf,
entry->arg[3]);
…
If the size of msgputhbuf is less than the length of the string in entry->arg[3], there is a buffer overflow. To fix this we replaced the strcpy call with a safe call to strncpy:
strncpy(msgpathbuf, entry->arg[3], 199);
msgpathbuf[199] = '\0';
and added a requires clause to the function declaration:
/*@requires maxSet(msgpathbuf) >= 199@*/
The requires clause documents an assumption (that may be incorrect) about the size of the buffer passed to acl_getlimit. Because of the constraints denoted by the requires clauses, LCLint does not report a warning for the call to strncpy.
When call sites are checked, LCLint produces a warning if it is unable to determine that this requires clause is satisfied. Originally, we had modified the function acl_getlimit by adding the precondition maxSet (msgpathbuf) >= 1023. After adding this precondition, LCLint produced a warning for a call site that passed a 200-byte buffer to acl_getlimit. Hence, we replaced the requires clause with the stronger constraint and used 199 as the parameter to strncpy.
This vulnerability was still present in the current version of wu-ftpd. We contacted the wu-ftpd developers who acknowledged the bug but did not consider it security critical since the string in question is read from a local file not user input [Luckin01, Lundberg01].
In addition to the previously unreported buffer overflows in the initialization code, LCLint detected a known buffer overflow in wu-ftpd. The buffer overflow occurs in the function do_elem shown below, which passes a global buffer and its parameters to the library function strcat. The function mapping_chdir calls do_elem with a value entered by the remote user as its parameter. Because wu-ftpd fails to perform sufficient bounds checking, a remote user is able to exploit this vulnerability to overflow the buffer by carefully creating a series of directories and executing the cd command.[8]
char mapped_path [200];
…
void do_elem(char *dir) {
…
if
(!(mapped_path[0] == '/'
&& mapped_path[1] == '\0'))
strcat
(mapped_path, "/");
strcat (mapped_path, dir);
}
LCLint generates warnings for the unsafe calls to strcat. This was fixed in latter versions of wu-ftpd by calling strncat instead of strcat.
Because of the limitations of static checking, LCLint sometimes generates spurious error messages. If the user believes the code is correct, annotations can be added to precisely suppress spurious messages.
Often the code was too complex for LCLint to analyze correctly. For example, LCLint reports a spurious warning for this code fragment since it cannot determine that ((1.0*j*rand()) / (RAND_MAX + 1.0)) always produces a value between 1 and j:
i =
passive_port_max
– passive_port_min + 1;
port_array
= calloc (i, sizeof (int));
for (i =
3; … && (i > 0); i--) {
for
(j = passive_port_max
– passive_port_min + 1;
… && (j > 0); j--) {
k =
(int) ((1.0 * j * rand())
/ (RAND_MAX + 1.0));
pasv_port_array [j-1]
= port_array [k];
Determining that the port_array[k] reference is safe would require far deeper analysis and more precise specifications than is feasible within a lightweight static checking tool.
Detecting buffer overflows with LCLint is an iterative process. Many of the constraints we found involved functions that are potentially unsafe. We added function preconditions to satisfy these constraints where possible. In certain cases, the code was too convoluted for LCLint to determine that our preconditions satisfied the constraints. After convincing ourselves the code was correct, we added annotations to suppress the spurious warnings.
Before any annotations were added, running LCLint on wu-ftpd resulted in 243 warnings each corresponding to an unresolved constraint. We added 22 annotations to the source code through an iterative process similar to the examples described above. Nearly all of the annotations were used to indicate preconditions constraining the value of maxSet for function parameters.
After adding these annotations and modifying the code, running LCLint produced 143 warnings. Of these, 88 reported unresolved constraints involving maxSet. While we believe the remaining warnings did not indicate bugs in wu-ftpd, LCLint’s analyses were not sufficiently powerful to determine the code was safe. Although this is a higher number of spurious warnings than we would like, most of the spurious warnings can be quickly understood and suppressed by the user. The source code contains 225 calls to the potentially buffer overflowing functions strcat, strcpy, strncat, strncpy, fgets and gets. Only 18 of the unresolved warnings resulted from calls to these functions. Hence, LCLint is able to determine that 92% of these calls are safe automatically. The other warnings all dealt with classes of problems that could not be detected through simple lexical techniques.
BIND is a key component of the Internet infrastructure. Recently, the Wall Street Journal identified buffer overflow vulnerabilities in BIND as a critical threat to the Internet [WSJ01]. We focus on named, the DNS sever portion of BIND, in this case study. We analyzed BIND version 8.2.2p7[9], a version with known bugs. BIND is larger and more complex than wu-ftpd. The name server portion of BIND, named, contains approximately 47 000 lines of C including shared libraries. LCLint took less than three and a half minutes to check all of the named code.
We limited our analysis to a subset of named because of the time required for human analysis. We focused on three files: ns_req.c and two library files that contain functions which are called extensively by ns_req.c: ns_name.c and ns_sign.c. These files contain slightly more than 3 000 lines of code.
BIND makes extensive use of functions in its internal library rather than C library functions. In order to accurately analyze individual files, we needed to annotate the library header files. The most accurate way to annotate the library would be to iteratively run LCLint on the library and add annotations. However, the library was extremely large and contains deeply nested call chains. To avoid the human analysis this would require, we added annotations to some of the library functions without annotating all the dependent functions. In many cases, we were able to guess preconditions by using comments or the names of function parameters. For example, several functions took a pointer parameter (p) and another parameter encoding it size (psize), from which we inferred a precondition MaxSet(p) >= (psize – 1). After annotating selected BIND library functions, we were able to check the chosen files without needing to fully annotate all of BIND.
LCLint produces warnings for a series of unguarded buffer writes in the function req_query. The code in question is called in response to a specific type of query which requests information concerning the domain name server version. BIND appends a response to the buffer containing the query that includes a global string read from a configuration file. If the default configuration is used, the code is safe because this function is only called with buffers that are large enough to store the response. However, the restrictions on the safe use of this function are not obvious and could easily be overlooked by someone modifying the code. Additionally, it is possible that an administrator could reconfigure BIND to use a value for the server version string large enough to make the code unsafe. The BIND developers agreed that a bounds check should be inserted to eliminate this risk [Andrews01].
BIND uses extensive run time bounds checking. This type of defensive programming is important for writing secure programs, but does not guarantee that a program is secure. LCLint detected a known buffer overflow in a function that used run time checking but specified buffer sizes incorrectly.[10]
The function ns_req examines a DNS query and generates a response. As part of its message processing, it looks for a signature and signs its response with the function ns_sign. LCLint reported that it was unable to satisfy a precondition for ns_sign that requires the size of the message buffer be accurately described by a size parameter. This precondition was added when we initially annotated the shared library. A careful hand analysis of this function reveals that to due to careless modification of variables denoting buffer length, it is possible for the buffer length to be specified incorrectly if the message contains a signature but a valid key is not found. This buffer overflow vulnerability was introduced when a digital signature feature was added to BIND (ironically to increase security). Static analysis tools can be used to quickly alert programmers to assumptions that are broken by incremental code changes.
Based on our case studies, we believe that LCLint is a useful tool for improving the security of programs. It does not detect all possible buffer overflow vulnerabilities, and it can generate spurious warnings. In practice, however, it provides programmers concerned about security vulnerabilities with useful assistance, even for large, complex programs. In addition to aiding in the detection of exploitable buffer overflows, the process of adding annotations to code encourages a disciplined style of programming and produces programs that include reliable and precise documentation.
Our analysis is implemented by combining traditional compiler data flow analyses with constraint generation and resolution. Programs are analyzed at the function level; all interprocedural analyses are done using the information contained in annotations.