Statically Detecting Likely Buffer Overflow Vulnerabilities

 

David Larochelle

larochelle@cs.virginia.edu

University of Virginia, Department of Computer Science

David Evans

evans@cs.virginia.edu

University of Virginia, Department of Computer Science

 

Abstract

 

Buffer overflow attacks may be today’s single most important security threat.  This paper presents a new approach to mitigating buffer overflow vulnerabilities by detecting likely vulnerabilities through an analysis of the program source code.  Our approach exploits information provided in semantic comments and uses lightweight and efficient static analyses.   This paper describes an implementation of our approach that extends the LCLint annotation-assisted static checking tool.  Our tool is as fast as a compiler and nearly as easy to use.  We present experience using our approach to detect buffer overflow vulnerabilities in two security-sensitive programs.

 

 


1.   Introduction

 

Buffer overflow attacks are an important and persistent security problem.  Buffer overflows account for approximately half of all security vulnerabilities [CWPBW00, WFBA00].  Richard Pethia of CERT identified buffer overflow attacks as the single most im­por­tant security problem at a recent software engineering conference [Pethia00]; Brian Snow of the NSA predicted that buffer overflow attacks would still be a problem in twenty years [Snow99].   

 

Programs written in C are particularly susceptible to buffer overflow attacks.  Space and performance were more important design considerations for C than safety.  Hence, C allows direct pointer manipulations without any bounds checking.  The standard C library includes many functions that are unsafe if they are not used carefully.  Nevertheless, many security-critical pro­grams are written in C.

 

Several run-time approaches to mitigating the risks associated with buffer overflows have been proposed.  Despite their availability, these techniques are not used widely enough to substantially mitigate the effectiveness of buffer overflow attacks.  The next section describes representative run-time approaches and speculates on why they are not more widely used.  We propose, instead, to tackle the problem by detecting likely buffer overflow vulnerabilities through a static analysis of program source code.  We have im­ple­ment­ed a prototype tool that does this by extending LCLint [Evans96].  Our work differs from other work on static detection of buffer overflows in three key ways: (1) we exploit semantic comments added to source code to enable local checking of interprocedural properties; (2) we focus on lightweight static checking techniques that have good performance and scalability characteristics, but sacrifice soundness and completeness; and (3) we introduce loop heuristics, a simple approach for efficiently analyzing many loops found in typical programs. 

 

The next section of this paper provides some background on buffer overflow attacks and previous attempts to mitigate the problem.  Section 3 gives an overview of our approach.  In Section 4, we report on our experience using our tool on wu-ftpd and BIND, two security-sensitive programs.  The following two sec­tions provide some details on how our analysis is done.  Section 7 compares our work to related work on buffer overflow detection and static analysis. 

2.      Buffer Overflow Attacks and Defenses

The simplest buffer overflow attack, stack smashing [AlephOne96], overwrites a buffer on the stack to replace the return address.  When the function returns, instead of jumping to the return address, control will jump to the address that was placed on the stack by the attacker.  This gives the attacker the ability to execute arbitrary code.  Programs written in C are particularly susceptible to this type of attack.  C provides direct low-level memory access and pointer arithmetic without bounds checking.   Worse, the standard C library provides unsafe functions (such as gets) that write an unbounded amount of user input into a fixed size buffer without any bounds checking [ISO99].  Buffers stored on the stack are often passed to these functions.  To exploit such vulnerabilities, an attacker merely has to enter an input larger than the size of the buffer and encode an attack program binary in that input.  The Internet Worm of 1988 [Spafford88, RE89] exploited this type of buffer overflow vulnerability in fingerd.  More so­phis­ti­ca­ted buffer overflow attacks may exploit unsafe buffer usage on the heap.  This is harder, since most programs do not jump to addresses loaded from the heap or to code that is stored in the heap.

 

Several run-time solutions to buffer overflow attacks have been proposed.  StackGuard [CPMH+98] is a com­pi­ler that generates binaries that incorporate code designed to prevent stack smashing attacks.  It places a special value on the stack next to the return address, and checks that it has not been tampered with before jumping.  Baratloo, Singh and Tsai describe two run-time approaches: one replaces unsafe library func­tions with safe implementations; the other modifies executables to perform sanity checking of return ad­dress­es on the stack before they are used [BST00].

 

Software fault isolation (SFI) is a technique that inserts bit mask instructions before memory operations to prevent access of out-of-range memory [WLAG93].  This alone does not offer much protection against typical buffer overflow attacks since it would not prevent a program from writing to the stack address where the return value is stored.  Generalizations of SFI can insert more expressive checking around potentially dangerous operations to restrict the behavior of programs more generally.  Examples include Janus, which observes and mediates behavior by monitoring system calls [GWTB96]; Naccio [ET99, Evans00a] and PSLang/PoET [ES99­, ES00] which transform object programs accord­ing to a safety policy; and Generic Software Wrappers [FBF99] which wraps system calls with security checking code. 

 

Buffer overflow attacks can be made more difficult by modifications to the operating system that put code and data in separate memory segments, where the code segment is read-only and instructions cannot be executed from the data segment.  This does not eliminate the buffer overflow problem, however, since an attacker can still overwrite an address stored on the stack to make the program jump to any point in the code segment.  For programs that use shared libraries, it is often possible for an attacker to jump to an address in the code segment that can be used maliciously (e.g., a call to system).  Developers decided against using this approach in the Linux kernel since it did not solve the real problem and it would prevent legitimate uses of self-modifying code [Torvalds98, Coolbaugh99].

 

Despite the availability of these and other run-time approaches, buffer overflow attacks remain a persistent problem.  Much of this may be due to lack of awareness of the severity of the problem and the availability of practical solutions.  Nevertheless, there are legitimate reasons why the run-time solutions are unacceptable in some environments.  Run-time solutions always incur some performance penalty (StackGuard reports performance overhead of up to 40% [CBDP+99]).  The other problem with run-time solutions is that while they may be able to detect or prevent a buffer overflow attack, they effectively turn it into a denial-of-service attack.  Upon detecting a buffer overflow, there is often no way to recover other than terminating execution. 

 

Static checking overcomes these problems by detecting likely vulnerabilities before deployment.  Detecting buffer overflow vulnerabilities by analyzing code in general is an undecidable problem.[1]  Nevertheless, it is possible to produce useful results using static analysis.  Rather than attempting to verify that a program has no buffer overflow vulnerabilities, we wish to have reasonable confidence of detecting a high fraction of likely buffer overflow vulnerabilities.  We are willing to accept a solution that is both unsound and incomplete.  This means that our checker will sometimes generate false warnings and sometimes miss real problems.  Our goal is to produce a tool that produces useful results for real programs with a reasonable effort.  The next section describes our approach.  We compare our work with other static approaches to detecting buffer overflow vulnerabilities in Section 7.

3.      Approach

Our static analysis tool is built upon LCLint [EGHT94, Evans96, Evans00b], an annotation-assisted lightweight static checking tool.  Examples of problems detected by  LCLint include violations of information hiding, inconsistent modifications of caller-visible state or uses of global variables, misuses of possibly NULL references, uses of dead storage, memory leaks and problems with parameters aliasing.  LCLint is actually used by working programmers, especially in the open source development community [Orcero00, PG00].

 

Our approach is to exploit semantic comments (henceforth called annotations) that are added to source code and standard libraries.  Annotations describe programmer assumptions and intents. They are treated as regular C comments by the compiler, but recognized as syntactic entities by LCLint using the @ following the /* to identify a semantic comment.  For example, the annotation /*@notnull@*/ can be used syntactically like a type qualifier.  In a parameter declaration, it indicates that the value passed for this parameter may not be NULL.  Although annotations can be used on any declaration, for this discussion we will focus exclusively on function and parameter declarations.  We can also use annotations similarly in declarations of global and local variables, types and type fields.

 

Annotations constrain the possible values a reference can contain either before or after a function call.  For example, the /*@notnull@*/ annotation places a constraint on the parameter value before the function body is entered.  When LCLint checks the function body, it assumes the initial value of the parameter is not NULL.  When LCLint checks a call site, it reports a warning unless it can determine that the value passed as the corresponding parameter is never NULL.

 

Prior to this work, all annotations supported by LCLint classified references as being in one of a small number of possible states.  For example, the annotation /*@null@*/ indicated that a reference may be NULL, and the annotation /*@notnull@*/ indicated that a reference is not NULL.  In order to do useful checking of buffer overflow vulnerabilities, we need annotations that are more expressive.  We are concerned with how much memory has been allocated for a buffer, something that cannot be adequately modeled using a finite number of states.  Hence, we need to extend LCLint to support a more general annotation language.  The annotations are more expressive, but still within the spirit of simple semantic comments added to programs.

 

The new annotations allow programmers to explicitly state function preconditions and postconditions using requires and ensures clauses.[2]  We can use these clauses to describe assumptions about buffers that are passed to functions and constrain the state of buffers when functions return.  For the analyses described in this paper, four kinds of assumptions and constraints are used: minSet, maxSet, minRead and maxRead.[3]

 

When used in a requires clause, the minSet and maxSet annotations describe assumptions about the lowest and highest indices of a buffer that may be safely used as an lvalue (e.g., on the left-hand side of an assignment).  For example, consider a function with an array parameter a and an integer parameter i that has a pre­condition requires maxSet(a) >= i.  The analysis assumes that at the beginning of the function body, a[i] may be used as an lvalue.  If a[i+1] were used before any modifications to the value of a or i, LCLint would generate a warning since the function preconditions are not sufficient to guarantee that a[i+1] can be used safely as an lvalue.  Arrays in C start with index 0, so the declaration

         char buf[MAXSIZE]

generates the constraints

         maxSet(buf) = MAXSIZE – 1 and minSet(buf) = 0.

Similarly, the minRead and maxRead constraints indicate the minimum and maximum indices of a buffer that may be read safely.  The value of maxRead for a given buffer is always less than or equal to the value of maxSet.  In cases where there are elements of the buffer have not yet been initialized, the value of maxRead may be lower than the value of maxSet.

 

At a call site, LCLint checks that the preconditions implied by the requires clause of the called function are satisfied before the call.  Hence, for the requires maxSet(a) >= i example, it would issue a warning if it cannot determine that the array passed as a is allocated to hold at least as many elements as the value passed as i.  If minSet or maxSet is used in an ensures clause, it indicates the state of a buffer after the function returns.  Checking at the call site proceeds by assuming the postconditions are true after the call returns. 

 

For checking, we use an annotated version of the standard library headers.  For example, the function strcpy is annotated as[4]:

char *strcpy (char *s1, const char *s2)    

 /*@requires maxSet(s1) >= maxRead(s2)@*/

 /*@ensures maxRead(s1) == maxRead(s2) /\ result == s1@*/;

The requires clause specifies the precondition that the buffer s1 is allocated to hold at least as many char­acters as are readable in the buffer s2 (that is, the number of characters up to and including its null terminator).  The postcondition reflects the behavior of strcpy – it copies the string pointed to by s2 into the buffer s1, and returns that buffer.  In ensures clauses, we use the result keyword to denote the value returned by the function.

 

Many buffer overflows result from using library functions such as strcpy in unsafe ways.  By annotating the standard library, many buffer overflow vulnerabilities can be detected even before adding any annotations to the target program.  Selected annotated standard library functions are shown in Appendix A.  

4.      Experience

In order to test our approach, we used our tool on wu-ftpd, a popular open source ftp server, and BIND (Berkeley Internet Name Domain), a set of domain name tools and libraries that is considered the reference implementation of DNS.  This section describes the process of running LCLint on these applications, and illustrates how our checking detected both known and unknown buffer overflow vulnerabilities in each  appli­cation.

4.1 wu-ftpd

We analyzed wu-ftp-2.5.0[5], a version with known se­cur­ity vulnerabilities. 

Running LCLint is similar to running a compiler.  It is typically run from the command line by listing the source code files to check, along with flags that set checking parameters and control which classes of warnings are reported.  It takes just over a minute for LCLint to analyze all 17 000 lines of wu-ftpd.  Running LCLint on the entire unmodified source code for wu-ftpd without adding any annotations resulted in 243 warnings related to buffer overflow checking.

Consider a representative message[6]:

ftpd.c:1112:2: Possible out-of-bounds store.  Unable to

   resolve constraint:

      maxRead ((entry->arg[0] @ ftpd.c:1112:23)) <= (1023)

   needed to satisfy precondition:

      requires maxSet ((ls_short @ ftpd.c:1112:14))

                  >= maxRead ((entry->arg[0] @ ftpd.c:1112:23))

   derived from strcpy precondition:

      requires maxSet (<param 1>) >= maxRead (<param 2>)

Relevant code fragments are shown below with line 1112 in bold: 

char  ls_short[1024];

extern struct aclmember * getaclentry(char *keyword, struct aclmember **next);

int main(int argc, char **argv, char **envp)

{

  

   entry = (struct aclmember *) NULL;

   if (getaclentry("ls_short", &entry)

       && entry->arg[0]

       && (int)strlen(entry->arg[0]) > 0)

      {

        strcpy(ls_short,entry->arg[0]);

       

This code is part of the initialization code that reads configuration files.  Several buffer overflow vul­ner­a­bil­i­­ties were found in the wu-ftpd initialization code.  Although this vulnerability is not likely to be exploited, it can cause security holes if an untrustworthy user is able to alter configuration files.

 

The warning message indicates that a possible out-of-bounds store was detected on line 1112 and contains information about the constraint LCLint was unable to resolve.  The warning results from the function call to strcpy.  LCLint generates a pre­con­dit­ion constraint corresponding to the strcpy requires clause maxSet(s1) >= maxRead(s2) by substituting the actual parameters:

   maxSet (ls_short @ ftpd.c:1112:14) >= maxRead (entry->arg[0] @ ftpd.c:1112:23). 

Note that the locations of the expressions passed as actual parameters are recorded in the constraint.  Since values of expressions may change through the code, it is important that constraints identify values at particular program points. 

The global variable ls_short was declared as an array of 1024 characters.  Hence, LCLint determines maxSet (ls_short) is 1023.  After the call to getaclentry, the local entry->arg[0] points to a string of arbitrary length read from the configuration file.  Because there are no annotations on the getaclentry function, LCLint does not assume anything about its behavior.  In particular, the value of maxRead (entry->arg[0]) is unknown. LCLint reports a possible buffer misuse, since the constraint derived from the strcpy requires clause may not be satisfied if the value of maxRead (entry->arg[0]) is greater than 1023.

 

To fix this problem, we modified the code to handle these values safely by using strncpy.  Since ls_short is a fixed size buffer, a simple change to use strncpy and store a null character at the end of the buffer is sufficient to ensure that the code is safe.[7] 

 

In other cases, eliminating a vulnerability involved both changing the code and adding annotations.  For example, LCLint generated a warning for a call to strcpy in the function acl_getlimit:

int acl_getlimit(char *class, char *msgpathbuf) {

 int limit;

 struct aclmember *entry = NULL;

 

 if (msgpathbuf) *msgpathbuf = '\0';

 while (getaclentry("limit", &entry)) {

  

   if (!strcasecmp(class, entry->arg[0]))

   {

    

     if (entry->arg[3]

         && msgpathbuf != NULL)

       strcpy(msgpathbuf, entry->arg[3]);

    

If the size of msgputhbuf is less than the length of the string in entry->arg[3], there is a buffer overflow.  To fix this we replaced the strcpy call with a safe call to strncpy:

 strncpy(msgpathbuf, entry->arg[3], 199);

 msgpathbuf[199] = '\0';        

and added a requires clause to the function declaration:

 /*@requires maxSet(msgpathbuf) >= 199@*/

The requires clause documents an assumption (that may be incorrect) about the size of the buffer passed to acl_getlimit.  Because of the constraints denoted by the requires clauses, LCLint does not report a warning for the call to strncpy.

 

When call sites are checked, LCLint produces a warn­ing if it is unable to determine that this requires clause is satisfied.  Originally, we had modified the function acl_getlimit by adding the precondition maxSet (msgpathbuf) >= 1023.  After adding this precondition, LCLint produced a warning for a call site that passed a 200-byte buffer to acl_getlimit.  Hence, we re­placed the requires clause with the stronger constraint and used 199 as the parameter to strncpy.

 

This vulnerability was still present in the current ver­sion of wu-ftpd.  We contacted the wu-ftpd developers who acknowledged the bug but did not consider it security critical since the string in question is read from a local file not user input [Luckin01, Lundberg01].

 

In addition to the previously unreported buffer overflows in the initialization code, LCLint detected a known buffer overflow in wu-ftpd.  The buffer overflow occurs in the function do_elem shown below, which passes a global buffer and its parameters to the library function strcat.  The function mapping_chdir calls do_elem with a value entered by the remote user as its parameter.  Because wu-ftpd fails to perform sufficient bounds checking, a remote user is able to exploit this vulnerability to overflow the buffer by carefully creating a series of directories and executing the cd command.[8]

 

char mapped_path [200];

void do_elem(char *dir) {

  

   if (!(mapped_path[0] == '/'

       && mapped_path[1] == '\0'))

     strcat (mapped_path, "/");

   strcat (mapped_path, dir);

}

 

LCLint generates warnings for the unsafe calls to strcat.  This was fixed in latter versions of wu-ftpd by calling strncat instead of strcat.

Because of the limitations of static checking, LCLint some­­times generates spurious error messages.  If the user believes the code is correct, annotations can be added to precisely suppress spurious messages. 

Often the code was too complex for LCLint to analyze correctly.  For example, LCLint reports a spurious warning for this code fragment since it cannot determine that ((1.0*j*rand()) / (RAND_MAX + 1.0)) always produces a value between 1 and j:

  i = passive_port_max

      – passive_port_min + 1;

  port_array = calloc (i, sizeof (int));

  for (i = 3; … && (i > 0); i--) {

    for (j = passive_port_max

             – passive_port_min + 1;

         … && (j > 0); j--) {

       k = (int) ((1.0 * j * rand())

                  / (RAND_MAX + 1.0));

       pasv_port_array [j-1]

          = port_array [k];

Determining that the port_array[k] reference is safe would require far deeper analysis and more precise specifications than is feasible within a lightweight static checking tool.

 

Detecting buffer overflows with LCLint is an iterative process.   Many of the constraints we found involved functions that are potentially unsafe.  We added function preconditions to satisfy these constraints where possible.  In certain cases, the code was too convoluted for LCLint to determine that our preconditions satisfied the constraints.  After convincing ourselves the code was correct, we added annotations to suppress the spurious warnings.

 

Before any annotations were added, running LCLint on wu-ftpd re­sulted in 243 warn­ings each corresponding to an unresolved constraint.   We added 22 annotations to the source code through an iterative process similar to the examples described above.  Nearly all of the annotations were used to indicate preconditions constraining the value of maxSet for function parameters. 

 

After adding these annotations and modifying the code, running LCLint produced 143 warnings.  Of these, 88 reported unresolved constraints involving maxSet.  While we believe the remaining warnings did not indicate bugs in wu-ftpd, LCLint’s analyses were not sufficiently powerful to determine the code was safe.  Although this is a higher number of spurious warnings than we would like, most of the spurious warnings can be quickly understood and suppressed by the user.  The source code contains 225 calls to the potentially buffer overflowing functions strcat, strcpy, strncat, strncpy, fgets and gets.  Only 18 of the unresolved warnings resulted from calls to these functions.  Hence, LCLint is able to determine that 92% of these calls are safe automatically.  The other warnings all dealt with classes of problems that could not be detected through simple lexical techniques.

4.2 BIND

BIND is a key component of the Internet infrastructure.  Recently, the Wall Street Journal iden­ti­fied buffer overflow vulnerabilities in BIND as a critical threat to the Internet [WSJ01].  We focus on named, the DNS sever portion of BIND, in this case study.  We analyzed BIND version 8.2.2p7[9], a version with known bugs.  BIND is larger and more complex than wu-ftpd. The name server portion of BIND, named, contains approximately 47 000 lines of C including shared li­bra­ries.  LCLint took less than three and a half minutes to check all of the named code. 

 

We limited our analysis to a subset of named because of the time required for human analysis. We focused on three files: ns_req.c and two library files that contain functions which are called extensively by ns_req.c:  ns_name.c and ns_sign.c.  These files contain slightly more than 3 000 lines of code.

 

BIND makes extensive use of functions in its internal library rather than C library functions. In order to accurately analyze individual files, we needed to annotate the library header files.  The most accurate way to annotate the library would be to iteratively run LCLint on the library and add annotations.  However, the library was extremely large and contains deeply nested call chains.  To avoid the human analysis this would require, we added annotations to some of the library functions without annotating all the dependent functions.  In many cases, we were able to guess preconditions by using comments or the names of function parameters.  For example, several functions took a pointer parameter (p) and another parameter encoding it size (psize), from which we inferred a precondition MaxSet(p) >= (psize – 1).  After annotating selected BIND library functions, we were able to check the chosen files without needing to fully annotate all of BIND.

 

LCLint produces warnings for a series of unguarded buffer writes in the function req_query.  The code in question is called in response to a specific type of query which requests information concerning the domain name server version. BIND appends a response to the buffer containing the query that includes a global string read from a configuration file.  If the default configuration is used, the code is safe because this function is only called with buffers that are large enough to store the response.  However, the restrictions on the safe use of this function are not obvious and could easily be overlooked by someone modifying the code.  Additionally, it is possible that an administrator could reconfigure BIND to use a value for the server version string large enough to make the code unsafe.  The BIND developers agreed that a bounds check should be inserted to eliminate this risk [Andrews01].

 

BIND uses extensive run time bounds checking. This type of defensive programming is important for writing secure programs, but does not guarantee that a program is secure.  LCLint detected a known buffer overflow in a function that used run time checking but specified buffer sizes incorrectly.[10]

 

The function ns_req examines a DNS query and gen­er­ates a response.  As part of its message processing, it looks for a signature and signs its response with the function ns_sign.  LCLint reported that it was unable to satisfy a precondition for ns_sign that requires the size of the message buffer be accurately described by a size parameter.  This precondition was added when we initially annotated the shared library.  A careful hand analysis of this function reveals that to due to careless modification of variables denoting buffer length, it is possible for the buffer length to be specified incorrectly if the message contains a signature but a valid key is not found.  This buffer overflow vulnerability was introduced when a digital signature feature was added to BIND (ironically to increase security).  Static analysis tools can be used to quickly alert programmers to assumptions that are broken by incremental code changes.

                               

Based on our case studies, we believe that LCLint is a useful tool for improving the security of programs.  It does not detect all possible buffer overflow vulnerabilities, and it can generate spurious warnings.  In practice, however, it provides programmers concerned about security vulnerabilities with useful assistance, even for large, complex programs.  In addition to aiding in the detection of exploitable buffer overflows, the process of adding annotations to code encourages a disciplined style of programming and produces programs that include reliable and precise documentation.

5.      Implementation

Our analysis is implemented by combining traditional compiler data flow analyses with constraint generation and resolution.  Programs are analyzed at the function level; all interprocedural analyses are done using the information contained in annotations.