Body text last updated 1998-07-22. Recently has become the most popular page of mine, presumably because a bunch of lamers want to learn how to break into things. This isn't really the focus of this document; I wrote it as a primer for people participating in the Linux Security Audit project, which is intended to find security holes so they can be fixed before people use them to break into things.
I wouldn't be surprised if calling 100-200 people a day `lamers' results in electronic attacks on me or my machine (kragen.dnaco.net.) All I can say is that people who do this would thereby demonstrate their lamosity.
These paragraphs added 1999-02-26.
If a program has a bug in it that manifests under extreme circumstances, then normally, it's a minor annoyance. Usually, you can just avoid the extreme circumstances, and the bug isn't a problem. You could duplicate the effect of tickling the bug by writing your own program, if you wanted to.
But sometimes programs sit on security boundaries. They take input from other programs that don't have the same access that they do.
Some examples: your mailreader takes input from anyone you get mail from, and it has access to your display, which they probably don't. The TCP/IP stack of any computer connected to the Internet takes input from anyone on the Internet, and usually has access to everything on the computer, which most people on the Internet certainly don't.
Any program that does such things has to be careful. If it has any bugs in it, it could potentially end up allowing other people -- untrusted people -- to do things they're not allowed to do. A bug that has this property is called a "hole", or more formally, a "vulnerability".
Here are some common categories of holes.
Cryptologists and real-time programmers are familiar with doing things this way. Most other programmers aren't, and habits of mind from their normal-software work tend to make their software insecure.
For example, suppose you have a PostScript interpreter that was originally intended to let you preview your documents before printing them. This is not a security-sensitive role; the PostScript interpreter doesn't have any capabilities that you don't. But suppose you start using it to view documents from other people, people you don't know, even untrustworthy people. Suddenly, the presence of PostScript's file access operators becomes a threat! Someone can send you a document which will delete all your files -- or possibly stash copies of your files someplace they can get at them.
This is the source of the vulnerabilities in most Unixes' TCP/IP stacks -- they were developed on a network where essentially everyone on the network was trustworthy, and now they're deployed on a network where there are many people who aren't.
This is also the problem with Sendmail. Until it went through an audit, it was a constant source of holes.
At a more subtle level, functions that are perfectly safe when they don't cross trust boundaries can be a disaster when they do. gets() is a perfect example. If you use gets() in a situation where you control the input, you just provide a buffer bigger than anything you expect to input, and you're fine. If you accidentally crash the program by giving it too much input, the fix is "don't do that" -- or maybe expand the buffer and recompile.
But when the data is coming from an untrusted source, gets() can overflow the buffer and cause the program to do literally anything. Crashing is the most common result, but you can often carefully craft data that will cause the program to run it as executable code.
Which brings us to . . .
Security-problem buffer-overflows can arise in several situations:
Remember, it's not a security hole if the input is already trusted -- it's just a potential annoyance.
This is particularly nasty in most Unix environments; if the array is a local variable in some function, it's likely that the return address is somewhere after it on the stack. This seems to be the fashionable hole to exploit; thousands and thousands of holes of this nature have been found in the last couple of years.
Even buffers in other places can sometimes be overflowed to produce security holes -- particularly if they're near function pointers or credential information.
Things to look for:
A blanket solution is to compile all security-sensitive programs with bounds-checking enabled.
The first work I know of on bounds-checking for gcc was done by Richard W. M. Jones and Paul Kelly, and is at http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html.
Greg McGary mailto:gkm@eng.ascend.com did some other work. Announcement: http://www.cygnus.com/ml/egcs/1998-May/0073.html. Richard Jones and Herman ten Brugge did other work. Announcement: http://www.cygnus.com/ml/egcs/1998-May/0557.html. Greg compares different approaches in http://www.cygnus.com/ml/egcs/1998-May/0559.html.
But if you give a filename to a security-sensitive program -- a CGI script, a setuid program, a setgid program, any network server -- it can't necessarily rely on the OS's built-in automatic protections. That's because it can do some things you can't. In the case of a web server, what it can do that you can't may be pretty minimal, but it's likely that it can at least read some files with private info.
Most such programs do some kind of checking on the data they receive. They often fall into one of several pitfalls:
This is a double-parsing problem, which we'll get into later, and also stems from fail-openness.
At any rate, programs that have privileges you don't usually fail to limit what they do on your behalf to just what they're supposed to do. setfsuid(), setreuid(), etc., can help.
Another problem is that frequently, standard libraries look in environment variables for files to open, and aren't smart enough to drop privileges while doing this. (Really, they can't be.) So we're forced to resort to parsing the filename to see if it looks reasonable.
Some OSes dump core with the wrong privileges, too, and if you can make a setuid program crash, you can overwrite a file that the program's owner would be able to overwrite. (Dumping core with the user's privileges often results in the user being able to read data from the core file that they wouldn't be able to read normally.)
As an example, an electronic door lock that locks the door by holding it closed with a massive electromagnet is fail-open when the power goes out -- when the electromagnet has no power, the door will open easily. An electronic door lock that locks the door with a spring-loaded deadbolt that is pulled out of the way with a solenoid is fail-closed -- when the solenoid has no power, it's impossible to pull back the deadbolt.
CGI scripts commonly execute other programs, passing them user data on their command lines. In order to avoid having this data interpreted by the shell (on a Unix system) as instructions to execute other programs, access other files, etc., the CGI script removes unusual characters -- things like '<', '|', ' ', '"', etc. You can do this in a fail-open way by having a list of "bad characters" that get removed. Then, if you forgot one, it's a security hole. You can do it in a fail-closed way by having a list of "good characters" that don't get removed. Then, if you forgot one, it's an inconvenience. An example of this (in Perl) is at http://www.geek-girl.com/bugtraq/1997_3/0013.html.
Fail-closed systems are a lot less convenient than fail-open ones, if they fail frequently. They're also a lot more likely to be secure.
Essentially every program I've seen to secure a Mac or Microsoft OS desktop computer has been fail-open -- if you can somehow disable the program, you have full access to the computer. By contrast, if you disable the Unix 'login' program, you have no access to the computer.
So look to see
Thus, network servers tend to be much more secure than setuid programs. Setuid programs get all sorts of things from untrustworthy sources -- environment variables, file descriptors, virtual memory mappings, command-line arguments, and probably file input, too. Network servers just get network-socket input (and possibly file input).
qmail is an example of a small security interface. Only a small part of qmail (though much more than ten lines, contrary to what I previously said on the linux-security-audit mailing list) runs as "root". The rest runs either as special qmail users, or as the mail recipient.
Internally to qmail, the buffer-overflow checking is centralized in two small functions, and all of the functions used to modify strings use these functions to check. This is another example of a small security interface -- the chance that some part of the checking is wrong is much smaller.
The more network daemons you run, the bigger the security interface between the Internet and your machine.
If you have a firewall, the security interface between your network and the Internet is reduced to one machine.
The difference between viewing an untrusted HTML page and viewing an untrusted JavaScript page is also one of interface size; the routines in the JavaScript interpreter are large and complex compared to the routines in the HTML renderer.
If you're auditing, auditing such programs extra thoroughly is an excellent idea, but sometimes it's better just to rewrite them, or not to use them in the first place.
The trust relationships must be enforced at every interface between security compartments. If you're running a library terminal, you probably want the terminal to have access only to the library database (and read-only, at that.). You want to deny them access to the Unix shell altogether. I'm not sure how to finish this paragraph -- I'm sure you can see what I'm getting at, though.
Mirabilis ICQ trusts the whole Internet to send it correct user identifications. Obviously, this is not secure.
At one point, tcp_wrappers trusted data it got from reverse DNS lookups, handing it to a shell. (It no longer does.)
Netscape Communicator would sometimes insert a user-entered FTP password into the URL in the history list, when using squid as a proxy. JavaScript programs and other web servers can see this URL.
Look at elses on ifs. Look at default: in switch statements. Make sure they're fail-closed.
gcc -pg -a causes the program to produce a bb.out file that may be helpful in determining how effective your tests are at covering all branches of the code.
I believe this has been the source of many of the recent IP denial-of-service problems.
This should not be a major issue for the Linux security audit.
It was written without any practical experience; thus the relative importance I give to different things may be silly, and I may have left out something important altogether. Also, parts of it are poorly thought out.
Nevertheless, I think it may be a useful primer for people who are participating in the Linux security audit without much previous experience in security auditing.
David A. Wheeler has developed a document for programmers titled ``Secure Programming for Linux HOWTO,'' which is now included in the Linux Documentation Project. You can get a copy at http://www.dwheeler.com/secure-programs.
SunWorld Online has an article on Designing Secure Software. While Sun doesn't have the world's best reputation for security, this article is worthwhile.
BUGTRAQ announces new Unix security holes on a daily basis, with full details. geek-girl.com keeps some archives that go back to 1993. This is a very useful resource to learn about new security holes, or look up particular old security holes. It's a terrible resource for getting a list of security holes, though.
Adam Shostack has posted some good code-review guidelines (apparently used by some company to review code to run on their firewall) at http://www.homeport.org/~adam/review.html.
Cops comes with a setuid(7) man page, which is HTMLized at http://www.homeport.org/~adam/setuid.7.html, and includes guidelines for finding and preventing insecurities in setuid programs.
John Cochran of EDS pointed me to the AUSCERT programming checklist: ftp://ftp.auscert.org.au/pub/auscert/papers/secure_programming_checklist