|
![]() |
|
In the previous segment, we looked at the basics of downloading, installing and configuring a basic Apache Web server for your own use. By editing the httpd.conf files, you created a management profile that helps maintain a stable and speedy Web server. In this part of the series we're going to look at the access.conf file as it relates to server security, and we'll briefly cover setting up content, or more specifically, setting up locations in which to store your site content. Access.confThe access.conf file, as previously mentioned, is your most basic security configuration file for the Apache server. By judiciously applying certain rules, you can lockout unfriendly or unwanted users from your site, and keep your less skilled users from creating security breaches with cgi-scripts that they might be installing. It's important to note that at least in recent versions of Apache for NT all the information in access.conf has been appended to httpd.conf, effectively removing the need to keep two separate files. So if you don't find a copy of access.conf after installing Apache, take a look in httpd.conf for the entries discussed in this article. Basic Security Probably one of the first sections you notice is one that should look like this: # First, we configure the "default" to be a very restrictive set of <Directory /> # Note that from this point forward you must specifically allow In essence, these settings have locked the server down tight, to the point of pretty much shielding it from any and all accesses. As the comments state, you have to manually activate specific features. While this may seem a bit extreme, it's actually useful if you think you might have an inexperienced sysadmin installing the Web server before you get to go in and reconfigure it. Imagine the damage that could be done if instead, Apache fired up from the get-go with all permissions and services enabled and a new or inexperienced sysadmin simply left it that way. As you roll through this section, you will see specific functions referenced such as "FollowSymLinks" and others. If you are not familiar with these terms, it's best to look in the manual pages (online documentation in UNIX/Linux) or the helpfiles that come with Apache for NT. Do not modify these settings unless you are very sure of what you're doing. Allow/Deny Configuration Probably my favorite feature within access.conf is the Allow/Deny settings. By simply filling in the domains that you don't want accessing your Web server, you effectively lockout any and all users from that domain. Why is this useful? Here's an example. In a company I installed Apache for, we had built a pretty good sized (5+ GB at the time) image directory that anyone could access through FTP or HTTP. We had told everyone we preferred they access it through FTP, especially if they were going to use scripts to download massive amounts of images. Some people just didn't listen ,however, and since they found that scripting for Web access was easier, they did just that, and pounded the Web server with HTTP requests on an hourly basis. At one point, I found them pulling 2 GB of images in a session. And since they were doing it through HTTP, they didn't set their script to resume from where it last left off. Instead, whenever I'd blow them off the machine, they'd come right back in, re-requesting files they already had. As you can guess, bandwidth throttled down, way down, when they hit the server. So I figured it was time to force them into either controlling themselves, or using FTP. By editing the Allow/Deny, and putting their domain in the denied list, they found their script failed constantly. Until one day they called me and asked nicely how they could play fair with our Web server. From then on, they've kept their script well-tamed. After that long example, I'm sure you just want to know how to edit the settings. It's really rather simple. The basic entries look like this: # The "Order" line tells Apache to first look at the entries in the Allow header first, then the deny. If you look at the Allow header, it is, obviously, allowing anyone into the server, and thus has an "Allow from all" entry. If you wanted to block out a specific domain, let's say "bigdownloader.net" your entries would look like the following: Order allow, deny And thus, bigdownloader.net's people with have their dreams of hitting your Web server dashed with a simple configuration setting. Robots.txt If you're like every other Webmaster, you want your site to be indexed with as many search engines as possible, thus ensuring a better hit-rate. While this may seem like a great thing, sometimes those spiders (also called robots) that index your site can be just a bit more intrusive than you like. Especially those poorly designed ones that tear through your site by making hundreds or thousands of requests for pages and end up in directories you'd rather not have indexed. By creating a robots.txt file and placing it in the root directory of your Web server, you can have a better chance of not getting your site pounded by these types of robots. To create such a file, first decide on what directories you don't want indexed or spidered, and what robots/spiders you want to block in general. Once you have an idea what you need blocked, create a plain text file (not a formatted document such as an MS Word doc or the like) called robots.txt in the root directory of your Web site. Do not try to create this in each subdirectory of your site, it simply will not help. Within this file, you will create specific blocking entries. For example, a basic robots.txt file may look like this: User-agent: * These settings would block every single spider (the User-agent entry) from three directories; cgi-bin, tmp, and ~secretguy. It's very important to note that each Disallow must be on its own line, you cannot put something like "Disallow: /cgi-bin,/tmp,/~secretguy" in robots.txt. If you wanted to block a specific spider, you would need to know what User-agent identification it is using. Let's say you wanted to block Webcrawler from your ~secretguy directory. You would place an entry in robots.txt like this: User-agent: Webcrawler If you wanted to allow everyone else unlimited access, but stop Webcrawler from even touching your site, you would configure robots.txt to look like this: User-agent: Webcrawler The other agent's permissions are implied to allow them full access. Creating and Publishing ContentWith Apache, publishing content couldn't be simpler. Each user is assigned a publicly accessible directory that is used to serve up their Web content. Once that is done, the user need only connect via FTP to their directory, place the HTML files in the public directory, and they are on their way! Every Apache installation starts with a base, or root site directory. In most Linux distributions, this is the /home/httpd/ directory. Within that directory are directories normally named "html" and "cgi-bin". If you are serving up the Web for a domain, all of your site's specific pages go here. By default, your base page should be named "index.html". If you don't name it this, when someone types in the address of your Web server, they will either get a directory listing, or a message stating they don't have directory permissions. So remember to create an index.html file in /home/httpd/html. If you're also going to have subdomains or publishing areas for users, you'll need to create user Web directories that will contain Web pages and images. Creating a user directory is simple, under Linux (while logged in as the user needing the directory) cd ~ If you're logged in as root and need to create a user directory, the commands are almost the same, with the addition of an extra permission change: cd ~username (replace "username" with the actual user's account login name) In either instance, there should now be a subdirectory named "public_html" within the user's home directory. All pages for that user should be published in that directory, and they need to name their base page "index.html". When a user needs to place files in their directory, they need only ftp to the Web server's address, being sure to login with their regular username and password. Then they can place all their Web content into the public_html directory for immediate viewing. If so desired, they can create subdirectories under public_html to help keep everything organized. This is great for those users who like to have an "images" subdirectory. By the way, to access a user's Website, you would use the url of the Webserver, followed by ~username. So if you had a user called "test" off "myfakedomain.com", visitors would need to use the url of http://www.myfakedomain.com/~test. Sub-domainsEveryone likes a simple domain. All those tilde's and slashes can be a real pain to remember, and look pretty ugly on a business card. Apache has a nice trick to help out in instances where a simpler domain is needed: the sub-domain. Let's say you have a user name "Suzie" who would really like something nicer than http://www.myfakedomain.com/~suzie as her URL. Well, if you really want to help her out, with a little work, you can assign her http://suzie.myfakedomain.com , which to some is preferable to anything with a tilde in it (have you ever tried to explain where the tilde is on the keyboard to a PC newbie? If you have, you see why subdomains are nicer all around). To do this, we need to go back to our copy of httpd.conf in the /etc/httpd/conf directory. If you open it up in an editor, you'll come across this section: # VirtualHost: Allows the daemon to respond to requests for more than one # server address, if your server machine is configured to accept IP packets # for multiple addresses. This can be accomplished with the ifconfig # alias flag, or through kernel patches like VIF. #<VirtualHost host.some_domain.com> First and foremost, you need to remove the comment identifiers (the # signs) to make this section active. From there, you just need to change each item to match the new subdomain, in Suzie's case, it would look like this: #<VirtualHost suzie.myfakedomain.com> You might have noticed I referred to a directory we haven't created yet, that being /home/suzie/logs. Make sure you create this directory so Suzie has a place to store her server log information. That way, she can see how many hits she's getting on her subdomain. Suzie would still upload all her pages to her public_html directory under her home directory, but she could access those pages from http://www.myfakedomain.com/~suzie or http://suzie.myfakedomain.com. Summing it upApache has a lot of power, and is relatively simple to configure. While I've only glossed over the basic configurations, you should have enough information to get an Apache Web server up and running now, and keep it fairly secure. To really lock down your server, be sure to read all the documentation and apply anything you think applicable.Ted Brockwood is the Information Services Manager for a real estate listing service in Oregon. His experience covers Java, Linux, UNIX, NT, Win95/98, Win3.x, and DOS. |
| Suits | Ponytails | Propheads | Contact WDJ | Discuss | Web Audio | Search |