|
|
![]() |
|
Where To StartThere are two potential uses for the information gathered by the profiling system - after-the-event analysis, and on-the-fly content production. We'll assume that you want to produce content on the fly from the profile information. The offline reporting angle is similar and less complicated. The first thing you'll need is a back-end database of some description, because if your site is at all popular, there will be a great deal of information flying about. And while writing to (say) an ASCII log file line-by-line is fairly efficient, if you're going to do any processing at all on the resulting data you'll need a proper, purpose-built query system. The easiest way to go is to use an SQL-based system such as Oracle or Microsoft SQL Server, as these are built to handle huge datasets. The second thing you'll need is a site that's based on scripts. If you want to build pages on the fly, the easiest way to do it is by using CGI, ASP, server-end Java or some other scripting language to query the database and figure out what to put in the areas targeted toward each specific user. You could do the job more efficiently by writing code that plugs into the Web server application itself via the server's API. This is far more difficult, though it may be worth doing because as we've said in a previous article (Speeding Up Your CGI) API is trickier but much faster than CGI-type scripts. Who Are You?Once you have a back-end database, and have chosen a scripting system (and you've worked out how to make them talk to each other!), you can plan how you're actually going to deal with the profiling information. The first task is to identify your users. This doesn't necessarily mean you have to know their name and address, only that you can tell people apart and, hopefully, identify people as they go around the site and when they leave and come back. The most obvious approach is to use cookies. Browsers store them and each cookie is accessible to any page/script on the server that dished out that cookie. So when someone comes to the site, the script asks for its cookie and, assuming there is one, can then look up the relevant profile information in the database. Each unique user is given some kind of unique ID number so users can be told apart. The problem with cookies, though, is that some old browsers don't support them, and some people turn off cookie support on their browsers for misconceived security reasons. Also, cookies tend to be blown away if people are rigorous about tidying up their temporary-files folders. It's worth looking at the alternative - standard CGI identification. CGI IdentificationThis scripting mechanism, that was part of the Web almost from day one, is a very versatile way of tracking someone through the site. You've probably seen sites where the URLs are huge long strings like http://www.myco.com/site.cgi?userID=AEERJFJ2235w37859S845 When a script is called, it looks up the values of the variables, and then uses what it sees to find that user/session information in the profile database. It also incorporates the values of the variables in calls to other pages, in order to pass the information on down the line. The only problem with using CGI user tracking is that when people first land on the site there is no way to tell who they are - unlike the cookie approach. The best compromise is to combine the two. Store a cookie on the user's browser, and read this cookie when people first land on the site. You then read the user ID information from the cookie and use it from that point on to drive the CGI tracking process. If there isn't a cookie, you can give the user a login box and ask them to identify themselves or to register with your site if they've not been before. It's a less automated way of finding out who someone is, and can put your visitors off, but it is a possibility. If you choose not to employ a registration process then you'll have to live with not being able to profile cookie-immune browsers except within each session. Click Here To RevisitThe next question is: just how do you work out what someone is doing, where they've been, and what they're interested in? The most basic profiling mechanism is simply defining 'major' sections of your site and providing "quick-links" into places that people have visited before. Say, for example, you're running a site providing technical support and helpful information for Webmasters. When someone goes into the SQL Server 7.0 section, you could register this fact, and from that point on present the SQL Server 7 section as a quick-link in (say) the left-hand navigation bar of their screen. People who come back a lot would soon have a nice little list of their 'favourite' sections so that they could get quickly to the areas they always look at. Please Sir, Can I Have Some More?Quick-links are very basic, and you'll probably want to go further. While many sites are roughly hierarchical, pages containing information on one specific subject are often found spread around different sections. This is not a through error, it's a simple consequence of the fact that some subjects have many dimensions or can't easily be categorised. It's not enough to use profiling to simply define a few vague areas of interest and point people at the top of subject areas. Instead, if someone is hopping from one page down in the tree to another at an equivalent depth in a different area, you need to figure out why, and what (say) a page about California has to do about the review of yesterday's sports coverage on the TV. The enhanced approach is to keep the idea of tagging pages with their subject area, but forget the hierarchy thing and just put in information about the content of each file. It's rather like the use of metatags to specify keywords for search engines when they crawl over your pages, except you're using the information yourself rather than putting it there solely for the use of external spiders. When a page is built, you make sure it's tagged with as many relevant items as possible, and then when someone goes to a page, record the tags that the server saw when they got there. With this approach, you can now do two more things that weren't possible with basic tags. First of all, when the user is on a page, the server can query the database and pick out some "related topics" links, which could be other pages with similar topics or - more importantly in these days of e-commerce - items in the online store. You can also build general pages (such as the site's main front page) with links to the areas you know that user likes to see. Next (and final) part >>> |
| Suits | Ponytails | Propheads | Contact WDJ | Discuss | Web Audio | Search |