I thought I’d post a little more about web stats in general, and also about the results of yesterday’s blog maintenance. In case any of you have the capability of looking at your own web stats and don’t know what it is you’re looking at, here’s a line from mine which I’ll talk you through. As I did yesterday, I’ve edited the URL in question.
81.240.255.226 – - [12/Jul/2005:15:16:20 -0400] “GET /cgi-bin/mt-bdcc.cgi?entry_id=174 HTTP/1.1″ 200 2827 “http://zzz.hawaiiansurvey.org/online-casinos.html” “Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; AIRF)”
Taking this bit by bit we have:
81.240.255.226 – this is the IP address where the request originated
[12/Jul/2005:15:16:20 -0400] – this is the date and time my web host served the file
“GET /cgi-bin/mt-bdcc.cgi?entry_id=174 HTTP/1.1″ – this is the name of the file requested. mt-bdcc.cgi is (was) the name of my comments script, and this particular request would have returned the comments for entry number 174 on my blog (which actually doesn’t have any comments, so that proves that they’re just looking at random)
200 – this is the success code returned by the web server, saying that it has sent the requested page to the person that requested it. Other codes are 404 (file not found), 403 (forbidden), 304 (partial content)
2827 – this is the size of the file in bytes.
“http://zzz.hawaiiansurvey.org/online-casinos.html” – this is the referring web page
“Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; AIRF)” – this is the identification string for the browser the surfer was using.
When you’re scanning down the log file and you see dozens, if not hundreds, of consecutive requests for the same file, from the same referrer, it leaps out at you. From what I can see these requests started on 15 June, although why they started and what led them to my site I’ve no idea.
Here’s another one:
67.28.112.46 – - [14/Jul/2005:20:14:16 -0400] “GET / HTTP/1.1″ 200 48898 “http://zzz.bestfreedirectory.com/” “Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)”
What’s noticeable about this is that the page requested (“GET / HTTP/1.1″) is the root (index) page, indicated by the single slash in the middle there. My index page at the time was 48K in size. This is not so bad on its own, but since July 10 they have requested my index page 610 times. 580 of those requests were yesterday alone. So they’ve gone very quickly indeed into my ban list. In fact, since I renamed my comments script and started adding IP addresses into the ban list, I’ve had 1115 requests for a page that no longer exists, and 503 requests from people who have been banned. And I still don’t know exactly what they’re after.
I may well be, in the long term, fighting a losing battle, but for now I think I’m mounting a half-decent defence.
Probably the most boring post in the world (vol 2)
Comments are closed.
#1 by annie at July 16th, 2005
That is a LOT of requests! crazy!