Sometimes to save Cornerhost's web space on the server, Michal will send a robot to gzip all the access logs more than a few days old. This is convenient for me, because I do my stats with Webalizer running locally and I'd rather download the gzipped files than the plaintext, over my modem. Now that I have a CGI account, though, I can do it myself. This is how I do it:
#!/usr/bin/python
print "Content-type: text/plain"
print
import sys
sys.stderr = sys.stdout
#import cgitb; cgitb.enable()
from gzip import GzipFile
from os.path import join
from os import chmod, environ
ldir = "/home/markpasc/logs/" + environ['HTTP_HOST']
for f in environ['QUERY_STRING'].split("&"):
gzfile = join(ldir, "%s.log.gz" % f)
print "Compressing %s" % gzfile
try:
out = GzipFile(gzfile, "wb")
inf = open(join(ldir, "%s.log" % f), "r")
for line in inf.xreadlines():
out.write(line)
inf.close()
out.close()
chmod(gzfile, 0666)
except Exception, ex:
print "Oops, %s: %s" % (ex.__class__.__name__, ex)
print "Done."
print
The particular ldir is a Cornerhost thing, so I can run the program from markpasc.org to zip up markpasc.org's logs and from neologasm.org to zip up neologasm.org's logs. (This only matters now that I'm hosting my Radio discussion group FAQ on neologasm.org, since my idea for a neologism dictionary didn't come together.)
It's run by visiting an URL like http://markpasc.org/compressLogs.cgi? 20030331& 20030401& 20030402 (without spaces of course) in the browser. Then I remove it when I'm done with it because I don't want people messing with it. Not too shabby, I think.