I'm cobbling together a news aggregator from whatever parts are lying about. Currently that's Blosxom and Mark Pilgrim's RSS parser, so now I've basically written a real PyBlagg as opposed to the not-really-pyblagg Spycyroll that used to be called pyblagg and on which development isn't developing. Only I'm not calling mine PyBlagg either, even though that's what it is (an aggregator designed to work with Blosxom, that is). Because it's replacing Kit's news page for me, I'm also writing a plugin for Blosxom that lets me slice the articles like Kit does. N hours M hours ago is all I've done so far, but I plan on reimplementing the whole form: that, plus from time X to time Y, or all news, possibly limited by a search term.
The two differences from Kit (which is derivative of Radio's native news interface) are that it deals on the item level, so that titling grey bar (UserLand likes to use "whitesmoke" in places) is an item title, not a feed title; and if an item has a date tag with an ISO 8601 format date, my code will os.utime the news file so it's sorted in Blosxom by that date. This is different from Radio in that... well, Radio doesn't sort items by dc:date or pubDate (not even with Kit), but by the time they were scanned.
This presents the problem of what to do with items . You could use the real publish time. That sounds backwards: read the delinquent items first? No, my code is keeping an empty lastAggr file in the data directory with its mtime set to the last time the aggregator ran to completion, and gives items with no dates that time. So, modulo the time zone issues I haven't yet considered closely enough, and leeway for hitting a feed at different points in the scanning, you're guaranteed that unmarked will not display newer than they were actually written.
The main difference in the interface is I can't go a day (or three or four, which it has been at this point) without scanning, then view the hour including the scan to get all I missed. I've been viewing in hour blocks still (which is annoying when Blosxom can't update the form defaults yet), and it's only been like two or three an hour once I got past a clump that was marked with now-ish because I had to run in multiple runs.
Other than that, it's been interesting, and I hope to have a fully usable solution for my news reading in the not too distant future.
- Make form work
- Let Blosxom fill in current form defaults
- Reorder titles to be Channeltitle Time instead of Time Channeltitle
- Make displayed channel title a link
- Fix bug with % in channel data
- Fix the Unicode quote bug or whatever that is
- Figure out how to put optional content (eg, comments link) in
item.html