In the spirit of the lightweight browser-based solution, I decided to create an equally lightweight server-based version based on Python and libxml2/libxslt. (I'm also working on a slightly heftier, but more powerful variation based on Berkeley DB XML; we'll explore that one next time.) [O'Reilly Network]
This article spells out, in more detail than I've gone into here, an approach to dynamic categories. During yesterday's roundtable at RSS WinterFest, I mentioned one use of this technique: querying for items, by date, that include QuickTime movies. Kevin Marks, who's now director of engineering for Technorati, pointed out, correctly, that there's nothing special about searching by date. What is special is a search that combines the sort of standard metadata captured by any content management system with what we might call “inline metadata” that emerges from the content itself.
A clearer example, because it involves only inline metadata, is this dynamic category for items related to books. It's a content-aware query that returns paragraphs (along with links, images, and other markup) containing URLs to amazon.com or allconsuming.com, the two book sites I commonly refer to. As a matter of fact, when I wrote the query I forgot about a third book site I commonly refer to: Safari. When I amended the query accordingly, a few more items appeared in the category. Note also that, because the query is content-aware, it can return more context (for example, entire items), or less context (for example, just links), by adjusting its scope.
Now, since the mountain will not come to Mohammed, Mohammed will go to the mountain. By that I mean: if the majority of blogs to which I subscribe won't provide me with XHTML content to search, then I will endeavor to XHTML-ize the feeds that they do supply. The reason: to extend these dynamic categories across the whole set of blogs I read. Here's a preview of a books query against the last few days' worth of my inbound feeds: … [Jon's Radio]