How to deal with broken feeds. Tim Bray writes about how to deal with broken PEAW feeds: I would absolutely require basic XML well-formedness.
Me, I would absolutely love it if I, as an aggregator developer, could require well-formedness.
In other words, if a feed isnt well-formed, then NetNewsWire would not parse it and display it.
The thing is, that doesnt work now for RSSbut not because of anything special to RSS, its because feed generators dont always produce well-formed XML. Theres no reason to expect PEAW feed generators would be any different. (Both RSS and PEAW require well-formedness. No difference there.)
The single most common cause of non-well-formedness that I see is unencoded ampersands. They appear in a feed as & rather than as &. This is most often in <title>s.
In my experience this most often afflicts larger publications, not weblogs using Movable Type or Radio or whatever. My guess is its because these larger publications have their own in-house systems. Those systems dont get tested the way weblog systems get tested. A weblog system will have many thousands of users, but an in-house system has just one user (the publication). (I mean user in the sense of publisher.)
Another thing about these larger publications is that their feeds are often very popular. So when one is non-well-formed, I get a ton of bug reports about it until they fix it.
Actually, that used to happen, but NetNewsWire has gotten progressively better about dealing with the ampersand problem, so I dont get so many bug reports.
According to Tim aggregators should consider it a fatal error and not process these feeds. If I agreed, NetNewsWire users would pay the price.
Tim writes: Granted that the RSS legacy necessarily required the use of liberal parsers, but hey, that was then, we have better tools now.
Setting aside the note about the RSS legacy, which isnt relevant to the issue of well-formedness, I want to note that though we indeed have better tools now they arent evenly distributed.
And, ironically and interestingly, John Q. NewToBlogging has better feed-generating tools than many of the large publications. [inessential.com]