Mining the intranet

Mining the intranet.

Of course sites such as Amazon and Google have reasons to create formal APIs and gate access to them. But on an enterprise intranet the threat is disuse, not overuse. You're publishing information that you want people to find, exploit, and recombine. When it's appropriate to use SOAP and WSDL — for example, when queries require fancy authorization or complex inputs — then do so. But when a simpler strategy will suffice, don't be ashamed to use it. Between the primordial tag soup of HTML and the formal realm of Web services lies a large and fertile middle ground: XHTML. Information that you publish in XHTML can be directly consumed by browsers, and it's much friendlier to spiders than ill-formed HTML. If you hope people will mine your intranet, make the job as easy as it can be. [Full story at]

I sometimes worry that I harp too much on these kinds of simple home truths. But Mike Champion's review of my XML 2003 keynote was a nice bit of validation: [Jon's Radio]

Leave a comment