Thursday, November 16, 2006

Sitemaps protocol no longer just for Google

As Brent might say. Google's sitemaps protocol, mostly just an XML schema to lay out the important bits of your site for web crawlers, is being set up as a more neutral protocol to be used by Google, Yahoo, and Microsoft. I suppose this is just recognizing the fact that though initiated by Google, the sitemap files can be read by anyone. Collaboration in this is pretty easy and makes for good PR. I am planning a site that will be most fun if it uses lots of dynamically fetched and generated content. Unfortunately, this will make it inscrutable by HTML-parsing search engines. Laying out a site map that points to raw content with as much metadata as possible will make it much easier to reliably ensure that search engines get the most information possible. The key is to allow Google or another advertiser maximum exposure to site data and meta information so that ads can be properly targeted. The issue that remains for me is whether or not Google will frown on sending ads to a black box, i.e. will they trust that my AJAX site when calling in Google ads with given keywords is serving up the same contents as is indexed through my sitemaps file. The irony of a lack of trust in my situation would be that I am planning on using Googles Web Toolkit (GWT) to build my dynamic site.

Here's what Netcraft says about the site:
http://www.sitemaps.org was running GWS on unknown when last queried at 16-Nov-2006 08:56:49 GMT