Apr. 15th, 2008 02:47 pm
Beating with the clue stick
Yesterday's providers have gone one better. Their RSS contains 'link' elements, which are supposed to point back to the content on their web site, looking something like this:
There are 190 links spread across 9 feeds. Every single one of their link elements 404s. Hmmmm. But that "//" towards the end of the URL is a bit odd, isn't it? One of my colleagues noticed this, hunted around on their web site and discovered that URLs like this work OK:
They've 'forgotten' to put what are clearly the unique IDs of their stories in the links they send to us.
What quality! The data is wrong at the syntactic and semantic level...
<link>http://www.crappers.example/stuff//thing.html</link>
There are 190 links spread across 9 feeds. Every single one of their link elements 404s. Hmmmm. But that "//" towards the end of the URL is a bit odd, isn't it? One of my colleagues noticed this, hunted around on their web site and discovered that URLs like this work OK:
http://www.crappers.example/stuff/123456/thing.html
They've 'forgotten' to put what are clearly the unique IDs of their stories in the links they send to us.
What quality! The data is wrong at the syntactic and semantic level...