<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: A possible next step for hoard.it?</title>
	<atom:link href="http://electronicmuseum.org.uk/2010/03/02/a-possible-next-step-for-hoard-it/feed/" rel="self" type="application/rss+xml" />
	<link>http://electronicmuseum.org.uk/2010/03/02/a-possible-next-step-for-hoard-it/</link>
	<description>musings about electronic culture</description>
	<lastBuildDate>Mon, 09 Jan 2012 14:02:10 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Mike</title>
		<link>http://electronicmuseum.org.uk/2010/03/02/a-possible-next-step-for-hoard-it/#comment-8735</link>
		<dc:creator><![CDATA[Mike]]></dc:creator>
		<pubDate>Thu, 04 Mar 2010 17:32:41 +0000</pubDate>
		<guid isPermaLink="false">http://electronicmuseum.org.uk/?p=664#comment-8735</guid>
		<description><![CDATA[Jim - cheers for the comment

Forgive me if I&#039;ve got this wrong but I think you&#039;ve misunderstood what I&#039;m suggesting. The idea isn&#039;t to hold data outside the page, but to hold a file which holds the *data shape* which refers to the existing data/html *on* the page. Sorry if I didn&#039;t make this clear!

The content in the file *could* be included somehow in the page, but the reality is that more than one page is likely to be represented in the same template shape - so it is better, like CSS, to hold this externally.

Make more sense?]]></description>
		<content:encoded><![CDATA[<p>Jim &#8211; cheers for the comment</p>
<p>Forgive me if I&#8217;ve got this wrong but I think you&#8217;ve misunderstood what I&#8217;m suggesting. The idea isn&#8217;t to hold data outside the page, but to hold a file which holds the *data shape* which refers to the existing data/html *on* the page. Sorry if I didn&#8217;t make this clear!</p>
<p>The content in the file *could* be included somehow in the page, but the reality is that more than one page is likely to be represented in the same template shape &#8211; so it is better, like CSS, to hold this externally.</p>
<p>Make more sense?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jim O'Donnell</title>
		<link>http://electronicmuseum.org.uk/2010/03/02/a-possible-next-step-for-hoard-it/#comment-8734</link>
		<dc:creator><![CDATA[Jim O'Donnell]]></dc:creator>
		<pubDate>Thu, 04 Mar 2010 17:23:24 +0000</pubDate>
		<guid isPermaLink="false">http://electronicmuseum.org.uk/?p=664#comment-8734</guid>
		<description><![CDATA[When RDF first came along, ten years ago, the proposal was that you could publish seperate, machine-readable metadata for your web page and link to it from the head, using a link tag. Remember how we all rushed out and created RDF metadata files for our pages? Me neither. That precedent makes me very nervous about putting your machine-readable data in a seperate file — in all likelihood, it will be forgotten about and won&#039;t get updated when the site changes. Hence the aversion to hidden metadata in the microformats community.

One point about embedding the metadata directly in the HTML — if you have access to add a link tag, surely that means you can add class=&quot;dc_title&quot; (microformats-style) or property=&quot;DC.Title&quot; (RDFa-style) to the appropriate HTML tag? This approach being much more robust against changes in the surrounding HTML and easier to maintain going forward.

I do agree that with the majority of museums seeing digitisation as publishing their records in HTML, there needs to be a mechanism of some sort to embed catalogue data in HTML. And it needs to be a mechanism with low enough barrier to entry that it doesn&#039;t require huge technical skills to set up.]]></description>
		<content:encoded><![CDATA[<p>When RDF first came along, ten years ago, the proposal was that you could publish seperate, machine-readable metadata for your web page and link to it from the head, using a link tag. Remember how we all rushed out and created RDF metadata files for our pages? Me neither. That precedent makes me very nervous about putting your machine-readable data in a seperate file — in all likelihood, it will be forgotten about and won&#8217;t get updated when the site changes. Hence the aversion to hidden metadata in the microformats community.</p>
<p>One point about embedding the metadata directly in the HTML — if you have access to add a link tag, surely that means you can add class=&#8221;dc_title&#8221; (microformats-style) or property=&#8221;DC.Title&#8221; (RDFa-style) to the appropriate HTML tag? This approach being much more robust against changes in the surrounding HTML and easier to maintain going forward.</p>
<p>I do agree that with the majority of museums seeing digitisation as publishing their records in HTML, there needs to be a mechanism of some sort to embed catalogue data in HTML. And it needs to be a mechanism with low enough barrier to entry that it doesn&#8217;t require huge technical skills to set up.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike</title>
		<link>http://electronicmuseum.org.uk/2010/03/02/a-possible-next-step-for-hoard-it/#comment-8731</link>
		<dc:creator><![CDATA[Mike]]></dc:creator>
		<pubDate>Wed, 03 Mar 2010 13:59:59 +0000</pubDate>
		<guid isPermaLink="false">http://electronicmuseum.org.uk/?p=664#comment-8731</guid>
		<description><![CDATA[@Sean - thanks for commenting

Yeah, I originally thought about using CSS as the selector, but did some reading and thought that the jquery approach allows a cleverer means of DOM selection. So, for example, you can exclude certain elements based on filters etc. I&#039;m not a massively good CSS&#039;er (like everything else, I hack it to make it work) but my impression was that DOM selection via CSS is a bit more limited?

re. RDFa - yes and microdata - yes, absolutely. If you have a look at my comment on this post http://doofercall.blogspot.com/2008/05/screen-scraping-and-posh.html you&#039;ll see that I suggest a hierarchical approach to grabbing data from the page, with &quot;most accurate&quot; at the top (including API, microformats, RDFa) and &quot;least accurate&quot; (the DOM approach) at the bottom. I think remaining compatible with these emerging approaches as well as with what has gone in the past is pretty key]]></description>
		<content:encoded><![CDATA[<p>@Sean &#8211; thanks for commenting</p>
<p>Yeah, I originally thought about using CSS as the selector, but did some reading and thought that the jquery approach allows a cleverer means of DOM selection. So, for example, you can exclude certain elements based on filters etc. I&#8217;m not a massively good CSS&#8217;er (like everything else, I hack it to make it work) but my impression was that DOM selection via CSS is a bit more limited?</p>
<p>re. RDFa &#8211; yes and microdata &#8211; yes, absolutely. If you have a look at my comment on this post <a href="http://doofercall.blogspot.com/2008/05/screen-scraping-and-posh.html" rel="nofollow">http://doofercall.blogspot.com/2008/05/screen-scraping-and-posh.html</a> you&#8217;ll see that I suggest a hierarchical approach to grabbing data from the page, with &#8220;most accurate&#8221; at the top (including API, microformats, RDFa) and &#8220;least accurate&#8221; (the DOM approach) at the bottom. I think remaining compatible with these emerging approaches as well as with what has gone in the past is pretty key</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sean Gillies</title>
		<link>http://electronicmuseum.org.uk/2010/03/02/a-possible-next-step-for-hoard-it/#comment-8730</link>
		<dc:creator><![CDATA[Sean Gillies]]></dc:creator>
		<pubDate>Wed, 03 Mar 2010 13:49:55 +0000</pubDate>
		<guid isPermaLink="false">http://electronicmuseum.org.uk/?p=664#comment-8730</guid>
		<description><![CDATA[Or how about just sticking with CSS selectors and CSS syntax for the metadata file? MSS?

div#container ul.top-nav { metadata: DC.Title }

Of course, HTML already has a  element that you&#039;d really want to map to DC.Title, and a page that doesn&#039;t provide that probably has additional problems.

One nice feature of your idea is that it&#039;s not incompatible at all with RDFa (my preference) or with HTML5 microdata. DC metadata could be mapped to RDFa attributes as well as to HTML element text.]]></description>
		<content:encoded><![CDATA[<p>Or how about just sticking with CSS selectors and CSS syntax for the metadata file? MSS?</p>
<p>div#container ul.top-nav { metadata: DC.Title }</p>
<p>Of course, HTML already has a  element that you&#8217;d really want to map to DC.Title, and a page that doesn&#8217;t provide that probably has additional problems.</p>
<p>One nice feature of your idea is that it&#8217;s not incompatible at all with RDFa (my preference) or with HTML5 microdata. DC metadata could be mapped to RDFa attributes as well as to HTML element text.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike</title>
		<link>http://electronicmuseum.org.uk/2010/03/02/a-possible-next-step-for-hoard-it/#comment-8729</link>
		<dc:creator><![CDATA[Mike]]></dc:creator>
		<pubDate>Wed, 03 Mar 2010 12:08:53 +0000</pubDate>
		<guid isPermaLink="false">http://electronicmuseum.org.uk/?p=664#comment-8729</guid>
		<description><![CDATA[@Mark - thanks for commenting, really useful.

You&#039;re absolutely right - the fact that data needs to have a certain shape (and also the fact that the data needs to be exposed on-page at all!) is to a certain extent a downside of the proposed approach. Also it is probably the case that in reality a fair amount of munging will be required of extracted data. I guess a real-world test is the only way of finding out if this is enough of a barrier to make the idea a no-go or not...]]></description>
		<content:encoded><![CDATA[<p>@Mark &#8211; thanks for commenting, really useful.</p>
<p>You&#8217;re absolutely right &#8211; the fact that data needs to have a certain shape (and also the fact that the data needs to be exposed on-page at all!) is to a certain extent a downside of the proposed approach. Also it is probably the case that in reality a fair amount of munging will be required of extracted data. I guess a real-world test is the only way of finding out if this is enough of a barrier to make the idea a no-go or not&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: mark</title>
		<link>http://electronicmuseum.org.uk/2010/03/02/a-possible-next-step-for-hoard-it/#comment-8728</link>
		<dc:creator><![CDATA[mark]]></dc:creator>
		<pubDate>Wed, 03 Mar 2010 11:42:54 +0000</pubDate>
		<guid isPermaLink="false">http://electronicmuseum.org.uk/?p=664#comment-8728</guid>
		<description><![CDATA[if i understand it,then i like the idea.... it sounds like you&#039;re trying to fill a similar need as rdfa and microformats-enabling people to pull data from web pages with minimal overheads required in exposing the data...however it occurs to me that your method, however, would require that the document structure would need to quite closely mirror the structure of the data you want to expose.Whilst this might be the case for certain elements, like the doc title example (which, incidentally, is generally going to contain unstructured data), i doubt it wil be the case for all interesting data on a page.but if it&#039;s about applying an external desription of data on a page, is that what xslt does?]]></description>
		<content:encoded><![CDATA[<p>if i understand it,then i like the idea&#8230;. it sounds like you&#8217;re trying to fill a similar need as rdfa and microformats-enabling people to pull data from web pages with minimal overheads required in exposing the data&#8230;however it occurs to me that your method, however, would require that the document structure would need to quite closely mirror the structure of the data you want to expose.Whilst this might be the case for certain elements, like the doc title example (which, incidentally, is generally going to contain unstructured data), i doubt it wil be the case for all interesting data on a page.but if it&#8217;s about applying an external desription of data on a page, is that what xslt does?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

