electronic museum

Entries categorized as ‘mashup’

The Brooklyn Museum API – Q&A with Shelley Bernstein and Paul Beaudoin

April 16, 2009 · 9 Comments

The concept and importance of museum-based API’s are notions that I’ve written about consistently (boringly, probably) both on this blog and elsewhere on the web. Programmatic and open access to data is – IMO – absolutely key to ensuring the long-term success of online collections.

Many conversations have been going on about how to make API’s happen over the last couple of years, and I think we’re finally seeing these conversations move away from niche groups of enthusiastic developers (eg. Mashed Museum ) into a more mainstream debate which also involves budget holders and strategists. These conversations have been aided by metrics from social media sites like Twitter which indicate that API access figures sometimes outstrip “normal web” browsing by a factor of 10 or more.

On March 4th 2009, Brooklyn Museum announced the launch of their API, the latest in a series of developments around their online collection. Brooklyn occupies a space which generates a fair amount of awe in museum web circles: Shelley Bernstein and team are always several steps in front of the curve – innovating rapidly, encouraging a “just do it” attitude, and most importantly, engaging wholly with a totally committed tribe of users. Many other museum try to do social media. Brooklyn lives social media.

So, as they say – without further ado – here’s Shelley and Paul talking about what they did, how they did it, and why.

Q: First and foremost, could you please introduce yourselves – what your main roles and responsibilities are and how you fit within the museum.

Shelley Bernstein, Chief of Technology. I manage the department that runs the Museum’s helpdesk, Network Administration, Website, gallery technology, and social media.

Paul Beaudoin, Programmer. I push data around on the back-end and build website features and internal tools.

Q: Can you explain in as non-technical language as possible what exactly the Brooklyn API is, and what it lets people do?

SB: It’s basically a way outside programmers can query our Collections data and create their own applications using it.

Q: Why did you decide to build an API? What are the main things you hope to achieve …and what about those age old “social web” problems like authority, value and so-on?

SB: First, practical… in the past we’d been asked to be a part of larger projects where institutions were trying to aggregate data across many collections (like d*hub). At the time, we couldn’t justify allocating the time to provide data sets which would become stale as fast as we could turn over the data. By developing the API, we can create this one thing that will work for many people so it no longer become a project every time we are asked to take part.

Second, community… the developer community is not one we’d worked with before. We’d recently had exposure to the indicommons community at the Flickr Commons and had seen developers like David Wilkinson do some great things with our data there. It’s been a very positive experience and one we wanted to carry forward into our Collection, not just the materials we are posting to The Commons.

Third, community+practical… I think we needed to recognize that ideas about our data can come from anywhere, and encourage outside partnerships. We should recognize that programmers from outside the organization will have skills and ideas that we don’t have internally and encourage everyone to use them with our data if they want to. When they do, we want to make sure we get them the credit they deserve by pointing our visitors to their sites so they get some exposure for their efforts.

Q: How have you built it? (Both from a technical and a project perspective: what platform, backend systems, relationship to collections management / website; also how long has it taken, and how have you run the project?)

PB: The API sits on top of our existing “OpenCollection” code (no relation to namesake at http://www.collectiveaccess.org) which we developed about a year ago. OpenCollection is a set of PHP classes sitting on top of a MySQL database, which contains all of the object data that’s been approved for Web.

All that data originates in our internal collections management systems and digital asset systems. SSIS scripts run nightly to identify approved data and images and push them to our FreeBSD servers for processing. We have several internal workflow tools that also contribute assets like labels, press releases, videos, podcasts, and custom-cropped thumbnails. A series of BASH and PHP scripts merge the data from the various sources and generate new derivatives as required (ImageMagick). Once compiled new collection database dumps and images are pushed out to the Web servers overnight. Everything is scheduled to run automatically so new data and images approved on Monday will be available in the wee hours Tuesday.

The API itself took about four weeks to build and document (documentation may have consumed the better part of that). But that seems like a misleading figure because so much of the API piggy-backs on our existing codebase. OpenCollection itself – and all of the data flow scripts that support it – took many months to build.

Cool diagrams. Every desk should have some.

Cool diagrams. Every desk should have some.

Q: How did you go about communicating the benefits of an API to internal stakeholders?

SB: Ha, well we used your hoard.it website as an example of what can happen if we don’t! The general discussion centered around how we can work with the community and develop a way people can can do this under our own terms, the alternative being that people are likely to do what they want anyway. We’d rather work with, than against. It also helped us immensely that an API had been released by DigitalNZ , so we had an example out there that we could follow.

Q: It’s obviously early days, but how much interest and take-up have you had? How much are you anticipating?

SB: We are not expecting a ton, but we’ve already seen a lot of creativity flowing which you can check out in our Application Gallery. We already know of a few things brewing that are really exciting. And Luke over at the Powerhouse is working on getting our data into d*hub already, so stay tuned.

Q: Can you give us some indication of the budget – at least ballpark, or as a % compared to your annual operating budget for the website?

SB: There was no budget specifically assigned to this project. We had an opening of time where we thought we could slot in the development and took it. Moving forward, we will make changes to the API and add features as time can be allocated, but it will often need to be secondary to other projects we need to accomplish.

Q: How are you dealing with rights issues?

SB: Anything that is under copyright is being delivered at a very small thumbnail size (100px wide on the longest size) for identification purposes only.

Q: What restrictions do you place on users when accessing, displaying and otherwise using your data?

SB: I’m not even going to attempt to summarize this one. Here’s the Terms of Service – everyone go get a good cup of coffee before settling down with it.

Q: You chose a particular approach (REST) to expose your collections. Could you talk a bit about the technical options you considered before coming to this solution, and why you preferred REST to these others?

PB: Actually it’s been pointed out that our API isn’t perfectly RESTful, so let me say first that, humbly, we consider our API REST-inspired at best. I’ve long been a fan of REST and tend to gravitate to it in principal. But when it comes down to it, development time and ease of use are the top concerns.

At the time the API was spec’ed we decided it was more important to build something that someone could jump right into than something meeting some aesthetic ideal. Of course those aren’t mutually exclusive goals if you have all the dev time in the world, but we don’t. So we thought about our users and looked to the APIs that seemed to be getting the most play (Flickr, DigiNZ, and many Google projects come to mind) and borrowed aspects we thought worked (api keys, mindful use of HTTP verbs, simple query parameters) and left out the things we thought were extraneous or personally inappropriate (complicated session management, multiple script gateways). The result is, I think, a lightweight API with very few rules and pretty accommodating responses. You don’t have to know what an XSD is to jump in.

Q: What advice would you give to other museums / institutions wanting to follow the API path?

SB: You mean other than “do it” <insert grin here>? No, really, if it’s right for the institution and their goals, they should consider it. Look to the DigitalNZ project and read this interview with their team (we did and it inspired us). Try and not stress over making it perfect first time out, just try and see what it yields…then adjust as you go along. Obviously, the more institutions that can open their data in this way, the richer the applications can become.

_______

Many, many thanks to Shelley and Paul for putting in the time to answer my questions. You can follow the development of the Brooklyn Museum collections and API over on their blog, or by following @brooklynmuseum on Twitter. More importantly, go build something cool :-)

Categories: IT · api · collections · community · innovation · mashup · technology · web2.0
Tagged: , , , , , , ,

Introducing OneTag

March 24, 2008 · 15 Comments

You might have noticed I’ve been a bit quiet on the blog front for the last couple of weeks. This is because I’m having a drive to send some ideas partying and have therefore been knee-deep coding my latest project most evenings.

OneTag logoI’ve put together an idea for people who run conferences or events. It’s called OneTag (www.onetag.org). It’s very simple conceptually, although as I’m discovering, a complete *dog* to code… – the idea is that it aggregates all the “buzz” about a particular (live) event and then provides the means to view this in different ways. Find out more at http://www.onetag.org/ot/about.asp.

Usual “it’s a beta” disclaimers apply…

I’ve agreed with David Bearman and Jennifer Trant that I’ll be trialling the system during the Museums and the Web 2008 conference in Montreal.

I need your help…

First off, if you’re going to the conference and intend to blog, twitter or upload any photos then the global tag follows the same pattern as previous years and is therefore mw2008. If you’re blogging then just add this as a tag or category; if you’re twittering then please use the hashtag #mw2008 as part of your tweet.

Second, if you’re the owner of a blog or other social networking site, will be blogging about the conference and have feed addresses you can supply me with, then let me know in the comments or via email and I’ll add these to the OneTag aggregator.

Finally, if you’d like to get access to the mw2008 OneTag feeds and views to help me test them then do feel free to get in touch – again, via email if you know it or using the comments to this post. Alternatively, tweet me direct at http://twitter.com/dmje.

I’m at the stage where as many critical eyes as possible is going to help muchly..

Thanks in advance!

Categories: blogging · conference · content · mashup · museum
Tagged: , , , ,

The progress of content

January 8, 2008 · Leave a Comment

I’m just helping Brian Kelly author a paper on Openness in Museums for the Museums and the Web conference later in the year. It just stuck me that the movement of content around the web has followed / is following a pattern a little bit like this:

Phase I: content held as HTML within sites. Little or no interoperability. Content mostly viewed “on site”

Phase II: content held as XHTML within sites. Better markup means better SEO. Better SEO means that content starts to find its way out to the wider web

Phase III: content held as XHTML but also key bits of content (news in particular) syndicated out via RSS

Phase IV: content held as XHTML/XML; key segments syndicated via RSS (and some RDF) but additional movement of data via some “islands” of additional functionality such as API’s.

Phase V: content held as XHTML/XML, some/all syndicated via RSS, RDF, API’s but additional standards (oAuth, OpenSearch, Microformats etc) begin to ensure further interoperability between disparate sites

It’s a bit of a brain dump and please feel free to take it apart in the comments, but I thought I’d share it with you :-)

I’d say most big commercial sites are firmly at Phase III but moving towards IV; museums are mostly at Phase II but moving (slowly!) towards Phase III…

Categories: api · content · mashup · web2.0
Tagged:

Open Education search

September 5, 2007 · 3 Comments

As some of you might remember, I put together www.museumcollections.org.uk a while back to demonstrate what could be done for collections searching with next to no cash (a fiver to register a domain), time (20 mins, tops) or effort (cut and paste). Underneath this is Google coop, an implementation of the big G’s search engine which lets you search across multiple websites. In this particular example, I added a bunch of domains or sub-domains featuring museum collections and also asked people (so far about 20) to contribute if they wanted to add further domains to the list.

lost?O’Reilly radar posted last week about Open Education Search, a collaboration to “build a web search portal dedicated to open educational resources“. There is more about the project on this later post, but it looks as if it will make extensive use of Custom Search, another offering in the bewildering array of free search services provided by Google.

For me, the interesting thing about using Google coop is that it places the bar for cross-domain collections searching, and automatically challenges any institutions considering the various approaches favoured in the past (such as, for instance, Z39.50), to come up with something better.

There have been rumblings in the pipeline for as long as I can remember about a national (or international) search engine for museum collections. Pretty much everyone agrees we’re in a ridiculous place right now: you have to know which institution to go to in the first place to then do the searching for the thing you’re interested in. There is no central place for finding all the Babbage-related collections on the web, for example, except for – oh, hang on – Google.

“…wait!…” shout the hardcore metadata types, “…Google apps doesn’t provide our users with the granularity they require: we need it to be better!”

Well, here’s a proposed solution to that problem: instead of a bunch of museums getting together and spending the next five years (and equivalent vast sums of money) arguing about standards, interoperability etc, before eventually self-imploding and deciding it’s all too much like hard work, how about we club together and buy a Google Enterprise or two (~£15k education price, I believe) and point it at each of the collections websites. Tweak the results, pay a designer £5k for the end result, buy a domain?

I’m being slightly fatuous (imagine!) but there’s a serious point here: Google does search really, really well, so why not use it? Yes, it’s “brute force” searching, but nothing – nothing – has come even close yet to doing it better. This is a perfection gene issue: I vote for cheap, cheerful and 90% perfect (and actually getting it done) rather than 99% perfect and still being here, £3m worse off and with nothing else to show in a few years time.

So. Anyone got £15k?

Categories: collections · mashup · museum · search · web2.0

Freebase is live

August 26, 2007 · Leave a Comment

Freebase logoFreebase has now opened its doors to anyone, at least for those who just want to browse and search. Looks like you’ll have to wait a while longer if you’re wanting to contribue. I’m still really interested in what Freebase brings to the party; how it compares and is different to Wikipedia – but most of all what such an open API can do for those of us mashing up data from across the web. When I get time (in about 2028 at this rate..) I’ll have a long hard look at their API and try a few ideas…

Meanwhile, there’s some lovely mashups already built – see for example CineSpin which is not only elegant and rather beautiful to look at but also extremely content rich, and (gasp) useful, too. There are more examples here.

Categories: api · community · folksonomy · mashup · web2.0

Netvibes universe for NMSI

August 2, 2007 · Leave a Comment

I got involved with Netvibes a long time ago, first as a user and then briefly when I helped them out with some dodgy English translations. That’s how I came to be invited to set up a Netvibes Universe before the beta was opened to the public.

If you haven’t used or come across Netvibes, you’ve been seriously missing out on a major productivity improver – it’s essentially an Ajaxified tabbed start page which lets you embed feeds, calendars, video and searches into one place. What’s more, the recent opening up of the universal widget API now means that developers with little more than XHTML skills can create widgets that do pretty much anything – you could have (and at some point I’m going to build) a search box for your collections, for example. There are vast quantities of widgets available.

NMSI Netvibes UniverseNetvibes have recently announced the concept of the Netvibes Universe – a place where institutions, groups or societies can set up a page with a specific focus, embedding feeds and so on that are specific to that particular field of interest. Once you’ve created a Netvibes account you can add Universes to your page. It’s all a bit difficult to explain but should make sense once you’re there…

The National Museum of Science and Industry Netvibes Universe can be seen at www.netvibes.com/nmsi – I’m still adding content and playing with what we can do with this space but I think it’s an interesting slant on using content which is about but not necessarily generated by a particular institution.

It’ll be interesting to hear what you think.

Categories: experimental · innovation · mashup · museum · web2.0

Why am I learning this stuff?

July 30, 2007 · 5 Comments

As I mentioned on a previous post, I smuggled my PC on holiday and had a go at learning Ruby on Rails. I’m not going to spend much time talking about what I think. Needless to say, I had fun starting from knowledge = nil and gently climbing up the learning curve towards knowledge = 0.1. And yes, Ruby is obviously very cool.

After two weeks of having a go at this stuff on an evening, I thought I’d sit back and re-assess exactly what I was doing, and why.

The lowdown is this:

> I’m an average programmer (possibly slightly above average because of my extreme / anally retentive progamming neatness) in ASP/VbScript
> There was a day when I did this as part of my day job, but not any longer. Now, I just send emails a lot.
> I have been able to solve most web application problems the world has thrown at me.
> I know well that VBScript/ASP has a limited lifespan, won’t be supported by MS for much longer and is laughed about in “real” developer circles.
> I accepted a looong time ago, grudgingly, that I’ll never know everything about everything. Maybe.

So apart from having a nice time learning something (and there’s a lot to be said for that without needing another reason), what exactly am I doing, trying to learn this new – completely new – way of coding?

Well, I have aspirations to retire any day now, and want to wake on a morning and know that I’ve truly made it by having to swim ashore to do my weekly shop. I intend to make this happen either with an extraordinary spate of deeply cunning crime or by building a web business that is so completely cool that no-one else will be able to touch me. The web business seems more sensible (for now at least) and I’ve got enough ideas to keep knocking them out faster than I need to. And here’s the crux. Bring on the graph…

Technical learning graph

[I love a graph almost as much as I love a diagram]

Here’s what’s going on. Up the X is technical proficiency from zero at the bottom to frightening at the top. Along the Y axis is time. Now think about me and my knowledge of VbScript. It’s up there – granted after a couple of years worth of hard slog – but I can do stuff in it, so my technical proficiency is pretty high. My knowledge might actually still be increasing so the line should be off-horizontal but there is also a threshold implied by its flatness.

Next up in this case is Ruby, but it could be any serious web development framework/language. When I started leaning Ruby, I knew nothing, but you’d hope my proficiency would increase with time. So there’s a gradient to it, which is me learning.

Now, there are also two thresholds which I have called the Production threshold and the Prototyping threshold. The Prototyping threshold is the point at which I’d feel confident enough to knock out some alpha versions, do some user testing, play about with the look and feel. I’d probably still be using an Access database under the hood though, and the code would be terrible. In other words, if I got more than three visitors my application would probably emit a funny smell and die on its arse. The Production threshold is the place that serious developers want to be (actually, really serious ones want to be at the Geek threshold which isn’t shown here. You know the people I mean…)

What I’m coming to realise is that I’ll probably never now be the person who writes production-level code – instead, I’ll pay, bribe or beg someone else to do it. So for my particular purposes the Production threshold and anything beyond is an academically interesting thing but actually nothing more than that. Instead, I’m only ever going to knock out experiments and demonstrations for ideas which other people can then take and make production-level. I don’t actually care how fast, well compiled, transportable, modern or supported my prototyping language of choice is. In the slightest.

The time it takes to get me to Point A is the only thing I’m interested in. And because that green Ruby line has a gradient to it, the time to Point A is unfortunately not zero (bring on the Matrix – “I know Ruby…”). The time to get to the Prototyping threshold for VbScript has already passed. I’m there, I’ve learnt it.

Uncomfortable though it is, I’m therefore going to dump Ruby. She’s very fine, but I’ve got too much to do, and not enough time to do it. VbScript will have to do…

Categories: experimental · innovation · mashup · programming · web2.0

Museum directory v2.0

July 3, 2007 · 3 Comments

In my previous post about the “museum directory” I built at UK Museums on the Web mashup day, I mentioned a museum address CSV file from the 24hr Museum which I planned to put use at a later date.

The original source I had contained *really* dodgy data and only about 380 institutions – I’d done some seriously horrible hacking to get it out of various APIs – but the new feed is derived from the 24hr Museum “Direct Data Entry” (DDE) system. This contains around 3,800 entries and is therefore much more interesting as a dataset.

24hr Museum have asked that I don’t expose the KML file at this time, so what you see is the museum directory as it was in version one but with more, and more accurate, data. Version 2 is pretty much the same code and approach as the original – read about how I did it here.

The new data set still didn’t contain geo references, so I had to re-hack the original postcode script to query the Google AJAX API on a running basis and write the lat/long back to the database. That hurt a bit – nearly 4,000 queries takes a long time, especially when postcodes weren’t found. This slightly manual approach, together with some discrepancies in the CSV I had to deal with by hand led me to add the disclaimer on the page about accuracy – nothing to do with the original data…

Anyway, enough tech rubbish. Go play with version two and let me know any ideas or thoughts you have. I’m already thinking about the next version which is gonna be a whole lot more exciting, functionality-wise…

Categories: conference · experimental · innovation · location based · mashup · museum · mw2007 · programming · technology · web2.0

Thought clarification: JUST DO IT but FOR A REASON

July 2, 2007 · 9 Comments

A long and interesting thread broke out on the Museums Computer Group mailing list today about how museums could use Facebook to their best advantage. As I said on the thread – although the question about how Facebook deals with organisations vs individuals is interesting, the key question to me is what we’re trying to get out of having a presence on social networking sites.

Although I spend a lot of time going on about how we should “just do it” (good tagline, that. Shame it’s been claimed by a global corporation of dubious ethics..), I’m also well aware that museums aren’t immune from the hype curve either. The suggestion we should “do something with Facebook” throughout the thread is terribly reminiscent of many requests I’ve had to “do web 2.0″. The conversation usually goes like this:

——————

Web team office, early morning. Somewhere a phone rings.

Web Team: “good morning, this is your friendly web team. how can I help?”

Important Person, usually somewhere high up in the organisation: “we need a blog/discussion board/wiki/podcast/facebook account/mobile website/[insert other new tech thingy here]“

WT: “why?”

IP: “because I read an article in the Guardian on Saturday and it’ll improve our productivity/sales/grooviness. Besides, it’s free”

WT: “what do you want to say on your blog/discussion board/wiki/[...you get the picture...] ?”

IP: “why does that matter?”

WT: “who is your audience?”

IP: “the kids, of course. da street. da yoof. innit?”

WT:

IP: “right, I’ll hope to see some serious re-alignment of our visitor figures by, say, a week Wednesday. I is expectin’ big fings in da hood. Bitchin’. “

——————-

There’s a fine line of course between what I push for – technology growth, user understanding, fast to market, flexible applications – and the Important Person’s vision. This is a subtle game, and one which often causes concerns.

I see it like this:

> the mashup environment is about playing with technology – it is therefore partially technology driven (a bad thing) but also understands and build on content and data from disparate sources in the hope that the thing which pops out at the end is useful (a good thing). It relies on a Darwinian process to determine what works and what doesn’t: if your users like it, they’ll take to it and it’ll succeed.

> the drive to make things happen – the push which I believe museums should be making to be more leading than lagging – should always come out of user centred design. Websites should come from a user need. Ultimately, they should fill a hole in people’s lives. The bitter pill to swallow is that the needs of the institution aren’t always the needs of the user, and that’s where conversations like the one above start to cause pain.

Sometimes the needs of the institution do match (or can be bent so they match) the needs of the end user – this is when the best things happen. Take for example the fabulous English Cut blog – a fascinating look into the otherwise closed world of the Savile Row tailor. Hugh Mcleod helped put this together and he writes wonderfully about the value of the “micro smarter conversation” vs the value of the “macro brand metaphor”.

This is where web teams need to be incredibly savvy about what is out there and how to make this stuff happen. Actually, the conversation above should have a moment where Web Team gets in quickly with “Good plan, Mrs Important Person. How about a personal blog written by X about the way in which we Y”, thereby cutting off any possibility that you’ll “just do it” in the wrong direction with some god-awful corporate nonsense.

So….should museums be on Facebook? Yes, probably, if that presence does something interesting and motivating for users. Should museums be on Facebook just because it’s there? Obviously not.

Categories: design · experimental · innovation · mashup · museum · social networks · technology · ugc · usability · web2.0

Guest post on UK Web Focus

July 1, 2007 · Leave a Comment

Brian Kelly asked me to do a guest post on his UK Web Focus blog. You can read the post – “Go forth and mash” here.

Categories: experimental · mashup · programming · ukmw07 · web2.0