Archive for the ‘Matt's Adventures and Musings’ Category

RIRI Day Two: Richard Green on Institutional Repositories

Tuesday, August 12th, 2008

At the moment, I’m witnessing Richard Green from University of Hull masterfully dissecting the notion of an Institutional Repository.  Its a treat to have someone spell this stuff out step by step from such a grounded perspective.  One wonderful element of his presentation was to simply leave some time for people to explore ePrints and DSpace repositories [1][2][3] (from the perspective of public end users).  He made the point that people, myself included, often work with only one repository system (or no repository system) and neglect to simply explore the existing options.

In the midst of his presentation about the RepoMMan project, Richard posed an interesting pair of questions regarding the prospect of giving users a private “My Repository” space for managing their stuff.  He asked us:

  1. What might a user want to get from “My Repository”?
  2. What might a user want to put into “My Repository”?

He allowed the room to ponder these questions for a while.  I must admit that I was left doubting my knee-jerk responses and in turn thinking a bit further about what users really want from systems like this.  Richard then reported that a survey of his users at University of Hull provided a resounding response.  His users wanted:  Storage (safe, backed up), Access (easy and from anywhere), Management (full version control), and Preservation (to know stuff is there when they want it, short and long term).  I found this to be much more straight forward than the responses I expected.

Richard then gave us a tour of the RepoMMan interface.  Some key characteristics of the systems are the fact that the web interface, which is implemented in Flex, mimics an FTP client (to provide familiarity) and the metadata editor uses Data Fountains to pre-populate objects with automatically generated metadata so that users can then review and revise existing metadata rather than starting from a blank form.

The presentation will continue this afternoon.  By the end of the week, Richard’s full slide deck for the presentation will be up in the RIRI repository.

At RIRI: The Red Island Repository Institute fires up

Tuesday, August 12th, 2008

The Red Island Repository Institute (RIRI), hosted by the University of Prince Edward Island (UPEI) has started with a bang.  Sandy Payette spent an entire day feeding the room with a wonderful mix of vision, software architecture, social context, and technical details.

Mark Leggott has put together a great event. There are people here from all over North America, and even one visitor from Australia.  Everyone has been enjoying the beautiful environs of Prince Edward Island and the quality of information being exchanged is top notch.  I particularly like the fact that Mark is “drinking his own kool-aid” by setting up a Drupal/Fedora site for the institute.

This should be a great week.

In Boston, reading The Register

Wednesday, August 6th, 2008

I’m in Boston at the moment.  I’m hanging out with my sister’s pitbull today while I prepare for the Red Island Repository Institute.

This morning I added The Register to my RSS subscriptions.  I’m a bit intimidated by the volume of content that the feed puts out, but the info is just so darn tasty.

Fedora Solutions Integration Council

Thursday, July 3rd, 2008

Picking up from the ideas in The Missing Sync for Fedora Commons, I’ve been talking with Thorny and Sandy at Fedora Commons about creating a Fedora Solutions Integration Council.  We haven’t quite figured out the structure of it, but the ideas are coming together pretty quickly. Bottom line, the council’s responsibility is to help everyone make informed decisions and support each other’s work.  

 As a first stab, I’m putting effort into three things:  

  1. bring together the streams of communication (ie. blogs, irc, etc) 
  2. help projects find and connect with others who are doing similar work
  3. identify the major themes: problem areas, innovations, exciting solutions, etc.

Ultimately, I hope this will allow us to shed light on the various avenues of exploration in Fedora-centric application development.  So many people are doing such interesting and exciting work.  It’s time for us to talk more openly and enthusiastically about it.

The other Fedora Solutions Councils are organized around themes like eScience, Museums, and Education.   In contrast, the Integration Council is aimed at addressing the cross-cutting concerns of application development.  We all have to deal with things like access controls, scalability, and workflow.  The best solutions to these types of challenges are often applicable in many contexts, regardless of whether you are an eScience project or a small humanities archive.  Our aim is to get as much information flowing between developers as possible.  I want to let developers decide for themselves which ideas apply to their work.

Watch this space. 

The Missing Sync for Fedora Commons

Thursday, July 3rd, 2008

Last month I picked up a Palm Centro. I quickly discovered that you can’t sync Apple’s calendar and address book applications to Palm OS without a $50 product called “The Missing Sync“. Within a week I had exchanged my Centro for a small, black Samsung dumbphone.

Since then, the topic of synchronization has come up repeatedly in my life.

On my laptop, I’m finally looking into synching Apple iCal with Google Calendar.

At home, I’ve started using Dopplr to figure out travel plans with my family and friends.

In my work life, I’ve started recognizing the fact that I actually play a sync role in the Fedora Commons community. I’m passionate about helping people use Fedora, so I’m constantly asking developers “How did you do that?” or “What went wrong? How did you fix it?”. This has naturally lead me to conversations where I find myself saying “Oh! You should talk to XXX project about the work that they’re doing. It’s right down your alley.” or “I think that someone has already solved that problem. Let’s ping the fedora-users list before we reinvent a wheel.”

I like this new theme. It fits with the way I want to operate in the world.

Fedora Commons is a community-driven project. The team in Ithaca has taken great strides to stabilize and facilitate community process. [In fact, the footwork and brainwork that Sandy Payette has done behind the scenes this year is facinating, but that’s a topic for another post.] They now have a Chief Architect (Daniel Davis), a Director of Communications (Carol Minton Morris), and a Director of Community Strategy (Thornton Staples). When these three talented people joined Fedora Commons, I thought “Phew! Problem solved.” What I didn’t realize was that there is still a missing link.

I’ve learned that there is only so much that a centralized organization can do to synchronize community efforts. Ultimately, you still need people who slosh around in the morass of innovations, workarounds and hacks in order to find those gems of best practices and well designed solutions. More importantly, you need those people to put momentum behind the good ideas and ensure that they filter back into the common pool.

Until this month, I had not realized how important this is to community-driven open source software development. There are tons of projects out there who are more than happy to collaborate, to share ideas and solutions, and even to contribute code. However, one thing is consistently true about these projects: their hands are full. They rarely have time to look over each others’ shoulders and trade notes, let alone figuring out how to share their code.

There are, of course, notable exceptions to this rule. For example, Gert Pedersen has done an admirable job of maintaining GSearch and making it generally useful for everyone. Every time a new use case or problem crops up, he usually has a solution on SourceForge within a few weeks.

What about all of the other work that people are doing?

As of late, projects have started inviting me to play an advisory role in their Fedora work, to be their missing sync tool. I’m really excited about this because ultimately it means that I have an opportunity to help more people play to their strengths. I hope that by playing this role, I can help ensure that more great solutions find their way directly into Fedora itself while other solutions join the constellation of tools, services, and documentation that populate the Fedora Commons galaxy.

Visual Language and Content Repositories

Monday, May 19th, 2008

This video by Dave Gray bubbled up on swissmiss the other week.  For those who are curious why MediaShelf’s team has such a strong UX/design emphasis, this introduction to the topic of visual language provides a great explanation.  

As we accumulate and preserve massive volumes of rich, complex information in content repositories, we have to find new ways to represent and interact with that content.  Our culture has barely begun to crack the surface of the possibilities here.  We are engaging in a new and exciting field of inquiry, one which focuses on putting more information in the hands of end-users in increasingly powerful ways.

thinking about developer happiness at JA-SIG

Monday, April 28th, 2008

Five years ago developers spent a lot of time speaking SQL when they talked about writing a database-driven app. Since then, we have enjoyed the arrival of modern webapp frameworks with good ORM. Now developers spend very little time talking about SQL. Instead, they talk about higher level problems and application-specific challenges. In other words, we are able to spend developer resources in more potent ways. This has played a major role in the recent upsurge of innovative, user-driven apps.

Right now I’m sitting in Christopher Brown’s JA-SIG presentation about writing a Fedora App in ColdFusion. Christopher has done valiant work. He’s a trailblazer. More importantly, he has a functioning application that is now in active use. However, I can’t help but feel like we’ve backpedaled five years in terms of developer experience. Christopher’s slides are dominated by Fedora-specific structures and the terminology from Fedora’s APIs. I feel like I’m back in SQL land. Being forced to think about this boilerplate code is an unnecessary burden for developers. It prevents them from fully taking advantage of Fedora’s power.

Now that we’ve had RubyFedora in hand for a few weeks and have been playing with ActiveFedora for a while, it’s really encouraging to be reminded what the alternative is. I’m so eager to set free developers like Christopher, to let them forget about the boilerplate code, so that instead they can invent new ways of helping users do crazy stuff with their digital content.

A Second Presentation at JA-SIG

Friday, March 21st, 2008

In addition to presenting on How We Integrated Fedora into Ruby On Rails, and How You Can Use It, I will now be giving a second talk at the JA-SIG Conference in late April. The second presentation is titled How we created a REST API for Fedora, Why we did it, and What it gives you. Here is the abstract:

The latest release of Fedora contains a new experimental RESTful API. MediaShelf created this API and contributed it to Fedora Commons in 2007. Matt Zumwalt, MediaShelf’s lead software architect, will present this API and its features. He will explain how it works, why we designed it, and how you can take advantage of this convenient interface.

The new REST API is included in Fedora 3.0 Beta1. You can download a copy of Fedora from the Fedora Commons website.

Upcoming Presentations at OpenRepositories and JA-SIG

Monday, March 17th, 2008

I will be presenting on How We Integrated Fedora into Ruby on Rails, and How You Can Use It at two conferences this spring. You can see the talk at OpenRepositories in Southampton, England on 03 April or at the JA-SIG Conference in Minneapolis, MN on 28 April. Here’s the abstract:

Breathe a sigh of relief, and let yourself daydream a bit. Developing Fedora-centric applications just became a lot easier.

MediaShelf (http://yourmediashelf.com) has created software libraries that allow developers to treat Fedora repository content as Ruby objects. Even better, we have made it so you can use these objects natively within Ruby on Rails. Matt Zumwalt, MediaShelf’s Lead Architect, will demonstrate how this graceful union was achieved.

Matt will start by reviewing Ruby on Rails fundamentals and showing how the new RubyFedora and ActiveRepository libraries fit into Rails. He will then walk you through a simple Rails application that uses these libraries. The session will conclude with a look at some of the underlying code and a discussion of the current development status for this work.

We will be putting out a pre-release of the RubyFedora and ActiveRepository libraries in conjunction with these presentations.

Account of the “Unpacking Fedora 3.0″ Workshop

Friday, February 22nd, 2008

07 February 2008
London, University of London, Birkbeck College

I was invited to give a public “Unpacking Fedora 3.0″ workshop at University of London this week. The workshop was hosted by the Bloomsbury College Consortium at U of London and by the Common Repository Interfaces Working Group (CRIG). Over the past two years, I have given a number of “Intro to Fedora” trainings. In contrast with those workshops, I saw this as an opportunity to skip the basics and delve into the more “advanced” topics in using Fedora. However, as the list of attendees began to grow, we quickly realized that we wouldn’t be able to entirely skip the basics. Instead I covered a broad spread of topics from initial deployment to pulling apart components of the new Content Model Architecture (CMA).

The workshop was really enjoyable. I think everyone went home with a hearty amount of new information to run with, as well as some new ideas.

By the time the day of the workshop arrived, so many people had registered that we couldn’t fit in our reserved classroom. Instead, we were moved to the Birkbeck Bar, a little pub within the Birkbeck College. It fits with CRIG’s Barcamp-inspired style that we set up a projector in the bar and pulled up an assortment of tables, chairs and bar stools. The day started somewhat formal, with powerpoints and a lecture. By lunch, people were circulating around to each others’ laptops and helping each other try things out. By the end of the day, when the regular bar clientele started to show up, our laptops went into our bags and all of the new information we had discussed dissolved into a cacophony of excited discussions about people’s new ideas, their current work, and their observations about Fedora 3.0.

Initially, we set out to cover three topics at the workshop. We wanted to start by showing people how to launch an Amazon EC2 instance with Fedora 3.0 pre-installed. After that, we were to play with Fedora’s new RESTful API, which MediaShelf created in 2007 [in collaboration with Digital Innovation South Aftrica]. Finally, the second half of the day was dedicated to the new Content Model Architecture.

In the final days before the workshop, I realized that there was no way to cover all of this information without first spending some time discussing the fundamental principles and concepts that are at play in Fedora. Essentially, I view Fedora as a conceptual framework and an architecture. The actual software that we deploy on our servers is just an implementation of that architecture, and it’s a work in progress. In light of this, I opened the day by giving an overview of the Fedora vision as I understand it.

It turned out that nobody was interested in tinkering with Amazon EC2 that morning, but about half of the attendees had arrived without the necessary version of Fedora installed on their laptops. It’s a testament to Fedora’s flexibility and relative maturity that it took only 45 minutes for an entire room full of people, some with no real system administration experience, to perform custom installs of Fedora 3.0 beta on three different operating systems (Windows, Mac OSX and Ubuntu Linux).

Presenting the REST API was a really rewarding experience for me. I have been convinced of the API’s importance from the first moment I conceived of it, but I was never fully sure how helpful it was until this workshop. There was a palpable moment while I was explaining the API where the idea seemed to click; suddenly people were thinking “Hey, this makes sense! Is Fedora really that easy?” When I finished explaining the basic concepts and showing a few examples, everyone downloaded a text file containing a list of sample curl commands that you can use to explore the REST API. Our pizza lunch arrived and the subsequent hour and a half turned into a free-for-all. Between the food and the gratification of interacting with Fedora in such a tactile way, everyone was in good spirits by the time we settled down to spend the afternoon chatting about the CMA.

Fedora’s Content Model Architecture (CMA) is a notion people have been tossing around for a while now, and Fedora 3.0 will contain the first public implementation of it. I’ve been watching the conversation around the CMA since it was initially proposed under the name CMDA (Content Model Dissemination Architecture) in March, 2006. Since then, I’ve followed the conversations about the CMA amongst Fedora Commons developers, and even chimed in a few times.

When I sat down to prepare this week’s presentation on the CMA, I started by reading the page about the CMA in Fedora 3.0 Beta’s Documentation. Much to my excitement, I found a visionary overview of the current state of this technology, its intentions, its implementation, and the open, community-driven process by which it is being developed. Much of the brilliance in that document is quiet and understated, or implicitly refers to the really gutsy topics. Despite their understatedness, I’m pretty sure that the Fedora Commons team have done a solid job of laying down the groundwork for the next five years of innovation in Digital Asset Management.

Over anything else, the Fedora 3.0b1 documentation sends a clear invitation to participate. Fedora Commons is actively inviting the user community and developer community to engage in the process of designing and testing this innovative new architecture. This message is stated clearly at the opening of the document:

“Please install the software and give us feedback. Be a contributor to Fedora Commons. You do not need to write any software. It is just as important to share your requirements, thoughts, observations, and, particularly, any defects you can identify in this software.”

This made the workshop really easy to plan. I spent about 45 minutes sharing my understanding of the CMA, how it works, what it’s trying to achieve and why. Then we spent the rest of the afternoon getting our hands dirty. We created the objects from Example 3 in Fedora’s Tutorial 2 and then pulled them apart. By the time we finished that, the room was roiling with ideas and questions. We wrote the major topics on the wall, broke up into smaller groups and plowed through them. Some people went through the Tutorial more thoroughly. Others tried tweaking the sample bMech object to use one of its own datastreams as an XSL transform. Another group unpacked the problem of translating file hierarchies into Fedora objects, wondering what implications it carries for content modeling.

Based on the experiences of the breakout groups, we collected some really solid ideas and feedback. I’ll post that information to the fedora-developers mailing list soon.

By the end of the day, Ben O’Steen had written a wrapper in python for the entire REST API. I couldn’t have asked for a clearer confirmation that the work we did to create that API was worth it.

As we approached 6 pm, the people who had come over from the continent for the day began to depart and the workshop dissolved into an assortment of jovial conversations. It was a great conclusion to a very informative day, and the perfect predecessor to the next day’s Barcamp-style CRIG gathering (watch for my separate post on that meeting).