Towards a Global Infrastructure For Sharing Learning Resources

From Creative Commons
Revision as of 19:30, 7 January 2010 by Nathan Yergler (talk | contribs)
Jump to: navigation, search


This document aims to help those who ask, "What should I be doing when I publish my OER's so that they are searchable and discoverable by the OER community and (perhaps) the whole world?". More specifically, this document aims to help those with a collection of such OER's.

There are several options for many of the fundamental questions. Luckily, the number of options is actually quite limited. For nearly all producers we recommend adopting one of the choices outlined below. These have significant adoption and deployment, increasing the reach and exposure of your OER. We will develop bridges between the different approaches; indeed, many of these bridges already exist.

In essence, the choices boil down to:

  • If you are looking for a repository to store the OER's and make them available to the world, then you can contact ARIADNE or Connexions who will gladly do so on your behalf or provide you with the code to run your own repository.
  • If you have your own repository, then we strongly suggest that you make your content and metadata available for harvesting. As an alternative or in addition, you can also make your repository available for federated search. We strongly suggest that you register your repository
  • For your content, we make no specific assumptions: HTML documents, OpenOffice documents, Microsoft Office documents, MPEG video clips, PDF documents, MP3 sound files, etc.; anything goes. Of course, in the spirit of open educational resources, we certainly encourage you to publish in as open a format as feasible.
  • For the metadata, we strongly suggest to use either Learning Object Metadata or Dublin Core. For specific niche domains, other formats like for instance MPEG apply as well. You may also define a so-called 'application profile' that defines your specific requirements your repository imposes with respect to metadata, so that you can enforce them. In concrete terms, such metadata can be expressed as XML or RDF.


Harvesting is a technique that allows a software agent to collect resources (content or metadata) from a repository. Some harvesting protocols such as OAI-PMH enable the requesting agent to retrieve only a specific set of metadata or content (for instance based on a query that identifies what is relevant).

The major advantage of harvesting is that the harvester then has all the metadata or content available, so that queries from end users can be processed without further need to contact the harvested repositories. Especially in the context of infrastructures with many dozens or more repositories, this can be quite important, as the response time involved with contacting repositories in order to answer and end user request becomes prohibitively long. Moreover, when queries are forwarded to local repositories for federated search, the local repository may incur considerable load servicing third party queries.

The major concern with harvesting for some organisations is that this approach allows third parties to collect the metadata and content from a repository. In an OER context, this is probably not much of a concern, but it can still raise issues about visibility for the party being harvested.

In concrete terms, there are two important harvesting protocols:

  • OAI-PMH: The Open Archives Initiative Protocol for Metadata Harvesting is widely used by repositories of scholarly material and learning resources. It can include simple queries to determine which metadata or content to harvest.
  • RSS/Atom: RSS and Atom are syndication formats, commonly used for web feeds that can be read by software such as Bloglines or Google Reader. They may be used more generically to let software know about updates to a repository as resources are created or changed.

We suggest that you adopt OAI-PMH if you have no strong reason to prefer RSS, as OAI-PMH offers more features that can be important for third party developers. If you also want to provide feeds of new objects that are deposited in your repository, then an additional RSS feed is quite useful.


In order to enable harvesting software to contact a provider of content and metadata, this software must be aware of

  • the location of the provider (typically a URL), as well as
  • the protocol(s) that the provider supports (OAI-PMH, RSS).

This information is typically maintained in a so-called registry. In a way, a registry is a meta-repository: a repository with information about repositories. Its main goal is to enable other services to discover repositories. To this effect, it sometimes also includes additional information about the content repositories, such as data about

  • the collections they hold: their domain, the number of resources, etc.
  • when the repository was last updated,
  • etc.

We strongly encourage you to register your repository in with ARIADNE or