PuSH Feed Type

From Creative Commons
Revision as of 18:39, 8 September 2010 by Nathan Yergler (talk | contribs)
Jump to: navigation, search
Contact Contact::Asheesh Laroia
Project ,|project_name|Project Driver::project_name}}
Status Status::Draft


The people who run a DiscoverEd instance may wish to be updated nearly-immediately when there are new resources published by a curator. Curators and publishers may wish to notify upstream consumers (aggregators, indexers, other tools) when changes occur.

Right now, DiscoverEd instances aggregate feeds and crawl periodically, often manually at the behest of the search engine operator. PubSubHubbub (PuSH) provides a way for a subscriber (like the DiscoverEd instance) to subscribe feeds and receive automatic, nearly-instantaneous notification of new information in the feed. This can be built on top of existing Atom/RSS feeds that curators already publish. Subscribers can register their interest in a feed and receive notifications when a change occurs. Curators and publishers may either notify subscribers (through a "hub") when changes occur, or the hub will periodically poll and distribute notifications to subscribers.

This feature was initially described during a meeting on the Learning Registry with NSDL, ADLnet, and the US Department of Education.

Requirements

A complete implementation of this specification would provide the following things.

  • DiscoverEd can discover a PuSH hub mentioned in a feed.
  • DiscoverEd can register itself as a subscriber to that feed on that hub. (To do that, it has to provide a URL on the DiscoverEd instance that, when the feed is updated, the hub should POST to.)
  • When the hub pings DiscoverEd to say there is an update to that feed, it re-aggregates data from that feed, does a crawl, and merges the index.

Potential Issues

Aggregation and crawling are currently two different steps in the pipeline. Implementing this will require us to examine the way they interact. This should not be difficult from a code perspective (we've made progress on the aggregation side, and the Nutch API is relatively sane), but will require us to update the index in place (as opposed to merging).

Status

  • Seeking partner to support development and test use of a hub.