Difference between revisions of "Metadata Retriever Plugins"

From Creative Commons
Jump to: navigation, search
Line 4: Line 4:
 
|status=Complete
 
|status=Complete
 
}}
 
}}
Operators may wish to include metadata about Resources from other sources, including web services (ie, Semantic Analysis), databases, etc.  This describes a plugin system for adding sources of information at aggregation time.
+
 
 +
DEd sites may wish to include metadata about Resources from other sources, including web services (ie, Semantic Analysis), databases, etc.  This describes a plugin system for adding sources of information at aggregation time.
  
 
== Requirements ==
 
== Requirements ==
  
Metadata retriever plugins are Nutch plugins which implement the MetadataRetriever extension point.  MetadataRetriever extensions must implement a single method, <code>retrieve</code>.  <code>retrieve</code> takes a Resource as an argument, and may add additional metadata to it.  <code>retrieve</code> is not responsible for persisting the Resource.
+
Metadata retriever plugins are [http://wiki.apache.org/nutch/PluginCentral Nutch plugins] which implement the <code>MetadataRetriever</code> extension point.  MetadataRetriever extensions must implement a single method, <code>retrieve</code>.  <code>retrieve</code> takes a Resource as an argument, and may add additional metadata to it.  <code>retrieve</code> is not responsible for persisting the Resource.
  
 
We will implement two demonstration plugins: a "dummy" plugin which logs the URLs being passed to it, and a functional demonstration which uses the Delicious API to [http://delicious.com/help/api#posts_suggest retrieve suggested and popular tags] for a page.  
 
We will implement two demonstration plugins: a "dummy" plugin which logs the URLs being passed to it, and a functional demonstration which uses the Delicious API to [http://delicious.com/help/api#posts_suggest retrieve suggested and popular tags] for a page.  

Revision as of 14:22, 17 June 2010

Contact Contact::Nathan Yergler
Project ,|project_name|Project Driver::project_name}}
Status Status::Complete


DEd sites may wish to include metadata about Resources from other sources, including web services (ie, Semantic Analysis), databases, etc. This describes a plugin system for adding sources of information at aggregation time.

Requirements

Metadata retriever plugins are Nutch plugins which implement the MetadataRetriever extension point. MetadataRetriever extensions must implement a single method, retrieve. retrieve takes a Resource as an argument, and may add additional metadata to it. retrieve is not responsible for persisting the Resource.

We will implement two demonstration plugins: a "dummy" plugin which logs the URLs being passed to it, and a functional demonstration which uses the Delicious API to retrieve suggested and popular tags for a page.

Implementation

  • Added support for storing arbitrary metadata on Resource objects
  • Added support to TripleStore for serialization an de-serialization
  • Implemented the org.creativecommons.learn.plugin.MetadataRetriever extension point and org.creativecommons.learn.plugin.MetadataRetrievers extension loader
  • Implemented test plugins

Deferred until later

  • Configuration parameters for plugins (the Delicious plugin reads from discovered.xml, but the plugin.xml manifest doesn't state that it needs parameters).
  • Allow sequencing of MetadataRetriever plugins, to determine which are authoritative