Metadata Retriever Plugins
DEd sites may wish to include metadata about Resources from other sources, including web services (ie, Semantic Analysis), databases, etc. This describes a plugin system for adding sources of information at aggregation time.
This feature was defined and developed during the June 2010 DiscoverEd Sprint
Metadata retriever plugins are Nutch plugins which implement the
MetadataRetriever extension point. MetadataRetriever extensions must implement a single method,
retrieve takes a Resource as an argument, and may add additional metadata to it.
retrieve is not responsible for persisting the Resource.
We will implement two demonstration plugins: a "dummy" plugin which logs the URLs being passed to it, and a functional demonstration which uses the Delicious API to retrieve suggested and popular tags for a page.
- Added support for storing arbitrary metadata on Resource objects
- Added support to TripleStore for serialization an de-serialization
- Implemented the org.creativecommons.learn.plugin.MetadataRetriever extension point and org.creativecommons.learn.plugin.MetadataRetrievers extension loader
- Implemented test plugins
Deferred until later
- Configuration parameters for plugins (the Delicious plugin reads from discovered.xml, but the plugin.xml manifest doesn't state that it needs parameters).
- Allow sequencing of MetadataRetriever plugins, to determine which are authoritative