Difference between revisions of "CcNutch"
(11 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
− | Creative Commons plugin for the open source [http://nutch.org Nutch] search engine. Module | + | '''CcNutch''' is a Creative Commons plugin for the open source [http://nutch.org Nutch] search engine. Module ccnutch in the cctools [[Source_Repository_Information|sourceforge repository]]. |
− | + | There '''was''' a running instance at http://search.creativecommons.org. Commercial search engines support CC well enough now that it was turned off. http://search.creativecommons.org now offers a selection of these. Nutch may be revived if we want to explore search features that do not yet have commercial interest. | |
==Build== | ==Build== | ||
− | + | {{incomplete}} | |
+ | ==Crawl== | ||
+ | |||
+ | {{incomplete}} | ||
+ | |||
+ | == Roadmap == | ||
+ | |||
+ | === Milestone 0 === | ||
+ | |||
+ | * Fill in documentation above | ||
+ | * Update CCNutch plugin for current Nutch version (may be no-op apart from testing) | ||
+ | * Add support for parsing RDFa (currently embedded RDF/XML is supported) | ||
+ | |||
+ | === Milestone 1 === | ||
+ | * Add support for indexing assertions about objects other than the current document (eg image, audio, video). | ||
+ | * Add support for indexing specific attribution metadata | ||
+ | |||
+ | === Milestone 2 === | ||
− | + | * Add feature requests here | |
− | + | === Milestone 3 === | |
+ | * deploy ccNutch on some infrastructure for testing | ||
− | == | + | === Milestone 4 === |
− | * | + | * Add feature requests here |
− | |||
− | |||
[[Category:CcNutch]] | [[Category:CcNutch]] | ||
+ | [[Category:opensource]] | ||
+ | [[Category:Technology]] | ||
+ | [[Category:Developer]] |
Latest revision as of 01:35, 24 July 2009
CcNutch is a Creative Commons plugin for the open source Nutch search engine. Module ccnutch in the cctools sourceforge repository.
There was a running instance at http://search.creativecommons.org. Commercial search engines support CC well enough now that it was turned off. http://search.creativecommons.org now offers a selection of these. Nutch may be revived if we want to explore search features that do not yet have commercial interest.
Contents
Build
Crawl
Roadmap
Milestone 0
- Fill in documentation above
- Update CCNutch plugin for current Nutch version (may be no-op apart from testing)
- Add support for parsing RDFa (currently embedded RDF/XML is supported)
Milestone 1
- Add support for indexing assertions about objects other than the current document (eg image, audio, video).
- Add support for indexing specific attribution metadata
Milestone 2
- Add feature requests here
Milestone 3
- deploy ccNutch on some infrastructure for testing
Milestone 4
- Add feature requests here