Difference between revisions of "Metadata Scraper"
(New page: The Creative Commons Metadata Scraper is a simple crawler used by the license engine to detect metadata stored in pages. == What it Does == The metadata scraper is used by the license de...) |
(Added the Software Project form) |
||
Line 1: | Line 1: | ||
− | The Creative Commons Metadata Scraper is a simple crawler used by the license engine to detect metadata stored in pages. | + | {{Software Project |
− | + | |Description=The Creative Commons Metadata Scraper is a simple crawler used by the license engine to detect metadata stored in pages. | |
+ | |Bug tracker=http://code.creativecommons.org/issues/ | ||
+ | |Code repository=http://code.creativecommons.org/viewsvn/metadata_scraper/ | ||
+ | |Mailing list=http://lists.ibiblio.org/mailman/listinfo/cc-devel | ||
+ | }} | ||
== What it Does == | == What it Does == | ||
Revision as of 21:51, 30 July 2009
Description of Software | The Creative Commons Metadata Scraper is a simple crawler used by the license engine to detect metadata stored in pages. |
---|---|
Bug Tracker | Link to Bug Tracker |
Code Repository | Link to Code repository |
Mailing List | Link to Mailing list |
What it Does
The metadata scraper is used by the license deeds to extract embedded metadata from pages. It currently scans the page for RDFa using rdfadict and pyRdfa.
When it Happens
When a creator or copyright holder selects a license, they have the opportunity to provide additional metadata about there work. This includes information such as creator and medium, as well as how the creator would like to be attributed. If this information is provided, it is encoded in the HTML generated using RDFa.
In order to provide more context relevant information for visitors, the license deeds load a script which looks at the referring page for metadata. If found, the metadata is used to update the deed and display additional attribution or CC+ details.
Developer Information
Metadata Scraper is implemented in Python using the CherryPy web framework. The source code is available in the metadata_scraper
module in subversion. You can checkout the source or browse it.
Contact
Contact Nathan Yergler with questions about the scraper.