Planet Venus

From Creative Commons
Revision as of 23:02, 29 February 2008 by Mike Linksvayer (talk | contribs)
Jump to: navigation, search

Creative Commmons uses feed aggregation software to collect in one place CC blogs, jurisdictions blogs, and also blogs of people and organizations that are closely associated with or actively involved in the CC community.

The software we are using to do this is called Planet Venus, which is a major rewrite of Planet done by Sam Ruby. For more information and documentation on Planet Venus, please see the project's page.

CC has slightly extended Planet Venus via a few plugins. First of all, as licensing information is eminently important to CC, we have created a plugin that will pull license information from the feeds and make it available to the HTML templates. For an example of the output, take a look at the CC Planet. The plugin is a python script named get_license_name.plugin and can be viewed/downloaded at the cctools subversion repository at sourceforge.net.

NOTE: the plugin only works for HTMLTMPL templates. XSLT and Genshi templates have full access to every feed element and therefore can extract licensing data directly.

The get_license_name.plugin requires a couple of semi-non-standard Python modules: Beautiful Soup and rdfadict. The plugin self documents to some extent via comments and in any case is not so big.

In order for the plugin to work, a small patch much also be applied to the Planet Venus code iteslf -- to a single file: planet/shell/tmpl.py. Most of the patch below is context and comments.

--- venus/planet/shell/tmpl.py  2007-12-21 19:24:02.000000000 -0800
+++ branches/production/software/planet/shell/tmpl.py   2008-02-22 15:35:18.000000000 -0800
@@ -120,6 +120,8 @@
     ['published', PlanetDate, 'published_parsed'],
     ['published_822', Rfc822, 'published_parsed'],
     ['published_iso', Rfc3399, 'published_parsed'],
+    ['license', String, 'source', 'links', {'rel': 'license'}, 'href'],
+    ['default_license', String, 'source', 'planet_default_license']
 ]
 
 # Add additional rules for source information
@@ -141,6 +143,15 @@
                     elif node.get('type','')=='application/xhtml+xml':
                         node['value'] = empty.sub(r"<\1 />", node['value'])
                 node = node[path]
+                      
+            # This is a special-case elif needed to grab license info from the
+            # feed data.  Normally node will be a simple list or dict, but in
+            # the case of license information, node is a list of lists, so we
+            # need to look inside the first item, which is where the license
+            # data seems to always be.
+            elif isinstance(path, str) and isinstance(node, list) and \
+                    path in node[0]:
+                node = node[0][path]
             elif isinstance(path, int):
                 node = node[path]
             elif isinstance(path, dict):
@@ -155,8 +166,20 @@
             else:
                 break
         else:
-            if node: output[rule[0]] = rule[1](node)
-        
+            # If this node contains license information, indicated by rule[0]
+            # being 'license' or 'default_license' (from list Items), then
+            # drop the the license URI into a variable that will be accessible
+            # by the template.  'default_license' is specified in the config
+            # of each blog, and can be used if no other license data is found
+            # in the feed itself.
+            if node:
+                if rule[0] == 'license' or rule[0] == 'default_license':
+                    output[rule[0]] = '<a about="%s" rel="license" \
+                        href="%s" title="License information">License</a>' \
+                        % (source.link, node)
+                else:
+                    output[rule[0]] = rule[1](node)
+
     # copy over all planet namespaced elements from parent source
     for name,value in source.items():
         if name.startswith('planet_'):

The patch also tries to make use of a custom configuration parameter called default_license. Since very few blogs will actually embed license data into a syndication feed in a machine readable way, it was necessary to provide a mechanism for this information to be supplied manually on a feed-by-feed basis. The plugin will first look for license information in the feed itself. If it doesn't find any then it looks to see if default_license is defined for the feed. If no license information is found either in the feed or in the config variable default_license then the plugin does nothing.

Here is an example configuration:

[http://somedomain.org/blog/feed/atom]
name = Some Blog Name
default_license = http://creativecommons.org/licenses/by/3.0