Creative Commons is participating in Google's Summer of Code as a mentoring organization. Submissions for SoC 2007 will open in March; see the timeline for more details. A list of projects accepted for last year's SoC 2006 is available here.
This page highlights ideas and suggestions for student proposals for 2007.
If you find an idea listed below that you like or have your own idea for a Creative Commons-related open source project, we encourage you to read up about the Creative Commons Developer Community, ask questions, and then include the following in your proposal:
- Detailed description / design document
- an approximate schedule (timeline)
- brief description of past projects (including open source) that you've participated in
- brief resume/bio/contact information
The following links detail successfull general ways to write a Summer of Code Proposal:
Please read the Selection Criteria. Participants who read this will be much further along than others.
- Read up about the Creative Commons Developer Community
- Join the cc-devel mailing list and ask questions
- Join the Creative Commons chat channel, #cc, on irc.freenode.net.
More ideas are available in the Developer Challenges section of the website. What follows is a generalized listing of quick ideas which any student may use to identify interests. Please do not be constrained by the ideas below, but please use them to jumpstart and understand the general areas we are interested in supporting.
- Any new tools which support publishing of content licensed with a Creative Commons license
- Develop plugins that utilize Creative Commons licenses and metadata in your favorite applications. If these are web-based ideally licensing both at site-level and "object" (e.g., page, image) level should be supported, and RDFa metadata.
- joomla, drupal, civicspace, plone,
- Any new tools which support finding of content licensed with a Creative Commons license
- beagle, tracker, spotlight (continue/finish)
- Extend the CcNutch codebase to support RDFa and image, audio, or video search (using scoped metadata, not image/audio/video analysis!)
- Feed aggregators
- Add cc licenses and metadata support
- A feed mashup library
- MusicBrainz has a project idea for improving Creative Commons integration in that site.
Expand Other Software with ccHost's Features or ccHost
- Implement the Sample Pool API in other web backends or software applications
- wordpress, mediawiki, plone, drupal
- Remix tracking within software or site
Expand Software with ccPublisher's Features or ccPublisher
- Implement support for embedding license metadata in additional file types; this support would be in the form of additions to the
cctagutils library. Contact Nathan Yergler for details or specifics.
- Embedding of XMP in various formats
- Implement back-end support for other publishing platforms, such as Flickr, My Space, etc. Basic documentation on storage providers has been started.
- Add ccPublisher's publish mechanism to an other application
- Add support for selecting a license within open source applications such as the above. A successful implementation will use the web services to provide up to date license information.
- Integrate finding and reusing of CC licensed content directly within applications like OpenOffice.org, The Gimp, Inkscape, Audacity, etc.
- HIGH-PRIORITY: OpenOffice.org, Other Open Source apps, Jokosher
- Integrate finding and publishing of CC licensed content directly within the Open Source Desktop (think Gnome or KDE integration). A starting point for Gnome may be the prototype Nautilus extension for displaying license information embedded in MP3 files.
- Extend the CC licensing extractor for Spotlight to support multiple file formats and polish it to release quality. Issues which must be addressed include extractor chaining and packaging.
- It's all the craze! Develop some web mashups by combining multiple different web-based APIs (creative commons, amazon, google, flickr, archive.org) to create a project that uses these APIs to help spread CC-licensing.
Statistics and Metrics
- Write tools to perform web crawls and characterize use of CC licenses according to content type, language, geography, metadata use, etc. The ideal output would be something analogous to Google Web Authoring Statistics of interest to CC.
- A possible addition to labs that would specifically demo RDFa and Metadata, as both advocacy and testing
- Re-implement validator.creativecommons.org as a useful resource with RDFa and current CC metadata specs.
- Alternate takes on ccSearch, possibly using Open Search, multiple search APIs
- Possibly some type of top/bottom fed search engine using cc-licensed enabled feeds
Mozilla/Gecko Extension (MozCC)
- Develop an XPCOM component implementing a metadata store which MozCC and other applications could build upon.
Open access publishing and Science
- Create or add CC licensing and RDFa support to an open access publishing tool. Example:
- It would appear a practical RDFa proposal would be around tagging a scientific HTML doc with the triples extracted and built with an NLP tool. The elements of the text, genes, diseases, pathways, therapeutics, etc, would be directly embedded in the text around such words (perhaps rendered as typed links) , while the triples of how they interplay would be represented as well. This would bring together the human readable world with the concept-codified space.
Semantic Web for Science
- Science Commons is using semantic web technologies to promote data accessibility and interoperability, so any project that makes the semantic web work better, especially for life scientists, is of interest. Examples:
- implement a macro system for OWL
- implement an RDF library for the Scheme programming language
- implement DL-Lite, an OWL subset designed to efficiently map to relational databases (see http://www.dis.uniroma1.it/~quonto/articoli/calv-etal-AAAI-2005.pdf )
- develop a library of RDF exporters for important sources of biological data
- Build a tool for optimizing access to particular RDF graphs, e.g. by synthesizing a relational schema inferred from the content of the graph and sample queries
- Adapt OpenNLP, GATE, or other open source natural language processing system to mine the open literature for interesting biological entities (cell lines, antibodies, ...) and relationships, rendering the results as RDF or RDFa
- Set up an open 'semantic wiki' for use by biologists: adapt an existing wiki implementation to add mechanisms for entry and/or deduction of entity identifications and relationships
An ideal proposal would include support for RDFa, remixing, open formats and affordances for educational and worldwide (not just wealthy regions) use. Ability to release under an open source license and incorporation of some Creative Commons affordance are necessary. However, a solid proposal is far more important than buzzword compliance. Please read Google's Summer of Code Student FAQ and advice from past participants as you create your proposal. Good luck!
Note: email addresses, in parentheses, are all at creativecommons.org