Difference between revisions of "Summer of Code 2007"

From Creative Commons
Jump to: navigation, search
(Web Mashups)
 
(52 intermediate revisions by 8 users not shown)
Line 1: Line 1:
[[Category:TechChallenges]]
+
[[Category:Developer Challenges]]
 
[[Category:Developer]]
 
[[Category:Developer]]
 
[[Category:Summerofcode]]
 
[[Category:Summerofcode]]
[[Category:2006]]
 
  
Creative Commons is participating in [http://code.google.com/soc/ Google's Summer of Code 2006] as a mentoring organization.
+
Creative Commons is participating in [http://code.google.com/soc/ Google's Summer of Code] as a mentoring organization.  Submissions for SoC 2007 will open in March; see the [http://code.google.com/support/bin/answer.py?answer=60325&topic=10729 timeline] for more details.  A list of projects accepted for last year's SoC 2006 is available [[Summer of Code 2006 Projects|here]].
  
This page highlights ideas for [http://creativecommons.org/weblog/entry/5857 Google Summer of Code] student proposals and feature updates as the program progresses.  
+
This page highlights ideas and suggestions for student proposals for 2007.
  
 
== Students ==  
 
== Students ==  
Line 40: Line 39:
 
== Deadlines ==  
 
== Deadlines ==  
  
Student applications open on '''May 1, 2006''' and close '''May 8, 2006'''. Final decisions by Creative Comons will be made by '''May 22, 2006''' for submission to Google.
+
TBD
  
 
== General Ideas ==
 
== General Ideas ==
  
More ideas are avaible in the [[Tech Challenges]] section of the website. What follows is a generalized listing of quick ideas which any student may use to identify interests. Please do not be constrained by the ideas below, but please use them to jumpstart and understand the general areas we are interested in supporting.
+
More ideas are available in the [[Developer Challenges]] section of the website. What follows is a generalized listing of quick ideas which any student may use to identify interests. Please do not be constrained by the ideas below, but please use them to jumpstart and understand the general areas we are interested in supporting.
  
 
=== Publish ===
 
=== Publish ===
  
Any new tools which support publishing of content licensed with a Creative Commons license
+
* Any new tools which support publishing of content licensed with a Creative Commons license
 +
* Develop plugins that utilize Creative Commons licenses and metadata in your favorite applications.  If these are web-based ideally licensing both at site-level and "object" (e.g., page, image) level should be supported, and [[RDFa]] metadata.
 +
** joomla, drupal, civicspace, plone,
  
 
=== Find (Search) ===
 
=== Find (Search) ===
  
Any new tools which support finding of content licensed with a Creative Commons license
+
* Any new tools which support finding of content licensed with a Creative Commons license
 +
** beagle, tracker, spotlight (continue/finish)
 +
* Extend the [[CcNutch]] codebase to support [[RDFa]] and image, audio, or video search (using scoped metadata, not image/audio/video analysis!)
 +
* Feed aggregators
 +
** Add cc licenses and metadata support
 +
** A feed mashup library
 +
* MusicBrainz has a [http://wiki.musicbrainz.org/SummerOfCodeIdeas project idea for improving Creative Commons integration] in that site.
  
=== ccTools ===
+
=== Expand Other Software with ccHost's Features or ccHost ===
  
The general tools created to work with Creative Commons licenses and content licensed with these licenses.
+
* Implement the [[Sample Pool API]] in other web backends or software applications
 +
** wordpress, mediawiki, plone, drupal
 +
* Remix tracking within software or site
  
==== ccHost ====
+
=== Expand Software with ccPublisher's Features or ccPublisher ===
  
* Extend ccHost to work with new media filetypes and push changes up-stream to getid3()
+
* Implement support for embedding license metadata in additional file types; this support would be in the form of additions to the <code>cctagutils</code> library.  Contact [[User:NathanYergler|Nathan Yergler]] for details or specifics.
* Setup and build a video version of ccmixter.org using cchost.
+
** Embedding of XMP in various formats
* Implement [[Sample_Pool_API|sample pool API]] in other web backends or software applications
+
* Implement back-end support for other publishing platforms, such as Flickr, My Space, etc.  Basic [[Writing_a_Storage_Provider|documentation]] on storage providers has been started.
 +
* Add ccPublisher's publish mechanism to an other application
 +
 
 +
=== Desktop Applications ===
 +
 
 +
* Add support for selecting a license within open source applications such as the above.  A successful implementation will use the [[Creative_Commons_Web_Services|web services]] to provide up to date license information.
 +
* Integrate finding and reusing of CC licensed content directly within applications like [http://www.openoffice.org OpenOffice.org], [http://gimp.org The Gimp], [http://www.inkscape.org Inkscape], [http://audacity.sourceforge.net Audacity], etc.
 +
** HIGH-PRIORITY: OpenOffice.org, Other Open Source apps, Jokosher
 +
 
 +
=== Desktop Integration ===
  
==== ccPublisher ====
+
* Integrate finding and publishing of CC licensed content directly within the Open Source Desktop (think Gnome or KDE integration).  A starting point for Gnome may be the prototype Nautilus extension for displaying license information embedded in MP3 files.
 +
** Beagle, Tracker, etc
 +
* Extend the [http://yergler.net/projects/cc-spotlight/ CC licensing extractor] for [http://www.apple.com/macosx/features/spotlight/ Spotlight] to support multiple file formats and polish it to release quality.  Issues which must be addressed include extractor chaining and packaging.
  
* Implement support for uploading works to [[CcHost|ccHost]] installations.  Note that this may require working with the ccHost codebase as well.
+
=== Web Mashups ===
* Implement support for embedding license metadata in additional file types; this support would be in the form of additions to the <code>cctagutils</code> library.  Contact [[User:NathanYergler|Nathan Yergler]] for details or specifics.
 
* Implement back end support for other publishing platforms, such as Flickr, My Space, etc.  Basic [[Writing_a_Storage_Provider|documentation]] on storage providers has been started.
 
  
=== LiveCD ===
+
* It's all the craze! Develop some [http://en.wikipedia.org/wiki/Mashup_%28web_application_hybrid%29 web mashups] by combining multiple different web-based APIs (creative commons, amazon, google, flickr, archive.org) to create a project that uses these APIs to help spread CC-licensing.
  
Create an Open Source [LiveCD http://en.wikipedia.org/wiki/Live_CD] which adds CC-licensed content and features the Creative Commons Tools.  Also, one should make a way to keep in sync with mainline LiveCDs so that this project does not have to maintain its own LiveCD.
+
=== Statistics and Metrics ===
  
=== Applications ===
+
* Write tools to perform web crawls and characterize use of CC licenses according to content type, language, geography, metadata use, etc. The ideal output would be something analogous to [http://code.google.com/webstats/index.html Google Web Authoring Statistics] of interest to CC.
  
* Integrate finding and publishing of CC licensed content directly within applications like [http://www.openoffice.org OpenOffice.org], [http://gimp.org The Gimp], and [http://www.inkscape.org]
+
=== RDFa/Metadata Specific ===
* Add support for selecting a license withing open source applications such as [http://gimp.org The Gimp] or [http://www.openoffice.org OpenOffice.org].  A successful implementation will use the [[Creative_Commons_Web_Services|web services]] to provide up to date license information.
 
  
=== Desktop ===
+
* A possible addition to labs that would specifically demo RDFa and Metadata, as both advocacy and testing
 +
* Re-implement validator.creativecommons.org as a useful resource with RDFa and current CC metadata specs.
  
* Integrate finding and publishing of CC licensed content directly within the Open Source Desktop (think Gnome or KDE integration).  A starting point for Gnome may be the prototype Nautilus extension for displaying license information embedded in MP3 files.
+
=== Search ===
* Extend the [http://yergler.net/projects/cc-spotlight/ CC licensing extractor] for [http://www.apple.com/macosx/features/spotlight/ Spotlight] to support multiple file formats and polish it to release quality.  Issues which must be addressed include extractor chaining and packaging.
 
  
=== Media Mixing ===
+
* Alternate takes on ccSearch, possibly using Open Search, multiple search APIs
 +
** Possibly some type of top/bottom fed search engine using cc-licensed enabled feeds
  
* Build basic media mixing tools either for the Open Source Desktop or for the web (think AJAX) that allows for mixing of legal media files (photos, videos, music, etc) in order to create interesting remixes and art.
+
=== Mozilla/Gecko Extension (MozCC) ===
  
=== Plugins ===
+
* Develop an XPCOM component implementing a metadata store which MozCC and other applications could build upon.
  
* Develop plugins that utilize Creative Commons licenses and metadata in your favorite applications
+
=== Open access publishing and Science ===
* Update and revise [http://yergler.net/projects/mozcc mozCC].  New features needed include support for [[RDFa]] embedded metadata and visualization of license information for specific elements (i.e. outlining an image if the metadata declares that the image is specifically licensed).
 
  
=== Web Mashups ===
+
* Create or add CC licensing and [[RDFa]] support to an [http://en.wikipedia.org/wiki/Open_access open access publishing tool]. Example:
 +
** It would appear a practical RDFa proposal would be around tagging a scientific HTML doc with the triples extracted and built with an NLP tool. The elements of the text, genes, diseases, pathways, therapeutics, etc, would be directly embedded in the text around such words (perhaps rendered as typed links) , while the triples of how they interplay would be represented as well. This would bring together the human readable world with the concept-codified space.
  
* Its all the craze! Develop some [http://en.wikipedia.org/wiki/Mashup_%28web_application_hybrid%29 web mashups] by combining multiple different web-based APIs (creative commons, amazon, google, flickr, archive.org) to create a project that uses these APIs to help spread CC-licensing
+
=== Semantic Web for Science ===
  
=== Attributors ===
+
* [http://sciencecommons.org/ Science Commons] is using semantic web technologies to promote data accessibility and interoperability, so any project that makes the semantic web work better, especially for life scientists, is of interest.  Examples:
 +
** implement computed properties for the Pellet OWL DL reasoner, with formulas expressed in Javascript
 +
** implement a macro system for OWL
 +
** implement an RDF library for the Scheme programming language
 +
** implement DL-Lite, an OWL subset designed to efficiently map to relational databases (see http://www.dis.uniroma1.it/~quonto/articoli/calv-etal-AAAI-2005.pdf )
 +
** develop a library of RDF exporters for important sources of biological data
 +
* Build a tool for optimizing access to particular RDF graphs, e.g. by synthesizing a relational schema inferred from the content of the graph and sample queries
 +
* Adapt OpenNLP, GATE, or other open source natural language processing system to mine the open literature for interesting biological entities (cell lines, antibodies, ...) and relationships, rendering the results as RDF or RDFa
 +
* Set up an open 'semantic wiki' for use by biologists: adapt an existing wiki implementation to add mechanisms for entry and/or deduction of entity identifications and relationships
  
* Create scripts which add attribution and basic license information into media
+
== Bonus Points ==
** Photos: this would create some basic graphical overlay to the image, or basic html wrappers around the content that says author name, license (url).
+
An ''ideal'' proposal would include support for [[RDFa]], remixing, [http://en.wikipedia.org/wiki/Open_format open formats] and affordances for educational and worldwide (not just wealthy regions) use. Ability to release under an open source license and incorporation of some Creative Commons affordance are necessary. However, a solid proposal is far more important than buzzword compliance. Please read Google's [http://code.google.com/soc/studentfaq.html Summer of Code Student FAQ] and [[#Writing_Proposals|advice from past participants]] as you create your proposal. Good luck!
** Other Media: Please propose other ways one could attribute authorship on the media itself
 
  
 
== Mentors ==
 
== Mentors ==
Line 109: Line 134:
 
* [[User:NathanYergler|Nathan Yergler]] (nyergler)
 
* [[User:NathanYergler|Nathan Yergler]] (nyergler)
 
* [[User:Alex|Alex Roberts]]
 
* [[User:Alex|Alex Roberts]]
 +
* [[User:Jar|Jonathan Rees]] (jar)  ([http://sciencecommons.org/ Science Commons])
 +
* [[User:AlanRuttenberg|Alan Ruttenberg]] (alanr) ([http://sciencecommons.org/ Science Commons])
 +
 +
Note: email addresses, in parentheses, are all at creativecommons.org
  
 
== External Links ==
 
== External Links ==

Latest revision as of 15:34, 4 March 2008


Creative Commons is participating in Google's Summer of Code as a mentoring organization. Submissions for SoC 2007 will open in March; see the timeline for more details. A list of projects accepted for last year's SoC 2006 is available here.

This page highlights ideas and suggestions for student proposals for 2007.

Students

If you find an idea listed below that you like or have your own idea for a Creative Commons-related open source project, we encourage you to read up about the Creative Commons Developer Community, ask questions, and then include the following in your proposal:

  1. Detailed description / design document
  2. an approximate schedule (timeline)
  3. brief description of past projects (including open source) that you've participated in
  4. brief resume/bio/contact information

Writing Proposals

The following links detail successfull general ways to write a Summer of Code Proposal:

Selection Criteria

Please read the Selection Criteria. Participants who read this will be much further along than others.

Questions

  1. Read up about the Creative Commons Developer Community
  2. Join the cc-devel mailing list and ask questions
  3. Join the Creative Commons chat channel, #cc, on irc.freenode.net.

Deadlines

TBD

General Ideas

More ideas are available in the Developer Challenges section of the website. What follows is a generalized listing of quick ideas which any student may use to identify interests. Please do not be constrained by the ideas below, but please use them to jumpstart and understand the general areas we are interested in supporting.

Publish

  • Any new tools which support publishing of content licensed with a Creative Commons license
  • Develop plugins that utilize Creative Commons licenses and metadata in your favorite applications. If these are web-based ideally licensing both at site-level and "object" (e.g., page, image) level should be supported, and RDFa metadata.
    • joomla, drupal, civicspace, plone,

Find (Search)

  • Any new tools which support finding of content licensed with a Creative Commons license
    • beagle, tracker, spotlight (continue/finish)
  • Extend the CcNutch codebase to support RDFa and image, audio, or video search (using scoped metadata, not image/audio/video analysis!)
  • Feed aggregators
    • Add cc licenses and metadata support
    • A feed mashup library
  • MusicBrainz has a project idea for improving Creative Commons integration in that site.

Expand Other Software with ccHost's Features or ccHost

  • Implement the Sample Pool API in other web backends or software applications
    • wordpress, mediawiki, plone, drupal
  • Remix tracking within software or site

Expand Software with ccPublisher's Features or ccPublisher

  • Implement support for embedding license metadata in additional file types; this support would be in the form of additions to the cctagutils library. Contact Nathan Yergler for details or specifics.
    • Embedding of XMP in various formats
  • Implement back-end support for other publishing platforms, such as Flickr, My Space, etc. Basic documentation on storage providers has been started.
  • Add ccPublisher's publish mechanism to an other application

Desktop Applications

  • Add support for selecting a license within open source applications such as the above. A successful implementation will use the web services to provide up to date license information.
  • Integrate finding and reusing of CC licensed content directly within applications like OpenOffice.org, The Gimp, Inkscape, Audacity, etc.
    • HIGH-PRIORITY: OpenOffice.org, Other Open Source apps, Jokosher

Desktop Integration

  • Integrate finding and publishing of CC licensed content directly within the Open Source Desktop (think Gnome or KDE integration). A starting point for Gnome may be the prototype Nautilus extension for displaying license information embedded in MP3 files.
    • Beagle, Tracker, etc
  • Extend the CC licensing extractor for Spotlight to support multiple file formats and polish it to release quality. Issues which must be addressed include extractor chaining and packaging.

Web Mashups

  • It's all the craze! Develop some web mashups by combining multiple different web-based APIs (creative commons, amazon, google, flickr, archive.org) to create a project that uses these APIs to help spread CC-licensing.

Statistics and Metrics

  • Write tools to perform web crawls and characterize use of CC licenses according to content type, language, geography, metadata use, etc. The ideal output would be something analogous to Google Web Authoring Statistics of interest to CC.

RDFa/Metadata Specific

  • A possible addition to labs that would specifically demo RDFa and Metadata, as both advocacy and testing
  • Re-implement validator.creativecommons.org as a useful resource with RDFa and current CC metadata specs.

Search

  • Alternate takes on ccSearch, possibly using Open Search, multiple search APIs
    • Possibly some type of top/bottom fed search engine using cc-licensed enabled feeds

Mozilla/Gecko Extension (MozCC)

  • Develop an XPCOM component implementing a metadata store which MozCC and other applications could build upon.

Open access publishing and Science

  • Create or add CC licensing and RDFa support to an open access publishing tool. Example:
    • It would appear a practical RDFa proposal would be around tagging a scientific HTML doc with the triples extracted and built with an NLP tool. The elements of the text, genes, diseases, pathways, therapeutics, etc, would be directly embedded in the text around such words (perhaps rendered as typed links) , while the triples of how they interplay would be represented as well. This would bring together the human readable world with the concept-codified space.

Semantic Web for Science

  • Science Commons is using semantic web technologies to promote data accessibility and interoperability, so any project that makes the semantic web work better, especially for life scientists, is of interest. Examples:
    • implement computed properties for the Pellet OWL DL reasoner, with formulas expressed in Javascript
    • implement a macro system for OWL
    • implement an RDF library for the Scheme programming language
    • implement DL-Lite, an OWL subset designed to efficiently map to relational databases (see http://www.dis.uniroma1.it/~quonto/articoli/calv-etal-AAAI-2005.pdf )
    • develop a library of RDF exporters for important sources of biological data
  • Build a tool for optimizing access to particular RDF graphs, e.g. by synthesizing a relational schema inferred from the content of the graph and sample queries
  • Adapt OpenNLP, GATE, or other open source natural language processing system to mine the open literature for interesting biological entities (cell lines, antibodies, ...) and relationships, rendering the results as RDF or RDFa
  • Set up an open 'semantic wiki' for use by biologists: adapt an existing wiki implementation to add mechanisms for entry and/or deduction of entity identifications and relationships

Bonus Points

An ideal proposal would include support for RDFa, remixing, open formats and affordances for educational and worldwide (not just wealthy regions) use. Ability to release under an open source license and incorporation of some Creative Commons affordance are necessary. However, a solid proposal is far more important than buzzword compliance. Please read Google's Summer of Code Student FAQ and advice from past participants as you create your proposal. Good luck!

Mentors

Note: email addresses, in parentheses, are all at creativecommons.org

External Links