Difference between revisions of "Summer of Code 2007"

From Creative Commons
Jump to: navigation, search
(General Ideas)
 
(14 intermediate revisions by 6 users not shown)
Line 3: Line 3:
 
[[Category:Summerofcode]]
 
[[Category:Summerofcode]]
  
Creative Commons is participates in [http://code.google.com/soc/ Google's Summer of Code] as a mentoring organization.  Submissions for SoC 2007 will open in March; see the [http://code.google.com/support/bin/answer.py?answer=60325&topic=10729 timeline] for more details.  A list of projects accepted for last year's SoC 2006 is available [[Summer of Code 2006 Projects|here]].
+
Creative Commons is participating in [http://code.google.com/soc/ Google's Summer of Code] as a mentoring organization.  Submissions for SoC 2007 will open in March; see the [http://code.google.com/support/bin/answer.py?answer=60325&topic=10729 timeline] for more details.  A list of projects accepted for last year's SoC 2006 is available [[Summer of Code 2006 Projects|here]].
  
 
This page highlights ideas and suggestions for student proposals for 2007.
 
This page highlights ideas and suggestions for student proposals for 2007.
Line 43: Line 43:
 
== General Ideas ==
 
== General Ideas ==
  
More ideas are avaible in the [[Developer Challenges]] section of the website. What follows is a generalized listing of quick ideas which any student may use to identify interests. Please do not be constrained by the ideas below, but please use them to jumpstart and understand the general areas we are interested in supporting.
+
More ideas are available in the [[Developer Challenges]] section of the website. What follows is a generalized listing of quick ideas which any student may use to identify interests. Please do not be constrained by the ideas below, but please use them to jumpstart and understand the general areas we are interested in supporting.
  
 
=== Publish ===
 
=== Publish ===
Line 59: Line 59:
 
** Add cc licenses and metadata support
 
** Add cc licenses and metadata support
 
** A feed mashup library
 
** A feed mashup library
 +
* MusicBrainz has a [http://wiki.musicbrainz.org/SummerOfCodeIdeas project idea for improving Creative Commons integration] in that site.
  
 
=== Expand Other Software with ccHost's Features or ccHost ===
 
=== Expand Other Software with ccHost's Features or ccHost ===
Line 87: Line 88:
 
=== Web Mashups ===
 
=== Web Mashups ===
  
* Its all the craze! Develop some [http://en.wikipedia.org/wiki/Mashup_%28web_application_hybrid%29 web mashups] by combining multiple different web-based APIs (creative commons, amazon, google, flickr, archive.org) to create a project that uses these APIs to help spread CC-licensing.
+
* It's all the craze! Develop some [http://en.wikipedia.org/wiki/Mashup_%28web_application_hybrid%29 web mashups] by combining multiple different web-based APIs (creative commons, amazon, google, flickr, archive.org) to create a project that uses these APIs to help spread CC-licensing.
 +
 
 +
=== Statistics and Metrics ===
 +
 
 +
* Write tools to perform web crawls and characterize use of CC licenses according to content type, language, geography, metadata use, etc. The ideal output would be something analogous to [http://code.google.com/webstats/index.html Google Web Authoring Statistics] of interest to CC.
 +
 
 +
=== RDFa/Metadata Specific ===
 +
 
 +
* A possible addition to labs that would specifically demo RDFa and Metadata, as both advocacy and testing
 +
* Re-implement validator.creativecommons.org as a useful resource with RDFa and current CC metadata specs.
 +
 
 +
=== Search ===
 +
 
 +
* Alternate takes on ccSearch, possibly using Open Search, multiple search APIs
 +
** Possibly some type of top/bottom fed search engine using cc-licensed enabled feeds
 +
 
 +
=== Mozilla/Gecko Extension (MozCC) ===
 +
 
 +
* Develop an XPCOM component implementing a metadata store which MozCC and other applications could build upon.
  
 
=== Open access publishing and Science ===
 
=== Open access publishing and Science ===
  
* Create or add CC licensing and [[RDFa]] support to an [http://en.wikipedia.org/wiki/Open_access open access publishing tool].
+
* Create or add CC licensing and [[RDFa]] support to an [http://en.wikipedia.org/wiki/Open_access open access publishing tool]. Example:
** Example
+
** It would appear a practical RDFa proposal would be around tagging a scientific HTML doc with the triples extracted and built with an NLP tool. The elements of the text, genes, diseases, pathways, therapeutics, etc, would be directly embedded in the text around such words (perhaps rendered as typed links) , while the triples of how they interplay would be represented as well. This would bring together the human readable world with the concept-codified space.
*** It would appear a practical RDFa proposal would be around tagging a scientific HTML doc with the triples extracted and built with an NLP tool. The elements of the text, genes, diseases, pathways, therapeutics, etc, would be directly embedded in the text around such words (perhaps rendered as typed links) , while the triples of how they interplay would be represented as well. This would bring together the human readable world with the concept-codified space.
+
 
* An application that does something interesting with [http://creativecommons.org/weblog/entry/5778 Uniprot] or other CC-licensed dataset enabled by CC licensing.
+
=== Semantic Web for Science ===
 +
 
 +
* [http://sciencecommons.org/ Science Commons] is using semantic web technologies to promote data accessibility and interoperability, so any project that makes the semantic web work better, especially for life scientists, is of interest.  Examples:
 +
** implement computed properties for the Pellet OWL DL reasoner, with formulas expressed in Javascript
 +
** implement a macro system for OWL
 +
** implement an RDF library for the Scheme programming language
 +
** implement DL-Lite, an OWL subset designed to efficiently map to relational databases (see http://www.dis.uniroma1.it/~quonto/articoli/calv-etal-AAAI-2005.pdf )
 +
** develop a library of RDF exporters for important sources of biological data
 +
* Build a tool for optimizing access to particular RDF graphs, e.g. by synthesizing a relational schema inferred from the content of the graph and sample queries
 +
* Adapt OpenNLP, GATE, or other open source natural language processing system to mine the open literature for interesting biological entities (cell lines, antibodies, ...) and relationships, rendering the results as RDF or RDFa
 +
* Set up an open 'semantic wiki' for use by biologists: adapt an existing wiki implementation to add mechanisms for entry and/or deduction of entity identifications and relationships
  
 
== Bonus Points ==
 
== Bonus Points ==
Line 105: Line 134:
 
* [[User:NathanYergler|Nathan Yergler]] (nyergler)
 
* [[User:NathanYergler|Nathan Yergler]] (nyergler)
 
* [[User:Alex|Alex Roberts]]
 
* [[User:Alex|Alex Roberts]]
 +
* [[User:Jar|Jonathan Rees]] (jar)  ([http://sciencecommons.org/ Science Commons])
 +
* [[User:AlanRuttenberg|Alan Ruttenberg]] (alanr) ([http://sciencecommons.org/ Science Commons])
 +
 +
Note: email addresses, in parentheses, are all at creativecommons.org
  
 
== External Links ==
 
== External Links ==

Latest revision as of 15:34, 4 March 2008


Creative Commons is participating in Google's Summer of Code as a mentoring organization. Submissions for SoC 2007 will open in March; see the timeline for more details. A list of projects accepted for last year's SoC 2006 is available here.

This page highlights ideas and suggestions for student proposals for 2007.

Students

If you find an idea listed below that you like or have your own idea for a Creative Commons-related open source project, we encourage you to read up about the Creative Commons Developer Community, ask questions, and then include the following in your proposal:

  1. Detailed description / design document
  2. an approximate schedule (timeline)
  3. brief description of past projects (including open source) that you've participated in
  4. brief resume/bio/contact information

Writing Proposals

The following links detail successfull general ways to write a Summer of Code Proposal:

Selection Criteria

Please read the Selection Criteria. Participants who read this will be much further along than others.

Questions

  1. Read up about the Creative Commons Developer Community
  2. Join the cc-devel mailing list and ask questions
  3. Join the Creative Commons chat channel, #cc, on irc.freenode.net.

Deadlines

TBD

General Ideas

More ideas are available in the Developer Challenges section of the website. What follows is a generalized listing of quick ideas which any student may use to identify interests. Please do not be constrained by the ideas below, but please use them to jumpstart and understand the general areas we are interested in supporting.

Publish

  • Any new tools which support publishing of content licensed with a Creative Commons license
  • Develop plugins that utilize Creative Commons licenses and metadata in your favorite applications. If these are web-based ideally licensing both at site-level and "object" (e.g., page, image) level should be supported, and RDFa metadata.
    • joomla, drupal, civicspace, plone,

Find (Search)

  • Any new tools which support finding of content licensed with a Creative Commons license
    • beagle, tracker, spotlight (continue/finish)
  • Extend the CcNutch codebase to support RDFa and image, audio, or video search (using scoped metadata, not image/audio/video analysis!)
  • Feed aggregators
    • Add cc licenses and metadata support
    • A feed mashup library
  • MusicBrainz has a project idea for improving Creative Commons integration in that site.

Expand Other Software with ccHost's Features or ccHost

  • Implement the Sample Pool API in other web backends or software applications
    • wordpress, mediawiki, plone, drupal
  • Remix tracking within software or site

Expand Software with ccPublisher's Features or ccPublisher

  • Implement support for embedding license metadata in additional file types; this support would be in the form of additions to the cctagutils library. Contact Nathan Yergler for details or specifics.
    • Embedding of XMP in various formats
  • Implement back-end support for other publishing platforms, such as Flickr, My Space, etc. Basic documentation on storage providers has been started.
  • Add ccPublisher's publish mechanism to an other application

Desktop Applications

  • Add support for selecting a license within open source applications such as the above. A successful implementation will use the web services to provide up to date license information.
  • Integrate finding and reusing of CC licensed content directly within applications like OpenOffice.org, The Gimp, Inkscape, Audacity, etc.
    • HIGH-PRIORITY: OpenOffice.org, Other Open Source apps, Jokosher

Desktop Integration

  • Integrate finding and publishing of CC licensed content directly within the Open Source Desktop (think Gnome or KDE integration). A starting point for Gnome may be the prototype Nautilus extension for displaying license information embedded in MP3 files.
    • Beagle, Tracker, etc
  • Extend the CC licensing extractor for Spotlight to support multiple file formats and polish it to release quality. Issues which must be addressed include extractor chaining and packaging.

Web Mashups

  • It's all the craze! Develop some web mashups by combining multiple different web-based APIs (creative commons, amazon, google, flickr, archive.org) to create a project that uses these APIs to help spread CC-licensing.

Statistics and Metrics

  • Write tools to perform web crawls and characterize use of CC licenses according to content type, language, geography, metadata use, etc. The ideal output would be something analogous to Google Web Authoring Statistics of interest to CC.

RDFa/Metadata Specific

  • A possible addition to labs that would specifically demo RDFa and Metadata, as both advocacy and testing
  • Re-implement validator.creativecommons.org as a useful resource with RDFa and current CC metadata specs.

Search

  • Alternate takes on ccSearch, possibly using Open Search, multiple search APIs
    • Possibly some type of top/bottom fed search engine using cc-licensed enabled feeds

Mozilla/Gecko Extension (MozCC)

  • Develop an XPCOM component implementing a metadata store which MozCC and other applications could build upon.

Open access publishing and Science

  • Create or add CC licensing and RDFa support to an open access publishing tool. Example:
    • It would appear a practical RDFa proposal would be around tagging a scientific HTML doc with the triples extracted and built with an NLP tool. The elements of the text, genes, diseases, pathways, therapeutics, etc, would be directly embedded in the text around such words (perhaps rendered as typed links) , while the triples of how they interplay would be represented as well. This would bring together the human readable world with the concept-codified space.

Semantic Web for Science

  • Science Commons is using semantic web technologies to promote data accessibility and interoperability, so any project that makes the semantic web work better, especially for life scientists, is of interest. Examples:
    • implement computed properties for the Pellet OWL DL reasoner, with formulas expressed in Javascript
    • implement a macro system for OWL
    • implement an RDF library for the Scheme programming language
    • implement DL-Lite, an OWL subset designed to efficiently map to relational databases (see http://www.dis.uniroma1.it/~quonto/articoli/calv-etal-AAAI-2005.pdf )
    • develop a library of RDF exporters for important sources of biological data
  • Build a tool for optimizing access to particular RDF graphs, e.g. by synthesizing a relational schema inferred from the content of the graph and sample queries
  • Adapt OpenNLP, GATE, or other open source natural language processing system to mine the open literature for interesting biological entities (cell lines, antibodies, ...) and relationships, rendering the results as RDF or RDFa
  • Set up an open 'semantic wiki' for use by biologists: adapt an existing wiki implementation to add mechanisms for entry and/or deduction of entity identifications and relationships

Bonus Points

An ideal proposal would include support for RDFa, remixing, open formats and affordances for educational and worldwide (not just wealthy regions) use. Ability to release under an open source license and incorporation of some Creative Commons affordance are necessary. However, a solid proposal is far more important than buzzword compliance. Please read Google's Summer of Code Student FAQ and advice from past participants as you create your proposal. Good luck!

Mentors

Note: email addresses, in parentheses, are all at creativecommons.org

External Links