Shared Names Implementation

Related To:
Tags: gsoc, scicom
Challenge Type: Developer
Is Complete: No

Several hundred important online databases and repositories are used routinely by biomedical researchers, and many of these resources include cross-references to records in other resources. The obvious implementation of these cross-references would be as web references (URIs), but this has been impossible due to the instability of existing URIs for the resources and the records they host. Instead, each resource has created its own private way of naming its external references.

Realizing that this leads to duplication and interoperability problems (when combining data from multiple resources), representatives of bioinformatics and Semantic Web application projects have formed the Shared Names alliance to establish a common namespace to be used for cross-reference links. We need someone to implement aspects of a design that advances beyond existing prototypes. The system has many components; the basic functionality we desire is given here. We would match the exact work to be done to the candidate's interests and abilities. Work would consist of the following tasks:

  • script to create "about-page" templates (extract databank information from our meta-databank and render it as an RDF template) (N.b. the "about-page" is a small bit of RDF provided to Semantic Web clients providing basic metadata and links out to various presentations of database content)
  • script to create "about-pages" on the fly (fill in RDF template with particular identifier)
  • establish redirection rules based on information in meta-databank (using either Apache or PURLZ, to be determined)
  • create a system for monitoring servers and bringing them on and off-line, by modifying the DNS zone file
  • improve the quality of information in the existing meta-databank via combination of hand curation and script-based curation
  • packaging for distribution

A current unknown is how far along the PURLZ software will be when the project starts. We will review its status at the beginning of the project. If PURLZ is sufficiently mature then we will work together with the PURLZ project to develop a PURLZ-based solution (also an open source project). Otherwise we will revert to an Apache-based solution.

Experience with RDF, Apache, and Nagios, or equivalent technologies, is desirable.

If successful, this project would set an example that could be replicated in other application domains, advancing the overall competence of the Semantic Web.

  • About CC Wiki
  • This page was last modified on 6 April 2010, at 16:04.