Difference between revisions of "Case Studies/Dryad"
|Line 58:||Line 58:|
== Media ==
== Media ==
[http://blog.datadryad.org/ Dryad News and Views]
[http://blog.datadryad.org/ Dryad News and Views
Latest revision as of 16:24, 16 April 2013
Dryad is a repository for data underlying scientific publications, with an initial focus on evolution, ecology, and related fields. Dryad allows investigators to validate published findings, explore new analysis methodologies, repurpose the data for research questions unanticipated by the original authors, and perform synthetic studies such as formal meta-analyses. Dryad aims to provide one-stop data deposition upon publication by interfacing with specialized repositories which are already required for publication (such as GenBank and TreeBASE). To staunch this loss, Dryad serves as a repository for tables, spreadsheets, flatfiles, and all other kinds of published data that do not currently have a home. A major design consideration with these data is to avoid placing an undue burden of metadata generation on individual researchers while at the same time capturing sufficient metadata to enable data discovery and reuse. — https://www.nescent.org/wg_dryad/Main_Page
Dryad is an international online repository for data contained in academic papers and other publications in the sciences. The National Evolutionary Synthesis Center and the University of North Carolina Metadata Research Center, in partnership with various journals and societies, are leading the development of the Dryad repository, all copyrights in which are surrendered under the CC0 public domain dedication.
According to the Dryad Fact Sheet, the main goals of Dryad are:
- "To preserve all the underlying data reported in a paper at the time of publication, when there is the greatest incentive and the ability for authors to share their data. This is particularly important in the case of data for which a specialized repository does not exist."
- "To lower the burden of data sharing by providing one-stop data-deposition via handshaking with specialized repositories."
- "To assign globally unique identifiers to datasets, thus enabling data citations."
- "To allow end-users to perform sophisticated searches over data (not only by publication, but also by taxon, geography, geological age, biological concept, etc)."
- "To allow journals and societies to pool their resources for one shared repository."
- "To enable bidirectional search and retrieval with data repositories from related disciplines."
All data deposited in Dryad is released via the CC0 public domain dedication.
However, scientific norms around citing data prevail.
Also excerpted from TJ Vision, Open Data and the Social Contract of Scientific Publishing, BioScience, May 2010, Vol. 60, No. 5, Pages 330–331.
We owe the effectiveness of the scientific enterprise in large part to the social contract under which scientists publish their findings in such a way that they may be confirmed or refuted and receive credit for their work in return. Because of the limitations of the printed page, data have been largely left out of this arrangement. We have grown accustomed to reading papers in which tables, figures, and statistics summarize the underlying data, but the data themselves are unavailable. There are exceptions, such as DNA sequences, for which there exist specialized public repositories that authors are required to use. But the vast majority of data types do not have such repositories.
Dryad (http://datadryad.org) [is] a digital repository designed specifically to enable authors to archive data upon publication and to promote the reuse of that data. The governing board of the repository is composed of representatives from a consortium of partner journals. The consortium has grown out of the original core of ecology and evolutionary biology journals that signed on to the JDAP. It currently includes more than a dozen journals, both society-owned and commercial.
One requirement for Dryad is that it be able to host any kind of orphan data. Therefore, the format and contents of the data files cannot practically be standardized, though journals are free to require minimal content standards or format conventions should they so choose, and the articles themselves provide important context for understanding the data.
A second critical requirement for Dryad is that it minimize the burden of submission for the author. To achieve this, partner journals provide Dryad with the bibliographic information for each article in advance of publication. Then, at the time of deposition, authors follow a link to a preexisting record in the Web submission system, log in, and upload their electronic files with some optional descriptive metadata and a “read me” file. To further minimize deposition burden, Dryad is developing interfaces to enable one-stop data submission for cases where some of the data belong in more specialized repositories.
Dryad promotes data citations by assigning a unique, persistent, and resolvable digital object identifier (DOI) for inclusion in the published article. This takes the form of a DataCite DOI (http://www.datacite.org). Data are dedicated to the public domain through a Creative Commons Zero Public Domain Dedication (http://creativecommons.org/publicdomain/zero/1.0), which makes the terms of reuse both clear and nonrestrictive. A statement of community norms advises scientists who reuse the data to cite both the paper and the data as separate research products. Thus, Dryad provides a positive incentive for data archiving without erecting unnecessary barriers to data reuse.