Difference between revisions of "Marking Works Technical"

From Creative Commons
Jump to: navigation, search
(Web: RDFa)
(Web)
Line 14: Line 14:
 
A web page is the preferred venue for published copyright notices. Notices published elsewhere should refer to an equivalent notice published on the web.
 
A web page is the preferred venue for published copyright notices. Notices published elsewhere should refer to an equivalent notice published on the web.
  
:<nowiki>This text is licensed to the public under the &lt;a rel="license" href="http://creativecommons.org/licenses/by-sa/2.5/"&gt;Creative Commons Attribution 2.5 License&lt;/a&gt;.</nowiki>
+
        <nowiki>This text is licensed to the public under the &lt;a rel="license"
 +
        href="http://creativecommons.org/licenses/by-sa/2.5/"&gt;Creative Commons Attribution 2.5 License&lt;/a&gt;.</nowiki>
  
 
That HTML snippet says that the current document is licensed under CC BY 2.5.  The current document is the default subject, <code>rel="license"</code> sets the predicate or verb, the URL in the <code>href</code> value sets the object.
 
That HTML snippet says that the current document is licensed under CC BY 2.5.  The current document is the default subject, <code>rel="license"</code> sets the predicate or verb, the URL in the <code>href</code> value sets the object.
Line 33: Line 34:
 
http://creativecommons.org/licenses/by/2.5/legalcode):
 
http://creativecommons.org/licenses/by/2.5/legalcode):
  
1. name of author or designated entity
+
# name of author or designated entity
2. title of work
+
# title of work
3. url associated with work referencing copyright
+
# url associated with work referencing copyright
4. identification of derivative use
+
# identification of derivative use
  
 
Ideally all of these would be available as machine-readable metadata.  I
 
Ideally all of these would be available as machine-readable metadata.  I
Line 42: Line 43:
 
document) are:
 
document) are:
  
1. <> cc:attributedTo "James Roberts" .
+
        1. <> cc:attributedTo "James Roberts" .
2. <> dc:title "Compact Representation of Blank Pages" .
+
        2. <> dc:title "Compact Representation of Blank Pages" .
3. <> cc:citeURL <http://example.org/crobp.html> .
+
        3. <> cc:citeURL <http://example.org/crobp.html> .
4. <> cc:derivativeDescription "aranslation of 'Paginas Blancos Si!'" .
+
        4. <> cc:derivativeDescription "aranslation of 'Paginas Blancos Si!'" .
  
 
Why the made up properties?
 
Why the made up properties?
  
1. dcterms:rightsHolder and dc:creator don't say which should be
+
# dcterms:rightsHolder and dc:creator don't say which should be
 
attributed -- or if someone else should be.
 
attributed -- or if someone else should be.
2. not made up, dc:title works just fine.
+
# not made up, dc:title works just fine.
3. often this will be the current URL, but need some way to specify that
+
# often this will be the current URL, but need some way to specify that
 
it should be cited.  probably there is some existing property to use.
 
it should be cited.  probably there is some existing property to use.
4. dc:source refers to the source work, not how source work was reused.
+
# dc:source refers to the source work, not how source work was reused.
  
 
Fortunately RDF/A allows us to annotate human readable notices and
 
Fortunately RDF/A allows us to annotate human readable notices and

Revision as of 21:53, 21 August 2006

I'm posting this here to get it off my laptop and motivate me to finish it. Please do not edit yet. Thanks. Mike Linksvayer 16:33, 15 May 2006 (UTC)

This article describes best practices for marking works licensed with a Creative Commons license or dedicated to the public domain with both human readable and machine readable notices, on the web and otherwise.

Principles

  • Copyright notice should be published on the web. Offline or peer-distributed works should refer to notice published on the web. A web notice allows content to be found via web search, content owners to be found by viewers of untethered works, may be trusted to the extent a web page may be trusted, and allows for further annotation by the owner and others.
  • Copyright notice should be visible and unambiguous to humans and computers, especially the former. Human and computer visible notices should be one and the same.
  • Copyright notice best practices should be general rather than applicable only to works offered under one of the Creative Commons licenses.

Web

A web page is the preferred venue for published copyright notices. Notices published elsewhere should refer to an equivalent notice published on the web.

       This text is licensed to the public under the <a rel="license"
        href="http://creativecommons.org/licenses/by-sa/2.5/">Creative Commons Attribution 2.5 License</a>.

That HTML snippet says that the current document is licensed under CC BY 2.5. The current document is the default subject, rel="license" sets the predicate or verb, the URL in the href value sets the object.

For detailed background on the web metadata model and syntax being used, see RDFa.

This use follows the aforementioned principles:

  • On the web (obviously, so long as the HTML page in question is published to a website).
  • The metadata is visible -- it is colocated with an actionable link and may further be styled with CSS for human consumption.
  • It is general -- the model used can accomodate any sort of statement about anything that has a URI and the license predicate can take any license that has a URI as its object, not only a Creative Commons license.

Now we will see how to use the model to further annotate works with licensing-relevant metadata.

Attribution

A licensor can specify (see 4(b) of http://creativecommons.org/licenses/by/2.5/legalcode):

  1. name of author or designated entity
  2. title of work
  3. url associated with work referencing copyright
  4. identification of derivative use

Ideally all of these would be available as machine-readable metadata. I think the triples we want (assuming the work in question is the current document) are:

       1. <> cc:attributedTo "James Roberts" .
       2. <> dc:title "Compact Representation of Blank Pages" .
       3. <> cc:citeURL <http://example.org/crobp.html> .
       4. <> cc:derivativeDescription "aranslation of 'Paginas Blancos Si!'" .

Why the made up properties?

  1. dcterms:rightsHolder and dc:creator don't say which should be

attributed -- or if someone else should be.

  1. not made up, dc:title works just fine.
  2. often this will be the current URL, but need some way to specify that

it should be cited. probably there is some existing property to use.

  1. dc:source refers to the source work, not how source work was reused.

Fortunately RDF/A allows us to annotate human readable notices and include both the custom properties needed for attribution as well as additional useful properties, e.g.

       <a rel="cc:citeURL" href="http://example.org/crobp.html"
       property="dc:title">Compact Representation of Blank Pages</a> by
       <a rel="dc:creator" href="http://example.org/jr.html"
       property="cc:attributedTo">James Roberts</a>, a <a
       rel="dc:source" href="http://example.org/bps.html"
       property="cc:derivativeDescription">translation of 'Paginas
       Blancos Si!'</a>, is licensed to the public under the <a
       rel="license"
       href="http://creativecommons.org/licenses/by/2.5/">Creative
       Commons Attribution 2.5 License</a>.

This produces the four triples above plus these:

<> dc:creator <http://example.org/jr.html> <> dc:source <http://example.org/bps.html> <> license <http://creativecommons.org/licenses/by/2.5/>

Embedded Objects

Objects embedded in web pages (images) should be annotated as above within their "host" web pages. Ideally all embedded objects should also include object format native notice and metadata as described below, but this is only crucial for objects where the publisher indends for or is concerned about distribution outside the context of the "host" web page.

Metadata about embedded objects that have their own URIs should be qualified with about, as rel="license" without an about attribute makes a statement about the current document (which is the "host" web page). Example:

Photo licensed under <a about="http://example.com/some-image.png" rel="license" href="http://creativecommons.org/licenses/by/2.5/">cc by 2.5</a>.

Non-Web

Objects not native to the web should adhere to the following principles regarding license notice and metadata.

  • To the extent possible copyright notice should be visible to users of the licensed object.
  • The licensed object should contain a web reference to a web page that provides an equivalent licensing notice for the object in question. This makes a non-web object's licesning status as certain as that of a web page's status -- if the web reference does not agree with the non-web object's license notice, ignore the latter.
  • To the extent possible license notice provided by non-web objects should use mechanisms and conventions in common use for the format of the non-web object in question.

We will now apply these principles to a variety of content formats.

Office Documents

"Office Documents" meaning word processor, presentation and spreadsheet files and document output formats such as PDF.

All of these formats are intended for human consumption and all modern versions support web links, so visibility is easy -- the license and web reference should be noted in thee document text wherever common use dictates a copyright notice would appear. Looks like we just took care of the other principles as well.

Many document formats support some form of embedded metadata that can be accessed, e.g., by selecting File|Document Properties from a menu. Where such metadata includes licensing or copyright-relevant fields, these should be populated with first priority to the web reference and second priority to the URL of the license the document in question is under.

Ideally these metadata fields should be used to directly populate the human-visible parts of the document, e.g., via a document field, as is typically used to automatically place page numbers and document titles throughout such documents. This gives us the best of both worlds -- any software that indexes embedded metadata can easily find license information, and that information is perfectly reflected in notices humans see.

What follows is a very incomplete accounting of how these guidelines may be used in various popular office document formats.

Microsoft Office

Visible notice as above. Metadata TBD.

OpenOffice.org

Visible notice as above. Metadata TBD.

PDF

Visible notice as above. XMP metadata, which has advantages and disadvantages:

  • Constrained RDF -- very expressive, compatible.
  • Not visible in document, nor linkable to visible fields.
  • Very little exposure via File|Document Properties style interfaces -- only Photoshop CS file browser?
  • Very little support for embedding -- only Photoshop CS file browser and macro for Latex users?

Visible notices

Documents

Video

Interactive

Images

Audio

XML

Visible metadata

Invisible notices and metadata

Images

Audio

Metadata formats

HTML

XML

PDF (XMP)