Difference between revisions of "Marking Works Technical"

From Creative Commons
Jump to: navigation, search
(Web)
 
(55 intermediate revisions by 8 users not shown)
Line 1: Line 1:
'''''I'm posting this here to get it off my laptop and motivate me to finish it.  Please do not edit yet.  Thanks. [[User:Mike Linksvayer|Mike Linksvayer]] 16:33, 15 May 2006 (UTC)'''''
+
{{Best Practice}}
  
 
This article describes best practices for marking works licensed with a Creative Commons license or dedicated to the public domain with both human readable and machine readable notices, on the web and otherwise.
 
This article describes best practices for marking works licensed with a Creative Commons license or dedicated to the public domain with both human readable and machine readable notices, on the web and otherwise.
  
=Principles=
+
Although the importance of human-visible attribution and notice is stressed, this primary focus of this article is machine-readable metadata. Best practice for presentation of human-visible attribution and notice in various formats (apart from collocation with machine-readable equivalents) is currently beyond the scope of this article and will be presented [[Marking|elsewhere]]. An index of specific technical recommendations, organized by file type, is [[:Category:Filetype|available]].
* Copyright notice should be published on the '''web'''. Offline or peer-distributed works should refer to notice published on the web. A web notice allows content to be found via web search, content owners to be found by viewers of untethered works, may be trusted to the extent a web page may be trusted, and allows for further annotation by the owner and others.
 
  
* Copyright notice should be '''visible''' and unambiguous to humans and computers, especially the former. Human and computer visible notices should be one and the same.
+
= Principles =
  
 +
* Copyright notice should be published on the '''web'''. Offline or peer-distributed works should refer to notice published on the web. A web notice allows content to be found via web search, content owners to be found by viewers of untethered works, may be trusted to the extent a web page may be trusted, and allows for further annotation by the owner and others.
 +
* Copyright notice should be '''visible''' and unambiguous to humans and computers, especially the former. Human and computer visible notices should be one and the same.Offline or peer-distributed works should not refer to notice published on the web.
 
* Copyright notice best practices should be '''general''' rather than applicable only to works offered under one of the Creative Commons licenses.
 
* Copyright notice best practices should be '''general''' rather than applicable only to works offered under one of the Creative Commons licenses.
  
Line 15: Line 16:
  
 
         <nowiki>This text is licensed to the public under the &lt;a rel="license"
 
         <nowiki>This text is licensed to the public under the &lt;a rel="license"
         href="http://creativecommons.org/licenses/by-sa/2.5/"&gt;Creative
+
         href="https://creativecommons.org/licenses/by-sa/4.0/"&gt;Creative
         Commons Attribution 2.5 License&lt;/a&gt;.</nowiki>
+
         Commons Attribution 4.0 License&lt;/a&gt;.</nowiki>
  
That HTML snippet says that the current document is licensed under CC BY 2.5.  The current document is the default subject, <code>rel="license"</code> sets the predicate or verb, the URL in the <code>href</code> value sets the object.
+
That HTML snippet says that the current document is licensed under CC Attribution 4.0.  The current document is the default subject, <code>rel="license"</code> sets the predicate or verb, the URL in the <code>href</code> value sets the object.
  
 
For detailed background on the web metadata model and syntax being used, see [[RDFa]].
 
For detailed background on the web metadata model and syntax being used, see [[RDFa]].
  
 
This use follows the aforementioned principles:
 
This use follows the aforementioned principles:
 +
 
* On the '''web''' (obviously, so long as the HTML page in question is published to a website).
 
* On the '''web''' (obviously, so long as the HTML page in question is published to a website).
 
* The metadata is '''visible''' -- it is colocated with an actionable link and may further be styled with CSS for human consumption.
 
* The metadata is '''visible''' -- it is colocated with an actionable link and may further be styled with CSS for human consumption.
Line 31: Line 33:
 
=== Attribution ===
 
=== Attribution ===
  
 +
A licensor can specify how they wish to be attributed for use of their work (see, for example, &sect; 4(b) of https://creativecommons.org/licenses/by/4.0/legalcode).  This includes:
  
A licensor can specify (see 4(b) of
+
# The title of the work
http://creativecommons.org/licenses/by/2.5/legalcode):
+
# The name of the person, group or organization to attribute the work to
 
+
# The URL to link the attribution to
# name of author or designated entity
 
# title of work
 
# url associated with work referencing copyright
 
# identification of derivative use
 
  
Ideally all of these would be available as machine-readable metadata.  I
+
Ideally this information would also be available as machine-readable metadata.  As an example, this could be encoded in the following triples (assuming the work in question is the current document):
think the triples we want (assuming the work in question is the current
 
document) are:
 
  
        1. <> cc:attributedTo "James Roberts" .
+
# <> dc:title "Compact Representation of Blank Pages" .
        2. <> dc:title "Compact Representation of Blank Pages" .
+
# <> cc:attributionName "James Roberts" .
        3. <> cc:citeURL <http://example.org/crobp.html> .
+
# <> cc:attributionURL <http://example.org/crobp.html> .
        4. <> cc:derivativeDescription "aranslation of 'Paginas Blancos Si!'" .
 
  
Why the made up properties?
+
Why the new properties?
  
# dcterms:rightsHolder and dc:creator don't say which should be
+
# dcterms:rightsHolder and dc:creator don't say which should be attributed -- or if someone else should be; cc:attributionName has the semantics necessary for the task.
attributed -- or if someone else should be.
+
# The attribution URL will often be the current URL, but we need some way to specify that it should be cited; cc:attributionURL provides this.
# not made up, dc:title works just fine.
 
# often this will be the current URL, but need some way to specify that
 
it should be cited.  probably there is some existing property to use.
 
# dc:source refers to the source work, not how source work was reused.
 
  
Fortunately RDF/A allows us to annotate human readable notices and
+
Fortunately RDFa allows us to annotate human readable notices and include both the custom properties needed for attribution as well as additional useful properties, e.g.
include both the custom properties needed for attribution as well as
 
additional useful properties, e.g.
 
  
         <a rel="cc:citeURL" href="http://example.org/crobp.html"
+
         <a rel="cc:attributionURL" href="http://example.org/crobp.html"
 
         property="dc:title">Compact Representation of Blank Pages</a> by
 
         property="dc:title">Compact Representation of Blank Pages</a> by
 
         <a rel="dc:creator" href="http://example.org/jr.html"
 
         <a rel="dc:creator" href="http://example.org/jr.html"
         property="cc:attributedTo">James Roberts</a>, a <a
+
         property="cc:attributionName">James Roberts</a>, a <a
         rel="dc:source" href="http://example.org/bps.html"
+
         rel="dc:source" href="http://example.org/bps.html">translation
         property="cc:derivativeDescription">translation of 'Paginas
+
         of 'Paginas Blancos Si!'</a>, is licensed to the public under  
        Blancos Si!'</a>, is licensed to the public under the <a
+
        the <a rel="license"
        rel="license"
+
         href="https://creativecommons.org/licenses/by/4.0/">Creative
         href="http://creativecommons.org/licenses/by/2.5/">Creative
+
         Commons Attribution 4.0 License</a>.
         Commons Attribution 2.5 License</a>.
 
  
This produces the four triples above plus these:
+
This produces the triples above plus these:
  
        5. <> dc:creator <http://example.org/jr.html>
+
# <> dc:creator <http://example.org/jr.html> .
        6. <> dc:source <http://example.org/bps.html>
+
# <> dc:source <http://example.org/bps.html> .
        7. <> license <http://creativecommons.org/licenses/by/2.5/>
+
# <> license <https://creativecommons.org/licenses/by/4.0/> .
  
=== Embedded Objects ===
+
=== Included Objects ===
  
Objects embedded in web pages (images) should be annotated as above within their "host" web pages. Ideally all embedded objects should also include object format native notice and metadata as described below, but this is only crucial for objects where the publisher indends for or is concerned about distribution outside the context of the "host" web page.
+
Objects included in web pages, such as images, should be annotated as above within their "host" web pages. Ideally all included objects should also embed object [[:Category:Filetype|format native]] notice and metadata as described below, but this is only crucial for objects where the publisher intends for or is concerned about distribution outside the context of the "host" web page.
  
Metadata about embedded objects that have their own URIs should be qualified with <code>about</code>, as <code>rel="license"</code> without an <code>about</code> attribute makes a statement about the current document (which is the "host" web page).  Example:
+
Metadata about include objects that have their own URIs should be qualified with <code>about</code>, as <code>rel="license"</code> without an <code>about</code> attribute makes a statement about the current document (which is the "host" web page).  Example:
  
 
         <nowiki>Photo licensed under &lt;a about="http://example.com/some-image.png"
 
         <nowiki>Photo licensed under &lt;a about="http://example.com/some-image.png"
         rel="license" href="http://creativecommons.org/licenses/by/2.5/"&gt;cc by 2.5&lt;/a&gt;.</nowiki>
+
         rel="license" href="https://creativecommons.org/licenses/by/4.0/"&gt;cc by 4.0&lt;/a&gt;.</nowiki>
 +
 
 +
See the RDFa Primer on [http://www.w3.org/2006/07/SWD/RDFa/primer/#beyond-current-document beyond the current document] for additional examples.
 +
 
 +
=== More Rights ===
 +
 
 +
A page may provide any sort of metadata for itself and its included components. The following example has some obviously useful statements:
 +
 
 +
        <nowiki>&lt;span rel="dc:type" href="http://purl.org/dc/dcmitype/Sound"/&gt;Audio&lt;/span&gt; &lt;a rel="license"
 +
        href="https://creativecommons.org/licenses/by-nc/4.0/"&gt;(cc)&lt;/a&gt; &lt;span property="dc:date"&gt;2006&lt;/span&gt;.
 +
        See &lt;a rel="cc:morePermissions" href="http://johnsmith.com/store"&gt;my store&lt;/a&gt; to obtain permissions not
 +
        granted by the CC license, signed CDs, and concert tickets.</nowiki>
 +
 
 +
In addition to the familiar <code>license</code> statement and <code>dc:type</code> from Dublin Core we have <code>cc:morePermissions</code>.  The intention of <code>cc:morePermissions</code> is to point to a URL at which one can discover permissions outside the scope of those granted to the public by the licensor via a CC license.
 +
 
 +
See [[CCPlus]] for more information on specifying <code>cc:morePermissions</code>.
 +
 
 +
=== Interoperability with other web-based metadata formats ===
 +
 
 +
==== RDF/XML ====
 +
 
 +
RDF/XML is another RDF serialization that can be used to make any statement that can be made with RDFa.  RDF/XML may be <code><link></code>ed in the <code><head></code> of a web page or [[Extend_Metadata#Embedding_RDF_in_HTML|included in HTML comments]], as the deprecated CC recommendation advocated.
 +
 
 +
RDF/XML is perfectly interoperable with RDFa, but the latter is preferred as it is collocated with human-visible content.
 +
 
 +
==== Microformats ====
 +
 
 +
[[RelLicense|rel=license]] is both an [http://microformats.org/wiki/elemental-microformat elemental microformat] (specifically [http://microformats.org/wiki/rel-license rel license]) and an [[RDFa]] statement.  Compound Microformats should be interoperable via [http://www.w3.org/TR/grddl/ GRDDL] or [http://www.w3.org/2001/sw/BestPractices/HTML/2006-08-08-hgrddl-xhtml1 hGRDDL].
 +
 
 +
Use cases for licensing and related metadata as microformats may be found at [http://microformats.org/wiki/licensing on the microformats wiki]; they remain hypothetical as microformats, but generalize to use cases for Creative Commons web metadata.
 +
 
 +
==== Feeds ====
 +
 
 +
RSS 1.0, RSS 2.0, and Atom 1.0 syndication formats each may include [[Syndication|license annotations]]. These should also be reflected on web pages referenced by feed items.
  
 
= Non-Web =
 
= Non-Web =
Line 93: Line 115:
  
 
* To the extent possible copyright notice should be '''visible''' to users of the licensed object.
 
* To the extent possible copyright notice should be '''visible''' to users of the licensed object.
 +
* The licensed object should contain a '''web reference''' to a web page that provides an equivalent licensing notice for the object in question.  This makes a non-web object's licesning status as certain as that of a web page's status -- if the web reference does not agree with the non-web object's license notice, ignore the latter.
 +
* To the extent possible license notice provided by non-web objects should use mechanisms and conventions in '''common use''' for the format of the non-web object in question.
 +
 +
We will now apply these principles to a variety of content formats, after describing two building blocks for Non-Web metadata -- web statement metadata and XMP.
 +
 +
== Web Statement ==
 +
 +
The web page that provides licensing notice equivalent to any included in the object itself may be only human readable. Ideally  the web page will publish metadata that is explicitly associated with the object in question. To do this the object must be identified, ideally with a content-derived identifier, such that a client may verify that the web page metadata concerns precisely the object in question. An example of a content-derived identifier is a content hash.
 +
 +
Following is an example of RDFa describing a resource indentified by a hash:
 +
 +
    <nowiki><span about="urn:sha1:DATAG7ENBVHFNQPM4W626VDVK25RYECI">
 +
    'Good Dream' is licensed under
 +
    <a rel="license" href="https://creativecommons.org/licenses/by/4.0/">CC BY</a>.
 +
    </span></nowiki>
 +
 +
Produces the following triple:
 +
 +
        <urn:sha1:DATAG7ENBVHFNQPM4W626VDVK25RYECI> license <https://creativecommons.org/licenses/by/4.0/> .
 +
 +
== XMP ==
 +
 +
[[XMP]] has the broadest support of any embedded metadata format (perhaps it is the only such format with anything approaching broad support) across many different media formats. With the exception of media formats where a workable embedded metadata format is already ubiquitous, Creative Commons recommends adopting XMP as an embedded metadata standard and use of the following two fields in particluar:
 +
 +
* Web reference: value of <code>xapRights:WebStatement</code>
 +
* License: value of <code>cc:license</code>
 +
 +
== Audio ==
 +
 +
Human visible metadata is not possible, and audio notice is not acceptable in most music contexts.  With the exception of MP3 and OGG below, XMP embedded metadata is recommended.
 +
 +
=== MP3 ===
 +
 +
[[MP3]] is the primary exception to the XMP recommendation above, as ID3 is widely supported. The following two fields are recommended:
 +
 +
* Web reference: value of <code>WOAF</code> ("official audio file" URL)
 +
* License: value of <code>WCOP</code> (copyright URL)
  
* The licensed object should contain a '''web reference''' to a web page that provides an equivalent licensing notice for the object in question.  This makes a non-web object's licesning status as certain as that of a web page's status -- if the web reference does not agree with the non-web object's license notice, ignore the latter.
+
=== OGG ===
  
* To the extent possible license notice provided by non-web objects should use mechanisms and conventions in '''common use''' for the format of the non-web object in question.
+
"Vorbis Comments" are widely supported for [[OGG]] files. The following two fields are recommended:
  
We will now apply these principles to a variety of content formats.
 
  
== Office Documents ==
+
* Web reference: value of <code>CONTACT</code>
 +
* License: value of <code>LICENSE</code>
  
"Office Documents" meaning word processor, presentation and spreadsheet files and document output formats such as PDF.
+
Note that OGG is a container format that also supports video.
  
All of these formats are intended for human consumption and all modern versions support web links, so '''visibility''' is easy  -- the license and '''web reference''' should be noted in thee document text wherever '''common use''' dictates a copyright notice would appear.  Looks like we just took care of the other principles as well.
+
== Still Images ==
  
Many document formats support some form of embedded metadata that can be accessed, e.g., by selecting <code>File|Document Properties</code> from a menu. Where such metadata includes licensing or copyright-relevant fields, these should be populated with first priority to the '''web reference''' and second priority to the URL of the license the document in question is under.
+
Although possible, visible attribution and copyright notice generally either does not fit in an image due to size or aesthetic limitations. Even where possible human-visible notice will not be collocated with machine-readable metadata (for bitmap formats currently in use).
  
Ideally these metadata fields should be used to directly populate the human-visible parts of the document, e.g., via a document field, as is typically used to automatically place page numbers and document titles throughout such documents.  This gives us the best of both worlds -- any software that indexes embedded metadata can easily find license information, and that information is perfectly reflected in notices humans see.
+
For embedded metadata XMP is recommended.
  
What follows is a very incomplete accounting of how these guidelines may be used in various popular office document formats.
+
== Video ==
  
=== Microsoft Office ===
+
Visible attribution and copyright notice is generally provided in video frames (e.g., credit roll) or [[Marking_work#Video|overlays]].
  
Visible notice as above. Metadata TBD.
+
Preferably relevant regions of the video will be clickable or otherwise interactively linked with web pages.  
  
=== OpenOffice.org ===
+
There are many video formats, each with its own ill-supported embedded metadata specification (if any). Creative Commons recommends adopting XMP across video formats, with the possible exception of OGG (see audio above).
  
Visible notice as above. Metadata TBD.
+
== Document formats ==
  
=== PDF ===
+
"Document" formats meaning word processor, presentation and spreadsheet document files or their output formats such as PDF.
  
Visible notice as above. [[XMP]] metadata, which has advantages and disadvantages:
+
All of these formats are intended for human consumption and all modern versions support web links, so '''visibility''' is easy  -- the license and '''web reference''' should be noted in thee document text wherever '''common use''' dictates a copyright notice would appear.  Looks like we just took care of the other principles as well.
* Constrained RDF -- very expressive, compatible.
 
* Not visible in document, nor linkable to visible fields.
 
* Very little exposure via <code>File|Document Properties</code> style interfaces -- only Photoshop CS file browser?
 
* Very little support for embedding -- only Photoshop CS file browser and macro for Latex users?
 
  
== Visible notices ==
+
Many document formats support some form of embedded metadata that can be accessed, e.g., by selecting <code>File|Document Properties</code> from a menu. Where such metadata includes licensing or copyright-relevant fields, these should be populated with first priority to the '''web reference''' and second priority to the URL of the license the document in question is under.
  
=== Documents ===
+
Ideally these metadata fields should be used to directly populate the human-visible parts of the document, e.g., via a document field, as is typically used to automatically place page numbers and document titles throughout such documents.  This gives us the best of both worlds -- any software that indexes embedded metadata can easily find license information, and that information is perfectly reflected in notices humans see.
=== Video ===
 
=== Interactive ===
 
=== Images ===
 
=== Audio ===
 
=== XML ===
 
== Visible metadata ==
 
== Invisible notices and metadata ==
 
=== Images ===
 
=== Audio ===
 
  
== Metadata formats ==
+
Details of how metadata may be embedded in specific file formats is available [[:Category:Filetype|here]].
=== HTML ===
 
=== XML ===
 
=== PDF (XMP) ===
 

Latest revision as of 14:31, 25 September 2014


This article describes best practices for marking works licensed with a Creative Commons license or dedicated to the public domain with both human readable and machine readable notices, on the web and otherwise.

Although the importance of human-visible attribution and notice is stressed, this primary focus of this article is machine-readable metadata. Best practice for presentation of human-visible attribution and notice in various formats (apart from collocation with machine-readable equivalents) is currently beyond the scope of this article and will be presented elsewhere. An index of specific technical recommendations, organized by file type, is available.

Principles

  • Copyright notice should be published on the web. Offline or peer-distributed works should refer to notice published on the web. A web notice allows content to be found via web search, content owners to be found by viewers of untethered works, may be trusted to the extent a web page may be trusted, and allows for further annotation by the owner and others.
  • Copyright notice should be visible and unambiguous to humans and computers, especially the former. Human and computer visible notices should be one and the same.Offline or peer-distributed works should not refer to notice published on the web.
  • Copyright notice best practices should be general rather than applicable only to works offered under one of the Creative Commons licenses.

Web

A web page is the preferred venue for published copyright notices. Notices published elsewhere should refer to an equivalent notice published on the web.

       This text is licensed to the public under the <a rel="license"
        href="https://creativecommons.org/licenses/by-sa/4.0/">Creative
        Commons Attribution 4.0 License</a>.

That HTML snippet says that the current document is licensed under CC Attribution 4.0. The current document is the default subject, rel="license" sets the predicate or verb, the URL in the href value sets the object.

For detailed background on the web metadata model and syntax being used, see RDFa.

This use follows the aforementioned principles:

  • On the web (obviously, so long as the HTML page in question is published to a website).
  • The metadata is visible -- it is colocated with an actionable link and may further be styled with CSS for human consumption.
  • It is general -- the model used can accomodate any sort of statement about anything that has a URI and the license predicate can take any license that has a URI as its object, not only a Creative Commons license.

Now we will see how to use the model to further annotate works with licensing-relevant metadata.

Attribution

A licensor can specify how they wish to be attributed for use of their work (see, for example, § 4(b) of https://creativecommons.org/licenses/by/4.0/legalcode). This includes:

  1. The title of the work
  2. The name of the person, group or organization to attribute the work to
  3. The URL to link the attribution to

Ideally this information would also be available as machine-readable metadata. As an example, this could be encoded in the following triples (assuming the work in question is the current document):

  1. <> dc:title "Compact Representation of Blank Pages" .
  2. <> cc:attributionName "James Roberts" .
  3. <> cc:attributionURL <http://example.org/crobp.html> .

Why the new properties?

  1. dcterms:rightsHolder and dc:creator don't say which should be attributed -- or if someone else should be; cc:attributionName has the semantics necessary for the task.
  2. The attribution URL will often be the current URL, but we need some way to specify that it should be cited; cc:attributionURL provides this.

Fortunately RDFa allows us to annotate human readable notices and include both the custom properties needed for attribution as well as additional useful properties, e.g.

       <a rel="cc:attributionURL" href="http://example.org/crobp.html"
       property="dc:title">Compact Representation of Blank Pages</a> by
       <a rel="dc:creator" href="http://example.org/jr.html"
       property="cc:attributionName">James Roberts</a>, a <a
       rel="dc:source" href="http://example.org/bps.html">translation 
       of 'Paginas Blancos Si!'</a>, is licensed to the public under 
       the <a rel="license"
       href="https://creativecommons.org/licenses/by/4.0/">Creative
       Commons Attribution 4.0 License</a>.

This produces the triples above plus these:

  1. <> dc:creator <http://example.org/jr.html> .
  2. <> dc:source <http://example.org/bps.html> .
  3. <> license <https://creativecommons.org/licenses/by/4.0/> .

Included Objects

Objects included in web pages, such as images, should be annotated as above within their "host" web pages. Ideally all included objects should also embed object format native notice and metadata as described below, but this is only crucial for objects where the publisher intends for or is concerned about distribution outside the context of the "host" web page.

Metadata about include objects that have their own URIs should be qualified with about, as rel="license" without an about attribute makes a statement about the current document (which is the "host" web page). Example:

       Photo licensed under <a about="http://example.com/some-image.png"
        rel="license" href="https://creativecommons.org/licenses/by/4.0/">cc by 4.0</a>.

See the RDFa Primer on beyond the current document for additional examples.

More Rights

A page may provide any sort of metadata for itself and its included components. The following example has some obviously useful statements:

       <span rel="dc:type" href="http://purl.org/dc/dcmitype/Sound"/>Audio</span> <a rel="license"
        href="https://creativecommons.org/licenses/by-nc/4.0/">(cc)</a> <span property="dc:date">2006</span>.
        See <a rel="cc:morePermissions" href="http://johnsmith.com/store">my store</a> to obtain permissions not
        granted by the CC license, signed CDs, and concert tickets.

In addition to the familiar license statement and dc:type from Dublin Core we have cc:morePermissions. The intention of cc:morePermissions is to point to a URL at which one can discover permissions outside the scope of those granted to the public by the licensor via a CC license.

See CCPlus for more information on specifying cc:morePermissions.

Interoperability with other web-based metadata formats

RDF/XML

RDF/XML is another RDF serialization that can be used to make any statement that can be made with RDFa. RDF/XML may be <link>ed in the <head> of a web page or included in HTML comments, as the deprecated CC recommendation advocated.

RDF/XML is perfectly interoperable with RDFa, but the latter is preferred as it is collocated with human-visible content.

Microformats

rel=license is both an elemental microformat (specifically rel license) and an RDFa statement. Compound Microformats should be interoperable via GRDDL or hGRDDL.

Use cases for licensing and related metadata as microformats may be found at on the microformats wiki; they remain hypothetical as microformats, but generalize to use cases for Creative Commons web metadata.

Feeds

RSS 1.0, RSS 2.0, and Atom 1.0 syndication formats each may include license annotations. These should also be reflected on web pages referenced by feed items.

Non-Web

Objects not native to the web should adhere to the following principles regarding license notice and metadata.

  • To the extent possible copyright notice should be visible to users of the licensed object.
  • The licensed object should contain a web reference to a web page that provides an equivalent licensing notice for the object in question. This makes a non-web object's licesning status as certain as that of a web page's status -- if the web reference does not agree with the non-web object's license notice, ignore the latter.
  • To the extent possible license notice provided by non-web objects should use mechanisms and conventions in common use for the format of the non-web object in question.

We will now apply these principles to a variety of content formats, after describing two building blocks for Non-Web metadata -- web statement metadata and XMP.

Web Statement

The web page that provides licensing notice equivalent to any included in the object itself may be only human readable. Ideally the web page will publish metadata that is explicitly associated with the object in question. To do this the object must be identified, ideally with a content-derived identifier, such that a client may verify that the web page metadata concerns precisely the object in question. An example of a content-derived identifier is a content hash.

Following is an example of RDFa describing a resource indentified by a hash:

   <span about="urn:sha1:DATAG7ENBVHFNQPM4W626VDVK25RYECI">
    'Good Dream' is licensed under
    <a rel="license" href="https://creativecommons.org/licenses/by/4.0/">CC BY</a>.
    </span>

Produces the following triple:

       <urn:sha1:DATAG7ENBVHFNQPM4W626VDVK25RYECI> license <https://creativecommons.org/licenses/by/4.0/> .

XMP

XMP has the broadest support of any embedded metadata format (perhaps it is the only such format with anything approaching broad support) across many different media formats. With the exception of media formats where a workable embedded metadata format is already ubiquitous, Creative Commons recommends adopting XMP as an embedded metadata standard and use of the following two fields in particluar:

  • Web reference: value of xapRights:WebStatement
  • License: value of cc:license

Audio

Human visible metadata is not possible, and audio notice is not acceptable in most music contexts. With the exception of MP3 and OGG below, XMP embedded metadata is recommended.

MP3

MP3 is the primary exception to the XMP recommendation above, as ID3 is widely supported. The following two fields are recommended:

  • Web reference: value of WOAF ("official audio file" URL)
  • License: value of WCOP (copyright URL)

OGG

"Vorbis Comments" are widely supported for OGG files. The following two fields are recommended:


  • Web reference: value of CONTACT
  • License: value of LICENSE

Note that OGG is a container format that also supports video.

Still Images

Although possible, visible attribution and copyright notice generally either does not fit in an image due to size or aesthetic limitations. Even where possible human-visible notice will not be collocated with machine-readable metadata (for bitmap formats currently in use).

For embedded metadata XMP is recommended.

Video

Visible attribution and copyright notice is generally provided in video frames (e.g., credit roll) or overlays.

Preferably relevant regions of the video will be clickable or otherwise interactively linked with web pages.

There are many video formats, each with its own ill-supported embedded metadata specification (if any). Creative Commons recommends adopting XMP across video formats, with the possible exception of OGG (see audio above).

Document formats

"Document" formats meaning word processor, presentation and spreadsheet document files or their output formats such as PDF.

All of these formats are intended for human consumption and all modern versions support web links, so visibility is easy -- the license and web reference should be noted in thee document text wherever common use dictates a copyright notice would appear. Looks like we just took care of the other principles as well.

Many document formats support some form of embedded metadata that can be accessed, e.g., by selecting File|Document Properties from a menu. Where such metadata includes licensing or copyright-relevant fields, these should be populated with first priority to the web reference and second priority to the URL of the license the document in question is under.

Ideally these metadata fields should be used to directly populate the human-visible parts of the document, e.g., via a document field, as is typically used to automatically place page numbers and document titles throughout such documents. This gives us the best of both worlds -- any software that indexes embedded metadata can easily find license information, and that information is perfectly reflected in notices humans see.

Details of how metadata may be embedded in specific file formats is available here.