Difference between revisions of "DiscoverEd Metadata"

From Creative Commons
Jump to: navigation, search
(CC-specific categories (fields))
 
(47 intermediate revisions by 5 users not shown)
Line 1: Line 1:
__TOC__
+
{{Infobox|'''This is a basic guide to increasing the discoverability of online educational resources by preparing them for inclusion into search engines that utilize structured data, like [http://discovered.creativecommons.org/search/ DiscoverEd]. This guide contains technical language and sample XHTML and RDFa.'''}}
  
== General ==
+
[http://discovered.creativecommons.org/search/ DiscoverEd] is an experimental project from Creative Commons intended to explore how [[Structured Data|structured data]] may be used to enhance the search experience. Metadata about the resources, including the license and subject information available, are exposed in the search result set. We are particularly interested in open educational resources (OER) and are collaborating with other open education projects to improve search and discovery capabilities for OER, using DiscoverEd and other available tools. For in-depth details, read the [http://learn.creativecommons.org/wp-content/uploads/2009/07/discovered-paper-17-july-2009.pdf white paper] that describes the goals and design of DiscoverEd.
This document outlines the format in which ccLearn would like to receive syndication feeds for the data that will go into our OER database.
 
  
The data must be supplied in an [http://www.atomenabled.org/developers/syndication/ Atom] or [http://www.rssboard.org/rss-specification RSS] format.  These are two very well documented XML formats and are implemented by many content management systems.
+
This page is meant to be a quick checklist for maximizing the discoverability of your resources in DiscoverEd and similarly designed search engines. Not all of these steps are necessary for inclusion into DiscoverEd. For example, structured data are not technically required for resources to be included in search results, but without them users of the search engine will be provided with very little information about your resources.
  
'''NOTE''': The sample Atom and RSS feeds below mostly implement the minimum elements required by the respective specification plus the fields that ccLearn needs.  For our purposes, a feed must minimally contain the elements in the examples below, but may also contain any other valid elements.  Also, though we prefer an Atom feed, there is no reason that another type of feed cannot be used, as long as it is able to include all of the data CC needs ''AND'' includes the data in such a way that the [http://feedparser.org Universal Feed Parser] can extract it in a normalized way.
+
== Resource Feed ==
  
Presently, ccLearn is looking for the following data:
+
DiscoverEd uses resource feeds to direct its resource crawl. In order to index your educational resources, DiscoverEd will need the URL to an RSS or Atom feed that is limited to your educational resources. It is not likely that a site is composed entirely of educational materials, instead consisting of "About" pages, links to staff profiles, and so on, in addition to the educational resources. An index of educational resources should be composed of only actual educational materials, thereby reducing or eliminating clutter that typically accompanies web-scale queries.
* Link: Full URL of the referenced resource.
 
* Title: A brief descriptive title for the resource.
 
* Summary: A relatively short summary/synopsis of the resource.
 
* License: This should be a URL to the license; e.g., http://creativecommons.org/licenses/by/3.0/.
 
* Grade level (cc:gradelevel): What grade(s) or age-level(s) this material is suitable for.
 
* Language (cc:lang): The language(s) of the referenced resource (not of your site).
 
* Subject (cc:subject): The subject(s) of the resource; e.g., math.
 
  
If you want to use more than one entry for any or all of the grade-level, language, and subject fields, simply comma-separate each annotation.
+
DiscoverEd consumes the feeds for each site that has been listed for inclusion. Your feed essentially provides a URL "road map" of your resources, which can then be used to run a directed crawl of the resources you curate. In other words, the crawler knows where the relevant resources are located because you, the curator, have pointed at them directly using the feed.
  
== CC-specific categories (fields) ==
+
Many curatorial sites already have feed functionality (RSS or Atom) or support the Open Archive Initiative's Protocol for Metadata Harvesting (OAI-PMH). The MIT Open CourseWare site, for example, allows you to subscribe to a feed of the courses, which means that you can get an update every time a course is added, deleted, or changed. This type of feed also usually contains a list of the URLs for every course already on the site. Both feeds and OAI-PMH also provide a convenient method of polling, allowing the system to periodically check for new resources. Once a feed is set up, the DiscoverEd system can be kept up to date with minimal oversight.
  
Some of these fields do not have native Atom or RSS element definitions.  For these fields we suggest that they be embedded as tags.  In order for us to be able to recognize these within the feed, the tag content should be of the format
+
== Resource Metadata ==
  
<blockquote>cc:<field>:<data></blockquote>
+
Once you have located the URL to a feed that is limited to your educational resources, a good next step to increasing their discoverability would be to provide metadata about those resources. We recommend XHTML+[[RDFa]] for metadata encoding and transport.
  
For example, the tag for Language would become something like:
+
As a curator, you have certain goals for the resources you curate. Generally, you want curated resources to be as easy to find as possible. Core to this goal is enabling machines to detect and interpret metadata about the resources, such as title, language, or licensing terms, in a way that is interoperable with as many detection and interpretation methods as possible. Interoperability here means not only that different programs can read particular metadata properties, but also that the vocabularies themselves, which are sets of related properties, can evolve and be extended. It is also important that potential extensions be backward compatible: existing tools should not be disrupted when new properties are added. If possible, existing tools should even be able to handle basic aspects of new properties. This is precisely the kind of "interoperability of meaning" that [http://en.wikipedia.org/wiki/Resource_Description_Framework RDF] is designed to support.
  
<pre>cc:lang:es</pre>
+
For this and other reasons, the ideal method for metadata encoding/transport is XHTML+[[RDFa]]. We believe this has the broadest possible exposure for current and future software agents. For more information as to why we recommend and require [[RDFa]] for metadata transport, see the [[CC REL]] W3C specification and our [http://learn.creativecommons.org/wp-content/uploads/2009/07/discovered-paper-17-july-2009.pdf white paper]. For technical information on XHTML and RDFa, see the [http://www.w3.org/TR/xhtml-rdfa-primer/ W3C RDFa Primer].
  
Another example for Grade level could be:
+
This section outlines some of the [[RDFa]] metadata Creative Commons is collecting for the DiscoverEd project and gives some examples of using RDFa in XHTML documents. These metadata are extracted from the document at crawl time. While our metadata store may include additional metadata information from resources, these fields are exposed by default in the search results:
  
<pre>cc:gradelevel:3,4,5,primary</pre>
+
*Title
 +
*Summary
 +
*License
 +
*Education level
 +
*Language
 +
*Subject
  
The "cc:" specific fields should be embedded precisely as specified above in parentheses:
+
'''Title''' (<code>[http://dublincore.org/documents/dcmi-terms/#terms-title DCT:title]</code>)<br />
* Subject: cc:subject:<data>
+
A brief descriptive title for the resource.
* Grade level: cc:gradelevel:<data>
 
* Language: cc:lang:<data>
 
  
=== Specifying Subject ===
+
'''Summary''' (<code>[http://dublincore.org/documents/dcmi-terms/#terms-description DCT:description]</code>)<br />
 +
A relatively short summary or synopsis of the resource.
  
The subject refers to the actual content in the resource; i.e., what is this resource 'about'? For many resources, more than one subject will be necessary (comma-separated), but we ask that you try to limit the number of tags to only those subjects that are objectively reflective of the entire resource. If you want to include other types of tags (opinions, metrics, etc), please add those to the free-form "tag" field instead.
+
'''License''' (<code>[http://dublincore.org/documents/dcmi-terms/#terms-license DCT:license]</code>, <code>[http://creativecommons.org/ns cc:license]</code>, <code>[http://www.w3.org/1999/xhtml/ xhtml:license]</code>)<br />
 +
The stable URL of the work's license; e.g., http://creativecommons.org/licenses/by/3.0/. If you are using Creative Commons licenses, we also recommend following the [[CC REL]] specification for identifying further CC license metadata.
  
=== Specifying Grade level ===
+
See the [[Syndication|CC with syndication formats]] documentation for more information on including this in a bootstrap feed.
  
The grade level should indicate all grade levels (student ages) for which the resource is deemed appropriate. While we are not asking for everyone to adhere to a single standard, you might consider using one of the following schemas: 1) primary, secondary, tertiary, adult; or 2) K,1,2,3,...,20 (where the number refers to the actual grade-level). You may include equivalent terms as well (e.g., grades 9-10 are part of secondary education, so you can annotate: 9,10,secondary), separated by commas. If we are confused by a grade-evel schema, we may contact you for further clarification.
+
'''Education level''' (<code>[http://dublincore.org/documents/dcmi-terms/#terms-educationLevel DCT:educationLevel]</code>)<br />
 +
What grade(s) or age-level(s) this material is suitable for. The education level should indicate all levels (student ages) for which the resource is deemed appropriate. Though we accept any descriptions that seem appropriate to you, please consider using one of the following schemas:
  
=== Specifying Language ===
+
*primary, secondary, tertiary, adult;
 +
*K,1,2,3,...,20 (where the number refers to the actual grade-level).
  
When specifying the language for a resource, the value should be specifed as the [http://en.wikipedia.org/wiki/ISO_639-1 ISO-639-1] code.  For example, <code>en</code> for English.  When specifying a national dialect, the [http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 IS-3166 alpha-2] code should be appeneded.  For example, to distinguish English (United States) from English (United Kindgom), the language would be specified as <code>en-US</code> and <code>en-GB</code>, respectively.
+
You may include equivalent terms as well by specifying more than one value for DCT:educationLevel. For example, you might include a separate DCT:educationLevel tag for 9, 10, and secondary.
  
In general, we are expecting that most resources will consist of a single language, but if more than one language is present, separate each entry with a comma.
+
'''Language''' (<code>[http://www.w3.org/XML/1998/namespace xml:lang]</code>, <code>[http://dublincore.org/documents/dcmi-terms/#terms-language DCT:language]</code>)<br />
 +
The language(s) of the referenced resource (not of your site). When specifying the language for a resource, the value should be specified as described by RFC-4646.8 For example, en for English. To distinguish English (United States) from English (United Kingdom), the language would be specified as en-US and en-GB, respectively.
  
=== Embedding license data ===
+
In an Atom 1.0 feed, the language is specified as the <code>xml:lang</code> attribute of the <code>content</code> element.  Multiple languages in a single entry is not supported.
  
Since the licensing of a resource is expected to be conveyed via URL, we can leverage the Atom &lt;link&gt; element. However we must markup the link element so as to identify is as a license URL. This is accomplished with adding the attribute rel="license" to the &lt;link&gt; element. For example:
+
'''Subject''' (<code>[http://dublincore.org/documents/dcmi-terms/#terms-subject DCT:subject]</code>)<br />
 +
The subject(s) of the resource; e.g., mathematics. The subject refers to the actual content in the resource; i.e., what the resource is about. For many resources, more than one subject will be necessary; in these cases, simply specify multiple subject elements. Ideally you should try to limit the contents of the subject to only those subjects that are objectively reflective of the entire resource. Other types of categories (opinions, metrics, etc.) may have other vocabularies available which are more appropriate.
  
<pre><link rel="license" href="http://creativecommons.org/licenses/by/3.0/" /></pre>
+
{{Infobox|'''Note about RDFa Vocabularies'''
  
See the [[Atom|complete CC+Atom]] documentation for more information.
+
Notice that each metadata label is preceded by a prefix of either dc or xhtml. In the RDFa specification, these are indicators of which vocabulary defines the properties, or metadata terms. We recommend the [http://purl.org/dc/terms/ Dublin Core] vocabulary for the majority of properties because of its widespread adoption. For license, we recommend using the xhtml namespace because it’s built in to the XHTML specification and is equivalent to other definitions of the property.}}
  
 
== Examples ==
 
== Examples ==
  
=== Atom 1.0 example ===
+
=== [X]HTML + [[RDFa]] ===
Here is a sample, one entry Atom 1.0 feed which implements the guidelines above.
+
 
 +
The following is an example of how a resource at http://ocw.example.org/math/101 could be annotated with machine-readable metadata, including license and attribution information.  This is our preferred manner for encoding this information as it exposes the metadata to a much wider range of clients.
 +
 
 +
<pre>
 +
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
 +
<html xmlns="http://www.w3.org/1999/xhtml/"
 +
      xmlns:dc="http://purl.org/dc/terms/"
 +
      xmlns:cc="http://creativecommons.org/ns#">
 +
  <head>
 +
  <title>OER Site</title>
 +
  </head>
 +
 
 +
  <body>
 +
    <h1 property="dc:title">Math 101</h1>
 +
    <h2>by <a href="http://example.org/~johnq" property="dc:author cc:attributionName" rel="cc:attributionURL">John Q. Public</a></h2>
 +
    <p property="dc:description">Basic mathematics for 5th graders</p>
 +
    <p>Subjects: <span property="dc:subject">Math</span></p>
 +
    <p>Grade level: <span property="dc:educationLevel">5</span></p>
 +
    <p>Language: <span property="dc:language" content="en">English</span></p>
 +
    <p>License: <a href="http://creativecommons.org/by/3.0/" rel="license">Attribution 3.0</a></p>
 +
 
 +
    <p>Lorem ipsum, etc, etc.</p>
 +
 
 +
  </body>
 +
</html>
 +
</pre>
 +
 
 +
If a site aggregates resources such that the metadata appear on a page other than the actual resource, the <code>about</code> attribute can be used to indicate that the metadata are about a different resource.  For example, the following page could be published at <code>'''http://commons.oer.example.org/math/101'''</code> and still refer to the same resource as the previous example:
 +
 
 +
<pre>
 +
 
 +
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
 +
<html xmlns="http://www.w3.org/1999/xhtml/"
 +
      xmlns:dc="http://purl.org/dc/terms/">
 +
  <head>
 +
  <title>OER Site</title>
 +
  </head>
 +
 
 +
  <body>
 +
    <div about="http://ocw.example.org/math/101">
 +
      <h1 property="dc:title">Math 101</h1>
 +
      <h2>by <span property="dc:author">John Q. Public</span></h2>
 +
      <p property="dc:description">Basic mathematics for 5th graders</p>
 +
      <p>Subjects: <span property="dc:subject">Math</span></p>
 +
      <p>Grade level: <span property="dc:educationLevel">5</span></p>
 +
      <p>Language: <span property="dc:language" content="en">English</span></p>
 +
      <p>License: <a href="http://creativecommons.org/by/3.0/" rel="license">Attribution 3.0</a></p>
 +
    </div>
 +
 
 +
    <p>Lorem ipsum, etc, etc.</p>
 +
 
 +
  </body>
 +
</html>
 +
</pre>
 +
 
 +
=== Atom 1.0 Example ===
 +
 
 +
Here is a sample, one entry Atom 1.0 feed which implements the guidelines above. '''Note that inclusion of additional metadata in the feed is optional and considered inferior to inclusion with the resource using [[RDFa]].'''
  
 
<pre>
 
<pre>
 
<feed xmlns="http://www.w3.org/2005/Atom">
 
<feed xmlns="http://www.w3.org/2005/Atom">
   <id>http://oersite.org/cc/</id>
+
   <id>http://oersite.example.org/cc/</id>
 
   <title>OER Aggregation Web Site</title>
 
   <title>OER Aggregation Web Site</title>
 
   <updated>2008-01-16T12:00:00Z</updated>
 
   <updated>2008-01-16T12:00:00Z</updated>
   <link rel="self" href="http://oersite.org/cc/atom.xml" type="application/atom+xml" />
+
   <link rel="self" href="http://oersite.example.org/cc/atom.xml" type="application/atom+xml" />
 
   <author>
 
   <author>
 
     <name>John Q. Public</name>
 
     <name>John Q. Public</name>
Line 78: Line 136:
 
     <id>tag:ocw.org,2007-10-15:/math/101</id>
 
     <id>tag:ocw.org,2007-10-15:/math/101</id>
 
     <updated>2007-10-15T12:00:00Z</updated>
 
     <updated>2007-10-15T12:00:00Z</updated>
     <link href="http://ocw.org/math/101" />
+
     <link href="http://ocw.example.org/math/101" />
 
     <title>Math 101</title>
 
     <title>Math 101</title>
 
     <summary>Basic mathematics for 5th graders</summary>
 
     <summary>Basic mathematics for 5th graders</summary>
 
     <link rel="license" href="http://creativecommons.org/licenses/by/3.0/" />
 
     <link rel="license" href="http://creativecommons.org/licenses/by/3.0/" />
     <category term="cc:subject:Math" />
+
     <category term="dc:subject:Math" />
     <category term="cc:gradelevel:5-7" />
+
     <category term="dc:educationLevel:5" />
     <category term="cc:lang:en" />
+
     <content type="xhtml" xml:lang="en">The content</content>
 
   </entry>
 
   </entry>
 
</feed>
 
</feed>
 
</pre>
 
</pre>
  
=== RSS 2.0 example ===
+
[[Category:Learn]]
Here is a sample, one entry RSS 2.0 feed which implements the guidelines above.
+
[[Category:DiscoverEd]]
 
 
<pre>
 
<?xml version="1.0"?>
 
<rss version="2.0"
 
  xmlns:atom="http://www.w3.org/2005/Atom"
 
  xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"
 
>
 
  <channel>
 
    <title>OER Web Site</title>
 
    <link>http://oersite.org/</link>
 
    <atom:link rel="self" href="http://oersite.org/cc/rss20.xml" type="application/atom+xml" />
 
    <description>OER Aggregation Web Site</description>
 
    <lastBuildDate>Wed, 16 Jan 2008 15:00:00 -0800</lastBuildDate>
 
    <webMaster>webmaster@oersite.org (John Q. Public)</webMaster>
 
    <item>
 
      <guid isPermaLink="false">tag:ocw.org,2007-10-15:/math/101</guid>
 
      <pubDate>Mon, 12 Nov 2007 09:15:00 -0800</pubDate>
 
      <link>http://ocw.org/math/101</link>
 
      <title>Math 101</title>
 
      <description>Basic mathematics for 5th graders</description>
 
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
 
      <category>cc:subject:Math</category>
 
      <category>cc:gradelevel:5-7</category>
 
      <category>cc:lang:en</category>
 
    </item>
 
  </channel>
 
</rss>
 
 
 
</pre>
 

Latest revision as of 18:10, 9 June 2010

This is a basic guide to increasing the discoverability of online educational resources by preparing them for inclusion into search engines that utilize structured data, like DiscoverEd. This guide contains technical language and sample XHTML and RDFa.

DiscoverEd is an experimental project from Creative Commons intended to explore how structured data may be used to enhance the search experience. Metadata about the resources, including the license and subject information available, are exposed in the search result set. We are particularly interested in open educational resources (OER) and are collaborating with other open education projects to improve search and discovery capabilities for OER, using DiscoverEd and other available tools. For in-depth details, read the white paper that describes the goals and design of DiscoverEd.

This page is meant to be a quick checklist for maximizing the discoverability of your resources in DiscoverEd and similarly designed search engines. Not all of these steps are necessary for inclusion into DiscoverEd. For example, structured data are not technically required for resources to be included in search results, but without them users of the search engine will be provided with very little information about your resources.

Resource Feed

DiscoverEd uses resource feeds to direct its resource crawl. In order to index your educational resources, DiscoverEd will need the URL to an RSS or Atom feed that is limited to your educational resources. It is not likely that a site is composed entirely of educational materials, instead consisting of "About" pages, links to staff profiles, and so on, in addition to the educational resources. An index of educational resources should be composed of only actual educational materials, thereby reducing or eliminating clutter that typically accompanies web-scale queries.

DiscoverEd consumes the feeds for each site that has been listed for inclusion. Your feed essentially provides a URL "road map" of your resources, which can then be used to run a directed crawl of the resources you curate. In other words, the crawler knows where the relevant resources are located because you, the curator, have pointed at them directly using the feed.

Many curatorial sites already have feed functionality (RSS or Atom) or support the Open Archive Initiative's Protocol for Metadata Harvesting (OAI-PMH). The MIT Open CourseWare site, for example, allows you to subscribe to a feed of the courses, which means that you can get an update every time a course is added, deleted, or changed. This type of feed also usually contains a list of the URLs for every course already on the site. Both feeds and OAI-PMH also provide a convenient method of polling, allowing the system to periodically check for new resources. Once a feed is set up, the DiscoverEd system can be kept up to date with minimal oversight.

Resource Metadata

Once you have located the URL to a feed that is limited to your educational resources, a good next step to increasing their discoverability would be to provide metadata about those resources. We recommend XHTML+RDFa for metadata encoding and transport.

As a curator, you have certain goals for the resources you curate. Generally, you want curated resources to be as easy to find as possible. Core to this goal is enabling machines to detect and interpret metadata about the resources, such as title, language, or licensing terms, in a way that is interoperable with as many detection and interpretation methods as possible. Interoperability here means not only that different programs can read particular metadata properties, but also that the vocabularies themselves, which are sets of related properties, can evolve and be extended. It is also important that potential extensions be backward compatible: existing tools should not be disrupted when new properties are added. If possible, existing tools should even be able to handle basic aspects of new properties. This is precisely the kind of "interoperability of meaning" that RDF is designed to support.

For this and other reasons, the ideal method for metadata encoding/transport is XHTML+RDFa. We believe this has the broadest possible exposure for current and future software agents. For more information as to why we recommend and require RDFa for metadata transport, see the CC REL W3C specification and our white paper. For technical information on XHTML and RDFa, see the W3C RDFa Primer.

This section outlines some of the RDFa metadata Creative Commons is collecting for the DiscoverEd project and gives some examples of using RDFa in XHTML documents. These metadata are extracted from the document at crawl time. While our metadata store may include additional metadata information from resources, these fields are exposed by default in the search results:

  • Title
  • Summary
  • License
  • Education level
  • Language
  • Subject

Title (DCT:title)
A brief descriptive title for the resource.

Summary (DCT:description)
A relatively short summary or synopsis of the resource.

License (DCT:license, cc:license, xhtml:license)
The stable URL of the work's license; e.g., http://creativecommons.org/licenses/by/3.0/. If you are using Creative Commons licenses, we also recommend following the CC REL specification for identifying further CC license metadata.

See the CC with syndication formats documentation for more information on including this in a bootstrap feed.

Education level (DCT:educationLevel)
What grade(s) or age-level(s) this material is suitable for. The education level should indicate all levels (student ages) for which the resource is deemed appropriate. Though we accept any descriptions that seem appropriate to you, please consider using one of the following schemas:

  • primary, secondary, tertiary, adult;
  • K,1,2,3,...,20 (where the number refers to the actual grade-level).

You may include equivalent terms as well by specifying more than one value for DCT:educationLevel. For example, you might include a separate DCT:educationLevel tag for 9, 10, and secondary.

Language (xml:lang, DCT:language)
The language(s) of the referenced resource (not of your site). When specifying the language for a resource, the value should be specified as described by RFC-4646.8 For example, en for English. To distinguish English (United States) from English (United Kingdom), the language would be specified as en-US and en-GB, respectively.

In an Atom 1.0 feed, the language is specified as the xml:lang attribute of the content element. Multiple languages in a single entry is not supported.

Subject (DCT:subject)
The subject(s) of the resource; e.g., mathematics. The subject refers to the actual content in the resource; i.e., what the resource is about. For many resources, more than one subject will be necessary; in these cases, simply specify multiple subject elements. Ideally you should try to limit the contents of the subject to only those subjects that are objectively reflective of the entire resource. Other types of categories (opinions, metrics, etc.) may have other vocabularies available which are more appropriate.

Note about RDFa Vocabularies

Notice that each metadata label is preceded by a prefix of either dc or xhtml. In the RDFa specification, these are indicators of which vocabulary defines the properties, or metadata terms. We recommend the Dublin Core vocabulary for the majority of properties because of its widespread adoption. For license, we recommend using the xhtml namespace because it’s built in to the XHTML specification and is equivalent to other definitions of the property.

Examples

[X]HTML + RDFa

The following is an example of how a resource at http://ocw.example.org/math/101 could be annotated with machine-readable metadata, including license and attribution information. This is our preferred manner for encoding this information as it exposes the metadata to a much wider range of clients.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml/"
      xmlns:dc="http://purl.org/dc/terms/"
      xmlns:cc="http://creativecommons.org/ns#">
  <head>
   <title>OER Site</title>
  </head>

  <body>
     <h1 property="dc:title">Math 101</h1>
     <h2>by <a href="http://example.org/~johnq" property="dc:author cc:attributionName" rel="cc:attributionURL">John Q. Public</a></h2>
     <p property="dc:description">Basic mathematics for 5th graders</p>
     <p>Subjects: <span property="dc:subject">Math</span></p>
     <p>Grade level: <span property="dc:educationLevel">5</span></p>
     <p>Language: <span property="dc:language" content="en">English</span></p>
     <p>License: <a href="http://creativecommons.org/by/3.0/" rel="license">Attribution 3.0</a></p>

     <p>Lorem ipsum, etc, etc.</p>

  </body>
</html>

If a site aggregates resources such that the metadata appear on a page other than the actual resource, the about attribute can be used to indicate that the metadata are about a different resource. For example, the following page could be published at http://commons.oer.example.org/math/101 and still refer to the same resource as the previous example:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml/"
      xmlns:dc="http://purl.org/dc/terms/">
  <head>
   <title>OER Site</title>
  </head>

  <body>
     <div about="http://ocw.example.org/math/101">
       <h1 property="dc:title">Math 101</h1>
       <h2>by <span property="dc:author">John Q. Public</span></h2>
       <p property="dc:description">Basic mathematics for 5th graders</p>
       <p>Subjects: <span property="dc:subject">Math</span></p>
       <p>Grade level: <span property="dc:educationLevel">5</span></p>
       <p>Language: <span property="dc:language" content="en">English</span></p>
       <p>License: <a href="http://creativecommons.org/by/3.0/" rel="license">Attribution 3.0</a></p>
     </div>

     <p>Lorem ipsum, etc, etc.</p>

  </body>
</html>

Atom 1.0 Example

Here is a sample, one entry Atom 1.0 feed which implements the guidelines above. Note that inclusion of additional metadata in the feed is optional and considered inferior to inclusion with the resource using RDFa.

<feed xmlns="http://www.w3.org/2005/Atom">
  <id>http://oersite.example.org/cc/</id>
  <title>OER Aggregation Web Site</title>
  <updated>2008-01-16T12:00:00Z</updated>
  <link rel="self" href="http://oersite.example.org/cc/atom.xml" type="application/atom+xml" />
  <author>
    <name>John Q. Public</name>
    <email>webmaster@oersite.org</email>
  </author>
  <entry>
    <id>tag:ocw.org,2007-10-15:/math/101</id>
    <updated>2007-10-15T12:00:00Z</updated>
    <link href="http://ocw.example.org/math/101" />
    <title>Math 101</title>
    <summary>Basic mathematics for 5th graders</summary>
    <link rel="license" href="http://creativecommons.org/licenses/by/3.0/" />
    <category term="dc:subject:Math" />
    <category term="dc:educationLevel:5" />
    <content type="xhtml" xml:lang="en">The content</content>
  </entry>
</feed>