Difference between revisions of "DiscoverEd Metadata"

From Creative Commons
Jump to: navigation, search
(General)
Line 1: Line 1:
 
{{draft}}
 
{{draft}}
__TOC__
 
  
== General ==
+
== Overview ==
 
This document outlines the format in which ccLearn would like to receive syndication feeds for the data that will go into our OER database.
 
This document outlines the format in which ccLearn would like to receive syndication feeds for the data that will go into our OER database.
  
 
The data must be supplied in an [http://www.atomenabled.org/developers/syndication/ Atom] or [http://www.rssboard.org/rss-specification RSS] format.  Both of these standards are in widespread use on the Internet for content syndication.   
 
The data must be supplied in an [http://www.atomenabled.org/developers/syndication/ Atom] or [http://www.rssboard.org/rss-specification RSS] format.  Both of these standards are in widespread use on the Internet for content syndication.   
 
'''NOTE''': The sample Atom and RSS 2.0 feeds below mostly implement the minimum elements required by the respective specification plus the fields that ccLearn needs.  For our purposes, a feed must minimally contain most of the elements in the examples below, but may also contain any other valid elements.  Also, though '''we prefer an Atom feed''', there is no reason that another type of feed cannot be used, as long as it is able to include all of the data CC needs ''AND'' includes the data in such a way that the [http://feedparser.org Universal Feed Parser] can extract it in a normalized way.
 
  
 
Presently, ccLearn is looking for the following data:
 
Presently, ccLearn is looking for the following data:
Line 18: Line 15:
 
* Subject (cc:subject): The subject(s) of the resource; e.g., math.
 
* Subject (cc:subject): The subject(s) of the resource; e.g., math.
  
If you want to use more than one entry for any or all of the grade-level, language, and subject fields, simply comma-separate each annotation.
+
<div style="border: 1px dashed #06f; margin: 0.5em auto 1em; padding:1em; margin-left:2em" class="boilerplate plainlinks" id="stub">
 +
'''NOTE''': The sample Atom and RSS 2.0 feeds below mostly implement the minimum elements required by the respective specification plus the fields that ccLearn needs.  For our purposes, a feed must minimally contain most of the elements in the examples below, but may also contain any other valid elements.  Also, though '''we prefer an Atom feed''', there is no reason that another type of feed cannot be used, as long as it is able to include all of the data CC needs ''AND'' includes the data in such a way that the [http://feedparser.org Universal Feed Parser] can extract it in a normalized way.
 +
</div>
  
 
== CC-specific categories (tags/fields) ==
 
== CC-specific categories (tags/fields) ==
  
Some of these fields do not have native Atom or RSS element definitions.  For these fields we suggest that they be embedded as <category> elementsIn order for us to be able to recognize these within the feed, the <category> content should be of the format:
+
The CC Specific fields do not have native Atom or RSS element definitions.  For these fields we suggest that they be embedded as category or tag specifications (<code><category></code> in Atom) with a specific prefixThese have the general format of:
  
<blockquote>cc:<field>:<data></blockquote>
+
<pre>cc:<field>:<data></pre>
  
For example, the <category> content for Language would become something like:
+
For example, the <code><category></code> content for Language would become something like:
  
 
<pre>cc:lang:es</pre>
 
<pre>cc:lang:es</pre>
Line 43: Line 42:
 
=== Specifying Subject ===
 
=== Specifying Subject ===
  
The subject refers to the actual content in the resource; i.e., what is this resource ''about''? For many resources, more than one subject will be necessary; in this case, specify multiple subject <category> elements.  We ask that you try to limit the number of elements to only those subjects that are objectively reflective of the entire resource. If you want to include other types of categories (opinions, metrics, etc), please add those as normal <catgory> elements instead.
+
The subject refers to the actual content in the resource; i.e., what is this resource ''about''? For many resources, more than one subject will be necessary; in this case, specify multiple subject <category> elements.  We ask that you try to limit the number of elements to only those subjects that are objectively reflective of the entire resource. If you want to include other types of categories (opinions, metrics, etc), please add those as normal (un-prefixed) <category> elements instead.
  
 
=== Specifying Grade level ===
 
=== Specifying Grade level ===
Line 49: Line 48:
 
The grade level should indicate all grade levels (student ages) for which the resource is deemed appropriate. Please consider using one of the following schemas:  
 
The grade level should indicate all grade levels (student ages) for which the resource is deemed appropriate. Please consider using one of the following schemas:  
  
# primary, secondary, tertiary, adult;  
+
* primary, secondary, tertiary, adult;  
# K,1,2,3,...,20 (where the number refers to the actual grade-level).  
+
* K,1,2,3,...,20 (where the number refers to the actual grade-level).  
  
 
You may include equivalent terms as well by specifying more than one <code>cc:gradelevel</code> <category>.  For example, you might include a <code>cc:gradelevel</code> for <code>9</code>, <code>10</code>, and <code>secondary</code>.
 
You may include equivalent terms as well by specifying more than one <code>cc:gradelevel</code> <category>.  For example, you might include a <code>cc:gradelevel</code> for <code>9</code>, <code>10</code>, and <code>secondary</code>.
Line 66: Line 65:
 
<pre><link rel="license" href="http://creativecommons.org/licenses/by/3.0/" /></pre>
 
<pre><link rel="license" href="http://creativecommons.org/licenses/by/3.0/" /></pre>
  
See the [[Atom|complete CC+Atom]] documentation for more information.
+
See the complete [[Syndication|CC with syndication formats]] documentation for more information.
  
 
== Examples ==
 
== Examples ==

Revision as of 18:56, 29 January 2008

Overview

This document outlines the format in which ccLearn would like to receive syndication feeds for the data that will go into our OER database.

The data must be supplied in an Atom or RSS format. Both of these standards are in widespread use on the Internet for content syndication.

Presently, ccLearn is looking for the following data:

  • Link: Full URL of the referenced resource.
  • Title: A brief descriptive title for the resource.
  • Summary: A relatively short summary/synopsis of the resource.
  • License: This should be a URL to the license; e.g., http://creativecommons.org/licenses/by/3.0/.
  • Grade level (cc:gradelevel): What grade(s) or age-level(s) this material is suitable for.
  • Language (cc:lang): The language(s) of the referenced resource (not of your site).
  • Subject (cc:subject): The subject(s) of the resource; e.g., math.

CC-specific categories (tags/fields)

The CC Specific fields do not have native Atom or RSS element definitions. For these fields we suggest that they be embedded as category or tag specifications (<category> in Atom) with a specific prefix. These have the general format of:

cc:<field>:<data>

For example, the <category> content for Language would become something like:

cc:lang:es

Examples for Grade level could be:

cc:gradelevel:3
cc:gradelevel:primary

The Creative Commons-specific fields build upon existing category/tag support in feeds. Therefore any cc: field may be specified multiple times if needed. The fields we currently use for refining search results include:

  • Subject: cc:subject:
  • Grade level: cc:gradelevel:
  • Language: cc:lang:

Specifying Subject

The subject refers to the actual content in the resource; i.e., what is this resource about? For many resources, more than one subject will be necessary; in this case, specify multiple subject <category> elements. We ask that you try to limit the number of elements to only those subjects that are objectively reflective of the entire resource. If you want to include other types of categories (opinions, metrics, etc), please add those as normal (un-prefixed) <category> elements instead.

Specifying Grade level

The grade level should indicate all grade levels (student ages) for which the resource is deemed appropriate. Please consider using one of the following schemas:

  • primary, secondary, tertiary, adult;
  • K,1,2,3,...,20 (where the number refers to the actual grade-level).

You may include equivalent terms as well by specifying more than one cc:gradelevel <category>. For example, you might include a cc:gradelevel for 9, 10, and secondary.

Specifying Language

When specifying the language for a resource, the value should be specifed as the ISO-639-1 code. For example, en for English. When specifying a national dialect, the IS-3166 alpha-2 code should be appeneded. For example, to distinguish English (United States) from English (United Kindgom), the language would be specified as en-US and en-GB, respectively.

In general, we are expecting that most resources will consist of a single language, but if more than one language is present, provide a cc:lang <category> for each.

Embedding license data

Since the licensing of a resource is expected to be conveyed via URL, we can leverage the Atom <link> element. However we must markup the link element so as to identify is as a license URL. This is accomplished with adding the attribute rel="license" to the <link> element. For example:

<link rel="license" href="http://creativecommons.org/licenses/by/3.0/" />

See the complete CC with syndication formats documentation for more information.

Examples

Atom 1.0 example

Here is a sample, one entry Atom 1.0 feed which implements the guidelines above.


<feed xmlns="http://www.w3.org/2005/Atom">
  <id>http://oersite.org/cc/</id>
  <title>OER Aggregation Web Site</title>
  <updated>2008-01-16T12:00:00Z</updated>
  <link rel="self" href="http://oersite.org/cc/atom.xml" type="application/atom+xml" />
  <author>
    <name>John Q. Public</name>
    <email>webmaster@oersite.org</email>
  </author>
  <entry>
    <id>tag:ocw.org,2007-10-15:/math/101</id>
    <updated>2007-10-15T12:00:00Z</updated>
    <link href="http://ocw.org/math/101" />
    <title>Math 101</title>
    <summary>Basic mathematics for 5th graders</summary>
    <link rel="license" href="http://creativecommons.org/licenses/by/3.0/" />
    <category term="cc:subject:Math" />
    <category term="cc:gradelevel:5" />
    <category term="cc:lang:en" />
  </entry>
</feed>

RSS 2.0 example

Here is a sample, one entry RSS 2.0 feed which implements the guidelines above.


<?xml version="1.0"?>
<rss version="2.0"
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"
>
  <channel>
    <title>OER Web Site</title>
    <link>http://oersite.org/</link>
    <atom:link rel="self" href="http://oersite.org/cc/rss20.xml" type="application/atom+xml" />
    <description>OER Aggregation Web Site</description>
    <lastBuildDate>Wed, 16 Jan 2008 15:00:00 -0800</lastBuildDate>
    <webMaster>webmaster@oersite.org (John Q. Public)</webMaster>
    <item>
      <guid isPermaLink="false">tag:ocw.org,2007-10-15:/math/101</guid>
      <pubDate>Mon, 12 Nov 2007 09:15:00 -0800</pubDate>
      <link>http://ocw.org/math/101</link>
      <title>Math 101</title>
      <description>Basic mathematics for 5th graders</description>
      <creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>
      <category>cc:subject:Math</category>
      <category>cc:gradelevel:5</category>
      <category>cc:lang:en</category>
    </item>
  </channel>
</rss>