Difference between revisions of "DiscoverEd Data"

From Creative Commons
Jump to: navigation, search
 
(14 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{incomplete}}
 
 
[http://oesearch.creativecommons.org Open Education Search] is a project of [http://learn.creativecommons.org ccLearn].  You can find more information on the project at http://learn.creativecommons.org/projects/oesearch/.
 
 
This page documents ways in which developers may use the data gathered by the project for other purposes, including integration and customization.
 
 
 
== Data Gathered ==
 
== Data Gathered ==
  
The Open Education Search (OES) project is a web-scale search of Open Educational Resources (OER).  As such, it utilizes a web-wide index, promoting results which have been identified as OER.  ccLearn is serving as an aggregation point for other organizations which have identified or produced OER. At this time we gather:
+
The DiscoverEd project is a scalable search of educational resources with a special emphasis on Open Educational Resources (OER).  As such, it utilizes a web-wide index, promoting results which have been identified as OER.  DiscoverEd.creativecommons.org is serving as an aggregation point for other organizations which have identified or produced OER.  
 
 
* URLs or URL patterns
 
* Subject annotations, sometimes called labels
 
 
 
We are not currently attempting to aggregate rich metadata sets.
 
 
 
The data gathered is currently available in a format suitable for use in a [http://google.com/coop/cse/ Google Custom Search Engine]; ccLearn is committed to making it available in a "raw" format for further reuse.
 
 
 
== Integrating Open Education Search ==
 
 
 
Any site may include Open Education Search on their website by pointing to our CSE context definition.  For example, the following block of HTML will produce a search box with the same semantics as Open Education Search.  For more information on customizing the results format, etc, see the [http://google.com/coop/docs/cse/cref.html Linked CSE documentation] from Google.
 
 
 
<code><pre>
 
<form id="cref" action="http://google.com/cse">
 
  <input type="hidden" name="cref"
 
    value="http://oercloud.creativecommons.org/api/posts_context"
 
    />
 
  <input type="text" name="q" size="40" />
 
  <input type="submit" name="sa" value="Search" />
 
</form>
 
<script type="text/javascript"
 
  src="http://google.com/coop/cse/brand?form=cref"></script>
 
</pre></code>
 
  
== Customizing OE Search ==
+
Data are aggregated from several sources, including:
  
Another re-use scenario is the reuse of the dataset, with adjustments made to result weighting.  We use labels for subject annotations, as well as annotating the source of the URL.  For example, URLs received from [http://cnx.org Connexions] are labeled with <code>connexions</code>.
+
* RSS and Atom feeds (title, description and subject information)
 +
* [http://www.openarchives.org/pmh/ OAI-PMH] repositories (OAI-DC metadata)
 +
* Crawled pages (embedded [[RDFa]])
  
<code><pre>
+
You can read more details about our [http://wiki.creativecommons.org/CcLearn_Search_Metadata metadata specifications]. The aggregated information, along with source annotations, is stored in a triple store.  This information is available as a SPARQL endpoint.
<Label name="ocw" mode="BOOST" weight="0.8"></Label>
 
</pre></code>
 
  
== Additional Resources and Information ==
+
[[Category:Learn]]
 +
[[Category:Developer]]
 +
[[Category:DiscoverEd]]

Latest revision as of 21:30, 18 June 2010

Data Gathered

The DiscoverEd project is a scalable search of educational resources with a special emphasis on Open Educational Resources (OER). As such, it utilizes a web-wide index, promoting results which have been identified as OER. DiscoverEd.creativecommons.org is serving as an aggregation point for other organizations which have identified or produced OER.

Data are aggregated from several sources, including:

  • RSS and Atom feeds (title, description and subject information)
  • OAI-PMH repositories (OAI-DC metadata)
  • Crawled pages (embedded RDFa)

You can read more details about our metadata specifications. The aggregated information, along with source annotations, is stored in a triple store. This information is available as a SPARQL endpoint.