<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>https://wiki.creativecommons.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dithyramble</id>
		<title>Creative Commons - User contributions [en]</title>
		<link rel="self" type="application/atom+xml" href="https://wiki.creativecommons.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dithyramble"/>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/wiki/Special:Contributions/Dithyramble"/>
		<updated>2026-06-09T16:33:02Z</updated>
		<subtitle>User contributions</subtitle>
		<generator>MediaWiki 1.30.0</generator>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Field_Query_Mapping&amp;diff=36256</id>
		<title>Field Query Mapping</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Field_Query_Mapping&amp;diff=36256"/>
				<updated>2010-06-29T15:41:32Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* How to use */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Asheesh Laroia&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
The people who run a DiscoverEd may wish to let users search specific metadata easily. For example, http://discovered.creativecommons.org/search/ lets users search for works &amp;quot;tagged&amp;quot; with &amp;quot;banana&amp;quot; by searching for tag:banana. (In particular, the predicate for &amp;quot;tag&amp;quot; is the term &amp;quot;subject&amp;quot; as specified by the Dublin Core.)&lt;br /&gt;
&lt;br /&gt;
These prefixes, like tag:, are stored in the DiscoverEd code right now. This feature aims to move those into a configuration file.&lt;br /&gt;
&lt;br /&gt;
This feature was defined and developed during the [[DiscoverEd Sprint (June, 2010)|June 2010 DiscoverEd Sprint]]&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
When DiscoverEd crawls feeds and resources and saves metadata such as the page title, it converts this information into RDF triples; those triples are eventually saved on disk in a triple store, namely Jena.&lt;br /&gt;
&lt;br /&gt;
We will create a new configuration file that stores a list of mappings from predicate URIs. For example, we might list &amp;quot;method:&amp;quot; as a shorthand for the RDF predicate &amp;lt;http://purl.org/dc/terms/instructionalMethod&amp;gt;, a.k.a. &amp;quot;dct:instructionalMethod&amp;quot;. At indexing time, a Lucene column called &amp;quot;method&amp;quot; will be created in the Lucene documents corresponding to each resource that has the dct:instructionalMethod predicate set in the Jena store.&lt;br /&gt;
&lt;br /&gt;
Then, at search time, Nutch's built-in query parser handles the query, e.g., &amp;quot;method:yaddayadda&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== How to use ==&lt;br /&gt;
&lt;br /&gt;
Let's say you want to allow users to perform this query:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;&amp;lt;pre&amp;gt;&lt;br /&gt;
 method:&amp;quot;Experiential learning&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and retrieve all web pages in your index that have a metadatum with predicate &amp;lt;http://purl.org/dc/terms/instructionalMethod&amp;gt; and value &amp;quot;Experiential learning&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
To do so, first edit &amp;lt;code&amp;gt;conf/nutch-site.xml&amp;lt;/code&amp;gt;. Add this XML inside the &amp;lt;configuration&amp;gt; block.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;&amp;lt;pre&amp;gt;&lt;br /&gt;
 &amp;lt;property&amp;gt;&lt;br /&gt;
     &amp;lt;name&amp;gt;query.basic.method.boost&amp;lt;/name&amp;gt;&lt;br /&gt;
     &amp;lt;value&amp;gt;1.0&amp;lt;/value&amp;gt;&lt;br /&gt;
 &amp;lt;/property&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This block of XML tells Nutch to accept the &amp;quot;method:&amp;quot; prefix in search queries. The value of this property indicates the weight the search engine should assign to this term.&lt;br /&gt;
&lt;br /&gt;
Next, edit &amp;lt;code&amp;gt;conf/discovered-search-prefixes.xml&amp;lt;/code&amp;gt;. Add this XML inside the &amp;lt;configuration&amp;gt; block.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;&amp;lt;pre&amp;gt;&lt;br /&gt;
 &amp;lt;property&amp;gt;&lt;br /&gt;
     &amp;lt;name&amp;gt;http://purl.org/dc/terms/instructionalMethod&amp;lt;/name&amp;gt;&lt;br /&gt;
     &amp;lt;value&amp;gt;method&amp;lt;/value&amp;gt;&lt;br /&gt;
 &amp;lt;/property&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This block of XML tells DiscoverEd to copy data out of the Jena store and paste it into a format where Nutch's basic query parser can find it.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
* Added a sample configuration file&lt;br /&gt;
* Added code to our IndexFilter that looks for relevant triples and stores them in the Lucene document&lt;br /&gt;
* Problem: The Lucene documents does not seem to show our column, so we're going back to the drawing board and carefully reading the [http://wiki.apache.org/nutch/HowToMakeCustomSearch relevant Nutch documentation] to make sure we're using the APIs correctly&lt;br /&gt;
&lt;br /&gt;
== Deferred until later ==&lt;br /&gt;
&lt;br /&gt;
* Handling provenance with regard to this. Based on the current plan for how to handle curator exclusion, and using the above example, we have to make sure that instead of adding merely the column &amp;quot;method&amp;quot;, we add something like &amp;quot;curator1:method&amp;quot;, &amp;quot;curator2:method&amp;quot;, and so on. (This may be out of date; see the spec for Excluding curators.)`&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Field_Query_Mapping&amp;diff=36255</id>
		<title>Field Query Mapping</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Field_Query_Mapping&amp;diff=36255"/>
				<updated>2010-06-29T15:40:25Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* How to use */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Asheesh Laroia&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
The people who run a DiscoverEd may wish to let users search specific metadata easily. For example, http://discovered.creativecommons.org/search/ lets users search for works &amp;quot;tagged&amp;quot; with &amp;quot;banana&amp;quot; by searching for tag:banana. (In particular, the predicate for &amp;quot;tag&amp;quot; is the term &amp;quot;subject&amp;quot; as specified by the Dublin Core.)&lt;br /&gt;
&lt;br /&gt;
These prefixes, like tag:, are stored in the DiscoverEd code right now. This feature aims to move those into a configuration file.&lt;br /&gt;
&lt;br /&gt;
This feature was defined and developed during the [[DiscoverEd Sprint (June, 2010)|June 2010 DiscoverEd Sprint]]&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
When DiscoverEd crawls feeds and resources and saves metadata such as the page title, it converts this information into RDF triples; those triples are eventually saved on disk in a triple store, namely Jena.&lt;br /&gt;
&lt;br /&gt;
We will create a new configuration file that stores a list of mappings from predicate URIs. For example, we might list &amp;quot;method:&amp;quot; as a shorthand for the RDF predicate &amp;lt;http://purl.org/dc/terms/instructionalMethod&amp;gt;, a.k.a. &amp;quot;dct:instructionalMethod&amp;quot;. At indexing time, a Lucene column called &amp;quot;method&amp;quot; will be created in the Lucene documents corresponding to each resource that has the dct:instructionalMethod predicate set in the Jena store.&lt;br /&gt;
&lt;br /&gt;
Then, at search time, Nutch's built-in query parser handles the query, e.g., &amp;quot;method:yaddayadda&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== How to use ==&lt;br /&gt;
&lt;br /&gt;
Let's say you want to allow users to perform this query:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;&lt;br /&gt;
 method:&amp;quot;Experiential learning&amp;quot;&lt;br /&gt;
&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and retrieve all web pages in your index that have a metadatum with predicate &amp;lt;http://purl.org/dc/terms/instructionalMethod&amp;gt; and value &amp;quot;Experiential learning&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
To do so, first edit &amp;lt;code&amp;gt;conf/nutch-site.xml&amp;lt;/code&amp;gt;. Add this XML inside the &amp;lt;configuration&amp;gt; block.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;&lt;br /&gt;
 &amp;lt;property&amp;gt;&lt;br /&gt;
     &amp;lt;name&amp;gt;query.basic.method.boost&amp;lt;/name&amp;gt;&lt;br /&gt;
     &amp;lt;value&amp;gt;1.0&amp;lt;/value&amp;gt;&lt;br /&gt;
 &amp;lt;/property&amp;gt;&lt;br /&gt;
&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This block of XML tells Nutch to accept the &amp;quot;method:&amp;quot; prefix in search queries. The value of this property indicates the weight the search engine should assign to this term.&lt;br /&gt;
&lt;br /&gt;
Next, edit &amp;lt;code&amp;gt;conf/discovered-search-prefixes.xml&amp;lt;/code&amp;gt;. Add this XML inside the &amp;lt;configuration&amp;gt; block.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;&lt;br /&gt;
 &amp;lt;property&amp;gt;&lt;br /&gt;
     &amp;lt;name&amp;gt;http://purl.org/dc/terms/instructionalMethod&amp;lt;/name&amp;gt;&lt;br /&gt;
     &amp;lt;value&amp;gt;method&amp;lt;/value&amp;gt;&lt;br /&gt;
 &amp;lt;/property&amp;gt;&lt;br /&gt;
&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This block of XML tells DiscoverEd to copy data out of the Jena store and paste it into a format where Nutch's basic query parser can find it.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
* Added a sample configuration file&lt;br /&gt;
* Added code to our IndexFilter that looks for relevant triples and stores them in the Lucene document&lt;br /&gt;
* Problem: The Lucene documents does not seem to show our column, so we're going back to the drawing board and carefully reading the [http://wiki.apache.org/nutch/HowToMakeCustomSearch relevant Nutch documentation] to make sure we're using the APIs correctly&lt;br /&gt;
&lt;br /&gt;
== Deferred until later ==&lt;br /&gt;
&lt;br /&gt;
* Handling provenance with regard to this. Based on the current plan for how to handle curator exclusion, and using the above example, we have to make sure that instead of adding merely the column &amp;quot;method&amp;quot;, we add something like &amp;quot;curator1:method&amp;quot;, &amp;quot;curator2:method&amp;quot;, and so on. (This may be out of date; see the spec for Excluding curators.)`&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Field_Query_Mapping&amp;diff=36252</id>
		<title>Field Query Mapping</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Field_Query_Mapping&amp;diff=36252"/>
				<updated>2010-06-29T15:26:40Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: Explain how to use field query mapping&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Asheesh Laroia&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
The people who run a DiscoverEd may wish to let users search specific metadata easily. For example, http://discovered.creativecommons.org/search/ lets users search for works &amp;quot;tagged&amp;quot; with &amp;quot;banana&amp;quot; by searching for tag:banana. (In particular, the predicate for &amp;quot;tag&amp;quot; is the term &amp;quot;subject&amp;quot; as specified by the Dublin Core.)&lt;br /&gt;
&lt;br /&gt;
These prefixes, like tag:, are stored in the DiscoverEd code right now. This feature aims to move those into a configuration file.&lt;br /&gt;
&lt;br /&gt;
This feature was defined and developed during the [[DiscoverEd Sprint (June, 2010)|June 2010 DiscoverEd Sprint]]&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
When DiscoverEd crawls feeds and resources and saves metadata such as the page title, it converts this information into RDF triples; those triples are eventually saved on disk in a triple store, namely Jena.&lt;br /&gt;
&lt;br /&gt;
We will create a new configuration file that stores a list of mappings from predicate URIs. For example, we might list &amp;quot;method:&amp;quot; as a shorthand for the RDF predicate &amp;lt;http://purl.org/dc/terms/instructionalMethod&amp;gt;, a.k.a. &amp;quot;dct:instructionalMethod&amp;quot;. At indexing time, a Lucene column called &amp;quot;method&amp;quot; will be created in the Lucene documents corresponding to each resource that has the dct:instructionalMethod predicate set in the Jena store.&lt;br /&gt;
&lt;br /&gt;
Then, at search time, Nutch's built-in query parser handles the query, e.g., &amp;quot;method:yaddayadda&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== How to use ==&lt;br /&gt;
&lt;br /&gt;
Let's say you want to allow users to perform this query:&lt;br /&gt;
&lt;br /&gt;
 method:&amp;quot;Experiential learning&amp;quot;&lt;br /&gt;
&lt;br /&gt;
and retrieve all web pages in your index that have a metadatum with predicate &amp;lt;http://purl.org/dc/terms/instructionalMethod&amp;gt; and value &amp;quot;Experiential learning&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
To do so, first edit &amp;lt;code&amp;gt;conf/nutch-site.xml&amp;lt;/code&amp;gt;. Add this XML inside the &amp;lt;configuration&amp;gt; block.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;property&amp;gt;&lt;br /&gt;
     &amp;lt;name&amp;gt;query.basic.method.boost&amp;lt;/name&amp;gt;&lt;br /&gt;
     &amp;lt;value&amp;gt;1.0&amp;lt;/value&amp;gt;&lt;br /&gt;
 &amp;lt;/property&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This block of XML tells Nutch to accept the &amp;quot;method:&amp;quot; prefix in search queries. The value of this property indicates the weight the search engine should assign to this term.&lt;br /&gt;
&lt;br /&gt;
Next, edit &amp;lt;code&amp;gt;conf/discovered-search-prefixes.xml&amp;lt;/code&amp;gt;. Add this XML inside the &amp;lt;configuration&amp;gt; block.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;property&amp;gt;&lt;br /&gt;
     &amp;lt;name&amp;gt;http://purl.org/dc/terms/instructionalMethod&amp;lt;/name&amp;gt;&lt;br /&gt;
     &amp;lt;value&amp;gt;method&amp;lt;/value&amp;gt;&lt;br /&gt;
 &amp;lt;/property&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This block of XML tells DiscoverEd to copy data out of the Jena store and paste it into a format where Nutch's basic query parser can find it.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
* Added a sample configuration file&lt;br /&gt;
* Added code to our IndexFilter that looks for relevant triples and stores them in the Lucene document&lt;br /&gt;
* Problem: The Lucene documents does not seem to show our column, so we're going back to the drawing board and carefully reading the [http://wiki.apache.org/nutch/HowToMakeCustomSearch relevant Nutch documentation] to make sure we're using the APIs correctly&lt;br /&gt;
&lt;br /&gt;
== Deferred until later ==&lt;br /&gt;
&lt;br /&gt;
* Handling provenance with regard to this. Based on the current plan for how to handle curator exclusion, and using the above example, we have to make sure that instead of adding merely the column &amp;quot;method&amp;quot;, we add something like &amp;quot;curator1:method&amp;quot;, &amp;quot;curator2:method&amp;quot;, and so on. (This may be out of date; see the spec for Excluding curators.)`&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36028</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36028"/>
				<updated>2010-06-23T04:26:03Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* What needs to be done */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query. When the engine returns with its listing of results, you see a particular result that could be categorized more usefully. You want to tell the search engine, bring up this result when the user searches for such-and-such a word.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission creates (if necessary) a new Jena triple store whose URI represents the person who filled in the form. The handler then inserts a new RDFa triple into this triple store:&lt;br /&gt;
&lt;br /&gt;
 result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
(Side note: The word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it here to mean &amp;quot;is tagged with&amp;quot;.)&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. During the crawl, when we are inserting information about a particular URL into the Lucene database, this bit of code looks in all the Jena triple stores for any tags associated with that URL. It then inserts these tags into Lucene as well. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. The number 18 signifies the user who submitted a tag via the brightly colored box mentioned above. It then adds a new field to the Lucene document associated with the URL we're crawling.&lt;br /&gt;
&lt;br /&gt;
== What works ==&lt;br /&gt;
&lt;br /&gt;
Look in the branch &amp;lt;tt&amp;gt;add_tagging_form&amp;lt;/tt&amp;gt; (at time of writing, this pointed to [http://gitorious.org/discovered/repo/commit/a2af4aea3270e4a663abc2eb89c310e1ab5148c8 a2af4aea3270e4a663abc2eb89c310e1ab5148c8]).&lt;br /&gt;
&lt;br /&gt;
* We can add a tag to the RdfStore and retrieve it, using the bean api for both adding and retrieving. (Nothing crazy-special.)&lt;br /&gt;
* The search results jsp has the add-a-tag form&lt;br /&gt;
&lt;br /&gt;
== What needs to be done ==&lt;br /&gt;
* Make &amp;lt;tt&amp;gt;org.creativecommons.learn.test.AddATag.testCheckThatResourceIsSearchableViaTag&amp;lt;/tt&amp;gt; pass.&lt;br /&gt;
* Write a test that the HTML form submits to a POST handler which adds a tag to the RdfStore. This code adds a tag: &amp;lt;tt&amp;gt;org.creativecommons.learn.Tag.add(taggerURI, resourceURI, tag);&amp;lt;/tt&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36027</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36027"/>
				<updated>2010-06-23T04:24:02Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* What needs to be done */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query. When the engine returns with its listing of results, you see a particular result that could be categorized more usefully. You want to tell the search engine, bring up this result when the user searches for such-and-such a word.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission creates (if necessary) a new Jena triple store whose URI represents the person who filled in the form. The handler then inserts a new RDFa triple into this triple store:&lt;br /&gt;
&lt;br /&gt;
 result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
(Side note: The word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it here to mean &amp;quot;is tagged with&amp;quot;.)&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. During the crawl, when we are inserting information about a particular URL into the Lucene database, this bit of code looks in all the Jena triple stores for any tags associated with that URL. It then inserts these tags into Lucene as well. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. The number 18 signifies the user who submitted a tag via the brightly colored box mentioned above. It then adds a new field to the Lucene document associated with the URL we're crawling.&lt;br /&gt;
&lt;br /&gt;
== What works ==&lt;br /&gt;
&lt;br /&gt;
Look in the branch &amp;lt;tt&amp;gt;add_tagging_form&amp;lt;/tt&amp;gt; (at time of writing, this pointed to [http://gitorious.org/discovered/repo/commit/a2af4aea3270e4a663abc2eb89c310e1ab5148c8 a2af4aea3270e4a663abc2eb89c310e1ab5148c8]).&lt;br /&gt;
&lt;br /&gt;
* We can add a tag to the RdfStore and retrieve it, using the bean api for both adding and retrieving. (Nothing crazy-special.)&lt;br /&gt;
* The search results jsp has the add-a-tag form&lt;br /&gt;
&lt;br /&gt;
== What needs to be done ==&lt;br /&gt;
* Make &amp;lt;tt&amp;gt;org.creativecommons.learn.test.AddATag.testCheckThatResourceIsSearchableViaTag&amp;lt;/tt&amp;gt; pass.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36026</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36026"/>
				<updated>2010-06-23T04:23:23Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* What works */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query. When the engine returns with its listing of results, you see a particular result that could be categorized more usefully. You want to tell the search engine, bring up this result when the user searches for such-and-such a word.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission creates (if necessary) a new Jena triple store whose URI represents the person who filled in the form. The handler then inserts a new RDFa triple into this triple store:&lt;br /&gt;
&lt;br /&gt;
 result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
(Side note: The word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it here to mean &amp;quot;is tagged with&amp;quot;.)&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. During the crawl, when we are inserting information about a particular URL into the Lucene database, this bit of code looks in all the Jena triple stores for any tags associated with that URL. It then inserts these tags into Lucene as well. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. The number 18 signifies the user who submitted a tag via the brightly colored box mentioned above. It then adds a new field to the Lucene document associated with the URL we're crawling.&lt;br /&gt;
&lt;br /&gt;
== What works ==&lt;br /&gt;
&lt;br /&gt;
Look in the branch &amp;lt;tt&amp;gt;add_tagging_form&amp;lt;/tt&amp;gt; (at time of writing, this pointed to [http://gitorious.org/discovered/repo/commit/a2af4aea3270e4a663abc2eb89c310e1ab5148c8 a2af4aea3270e4a663abc2eb89c310e1ab5148c8]).&lt;br /&gt;
&lt;br /&gt;
* We can add a tag to the RdfStore and retrieve it, using the bean api for both adding and retrieving. (Nothing crazy-special.)&lt;br /&gt;
* The search results jsp has the add-a-tag form&lt;br /&gt;
&lt;br /&gt;
== What needs to be done ==&lt;br /&gt;
* Make testCheckThatResourceIsSearchableViaTag pass.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36025</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36025"/>
				<updated>2010-06-23T04:22:51Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query. When the engine returns with its listing of results, you see a particular result that could be categorized more usefully. You want to tell the search engine, bring up this result when the user searches for such-and-such a word.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission creates (if necessary) a new Jena triple store whose URI represents the person who filled in the form. The handler then inserts a new RDFa triple into this triple store:&lt;br /&gt;
&lt;br /&gt;
 result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
(Side note: The word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it here to mean &amp;quot;is tagged with&amp;quot;.)&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. During the crawl, when we are inserting information about a particular URL into the Lucene database, this bit of code looks in all the Jena triple stores for any tags associated with that URL. It then inserts these tags into Lucene as well. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. The number 18 signifies the user who submitted a tag via the brightly colored box mentioned above. It then adds a new field to the Lucene document associated with the URL we're crawling.&lt;br /&gt;
&lt;br /&gt;
== What works ==&lt;br /&gt;
&lt;br /&gt;
Look in the branch &amp;lt;pre&amp;gt;add_tagging_form&amp;lt;/pre&amp;gt; (at time of writing, this pointed to [http://gitorious.org/discovered/repo/commit/a2af4aea3270e4a663abc2eb89c310e1ab5148c8 a2af4aea3270e4a663abc2eb89c310e1ab5148c8]).&lt;br /&gt;
&lt;br /&gt;
* We can add a tag to the RdfStore and retrieve it, using the bean api for both adding and retrieving. (Nothing crazy-special.)&lt;br /&gt;
* The search results jsp has the add-a-tag form&lt;br /&gt;
&lt;br /&gt;
== What needs to be done ==&lt;br /&gt;
* Make testCheckThatResourceIsSearchableViaTag pass.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36024</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36024"/>
				<updated>2010-06-23T04:21:55Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* What works */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query. When the engine returns with its listing of results, you see a particular result that could be categorized more usefully. You want to tell the search engine, bring up this result when the user searches for such-and-such a word.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission creates (if necessary) a new Jena triple store whose URI represents the person who filled in the form. The handler then inserts a new RDFa triple into this triple store:&lt;br /&gt;
&lt;br /&gt;
 result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
(Side note: The word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it here to mean &amp;quot;is tagged with&amp;quot;.)&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. During the crawl, when we are inserting information about a particular URL into the Lucene database, this bit of code looks in all the Jena triple stores for any tags associated with that URL. It then inserts these tags into Lucene as well. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. The number 18 signifies the user who submitted a tag via the brightly colored box mentioned above. It then adds a new field to the Lucene document associated with the URL we're crawling.&lt;br /&gt;
&lt;br /&gt;
== What works ==&lt;br /&gt;
&lt;br /&gt;
Look in the branch &amp;lt;pre&amp;gt;add_tagging_form&amp;lt;/pre&amp;gt; (at time of writing, this pointed to [http://gitorious.org/discovered/repo/commit/a2af4aea3270e4a663abc2eb89c310e1ab5148c8 a2af4aea3270e4a663abc2eb89c310e1ab5148c8]).&lt;br /&gt;
&lt;br /&gt;
* We can add a tag to the RdfStore and retrieve it, using the bean api for both adding and retrieving. (Nothing crazy-special.)&lt;br /&gt;
* The search results jsp has the add-a-tag form&lt;br /&gt;
&lt;br /&gt;
== What needs to be done ==&lt;br /&gt;
* Make testCheckThatResourceIsSearchableViaTag pass.&lt;br /&gt;
&lt;br /&gt;
== What remains to be done ==&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36023</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36023"/>
				<updated>2010-06-23T03:40:15Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query. When the engine returns with its listing of results, you see a particular result that could be categorized more usefully. You want to tell the search engine, bring up this result when the user searches for such-and-such a word.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission creates (if necessary) a new Jena triple store whose URI represents the person who filled in the form. The handler then inserts a new RDFa triple into this triple store:&lt;br /&gt;
&lt;br /&gt;
 result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
(Side note: The word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it here to mean &amp;quot;is tagged with&amp;quot;.)&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. During the crawl, when we are inserting information about a particular URL into the Lucene database, this bit of code looks in all the Jena triple stores for any tags associated with that URL. It then inserts these tags into Lucene as well. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. The number 18 signifies the user who submitted a tag via the brightly colored box mentioned above. It then adds a new field to the Lucene document associated with the URL we're crawling.&lt;br /&gt;
&lt;br /&gt;
== What works ==&lt;br /&gt;
&lt;br /&gt;
In the branch &amp;lt;pre&amp;gt;add_tagging_form&amp;lt;/pre&amp;gt; ([http://gitorious.org/discovered/repo/commit/a2af4aea3270e4a663abc2eb89c310e1ab5148c8 a2af4aea3270e4a663abc2eb89c310e1ab5148c8])&lt;br /&gt;
&lt;br /&gt;
* &amp;quot;Add a tag&amp;quot; &lt;br /&gt;
&lt;br /&gt;
== What remains to be done ==&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36022</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36022"/>
				<updated>2010-06-23T03:33:17Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Implementation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query. When the engine returns with its listing of results, you see a particular result that could be categorized more usefully. You want to tell the search engine, bring up this result when the user searches for such-and-such a word.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission creates (if necessary) a new Jena triple store whose URI represents the person who filled in the form. The handler then inserts a new RDFa triple into this triple store:&lt;br /&gt;
&lt;br /&gt;
 result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
(Side note: The word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it here to mean &amp;quot;is tagged with&amp;quot;.)&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. During the crawl, when we are inserting information about a particular URL into the Lucene database, this bit of code looks in all the Jena triple stores for any tags associated with that URL. It then inserts these tags into Lucene as well. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. The number 18 signifies the user who submitted a tag via the brightly colored box mentioned above. It then adds a new field to the Lucene document associated with the URL we're crawling.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36021</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36021"/>
				<updated>2010-06-23T03:32:10Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Implementation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query. When the engine returns with its listing of results, you see a particular result that could be categorized more usefully. You want to tell the search engine, bring up this result when the user searches for such-and-such a word.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission creates (if necessary) a new Jena triple store whose URI represents the person who filled in the form. The handler then inserts a new RDFa triple into this triple store:&lt;br /&gt;
&lt;br /&gt;
 result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
(Side note: The word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it here to mean &amp;quot;is tagged with&amp;quot;.)&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. During the crawl, when we are inserting information about a particular URL into the Lucene database, this bit of code looks in all the Jena triple stores for any tags associated with that URL. It then inserts these tags into Lucene as well. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. 18 signifies the user who submitted a tag via the brightly colored box mentioned above. It then adds a new field to the Lucene document associated with the URL we're crawling.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36020</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36020"/>
				<updated>2010-06-23T03:31:08Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* The story from the user's point of view */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query. When the engine returns with its listing of results, you see a particular result that could be categorized more usefully. You want to tell the search engine, bring up this result when the user searches for such-and-such a word.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission creates (if necessary) a new Jena triple store whose URI represents the person who filled in the form. The handler then inserts a new RDFa triple into this triple store:&lt;br /&gt;
&lt;br /&gt;
 result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
(Side note: The word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it to pick out the concept, 'is tagged with'.)&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. During the crawl, when we are inserting information about a particular URL into the Lucene database, this bit of code looks in all the Jena triple stores for any tags associated with that URL. It then inserts these tags into Lucene as well. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. 18 signifies the user who submitted a tag via the brightly colored box mentioned above. It then adds a new field to the Lucene document associated with the URL we're crawling.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36019</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36019"/>
				<updated>2010-06-23T02:58:34Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Implementation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query; &amp;quot;sustainability water&amp;quot; for instance. When the engine returns with its listing of results, you see a particular result that could be categorized more effectively. You want to teach the engine a new fact about one of those ecological pages.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission creates (if necessary) a new Jena triple store whose URI represents the person who filled in the form. The handler then inserts a new RDFa triple into this triple store:&lt;br /&gt;
&lt;br /&gt;
 result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
(Side note: The word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it to pick out the concept, 'is tagged with'.)&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. During the crawl, when we are inserting information about a particular URL into the Lucene database, this bit of code looks in all the Jena triple stores for any tags associated with that URL. It then inserts these tags into Lucene as well. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. 18 signifies the user who submitted a tag via the brightly colored box mentioned above. It then adds a new field to the Lucene document associated with the URL we're crawling.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36018</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36018"/>
				<updated>2010-06-23T02:54:27Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Requirements */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query; &amp;quot;sustainability water&amp;quot; for instance. When the engine returns with its listing of results, you see a particular result that could be categorized more effectively. You want to teach the engine a new fact about one of those ecological pages.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a tag with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored box where you can enter the tag. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission writes a new RDFa triple to the Jena quad store, consisting of four strings:&lt;br /&gt;
&lt;br /&gt;
 submitter_uri, result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
Note that the word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it to pick out the concept, 'is tagged with'.&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. This code looks in the Jena quad store for any tags stored there. It then adds these tags to Lucene. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. 18 signifies the user who submitted a tag via the brightly colored box mentioned above.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36017</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36017"/>
				<updated>2010-06-23T02:54:00Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Requirements */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query; &amp;quot;sustainability water&amp;quot; for instance. When the engine returns with its listing of results, you see a particular result that could be categorized more effectively. You want to teach the engine a new fact about one of those ecological pages.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a word with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored rectangular box where you can enter a new word. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission writes a new RDFa triple to the Jena quad store, consisting of four strings:&lt;br /&gt;
&lt;br /&gt;
 submitter_uri, result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
Note that the word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it to pick out the concept, 'is tagged with'.&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. This code looks in the Jena quad store for any tags stored there. It then adds these tags to Lucene. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. 18 signifies the user who submitted a tag via the brightly colored box mentioned above.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36016</id>
		<title>User Supplied Metadata</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=User_Supplied_Metadata&amp;diff=36016"/>
				<updated>2010-06-23T02:53:28Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: beginning of a write up of a spec&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Raphael Krut-Landau&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
== The story from the user's point of view ==&lt;br /&gt;
&lt;br /&gt;
A moment a bit like this is fairly common. You've asked a search engine to tell you what it knows about a particular query; &amp;quot;sustainability water&amp;quot; for instance. When the engine returns with its listing of results, you see a particular result that could be categorized more effectively. You want to teach the engine a new fact about one of those ecological pages.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
In this feature, we allow you, as a user of DiscoverEd, to associate a new word with a search result that you see on your screen. Next to all search results there is a small link reading &amp;quot;Add a tag&amp;quot;; click this to open a brightly colored rectangular box where you can enter a new word. The box has a small &amp;quot;submit&amp;quot; link; click this and you immediately see the word alongside all the other tags that the engine associates with the result, if there were any.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
The brightly colored box mentioned above is an HTML form. The POST handler which accepts the user's submission writes a new RDFa triple to the Jena quad store, consisting of four strings:&lt;br /&gt;
&lt;br /&gt;
 submitter_uri, result_uri, dct:subject, tag&lt;br /&gt;
&lt;br /&gt;
Note that the word &amp;quot;subject&amp;quot; above might confuse you a bit if you are into RDF. In RDF, &amp;quot;subject&amp;quot; usually means the subject of a triple (subject, predicate, object). In the Dublin Core terms (DCT), subject means a ''topic''. We use it to pick out the concept, 'is tagged with'.&lt;br /&gt;
&lt;br /&gt;
We want to ensure that this new tag appears whenever anybody now or in the future chances upon the search result in question using this particular installation of the DiscoverEd search engine. Here's how the engine will do that. From time to time, a webmaster asks his copy of DiscoverEd to &amp;quot;crawl&amp;quot; &amp;amp;mdash; that is, to download copies of web pages from the internet and put their text, and other information about them, into the search engine's Lucene database. We want to make sure that the user-submitted tag is included among that information we store in Lucene.&lt;br /&gt;
&lt;br /&gt;
So there'll be a bit of a code that runs whenever you ask DiscoverEd to perform a crawl. This code looks in the Jena quad store for any tags stored there. It then adds these tags to Lucene. In the parlance of Lucene, it adds a new column (or you could say, a new kind of field). The column is named something like 18__dct_subject. 18 signifies the user who submitted a tag via the brightly colored box mentioned above.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Field_Query_Mapping&amp;diff=35995</id>
		<title>Field Query Mapping</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Field_Query_Mapping&amp;diff=35995"/>
				<updated>2010-06-22T15:05:42Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Asheesh Laroia&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
The people who run a DiscoverEd may wish to let users search specific metadata easily. For example, http://discovered.creativecommons.org/search/ lets users search for works &amp;quot;tagged&amp;quot; with &amp;quot;banana&amp;quot; by searching for tag:banana. (In particular, the predicate for &amp;quot;tag&amp;quot; is the term &amp;quot;subject&amp;quot; as specified by the Dublin Core.)&lt;br /&gt;
&lt;br /&gt;
These prefixes, like tag:, are stored in the DiscoverEd code right now. This feature aims to move those into a configuration file.&lt;br /&gt;
&lt;br /&gt;
This feature was defined and developed during the [[DiscoverEd Sprint (June, 2010)|June 2010 DiscoverEd Sprint]]&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
When DiscoverEd crawls feeds and resources and saves metadata such as the page title, it converts this information into RDF triples; those triples are eventually saved on disk in a triple store, namely Jena.&lt;br /&gt;
&lt;br /&gt;
We will create a new configuration file that stores a list of mappings from predicate URIs. For example, we might list &amp;quot;method:&amp;quot; as a shorthand for the RDF predicate &amp;lt;http://purl.org/dc/terms/instructionalMethod&amp;gt;, a.k.a. &amp;quot;dct:instructionalMethod&amp;quot;. At indexing time, a Lucene column called &amp;quot;method&amp;quot; will be created in the Lucene documents corresponding to each resource that has the dct:instructionalMethod predicate set in the Jena store.&lt;br /&gt;
&lt;br /&gt;
Then, at search time, Nutch's built-in query parser handles the query, e.g., &amp;quot;method:yaddayadda&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
* Added a sample configuration file&lt;br /&gt;
* Added code to our IndexFilter that looks for relevant triples and stores them in the Lucene document&lt;br /&gt;
* Problem: The Lucene documents does not seem to show our column, so we're going back to the drawing board and carefully reading the [http://wiki.apache.org/nutch/HowToMakeCustomSearch relevant Nutch documentation] to make sure we're using the APIs correctly&lt;br /&gt;
&lt;br /&gt;
== Deferred until later ==&lt;br /&gt;
&lt;br /&gt;
* Handling provenance with regard to this. Based on the current plan for how to handle curator exclusion, and using the above example, we have to make sure that instead of adding merely the column &amp;quot;method&amp;quot;, we add something like &amp;quot;curator1:method&amp;quot;, &amp;quot;curator2:method&amp;quot;, and so on. (This may be out of date; see the spec for Excluding curators.)`&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Field_Query_Mapping&amp;diff=35994</id>
		<title>Field Query Mapping</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Field_Query_Mapping&amp;diff=35994"/>
				<updated>2010-06-22T14:58:47Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: disambig &amp;quot;subject&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Asheesh Laroia&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
The people who run a DiscoverEd may wish to let users search specific metadata easily. For example, http://discovered.creativecommons.org/search/ lets users search for works &amp;quot;tagged&amp;quot; with &amp;quot;banana&amp;quot; by searching for tag:banana. (In particular, the predicate for &amp;quot;tag&amp;quot; is the term &amp;quot;subject&amp;quot; as specified by the Dublin Core.)&lt;br /&gt;
&lt;br /&gt;
These prefixes, like tag:, are stored in the DiscoverEd code right now. This feature aims to move those into a configuration file.&lt;br /&gt;
&lt;br /&gt;
This feature was defined and developed during the [[DiscoverEd Sprint (June, 2010)|June 2010 DiscoverEd Sprint]]&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
When DiscoverEd crawls feeds and resources and saves metadata such as the page title, it converts this information into RDF triples; those triples are eventually saved on disk in a triple store, Jena.&lt;br /&gt;
&lt;br /&gt;
We will create a new configuration file that stores a list of mappings from predicate URIs (such as stating that &amp;quot;method:&amp;quot; will be a shorthand for the RDF predicate http://purl.org/dc/terms/instructionalMethod, AKA dct:instructionalMethod). At indexing time, a Lucene column called &amp;quot;method&amp;quot; will be created in the Lucene documents corresponding to each resource that has the dct:instructionalMethod predicate set.&lt;br /&gt;
&lt;br /&gt;
Then, at search time, Nutch's built-in query parser handles the query.&lt;br /&gt;
&lt;br /&gt;
== Implementation ==&lt;br /&gt;
&lt;br /&gt;
* Added a sample configuration file&lt;br /&gt;
* Added code to our IndexFilter that looks for relevant triples and stores them in the Lucene document&lt;br /&gt;
* Problem: The Lucene documents does not seem to show our column, so we're going back to the drawing board and carefully reading the [http://wiki.apache.org/nutch/HowToMakeCustomSearch relevant Nutch documentation] to make sure we're using the APIs correctly&lt;br /&gt;
&lt;br /&gt;
==Deferred until later==&lt;br /&gt;
&lt;br /&gt;
* Handling provenance with regard to this.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd/Meetings/2010/06/21&amp;diff=35900</id>
		<title>DiscoverEd/Meetings/2010/06/21</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd/Meetings/2010/06/21&amp;diff=35900"/>
				<updated>2010-06-21T18:19:16Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Asheesh, Nathan and Raffi were on this phone call.&lt;br /&gt;
&lt;br /&gt;
== Sprint Follow-up ==&lt;br /&gt;
* Outstanding tasks&lt;br /&gt;
** Asheesh writes up a team report for his team (-:&lt;br /&gt;
*** structured as a spec page&lt;br /&gt;
** Raffi writes up a team report for the tag-adding team&lt;br /&gt;
* Issues from sprint&lt;br /&gt;
** NY has been refactoring the RdfStore, and was sad to see half-finished refactorings in the codebase. Going forward, we should pay more attention to these refactorings. When we add a new helper method that simply calls an existing method, maybe we can simply replace the old method. That way we could avoid leaving both the old and new versions in the class.&lt;br /&gt;
* Tests&lt;br /&gt;
** running from Ant: Asheesh and Raffi will confirm that the tests in the branch &amp;quot;next&amp;quot; do pass. Nathan will push an ant target he wrote once his laptop is resuscitated. AL and RKL will then use ant to run the tests.&lt;br /&gt;
** source tree separation (src/tests/... instead of src/java/...)&lt;br /&gt;
*** This seems to be the pattern Nathan has observed in large Java projects.  It also allows you to easily create a &amp;quot;run-time&amp;quot; that excludes your testing code.&lt;br /&gt;
&lt;br /&gt;
What we were working on before the sprint: &lt;br /&gt;
== Excluding a curator from a search ==&lt;br /&gt;
* We had written a large test, and were in the process of breaking it up into smaller pieces which could be individually tested. - Raffi&lt;br /&gt;
* That test was called &amp;quot;MinusCurator&amp;quot;, and it sort of overwhelmingly failed. Asheesh began work with Tim on migrating the TripleStoreIndexer to use the new document.add(String, String) method from Nutch rather than LuceneWriter.add(Field, String). The latter is deprecated, and moreover seems to not quite work. Yesterday Asheesh began writing a few helper methods and tests in a branch to help complete this migration.&lt;br /&gt;
&lt;br /&gt;
== Next steps ==&lt;br /&gt;
&lt;br /&gt;
The current goal is to make sure TripleStoreIndexer works. It's pretty deeply broken if we know we can't write a single TripleStore-based value into Lucene. Raffi pointed out that the tag-addition team has a test for this which he thinks already passes.&lt;br /&gt;
&lt;br /&gt;
After that, we will work on landing Tim's and Asheesh's code from the sprint, namely the work on creating new Lucene columns that represent particular RDF predicates, controlled simply by a configuration file.&lt;br /&gt;
&lt;br /&gt;
[http://piratepad.net/5OhqF55lTk This history of this document lives here at PiratePad]&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd/Meetings/2010/06/21&amp;diff=35896</id>
		<title>DiscoverEd/Meetings/2010/06/21</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd/Meetings/2010/06/21&amp;diff=35896"/>
				<updated>2010-06-21T17:24:26Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Asheesh, Nathan and Raffi were on this phone call.&lt;br /&gt;
&lt;br /&gt;
== Sprint Follow-up ==&lt;br /&gt;
* Outstanding tasks&lt;br /&gt;
** Asheesh writes up a team report for his team (-:&lt;br /&gt;
*** structured as a spec page&lt;br /&gt;
** Raffi writes up a team report for the tag-adding team&lt;br /&gt;
* Issues from sprint&lt;br /&gt;
** NY has been refactoring the RdfStore, and was sad to see half-finished refactorings in the codebase. Going forward, we should pay more attention to these refactorings. When we add a new helper method that simply calls an existing method, maybe we can simply replace the old method. That way we could avoid leaving both the old and new versions in the class.&lt;br /&gt;
* Tests&lt;br /&gt;
** running from Ant: Asheesh and Raffi will confirm that the tests in the branch &amp;quot;next&amp;quot; do pass. Nathan will push an ant target he wrote once his laptop is resuscitated. AL and RKL will then use ant to run the tests.&lt;br /&gt;
** source tree separation (src/tests/... instead of src/java/...)&lt;br /&gt;
*** This seems to be the pattern Nathan has observed in large Java projects.  It also allows you to easily create a &amp;quot;run-time&amp;quot; that excludes your testing code.&lt;br /&gt;
&lt;br /&gt;
What we were working on before the sprint: &lt;br /&gt;
== Excluding a curator from a search ==&lt;br /&gt;
* We had written a large test, and were in the process of breaking it up into smaller pieces which could be individually tested. - Raffi&lt;br /&gt;
* That test was called &amp;quot;MinusCurator&amp;quot;, and it sort of overwhelmingly failed. Asheesh began work with Tim on migrating the TripleStoreIndexer to use the new document.add(String, String) method from Nutch rather than LuceneWriter.add(Field, String). The latter is deprecated, and moreover seems to not quite work. Yesterday Asheesh began writing a few helper methods and tests in a branch to help complete this migration.&lt;br /&gt;
&lt;br /&gt;
== Next steps ==&lt;br /&gt;
&lt;br /&gt;
The current goal is to make sure TripleStoreIndexer works. It's pretty deeply broken if we know we can't write a single TripleStore-based value into Lucene. Raffi pointed out that the tag-addition team has a test for this which he thinks already passes.&lt;br /&gt;
&lt;br /&gt;
After that, we will work on landing Tim's and Asheesh's code from the sprint, namely the work on creating new Lucene columns that represent particular RDF predicates, controlled simply by a configuration file.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd/Meetings/2010/06/21&amp;diff=35895</id>
		<title>DiscoverEd/Meetings/2010/06/21</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd/Meetings/2010/06/21&amp;diff=35895"/>
				<updated>2010-06-21T17:24:10Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* DiscoverEd Checkin 21 June 2010 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Asheesh, Nathan and Raffi were on this phone call.&lt;br /&gt;
&lt;br /&gt;
=== Sprint Follow-up ===&lt;br /&gt;
* Outstanding tasks&lt;br /&gt;
** Asheesh writes up a team report for his team (-:&lt;br /&gt;
*** structured as a spec page&lt;br /&gt;
** Raffi writes up a team report for the tag-adding team&lt;br /&gt;
* Issues from sprint&lt;br /&gt;
** NY has been refactoring the RdfStore, and was sad to see half-finished refactorings in the codebase. Going forward, we should pay more attention to these refactorings. When we add a new helper method that simply calls an existing method, maybe we can simply replace the old method. That way we could avoid leaving both the old and new versions in the class.&lt;br /&gt;
* Tests&lt;br /&gt;
** running from Ant: Asheesh and Raffi will confirm that the tests in the branch &amp;quot;next&amp;quot; do pass. Nathan will push an ant target he wrote once his laptop is resuscitated. AL and RKL will then use ant to run the tests.&lt;br /&gt;
** source tree separation (src/tests/... instead of src/java/...)&lt;br /&gt;
*** This seems to be the pattern Nathan has observed in large Java projects.  It also allows you to easily create a &amp;quot;run-time&amp;quot; that excludes your testing code.&lt;br /&gt;
&lt;br /&gt;
What we were working on before the sprint: &lt;br /&gt;
=== Excluding a curator from a search ===&lt;br /&gt;
* We had written a large test, and were in the process of breaking it up into smaller pieces which could be individually tested. - Raffi&lt;br /&gt;
* That test was called &amp;quot;MinusCurator&amp;quot;, and it sort of overwhelmingly failed. Asheesh began work with Tim on migrating the TripleStoreIndexer to use the new document.add(String, String) method from Nutch rather than LuceneWriter.add(Field, String). The latter is deprecated, and moreover seems to not quite work. Yesterday Asheesh began writing a few helper methods and tests in a branch to help complete this migration.&lt;br /&gt;
&lt;br /&gt;
== Next steps ==&lt;br /&gt;
&lt;br /&gt;
The current goal is to make sure TripleStoreIndexer works. It's pretty deeply broken if we know we can't write a single TripleStore-based value into Lucene. Raffi pointed out that the tag-addition team has a test for this which he thinks already passes.&lt;br /&gt;
&lt;br /&gt;
After that, we will work on landing Tim's and Asheesh's code from the sprint, namely the work on creating new Lucene columns that represent particular RDF predicates, controlled simply by a configuration file.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd/Meetings/2010/06/21&amp;diff=35894</id>
		<title>DiscoverEd/Meetings/2010/06/21</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd/Meetings/2010/06/21&amp;diff=35894"/>
				<updated>2010-06-21T17:23:30Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: add notes&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== DiscoverEd Checkin 21 June 2010 ==&lt;br /&gt;
&lt;br /&gt;
=== Sprint Follow-up ===&lt;br /&gt;
* Outstanding tasks&lt;br /&gt;
** Asheesh writes up a team report for his team (-:&lt;br /&gt;
*** structured as a spec page&lt;br /&gt;
** Raffi writes up a team report for the tag-adding team&lt;br /&gt;
* Issues from sprint&lt;br /&gt;
** NY has been refactoring the RdfStore, and was sad to see half-finished refactorings in the codebase. Going forward, we should pay more attention to these refactorings. When we add a new helper method that simply calls an existing method, maybe we can simply replace the old method. That way we could avoid leaving both the old and new versions in the class.&lt;br /&gt;
* Tests&lt;br /&gt;
** running from Ant: Asheesh and Raffi will confirm that the tests in the branch &amp;quot;next&amp;quot; do pass. Nathan will push an ant target he wrote once his laptop is resuscitated. AL and RKL will then use ant to run the tests.&lt;br /&gt;
** source tree separation (src/tests/... instead of src/java/...)&lt;br /&gt;
*** This seems to be the pattern Nathan has observed in large Java projects.  It also allows you to easily create a &amp;quot;run-time&amp;quot; that excludes your testing code.&lt;br /&gt;
&lt;br /&gt;
What we were working on before the sprint: &lt;br /&gt;
=== Excluding a curator from a search ===&lt;br /&gt;
* We had written a large test, and were in the process of breaking it up into smaller pieces which could be individually tested. - Raffi&lt;br /&gt;
* That test was called &amp;quot;MinusCurator&amp;quot;, and it sort of overwhelmingly failed. Asheesh began work with Tim on migrating the TripleStoreIndexer to use the new document.add(String, String) method from Nutch rather than LuceneWriter.add(Field, String). The latter is deprecated, and moreover seems to not quite work. Yesterday Asheesh began writing a few helper methods and tests in a branch to help complete this migration.&lt;br /&gt;
&lt;br /&gt;
== Next steps ==&lt;br /&gt;
&lt;br /&gt;
The current goal is to make sure TripleStoreIndexer works. It's pretty deeply broken if we know we can't write a single TripleStore-based value into Lucene. Raffi pointed out that the tag-addition team has a test for this which he thinks already passes.&lt;br /&gt;
&lt;br /&gt;
After that, we will work on landing Tim's and Asheesh's code from the sprint, namely the work on creating new Lucene columns that represent particular RDF predicates, controlled simply by a configuration file.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd&amp;diff=35887</id>
		<title>DiscoverEd</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd&amp;diff=35887"/>
				<updated>2010-06-21T17:11:00Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Development */ link to unmade meetings page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;DiscoverEd is a search prototype developed by Creative Commons to explore metadata enhanced search, specifically for OER.  DiscoverEd combines full text search with [[DiscoverEd Data|metadata about the resources]].  DiscoverEd is built on [http://lucene.apache.org/nutch/ Nutch].&lt;br /&gt;
&lt;br /&gt;
== General documentation ==&lt;br /&gt;
*[[DiscoverEd FAQ|FAQ]]&lt;br /&gt;
*[[DiscoverEd Glossary|Glossary]]&lt;br /&gt;
** Gloassary of DiscoverEd-related terms.&lt;br /&gt;
*[[DiscoverEd Metadata|Metadata]]&lt;br /&gt;
** Basic guide on metadata markup for DiscoverEd.&lt;br /&gt;
*[[DiscoverEd Quickstart|Quickstart]]&lt;br /&gt;
*[[Running DiscoverEd]]&lt;br /&gt;
*[[DiscoverEd Data|Data]]&lt;br /&gt;
**This page documents ways in which developers may use the data gathered by the project for other purposes.&lt;br /&gt;
&lt;br /&gt;
== Development ==&lt;br /&gt;
* Source repository  ([http://gitorious.org/discovered gitorious])&lt;br /&gt;
* Project planning ([https://www.pivotaltracker.com/projects/77041 Pivotal Tracker])&lt;br /&gt;
* [[DiscoverEd/Development notes|Development notes]]&lt;br /&gt;
* [[Hacking DiscoverEd]]&lt;br /&gt;
*[[:Category:DiscoverEd_Specification|DiscoverEd dev spec pages]]&lt;br /&gt;
* [[/Meetings]]&lt;br /&gt;
&lt;br /&gt;
== Additional Information ==&lt;br /&gt;
*[[Related Efforts]]&lt;br /&gt;
&lt;br /&gt;
[[Category:DiscoverEd]]&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35796</id>
		<title>Hacking DiscoverEd</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35796"/>
				<updated>2010-06-17T14:52:45Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Check out and build the source code */ remove dollar signs from some of the code&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;How to deploy a hackable DiscoverEd, make changes, and update your deployment&lt;br /&gt;
&lt;br /&gt;
[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
= Check out and build the source code =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
git clone git://gitorious.org/discovered/repo.git discovered&lt;br /&gt;
cd discovered&lt;br /&gt;
git checkout (whatever branch we're working on today)&lt;br /&gt;
ant&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Add a curator and a feed =&lt;br /&gt;
&lt;br /&gt;
DiscoverEd uses feeds to help identify resources to crawl.  Feeds are provided by curators, who can also provide metadata about resources.&lt;br /&gt;
&lt;br /&gt;
By default DiscoverEd uses MySQL and looks for a database called discovered.  ```Configure your database settings by editing &amp;lt;code&amp;gt;conf/discovered.xml&amp;lt;/code&amp;gt;.```&lt;br /&gt;
&lt;br /&gt;
Make sure the database exists and then:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds addcurator &amp;quot;ND OCW&amp;quot; http://ocw.nd.edu/ &lt;br /&gt;
$ ./bin/feeds addfeed rss http://ocw.nd.edu/english/@@rss http://ocw.nd.edu/&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
See [[DiscoverEd Feeds]] for information on supported feed types.&lt;br /&gt;
&lt;br /&gt;
More information on &amp;lt;code&amp;gt;./bin/feeds&amp;lt;/code&amp;gt; commands at [[Running DiscoverEd]]  (some information will be discovered.cc specific)&lt;br /&gt;
&lt;br /&gt;
= Aggregate and crawl resources =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds aggregate&lt;br /&gt;
$ ./bin/feeds seed &amp;gt; seed/urls.txt&lt;br /&gt;
$ ant -f dedbuild.xml crawl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Run the web application =&lt;br /&gt;
&lt;br /&gt;
'''Edit conf/nutch-site.xml to point to your crawl location.'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
$ ant war&lt;br /&gt;
$ cp build/nutch-1.1.war [substitute the location for your J2EE container here; ie, /var/lib/tomcat6/webapps ]&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Hacking The Code  =&lt;br /&gt;
&lt;br /&gt;
* Run Eclipse&lt;br /&gt;
* Do File -&amp;gt; Import...&lt;br /&gt;
** When it asks you to &amp;quot;Existing projects into workspace,&amp;quot; choose &amp;quot;General -&amp;gt; File System&amp;quot;&lt;br /&gt;
** Select the location of your source tree&lt;br /&gt;
** Click Finish&lt;br /&gt;
&lt;br /&gt;
(There are three options.  1. &amp;quot;Existing projects into workspace&amp;quot;. 2. &amp;quot;Create from existing source&amp;quot; 3. &amp;quot;File  1. &amp;quot;Existing projects into workspace&amp;quot;. 2. &amp;quot;Create from existing source&amp;quot; 3. &amp;quot;File System&amp;quot;. Some of these trigger an error regarding Nutch MP3 code.)&lt;br /&gt;
&lt;br /&gt;
The DiscoverEd source code lives in two locations:&lt;br /&gt;
&lt;br /&gt;
* ded/src/java contains DiscoverEd specific code, primarily related to interfacing with the RDF store.&lt;br /&gt;
* src/plugins/cclearn contains the DiscoverEd Nutch plugin, which provides some filtering features to Nutch and ensures metadata indexed in the RDF store is injected into the Lucene index&lt;br /&gt;
&lt;br /&gt;
Generally, the plugin may depend upon code in the ded/src/java tree, but classes in the plugin may not be available to that code.&lt;br /&gt;
&lt;br /&gt;
= Commiting Changes and Merging to the Main Repository =&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Troubleshooting =&lt;br /&gt;
&lt;br /&gt;
==== I get a big long Java backtrace talking about Jena and MySQL the first time I run the code ====&lt;br /&gt;
&lt;br /&gt;
This means that you need to CREATE DATABASE discovered in MySQL. DiscoverEd stores its data in MySQL by default, and you need to either (a) create that database, or (b) choose a different configuration file.&lt;br /&gt;
&lt;br /&gt;
==== Database permissions ====&lt;br /&gt;
&lt;br /&gt;
You might need to change the MySQL credentials or database configuration value in &amp;lt;code&amp;gt;conf/discovered.xml&amp;lt;/code&amp;gt;. DiscoverEd does not require that you use the root user; it does require that the database already exist.&lt;br /&gt;
&lt;br /&gt;
==== JAVA_HOME on a Mac ====&lt;br /&gt;
&lt;br /&gt;
Mac users setting JAVA_HOME should use&lt;br /&gt;
/usr/libexec/java_home to determine the current JAVA_HOME&lt;br /&gt;
&lt;br /&gt;
if you're really lazy add&lt;br /&gt;
JAVA_HOME=`/usr/libexec/java_home` &lt;br /&gt;
to .bash_profile and it will set JAVA_HOME each time you invoke a shell. (This is a good idea!)&lt;br /&gt;
&lt;br /&gt;
==== Error message: &amp;quot;Feature 'http://apache.org/xml/features/xinclude' is not recognized.&amp;quot; ====&lt;br /&gt;
&amp;quot;You probably have an older version of Xerces somewhere in your classpath or something is overriding the default parser configuration with one that doesn't support XInclude.&amp;quot; (http://marc.info/?l=xerces-j-user&amp;amp;m=117066278506146&amp;amp;w=2)&lt;br /&gt;
&lt;br /&gt;
==== AccessControlException ====&lt;br /&gt;
&lt;br /&gt;
When starting Tomcat, if you get a traceback like this in your tomcat log (e.g., in /var/lib/tomcat6/logs/localhost-$date.log):&lt;br /&gt;
&lt;br /&gt;
 SEVERE: Exception sending context initialized event to listener instance of class org.apache.nutch.searcher.NutchBean$NutchBeanConstructor&lt;br /&gt;
 java.lang.RuntimeException: java.security.AccessControlException: access denied (java.lang.reflect.ReflectPermission suppressAccessChecks)&lt;br /&gt;
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1377)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)&lt;br /&gt;
&lt;br /&gt;
and so on, try changing the Tomcat policy in /etc/tomcat6/policy.d/04webapps.policy. Add these lines in the grant {} block:&lt;br /&gt;
&lt;br /&gt;
    // Attempt to get Nutch working&lt;br /&gt;
    // Courtesy of Alex McLintock at http://mail-archives.apache.org/mod_mbox/lucene-nutch-user/200907.mbox/&amp;lt;d398ec7f0907041237j6acffe0fm10b7cd374a77795b@mail.gmail.com&amp;gt;&lt;br /&gt;
    permission java.security.AllPermission;&lt;br /&gt;
&lt;br /&gt;
This is obviously inappropriate for any site running a public instance of DiscoverEd. But it might be useful for your local dev environment. If you know how to specify a class level permission, please update this document.&lt;br /&gt;
&lt;br /&gt;
==== Missing build/plugins ====&lt;br /&gt;
&lt;br /&gt;
Be sure to run ant in the root repo directory.&lt;br /&gt;
&lt;br /&gt;
==== Missing parse-mp3 plugin ====&lt;br /&gt;
&lt;br /&gt;
Remove that source folder from the build path (in Eclimse, Project &amp;gt; Properties &amp;gt; Java Build Path &amp;gt; Source.&lt;br /&gt;
&lt;br /&gt;
==== Eclipse complains: Wrong version number in .class file ====&lt;br /&gt;
&lt;br /&gt;
Use Java 1.6 as your compiler. Be sure to use the right JVM for this project.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35560</id>
		<title>Hacking DiscoverEd</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35560"/>
				<updated>2010-06-16T14:20:27Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Troubleshooting */ +two more troubleshooting sections&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;How to deploy a hackable DiscoverEd, make changes, and update your deployment&lt;br /&gt;
&lt;br /&gt;
[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
= Check out and build the source code =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ git clone git://gitorious.org/discovered/repo.git discovered&lt;br /&gt;
$ cd discovered&lt;br /&gt;
$ git checkout (whatever branch we're working on today)&lt;br /&gt;
$ ant&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Add a curator and a feed =&lt;br /&gt;
&lt;br /&gt;
DiscoverEd uses feeds to help identify resources to crawl.  Feeds are provided by curators, who can also provide metadata about resources.&lt;br /&gt;
&lt;br /&gt;
By default DiscoverEd uses MySQL and looks for a database called discovered.  ```Configure your database settings by editing &amp;lt;code&amp;gt;conf/discovered.xml&amp;lt;/code&amp;gt;.```&lt;br /&gt;
&lt;br /&gt;
Make sure the database exists and then:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds addcurator &amp;quot;ND OCW&amp;quot; http://ocw.nd.edu/ &lt;br /&gt;
$ ./bin/feeds addfeed rss http://ocw.nd.edu/english/@@rss http://ocw.nd.edu/&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
See [[DiscoverEd Feeds]] for information on supported feed types.&lt;br /&gt;
&lt;br /&gt;
More information on &amp;lt;code&amp;gt;./bin/feeds&amp;lt;/code&amp;gt; commands at [[Running DiscoverEd]]  (some information will be discovered.cc specific)&lt;br /&gt;
&lt;br /&gt;
= Aggregate and crawl resources =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds aggregate&lt;br /&gt;
$ ./bin/feeds seed &amp;gt; seed/urls.txt&lt;br /&gt;
$ ant -f dedbuild.xml crawl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Run the web application =&lt;br /&gt;
&lt;br /&gt;
'''Edit conf/nutch-site.xml to point to your crawl location.'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
$ ant war&lt;br /&gt;
$ cp build/nutch-1.1.war [substitute the location for your J2EE container here; ie, /var/lib/tomcat6/webapps ]&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Hacking The Code  =&lt;br /&gt;
&lt;br /&gt;
* Run Eclipse&lt;br /&gt;
* Do File -&amp;gt; Import...&lt;br /&gt;
** When it asks you to &amp;quot;Existing projects into workspace,&amp;quot; choose &amp;quot;General -&amp;gt; File System&amp;quot;&lt;br /&gt;
** Select the location of your source tree&lt;br /&gt;
** Click Finish&lt;br /&gt;
&lt;br /&gt;
(There are three options.  1. &amp;quot;Existing projects into workspace&amp;quot;. 2. &amp;quot;Create from existing source&amp;quot; 3. &amp;quot;File  1. &amp;quot;Existing projects into workspace&amp;quot;. 2. &amp;quot;Create from existing source&amp;quot; 3. &amp;quot;File System&amp;quot;. Some of these trigger an error regarding Nutch MP3 code.)&lt;br /&gt;
&lt;br /&gt;
The DiscoverEd source code lives in two locations:&lt;br /&gt;
&lt;br /&gt;
* ded/src/java contains DiscoverEd specific code, primarily related to interfacing with the RDF store.&lt;br /&gt;
* src/plugins/cclearn contains the DiscoverEd Nutch plugin, which provides some filtering features to Nutch and ensures metadata indexed in the RDF store is injected into the Lucene index&lt;br /&gt;
&lt;br /&gt;
Generally, the plugin may depend upon code in the ded/src/java tree, but classes in the plugin may not be available to that code.&lt;br /&gt;
&lt;br /&gt;
= Commiting Changes and Merging to the Main Repository =&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Troubleshooting =&lt;br /&gt;
&lt;br /&gt;
==== I get a big long Java backtrace talking about Jena and MySQL the first time I run the code ====&lt;br /&gt;
&lt;br /&gt;
This means that you need to CREATE DATABASE discovered in MySQL. DiscoverEd stores its data in MySQL by default, and you need to either (a) create that database, or (b) choose a different configuration file.&lt;br /&gt;
&lt;br /&gt;
==== Database permissions ====&lt;br /&gt;
&lt;br /&gt;
You might need to change the MySQL credentials or database configuration value in &amp;lt;code&amp;gt;conf/discovered.xml&amp;lt;/code&amp;gt;. DiscoverEd does not require that you use the root user; it does require that the database already exist.&lt;br /&gt;
&lt;br /&gt;
==== JAVA_HOME on a Mac ====&lt;br /&gt;
&lt;br /&gt;
Mac users setting JAVA_HOME should use&lt;br /&gt;
/usr/libexec/java_home to determine the current JAVA_HOME&lt;br /&gt;
&lt;br /&gt;
if you're really lazy add&lt;br /&gt;
JAVA_HOME=`/usr/libexec/java_home` &lt;br /&gt;
to .bash_profile and it will set JAVA_HOME each time you invoke a shell. (This is a good idea!)&lt;br /&gt;
&lt;br /&gt;
==== Error message: &amp;quot;Feature 'http://apache.org/xml/features/xinclude' is not recognized.&amp;quot; ====&lt;br /&gt;
&amp;quot;You probably have an older version of Xerces somewhere in your classpath or something is overriding the default parser configuration with one that doesn't support XInclude.&amp;quot; (http://marc.info/?l=xerces-j-user&amp;amp;m=117066278506146&amp;amp;w=2)&lt;br /&gt;
&lt;br /&gt;
==== AccessControlException ====&lt;br /&gt;
&lt;br /&gt;
When starting Tomcat, if you get a traceback like this in your tomcat log (e.g., in /var/lib/tomcat6/logs/localhost-$date.log):&lt;br /&gt;
&lt;br /&gt;
 SEVERE: Exception sending context initialized event to listener instance of class org.apache.nutch.searcher.NutchBean$NutchBeanConstructor&lt;br /&gt;
 java.lang.RuntimeException: java.security.AccessControlException: access denied (java.lang.reflect.ReflectPermission suppressAccessChecks)&lt;br /&gt;
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1377)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)&lt;br /&gt;
&lt;br /&gt;
and so on, try changing the Tomcat policy in /etc/tomcat6/policy.d/04webapps.policy. Add these lines in the grant {} block:&lt;br /&gt;
&lt;br /&gt;
    // Attempt to get Nutch working&lt;br /&gt;
    // Courtesy of Alex McLintock at http://mail-archives.apache.org/mod_mbox/lucene-nutch-user/200907.mbox/&amp;lt;d398ec7f0907041237j6acffe0fm10b7cd374a77795b@mail.gmail.com&amp;gt;&lt;br /&gt;
    permission java.security.AllPermission;&lt;br /&gt;
&lt;br /&gt;
This is obviously inappropriate for any site running a public instance of DiscoverEd. But it might be useful for your local dev environment. If you know how to specify a class level permission, please update this document.&lt;br /&gt;
&lt;br /&gt;
==== Missing build/plugins ====&lt;br /&gt;
&lt;br /&gt;
Be sure to run ant in the root repo directory.&lt;br /&gt;
&lt;br /&gt;
==== Missing parse-mp3 plugin ====&lt;br /&gt;
&lt;br /&gt;
Remove that source folder from the build path (in Eclimse, Project &amp;gt; Properties &amp;gt; Java Build Path &amp;gt; Source.&lt;br /&gt;
&lt;br /&gt;
==== Eclipse complains: Wrong version number in .class file ====&lt;br /&gt;
&lt;br /&gt;
Use Java 1.6 as your compiler. Be sure to use the right JVM for this project.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35559</id>
		<title>Hacking DiscoverEd</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35559"/>
				<updated>2010-06-16T14:17:58Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Troubleshooting */ +section about what to do if you're missing build/plugins&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;How to deploy a hackable DiscoverEd, make changes, and update your deployment&lt;br /&gt;
&lt;br /&gt;
[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
= Check out and build the source code =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ git clone git://gitorious.org/discovered/repo.git discovered&lt;br /&gt;
$ cd discovered&lt;br /&gt;
$ git checkout (whatever branch we're working on today)&lt;br /&gt;
$ ant&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Add a curator and a feed =&lt;br /&gt;
&lt;br /&gt;
DiscoverEd uses feeds to help identify resources to crawl.  Feeds are provided by curators, who can also provide metadata about resources.&lt;br /&gt;
&lt;br /&gt;
By default DiscoverEd uses MySQL and looks for a database called discovered.  ```Configure your database settings by editing &amp;lt;code&amp;gt;conf/discovered.xml&amp;lt;/code&amp;gt;.```&lt;br /&gt;
&lt;br /&gt;
Make sure the database exists and then:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds addcurator &amp;quot;ND OCW&amp;quot; http://ocw.nd.edu/ &lt;br /&gt;
$ ./bin/feeds addfeed rss http://ocw.nd.edu/english/@@rss http://ocw.nd.edu/&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
See [[DiscoverEd Feeds]] for information on supported feed types.&lt;br /&gt;
&lt;br /&gt;
More information on &amp;lt;code&amp;gt;./bin/feeds&amp;lt;/code&amp;gt; commands at [[Running DiscoverEd]]  (some information will be discovered.cc specific)&lt;br /&gt;
&lt;br /&gt;
= Aggregate and crawl resources =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds aggregate&lt;br /&gt;
$ ./bin/feeds seed &amp;gt; seed/urls.txt&lt;br /&gt;
$ ant -f dedbuild.xml crawl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Run the web application =&lt;br /&gt;
&lt;br /&gt;
'''Edit conf/nutch-site.xml to point to your crawl location.'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
$ ant war&lt;br /&gt;
$ cp build/nutch-1.1.war [substitute the location for your J2EE container here; ie, /var/lib/tomcat6/webapps ]&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Hacking The Code  =&lt;br /&gt;
&lt;br /&gt;
* Run Eclipse&lt;br /&gt;
* Do File -&amp;gt; Import...&lt;br /&gt;
** When it asks you to &amp;quot;Existing projects into workspace,&amp;quot; choose &amp;quot;General -&amp;gt; File System&amp;quot;&lt;br /&gt;
** Select the location of your source tree&lt;br /&gt;
** Click Finish&lt;br /&gt;
&lt;br /&gt;
(There are three options.  1. &amp;quot;Existing projects into workspace&amp;quot;. 2. &amp;quot;Create from existing source&amp;quot; 3. &amp;quot;File  1. &amp;quot;Existing projects into workspace&amp;quot;. 2. &amp;quot;Create from existing source&amp;quot; 3. &amp;quot;File System&amp;quot;. Some of these trigger an error regarding Nutch MP3 code.)&lt;br /&gt;
&lt;br /&gt;
The DiscoverEd source code lives in two locations:&lt;br /&gt;
&lt;br /&gt;
* ded/src/java contains DiscoverEd specific code, primarily related to interfacing with the RDF store.&lt;br /&gt;
* src/plugins/cclearn contains the DiscoverEd Nutch plugin, which provides some filtering features to Nutch and ensures metadata indexed in the RDF store is injected into the Lucene index&lt;br /&gt;
&lt;br /&gt;
Generally, the plugin may depend upon code in the ded/src/java tree, but classes in the plugin may not be available to that code.&lt;br /&gt;
&lt;br /&gt;
= Commiting Changes and Merging to the Main Repository =&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Troubleshooting =&lt;br /&gt;
&lt;br /&gt;
==== I get a big long Java backtrace talking about Jena and MySQL the first time I run the code ====&lt;br /&gt;
&lt;br /&gt;
This means that you need to CREATE DATABASE discovered in MySQL. DiscoverEd stores its data in MySQL by default, and you need to either (a) create that database, or (b) choose a different configuration file.&lt;br /&gt;
&lt;br /&gt;
==== Database permissions ====&lt;br /&gt;
&lt;br /&gt;
You might need to change the MySQL credentials or database configuration value in &amp;lt;code&amp;gt;conf/discovered.xml&amp;lt;/code&amp;gt;. DiscoverEd does not require that you use the root user; it does require that the database already exist.&lt;br /&gt;
&lt;br /&gt;
==== JAVA_HOME on a Mac ====&lt;br /&gt;
&lt;br /&gt;
Mac users setting JAVA_HOME should use&lt;br /&gt;
/usr/libexec/java_home to determine the current JAVA_HOME&lt;br /&gt;
&lt;br /&gt;
if you're really lazy add&lt;br /&gt;
JAVA_HOME=`/usr/libexec/java_home` &lt;br /&gt;
to .bash_profile and it will set JAVA_HOME each time you invoke a shell. (This is a good idea!)&lt;br /&gt;
&lt;br /&gt;
==== Error message: &amp;quot;Feature 'http://apache.org/xml/features/xinclude' is not recognized.&amp;quot; ====&lt;br /&gt;
&amp;quot;You probably have an older version of Xerces somewhere in your classpath or something is overriding the default parser configuration with one that doesn't support XInclude.&amp;quot; (http://marc.info/?l=xerces-j-user&amp;amp;m=117066278506146&amp;amp;w=2)&lt;br /&gt;
&lt;br /&gt;
==== AccessControlException ====&lt;br /&gt;
&lt;br /&gt;
When starting Tomcat, if you get a traceback like this in your tomcat log (e.g., in /var/lib/tomcat6/logs/localhost-$date.log):&lt;br /&gt;
&lt;br /&gt;
 SEVERE: Exception sending context initialized event to listener instance of class org.apache.nutch.searcher.NutchBean$NutchBeanConstructor&lt;br /&gt;
 java.lang.RuntimeException: java.security.AccessControlException: access denied (java.lang.reflect.ReflectPermission suppressAccessChecks)&lt;br /&gt;
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1377)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)&lt;br /&gt;
&lt;br /&gt;
and so on, try changing the Tomcat policy in /etc/tomcat6/policy.d/04webapps.policy. Add these lines in the grant {} block:&lt;br /&gt;
&lt;br /&gt;
    // Attempt to get Nutch working&lt;br /&gt;
    // Courtesy of Alex McLintock at http://mail-archives.apache.org/mod_mbox/lucene-nutch-user/200907.mbox/&amp;lt;d398ec7f0907041237j6acffe0fm10b7cd374a77795b@mail.gmail.com&amp;gt;&lt;br /&gt;
    permission java.security.AllPermission;&lt;br /&gt;
&lt;br /&gt;
This is obviously inappropriate for any site running a public instance of DiscoverEd. But it might be useful for your local dev environment. If you know how to specify a class level permission, please update this document.&lt;br /&gt;
&lt;br /&gt;
==== Missing build/plugins ====&lt;br /&gt;
&lt;br /&gt;
Be sure to run `ant` in the root repo directory.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35543</id>
		<title>Hacking DiscoverEd</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35543"/>
				<updated>2010-06-15T18:59:59Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* AccessControlException */ formatting edit&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;How to deploy a hackable DiscoverEd, make changes, and update your deployment&lt;br /&gt;
&lt;br /&gt;
[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
= Check out and build the source code =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ git clone git://gitorious.org/discovered/repo.git discovered&lt;br /&gt;
$ cd discovered&lt;br /&gt;
$ git checkout (whatever branch we're working on today)&lt;br /&gt;
$ ant&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Add a curator and a feed =&lt;br /&gt;
&lt;br /&gt;
DiscoverEd uses feeds to help identify resources to crawl.  Feeds are provided by curators, who can also provide metadata about resources.&lt;br /&gt;
&lt;br /&gt;
By default DiscoverEd uses MySQL and looks for a database called discovered.  ```Configure your database settings by editing &amp;lt;code&amp;gt;conf/discovered.xml&amp;lt;/code&amp;gt;.```&lt;br /&gt;
&lt;br /&gt;
Make sure the database exists and then:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds addcurator &amp;quot;ND OCW&amp;quot; http://ocw.nd.edu/ &lt;br /&gt;
$ ./bin/feeds addfeed rss http://ocw.nd.edu/english/@@rss http://ocw.nd.edu/&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
See [[DiscoverEd Feeds]] for information on supported feed types.&lt;br /&gt;
&lt;br /&gt;
More information on &amp;lt;code&amp;gt;./bin/feeds&amp;lt;/code&amp;gt; commands at [[Running DiscoverEd]]  (some information will be discovered.cc specific)&lt;br /&gt;
&lt;br /&gt;
= Aggregate and crawl resources =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds aggregate&lt;br /&gt;
$ ./bin/feeds seed &amp;gt; seed/urls.txt&lt;br /&gt;
$ ant -f dedbuild.xml crawl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Run the web application =&lt;br /&gt;
&lt;br /&gt;
'''Edit conf/nutch-site.xml to point to your crawl location.'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
$ ant war&lt;br /&gt;
$ cp build/nutch-1.1.war [substitute the location for your J2EE container here; ie, /var/lib/tomcat6/webapps ]&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Hacking The Code  =&lt;br /&gt;
&lt;br /&gt;
* Run Eclipse&lt;br /&gt;
* Do File -&amp;gt; Import...&lt;br /&gt;
** When it asks you to &amp;quot;Existing projects into workspace,&amp;quot; choose &amp;quot;General -&amp;gt; File System&amp;quot;&lt;br /&gt;
** Select the location of your source tree&lt;br /&gt;
** Click Finish&lt;br /&gt;
&lt;br /&gt;
(There are three options.  1. &amp;quot;Existing projects into workspace&amp;quot;. 2. &amp;quot;Create from existing source&amp;quot; 3. &amp;quot;File  1. &amp;quot;Existing projects into workspace&amp;quot;. 2. &amp;quot;Create from existing source&amp;quot; 3. &amp;quot;File System&amp;quot;. Some of these trigger an error regarding Nutch MP3 code.)&lt;br /&gt;
&lt;br /&gt;
The DiscoverEd source code lives in two locations:&lt;br /&gt;
&lt;br /&gt;
* ded/src/java contains DiscoverEd specific code, primarily related to interfacing with the RDF store.&lt;br /&gt;
* src/plugins/cclearn contains the DiscoverEd Nutch plugin, which provides some filtering features to Nutch and ensures metadata indexed in the RDF store is injected into the Lucene index&lt;br /&gt;
&lt;br /&gt;
Generally, the plugin may depend upon code in the ded/src/java tree, but classes in the plugin may not be available to that code.&lt;br /&gt;
&lt;br /&gt;
= Commiting Changes and Merging to the Main Repository =&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Troubleshooting =&lt;br /&gt;
&lt;br /&gt;
==== I get a big long Java backtrace talking about Jena and MySQL the first time I run the code ====&lt;br /&gt;
&lt;br /&gt;
This means that you need to CREATE DATABASE discovered in MySQL. DiscoverEd stores its data in MySQL by default, and you need to either (a) create that database, or (b) choose a different configuration file.&lt;br /&gt;
&lt;br /&gt;
==== Database permissions ====&lt;br /&gt;
&lt;br /&gt;
You might need to change the MySQL credentials or database configuration value in &amp;lt;code&amp;gt;conf/discovered.xml&amp;lt;/code&amp;gt;. DiscoverEd does not require that you use the root user; it does require that the database already exist.&lt;br /&gt;
&lt;br /&gt;
==== JAVA_HOME on a Mac ====&lt;br /&gt;
&lt;br /&gt;
Mac users setting JAVA_HOME should use&lt;br /&gt;
/usr/libexec/java_home to determine the current JAVA_HOME&lt;br /&gt;
&lt;br /&gt;
if you're really lazy add&lt;br /&gt;
JAVA_HOME=`/usr/libexec/java_home` &lt;br /&gt;
to .bash_profile and it will set JAVA_HOME each time you invoke a shell. (This is a good idea!)&lt;br /&gt;
&lt;br /&gt;
==== Error message: &amp;quot;Feature 'http://apache.org/xml/features/xinclude' is not recognized.&amp;quot; ====&lt;br /&gt;
&amp;quot;You probably have an older version of Xerces somewhere in your classpath or something is overriding the default parser configuration with one that doesn't support XInclude.&amp;quot; (http://marc.info/?l=xerces-j-user&amp;amp;m=117066278506146&amp;amp;w=2)&lt;br /&gt;
&lt;br /&gt;
==== AccessControlException ====&lt;br /&gt;
&lt;br /&gt;
When starting Tomcat, if you get a traceback like this in your tomcat log (e.g., in /var/lib/tomcat6/logs/localhost-$date.log):&lt;br /&gt;
&lt;br /&gt;
 SEVERE: Exception sending context initialized event to listener instance of class org.apache.nutch.searcher.NutchBean$NutchBeanConstructor&lt;br /&gt;
 java.lang.RuntimeException: java.security.AccessControlException: access denied (java.lang.reflect.ReflectPermission suppressAccessChecks)&lt;br /&gt;
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1377)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)&lt;br /&gt;
&lt;br /&gt;
and so on, try changing the Tomcat policy in /etc/tomcat6/policy.d/04webapps.policy. Add these lines in the grant {} block:&lt;br /&gt;
&lt;br /&gt;
    // Attempt to get Nutch working&lt;br /&gt;
    // Courtesy of Alex McLintock at http://mail-archives.apache.org/mod_mbox/lucene-nutch-user/200907.mbox/&amp;lt;d398ec7f0907041237j6acffe0fm10b7cd374a77795b@mail.gmail.com&amp;gt;&lt;br /&gt;
    permission java.security.AllPermission;&lt;br /&gt;
&lt;br /&gt;
This is obviously inappropriate for any site running a public instance of DiscoverEd. But it might be useful for your local dev environment. If you know how to specify a class level permission, please update this document.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35542</id>
		<title>Hacking DiscoverEd</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35542"/>
				<updated>2010-06-15T18:59:20Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Troubleshooting */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;How to deploy a hackable DiscoverEd, make changes, and update your deployment&lt;br /&gt;
&lt;br /&gt;
[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
= Check out and build the source code =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ git clone git://gitorious.org/discovered/repo.git discovered&lt;br /&gt;
$ cd discovered&lt;br /&gt;
$ git checkout (whatever branch we're working on today)&lt;br /&gt;
$ ant&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Add a curator and a feed =&lt;br /&gt;
&lt;br /&gt;
DiscoverEd uses feeds to help identify resources to crawl.  Feeds are provided by curators, who can also provide metadata about resources.&lt;br /&gt;
&lt;br /&gt;
By default DiscoverEd uses MySQL and looks for a database called discovered.  ```Configure your database settings by editing &amp;lt;code&amp;gt;conf/discovered.xml&amp;lt;/code&amp;gt;.```&lt;br /&gt;
&lt;br /&gt;
Make sure the database exists and then:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds addcurator &amp;quot;ND OCW&amp;quot; http://ocw.nd.edu/ &lt;br /&gt;
$ ./bin/feeds addfeed rss http://ocw.nd.edu/english/@@rss http://ocw.nd.edu/&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
See [[DiscoverEd Feeds]] for information on supported feed types.&lt;br /&gt;
&lt;br /&gt;
More information on &amp;lt;code&amp;gt;./bin/feeds&amp;lt;/code&amp;gt; commands at [[Running DiscoverEd]]  (some information will be discovered.cc specific)&lt;br /&gt;
&lt;br /&gt;
= Aggregate and crawl resources =&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds aggregate&lt;br /&gt;
$ ./bin/feeds seed &amp;gt; seed/urls.txt&lt;br /&gt;
$ ant -f dedbuild.xml crawl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Run the web application =&lt;br /&gt;
&lt;br /&gt;
'''Edit conf/nutch-site.xml to point to your crawl location.'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
$ ant war&lt;br /&gt;
$ cp build/nutch-1.1.war [substitute the location for your J2EE container here; ie, /var/lib/tomcat6/webapps ]&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Hacking The Code  =&lt;br /&gt;
&lt;br /&gt;
* Run Eclipse&lt;br /&gt;
* Do File -&amp;gt; Import...&lt;br /&gt;
** When it asks you to &amp;quot;Existing projects into workspace,&amp;quot; choose &amp;quot;General -&amp;gt; File System&amp;quot;&lt;br /&gt;
** Select the location of your source tree&lt;br /&gt;
** Click Finish&lt;br /&gt;
&lt;br /&gt;
(There are three options.  1. &amp;quot;Existing projects into workspace&amp;quot;. 2. &amp;quot;Create from existing source&amp;quot; 3. &amp;quot;File  1. &amp;quot;Existing projects into workspace&amp;quot;. 2. &amp;quot;Create from existing source&amp;quot; 3. &amp;quot;File System&amp;quot;. Some of these trigger an error regarding Nutch MP3 code.)&lt;br /&gt;
&lt;br /&gt;
The DiscoverEd source code lives in two locations:&lt;br /&gt;
&lt;br /&gt;
* ded/src/java contains DiscoverEd specific code, primarily related to interfacing with the RDF store.&lt;br /&gt;
* src/plugins/cclearn contains the DiscoverEd Nutch plugin, which provides some filtering features to Nutch and ensures metadata indexed in the RDF store is injected into the Lucene index&lt;br /&gt;
&lt;br /&gt;
Generally, the plugin may depend upon code in the ded/src/java tree, but classes in the plugin may not be available to that code.&lt;br /&gt;
&lt;br /&gt;
= Commiting Changes and Merging to the Main Repository =&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Troubleshooting =&lt;br /&gt;
&lt;br /&gt;
==== I get a big long Java backtrace talking about Jena and MySQL the first time I run the code ====&lt;br /&gt;
&lt;br /&gt;
This means that you need to CREATE DATABASE discovered in MySQL. DiscoverEd stores its data in MySQL by default, and you need to either (a) create that database, or (b) choose a different configuration file.&lt;br /&gt;
&lt;br /&gt;
==== Database permissions ====&lt;br /&gt;
&lt;br /&gt;
You might need to change the MySQL credentials or database configuration value in &amp;lt;code&amp;gt;conf/discovered.xml&amp;lt;/code&amp;gt;. DiscoverEd does not require that you use the root user; it does require that the database already exist.&lt;br /&gt;
&lt;br /&gt;
==== JAVA_HOME on a Mac ====&lt;br /&gt;
&lt;br /&gt;
Mac users setting JAVA_HOME should use&lt;br /&gt;
/usr/libexec/java_home to determine the current JAVA_HOME&lt;br /&gt;
&lt;br /&gt;
if you're really lazy add&lt;br /&gt;
JAVA_HOME=`/usr/libexec/java_home` &lt;br /&gt;
to .bash_profile and it will set JAVA_HOME each time you invoke a shell. (This is a good idea!)&lt;br /&gt;
&lt;br /&gt;
==== Error message: &amp;quot;Feature 'http://apache.org/xml/features/xinclude' is not recognized.&amp;quot; ====&lt;br /&gt;
&amp;quot;You probably have an older version of Xerces somewhere in your classpath or something is overriding the default parser configuration with one that doesn't support XInclude.&amp;quot; (http://marc.info/?l=xerces-j-user&amp;amp;m=117066278506146&amp;amp;w=2)&lt;br /&gt;
&lt;br /&gt;
==== AccessControlException ====&lt;br /&gt;
&lt;br /&gt;
When starting Tomcat, if you get a traceback like this in your tomcat log (e.g., in /var/lib/tomcat6/logs/localhost-$date.log):&lt;br /&gt;
&lt;br /&gt;
 SEVERE: Exception sending context initialized event to listener instance of class org.apache.nutch.searcher.NutchBean$NutchBeanConstructor&lt;br /&gt;
java.lang.RuntimeException: java.security.AccessControlException: access denied (java.lang.reflect.ReflectPermission suppressAccessChecks)&lt;br /&gt;
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1377)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)&lt;br /&gt;
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)&lt;br /&gt;
&lt;br /&gt;
and so on, try changing the Tomcat policy in /etc/tomcat6/policy.d/04webapps.policy. Add these lines in the grant {} block:&lt;br /&gt;
&lt;br /&gt;
    // Attempt to get Nutch working&lt;br /&gt;
    // Courtesy of Alex McLintock at http://mail-archives.apache.org/mod_mbox/lucene-nutch-user/200907.mbox/&amp;lt;d398ec7f0907041237j6acffe0fm10b7cd374a77795b@mail.gmail.com&amp;gt;&lt;br /&gt;
    permission java.security.AllPermission;&lt;br /&gt;
&lt;br /&gt;
This is obviously inappropriate for any site running a public instance of DiscoverEd. But it might be useful for your local dev environment. If you know how to specify a class level permission, please update this document.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd&amp;diff=35473</id>
		<title>DiscoverEd</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd&amp;diff=35473"/>
				<updated>2010-06-15T13:57:32Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Additional Information */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;DiscoverEd is a search prototype developed by Creative Commons to explore metadata enhanced search, specifically for OER.  DiscoverEd combines full text search with [[DiscoverEd Data|metadata about the resources]].  DiscoverEd is built on [http://lucene.apache.org/nutch/ Nutch].&lt;br /&gt;
&lt;br /&gt;
== Additional Information ==&lt;br /&gt;
&lt;br /&gt;
* Source repository  ([http://gitorious.org/discovered gitorious])&lt;br /&gt;
* Project planning ([https://www.pivotaltracker.com/projects/77041 Pivotal Tracker])&lt;br /&gt;
* [[DiscoverEd/Development notes]]&lt;br /&gt;
* [[/Deploy for programmers]]&lt;br /&gt;
&lt;br /&gt;
[[Category:DiscoverEd]]&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35470</id>
		<title>Hacking DiscoverEd</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Hacking_DiscoverEd&amp;diff=35470"/>
				<updated>2010-06-15T13:57:02Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: Created page with 'Edit this page here: http://piratepad.net/IgSLjgAcA2'&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Edit this page here: http://piratepad.net/IgSLjgAcA2&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd_Sprint_(June,_2010)&amp;diff=35415</id>
		<title>DiscoverEd Sprint (June, 2010)</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd_Sprint_(June,_2010)&amp;diff=35415"/>
				<updated>2010-06-14T16:34:16Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Attendees */ fix minor typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
&lt;br /&gt;
* '''What:''' A sprint on development of [[DiscoverEd]]; see [[:Category:DiscoverEd_Specification|DiscoverEd Specifications]] for possible areas of work&lt;br /&gt;
* '''When:''' Tuesday, June 15 through Thursday, June 17, 2010&lt;br /&gt;
* '''Where:''' [http://vudat.msu.edu/location/ Wills House], Michigan State University, East Lansing, MI ([http://maps.google.com/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=101+Wills+House,+east+lansing,+mi&amp;amp;sll=37.0625,-95.677068&amp;amp;sspn=41.818029,58.447266&amp;amp;ie=UTF8&amp;amp;hq=&amp;amp;hnear=Wills+House,+East+Lansing,+Ingham,+Michigan+48823&amp;amp;z=14 map])&lt;br /&gt;
&lt;br /&gt;
== Attendees ==&lt;br /&gt;
&lt;br /&gt;
* Asheesh Laroia (OpenHatch / Creative Commons)&lt;br /&gt;
* Raphael Krut-Landau (OpenHatch / Creative Commons)&lt;br /&gt;
* [[User:Nathan Yergler|Nathan Yergler]] (Creative Commons)&lt;br /&gt;
* Alex Kozak (Creative Commons)&lt;br /&gt;
* Ali Asad Lotia (open.michigan)&lt;br /&gt;
* Kevin Coffman (open.michigan)&lt;br /&gt;
* ''add your name and affiliation here''&lt;br /&gt;
&lt;br /&gt;
== Travel &amp;amp; Accommodations ==&lt;br /&gt;
&lt;br /&gt;
* [http://vudat.msu.edu/directions/ Directions to Wills House]&lt;br /&gt;
** Brendan Guenther will provide MSU Guest Parking Passes when you arrive in the morning&lt;br /&gt;
* Area Hotels (mention MSU for discounted rate)&lt;br /&gt;
** [http://www.marriott.com/hotels/travel/lants-towneplace-suites-east-lansing/ Townplace Suites by Marriott]&lt;br /&gt;
** [http://www.hamptoninn.com/en/hp/hotels/index.jhtml?ctyhocn=LANETHX Hampton Inn]&lt;br /&gt;
&lt;br /&gt;
== Agenda ==&lt;br /&gt;
&lt;br /&gt;
''This is a draft, subject to change.''&lt;br /&gt;
&lt;br /&gt;
=== Tuesday ===&lt;br /&gt;
&lt;br /&gt;
* 9:00 AM - Welcome&lt;br /&gt;
* DiscoverEd Context: Why, When, Where are we going? (10 min)&lt;br /&gt;
* Introductions&amp;lt;br/&amp;gt;Developers introduce themselves, give brief statement on what they've done with DiscoverEd, what they're interested in working on. (5 min ea.)&lt;br /&gt;
* MSU Context: AgShare, FSKN, etc (Chris Geith, 10-15 min)&lt;br /&gt;
* 10:30 AM - Identify Themes, Possible Blocks of Work&lt;br /&gt;
* 11:30 AM - Pair up and begin work&lt;br /&gt;
* 4:45 PM - Brief report back from each group: unexpected issues, things to bring to the group as a whole, etc.&lt;br /&gt;
&lt;br /&gt;
=== Wednesday ===&lt;br /&gt;
&lt;br /&gt;
No schedule; work as pairs.&lt;br /&gt;
&lt;br /&gt;
=== Thursday ===&lt;br /&gt;
&lt;br /&gt;
* 3:30 PM - Pairs begin to make sure work is pushed to Gitorious, determine next steps&lt;br /&gt;
* 4:00 PM - Full report back: pairs report on progress and state of their code. Pairs are encouraged to identify follow up steps they will take post sprint to see tasks to completion.&lt;br /&gt;
&lt;br /&gt;
== Preparation ==&lt;br /&gt;
&lt;br /&gt;
In order to minimize time spent configuring laptops, etc, please try to do the following before arriving at the sprint:&lt;br /&gt;
&lt;br /&gt;
# Make sure you have prerequisite software installed and working:&lt;br /&gt;
#* git (Windows users, see http://progit.org/book/ch1-4.html and http://code.google.com/p/msysgit/)&lt;br /&gt;
#* Java 1.6 JDK&lt;br /&gt;
#* Eclipse (not required, but can make life easier)&lt;br /&gt;
# Generate an SSH public key (if needed; see http://progit.org/book/ch4-3.html for some instructions)&lt;br /&gt;
# Create a [http://gitorious.org Gitorious] account and add your SSH key to it&lt;br /&gt;
&lt;br /&gt;
=== Code Preparation ===&lt;br /&gt;
&lt;br /&gt;
''TBD''&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Linked_Data_Curation&amp;diff=35264</id>
		<title>Linked Data Curation</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Linked_Data_Curation&amp;diff=35264"/>
				<updated>2010-06-09T14:29:17Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: minor typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Nathan Yergler&lt;br /&gt;
|project=DiscoverEd&lt;br /&gt;
|status=Draft&lt;br /&gt;
}}&lt;br /&gt;
DiscoverEd currently relies on feeds or OAI-PMH to aggregate resources from a curator.  Third-party curators may wish to provide a list of resources with additional metadata without the overhead or restrictions of providing feeds or OAI-PMH support.  This specification describes a linked, structured data model for curating resources for DiscoverEd without feeds or OAI-PMH.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
&lt;br /&gt;
* [http://www.openarchives.org/ore/ OAI-ORE] (Object Reuse &amp;amp; Exchange)&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Resource_Analytics&amp;diff=35263</id>
		<title>Resource Analytics</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Resource_Analytics&amp;diff=35263"/>
				<updated>2010-06-09T14:28:19Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* Requirements */ minor typo&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Nathan Yergler&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=Draft&lt;br /&gt;
}}&lt;br /&gt;
Curators (both publishers and third party) are interested in verifying that their resources are ingested correctly, how often they are searched for, and how often they are used.  Operators are interested in exploring searches that are not successful, and how users interact with DiscoverEd.  Analytics will provide information about indexed resources, searches performed, and user activity in a DiscoverEd instance.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
* Provide a web-based dashboard which displays the status of curator indexes.  This includes: last aggregation, last crawl, number of resources indexed, any errors which occurred during aggregation or crawl.&lt;br /&gt;
* Provide a web page which displays search analytics.  This includes basic web analytics, such as number of visitors, bounce rate, referrers, and time spent on site.  It also includes DiscoverEd-specific analytics, including: &lt;br /&gt;
** searches grouped by number of results (for exploring queries that have no results, &lt;br /&gt;
** popular search terms, and&lt;br /&gt;
** resource refinements (ie, do users refine by curator? subject? license?)&lt;br /&gt;
&lt;br /&gt;
This may also include resource click-through tracking and reporting.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=Metadata_Provenance&amp;diff=35262</id>
		<title>Metadata Provenance</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=Metadata_Provenance&amp;diff=35262"/>
				<updated>2010-06-09T14:26:23Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &amp;quot;short coming&amp;quot; -&amp;gt; &amp;quot;shortcoming&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{DiscoverEd Specification&lt;br /&gt;
|contact=Nathan Yergler&lt;br /&gt;
|project=AgShare&lt;br /&gt;
|status=In Development&lt;br /&gt;
}}&lt;br /&gt;
{{Draft}}&lt;br /&gt;
&lt;br /&gt;
The initial version of DiscoverEd does not include provenance support.  Provenance means tracking the source of resource metadata.  Due to this limitation, DiscoverEd has limited ability to filter by curator.  While you can filter for resources with a specific curator, the remaining search terms are not limited to metadata provided by that curator.  This is a significant shortcoming for resources with multiple curators.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
* The provenance of metadata discovered through RSS, Atom, and OAI-PMH is stored in the RDF Store.&lt;br /&gt;
* Metadata extracted from structured data is stored with provenance reflecting the page it was extracted from.&lt;br /&gt;
* Users can filter a query to exclude a curator, and metadata provided by that curator is not considered for other query terms.  For example, &amp;quot;&amp;lt;code&amp;gt;-curator:http://example.org subject:biology cells&amp;lt;/code&amp;gt;&amp;quot; would return results containing the term &amp;quot;cells&amp;quot;, with the subject tag &amp;quot;biology&amp;quot; provided by a curator &amp;lt;strong&amp;gt;other than&amp;lt;/strong&amp;gt; http://example.org.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd/Development_notes&amp;diff=35169</id>
		<title>DiscoverEd/Development notes</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd/Development_notes&amp;diff=35169"/>
				<updated>2010-06-07T18:15:15Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: Created page with 'To start using Eclipse on another computer, you'll need to go File &amp;gt; Import and specify the root directory in the repo  (We actually haven't totally checked that this works yet.)'&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;To start using Eclipse on another computer, you'll need to go File &amp;gt; Import and specify the root directory in the repo&lt;br /&gt;
&lt;br /&gt;
(We actually haven't totally checked that this works yet.)&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd&amp;diff=35167</id>
		<title>DiscoverEd</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd&amp;diff=35167"/>
				<updated>2010-06-07T18:14:30Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: add link to * DiscoverEd/Development notes&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;DiscoverEd is a search prototype developed by Creative Commons to explore metadata enhanced search, specifically for OER.  DiscoverEd combines full text search with [[DiscoverEd Data|metadata about the resources]].  DiscoverEd is built on [http://lucene.apache.org/nutch/ Nutch].&lt;br /&gt;
&lt;br /&gt;
== Additional Information ==&lt;br /&gt;
&lt;br /&gt;
* Source repository  ([http://gitorious.org/discovered gitorious])&lt;br /&gt;
* Project planning ([https://www.pivotaltracker.com/projects/77041 Pivotal Tracker])&lt;br /&gt;
* [[DiscoverEd/Development notes]]&lt;br /&gt;
&lt;br /&gt;
[[Category:DiscoverEd]]&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34557</id>
		<title>DiscoverEd Quickstart</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34557"/>
				<updated>2010-05-11T14:50:14Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* How to install DiscoverEd */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
== How to install DiscoverEd ==&lt;br /&gt;
&lt;br /&gt;
Run these commands to download a quickstart script.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em; line-height: 160%;'&amp;gt;&lt;br /&gt;
cd /tmp/ # As good a place as any&lt;br /&gt;
wget http://gitorious.org/discovered/repo/blobs/raw/deploy_script/gimme-discovered&lt;br /&gt;
bash gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The script will check for dependencies.&lt;br /&gt;
&lt;br /&gt;
=== What does the script do? ===&lt;br /&gt;
&lt;br /&gt;
* Install the DiscoverEd code in &amp;lt;tt&amp;gt;/var/lib/discovered/&amp;lt;/tt&amp;gt;&lt;br /&gt;
* Add a sample curator, and a sample feed&lt;br /&gt;
* Download the web pages linked to by that feed&lt;br /&gt;
* Run a test search for the term &amp;quot;crime&amp;quot;, and print the results to your terminal&lt;br /&gt;
&lt;br /&gt;
The above steps use the search engine without using a web browser. To make it all work in your web browser, the script will then do the following:&lt;br /&gt;
* Install a copy of the excellent web server Tomcat in the same place&lt;br /&gt;
* Run that copy of Tomcat&lt;br /&gt;
* Open the search engine in Firefox&lt;br /&gt;
&lt;br /&gt;
=== What you should expect to see ===&lt;br /&gt;
&lt;br /&gt;
This is a development branch. What you'll see is a search engine that says &amp;quot;DiscoverEd&amp;quot;, but which is colored and laid-out incorrectly. The text of the results won't display. (We're working on it.)&lt;br /&gt;
&lt;br /&gt;
== Or do it manually ==&lt;br /&gt;
&lt;br /&gt;
See '''[[DiscoverEd/Install manually]]'''.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34556</id>
		<title>DiscoverEd Quickstart</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34556"/>
				<updated>2010-05-11T14:48:51Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* What does the script do? */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
== How to install DiscoverEd ==&lt;br /&gt;
&lt;br /&gt;
Run these commands to download a quickstart script.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em; line-height: 160%;'&amp;gt;&lt;br /&gt;
cd /tmp/ # As good a place as any&lt;br /&gt;
wget http://gitorious.org/discovered/repo/blobs/raw/deploy_script/gimme-discovered&lt;br /&gt;
bash gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== What does the script do? ===&lt;br /&gt;
&lt;br /&gt;
* Install the DiscoverEd code in &amp;lt;tt&amp;gt;/var/lib/discovered/&amp;lt;/tt&amp;gt;&lt;br /&gt;
* Add a sample curator, and a sample feed&lt;br /&gt;
* Download the web pages linked to by that feed&lt;br /&gt;
* Run a test search for the term &amp;quot;crime&amp;quot;, and print the results to your terminal&lt;br /&gt;
&lt;br /&gt;
The above steps use the search engine without using a web browser. To make it all work in your web browser, the script will then do the following:&lt;br /&gt;
* Install a copy of the excellent web server Tomcat in the same place&lt;br /&gt;
* Run that copy of Tomcat&lt;br /&gt;
* Open the search engine in Firefox&lt;br /&gt;
&lt;br /&gt;
=== What you should expect to see ===&lt;br /&gt;
&lt;br /&gt;
This is a development branch. What you'll see is a search engine that says &amp;quot;DiscoverEd&amp;quot;, but which is colored and laid-out incorrectly. The text of the results won't display. (We're working on it.)&lt;br /&gt;
&lt;br /&gt;
== Or do it manually ==&lt;br /&gt;
&lt;br /&gt;
See '''[[DiscoverEd/Install manually]]'''.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34555</id>
		<title>DiscoverEd Quickstart</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34555"/>
				<updated>2010-05-11T14:45:03Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: /* What does the script do? */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
== How to install DiscoverEd ==&lt;br /&gt;
&lt;br /&gt;
Run these commands to download a quickstart script.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em; line-height: 160%;'&amp;gt;&lt;br /&gt;
cd /tmp/ # As good a place as any&lt;br /&gt;
wget http://gitorious.org/discovered/repo/blobs/raw/deploy_script/gimme-discovered&lt;br /&gt;
bash gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== What does the script do? ===&lt;br /&gt;
&lt;br /&gt;
* Install the DiscoverEd code in &amp;lt;tt&amp;gt;/var/lib/discovered/&amp;lt;/tt&amp;gt;&lt;br /&gt;
* Add a sample curator, and a sample feed&lt;br /&gt;
* Download the web pages linked to by that feed&lt;br /&gt;
* Run a test search for the term &amp;quot;crime&amp;quot;, and print the results to your terminal&lt;br /&gt;
&lt;br /&gt;
The above steps are performed by DiscoverEd. Next, we install a copy of the excellent web server Tomcat, which will allow to you perform searches through your web browser. So the script will do a few more things:&lt;br /&gt;
* Install a copy of the excellent web server Tomcat in the same place&lt;br /&gt;
* Run that copy of Tomcat&lt;br /&gt;
* Open the search engine in Firefox&lt;br /&gt;
&lt;br /&gt;
=== What you should expect to see ===&lt;br /&gt;
&lt;br /&gt;
This is a development branch. What you'll see is a search engine that says &amp;quot;DiscoverEd&amp;quot;, but which is colored and laid-out incorrectly. The text of the results won't display. (We're working on it.)&lt;br /&gt;
&lt;br /&gt;
== Or do it manually ==&lt;br /&gt;
&lt;br /&gt;
See '''[[DiscoverEd/Install manually]]'''.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34494</id>
		<title>DiscoverEd Quickstart</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34494"/>
				<updated>2010-05-07T19:03:09Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
== How to install DiscoverEd ==&lt;br /&gt;
&lt;br /&gt;
Run these commands to download a quickstart script.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em; line-height: 160%;'&amp;gt;&lt;br /&gt;
cd /tmp/ # As good a place as any&lt;br /&gt;
wget http://gitorious.org/discovered/repo/blobs/raw/deploy_script/gimme-discovered&lt;br /&gt;
bash gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== What does the script do? ===&lt;br /&gt;
&lt;br /&gt;
This will do a few things:&lt;br /&gt;
* Install the DiscoverEd code in &amp;lt;tt&amp;gt;/var/lib/discovered/&amp;lt;/tt&amp;gt;&lt;br /&gt;
* Add a sample curator, and a sample feed&lt;br /&gt;
* Download the web pages linked to by that feed&lt;br /&gt;
* Run a test search for the term &amp;quot;crime&amp;quot;, and print the results to your terminal&lt;br /&gt;
&lt;br /&gt;
The above steps are performed by DiscoverEd. Next, we install a copy of the excellent web server Tomcat, which will allow to you perform searches through your web browser. So the script will do a few more things:&lt;br /&gt;
* Install a copy of the excellent web server Tomcat in the same place&lt;br /&gt;
* Run that copy of Tomcat&lt;br /&gt;
* Open the search engine in Firefox&lt;br /&gt;
&lt;br /&gt;
=== What you should expect to see ===&lt;br /&gt;
&lt;br /&gt;
This is a development branch. What you'll see is a search engine that says &amp;quot;DiscoverEd&amp;quot;, but which is colored and laid-out incorrectly. The text of the results won't display. (We're working on it.)&lt;br /&gt;
&lt;br /&gt;
== Or do it manually ==&lt;br /&gt;
&lt;br /&gt;
See '''[[DiscoverEd/Install manually]]'''.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34493</id>
		<title>DiscoverEd Quickstart</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34493"/>
				<updated>2010-05-07T18:57:05Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
== How to install DiscoverEd ==&lt;br /&gt;
&lt;br /&gt;
Run these commands to download a quickstart script.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em; line-height: 160%;'&amp;gt;&lt;br /&gt;
cd /tmp/ # As good a place as any&lt;br /&gt;
wget http://gitorious.org/discovered/repo/blobs/raw/deploy_script/gimme-discovered&lt;br /&gt;
bash gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== What does the script do? ===&lt;br /&gt;
&lt;br /&gt;
This will do a few things:&lt;br /&gt;
* Install the DiscoverEd code in &amp;lt;tt&amp;gt;/var/lib/discovered/&amp;lt;/tt&amp;gt;&lt;br /&gt;
* Add a sample curator, and a sample feed&lt;br /&gt;
* Download the web pages linked to by that feed&lt;br /&gt;
* Run a test search for the term &amp;quot;crime&amp;quot;, and print the results to your terminal&lt;br /&gt;
&lt;br /&gt;
The above steps are performed by DiscoverEd. Next, we install a copy of the excellent web server Tomcat, which will allow to you perform searches through your web browser. So the script will do a few more things:&lt;br /&gt;
* Install a copy of the excellent web server Tomcat in the same place&lt;br /&gt;
* Run that copy of Tomcat&lt;br /&gt;
* Open the search engine in Firefox&lt;br /&gt;
&lt;br /&gt;
== Or do it manually ==&lt;br /&gt;
&lt;br /&gt;
See '''[[DiscoverEd/Install manually]]'''.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34492</id>
		<title>DiscoverEd Quickstart</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34492"/>
				<updated>2010-05-07T18:50:41Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
== How to install DiscoverEd ==&lt;br /&gt;
&lt;br /&gt;
1. Run these commands to download a quickstart script.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em; line-height: 140%;'&amp;gt;&lt;br /&gt;
cd /tmp/ # As good a place as any&lt;br /&gt;
wget http://gitorious.org/discovered/repo/blobs/raw/deploy_script/gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
2. Edit the script and change &amp;lt;tt&amp;gt;MYSQL_ROOT_PASSWORD&amp;lt;/tt&amp;gt; to the right thing.&lt;br /&gt;
&lt;br /&gt;
3. Run this command to execute the script.&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em;'&amp;gt;&lt;br /&gt;
bash gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== What does the script do? ===&lt;br /&gt;
&lt;br /&gt;
This will do a few things:&lt;br /&gt;
* Install the DiscoverEd code in &amp;lt;tt&amp;gt;/var/lib/discovered/&amp;lt;/tt&amp;gt;&lt;br /&gt;
* Add a sample curator, and a sample feed&lt;br /&gt;
* Download the web pages linked to by that feed&lt;br /&gt;
* Run a test search for the term &amp;quot;crime&amp;quot;, and print the results to your terminal&lt;br /&gt;
&lt;br /&gt;
The above steps are performed by DiscoverEd. Next, we install a copy of the excellent web server Tomcat, which will allow to you perform searches through your web browser. So the script will do a few more things:&lt;br /&gt;
* Install a copy of the excellent web server Tomcat in the same place&lt;br /&gt;
* Run that copy of Tomcat&lt;br /&gt;
* Open the search engine in Firefox&lt;br /&gt;
&lt;br /&gt;
== Or do it manually ==&lt;br /&gt;
&lt;br /&gt;
See '''[[DiscoverEd/Install manually]]'''.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34491</id>
		<title>DiscoverEd Quickstart</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34491"/>
				<updated>2010-05-07T18:50:19Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
== How to install DiscoverEd ==&lt;br /&gt;
&lt;br /&gt;
1. Run these commands to download a quickstart script.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em;'&amp;gt;&lt;br /&gt;
cd /tmp/ # As good a place as any&lt;br /&gt;
wget http://gitorious.org/discovered/repo/blobs/raw/deploy_script/gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
2. Edit the script and change &amp;lt;tt&amp;gt;MYSQL_ROOT_PASSWORD&amp;lt;/tt&amp;gt; to the right thing.&lt;br /&gt;
&lt;br /&gt;
3. Run this command to execute the script.&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em;'&amp;gt;&lt;br /&gt;
bash gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== What does the script do? ===&lt;br /&gt;
&lt;br /&gt;
This will do a few things:&lt;br /&gt;
* Install the DiscoverEd code in &amp;lt;tt&amp;gt;/var/lib/discovered/&amp;lt;/tt&amp;gt;&lt;br /&gt;
* Add a sample curator, and a sample feed&lt;br /&gt;
* Download the web pages linked to by that feed&lt;br /&gt;
* Run a test search for the term &amp;quot;crime&amp;quot;, and print the results to your terminal&lt;br /&gt;
&lt;br /&gt;
The above steps are performed by DiscoverEd. Next, we install a copy of the excellent web server Tomcat, which will allow to you perform searches through your web browser. So the script will do a few more things:&lt;br /&gt;
* Install a copy of the excellent web server Tomcat in the same place&lt;br /&gt;
* Run that copy of Tomcat&lt;br /&gt;
* Open the search engine in Firefox&lt;br /&gt;
&lt;br /&gt;
== Or do it manually ==&lt;br /&gt;
&lt;br /&gt;
See '''[[DiscoverEd/Install manually]]'''.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34490</id>
		<title>DiscoverEd Quickstart</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34490"/>
				<updated>2010-05-07T18:49:28Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
== How to install DiscoverEd ==&lt;br /&gt;
&lt;br /&gt;
1. Run this command to download a quickstart script.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em;'&amp;gt;&lt;br /&gt;
wget http://gitorious.org/discovered/repo/blobs/raw/deploy_script/gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
2. Edit the script and change &amp;lt;tt&amp;gt;MYSQL_ROOT_PASSWORD&amp;lt;/tt&amp;gt; to the right thing.&lt;br /&gt;
&lt;br /&gt;
3. Run this command to execute the script.&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em;'&amp;gt;&lt;br /&gt;
bash gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== What does the script do? ===&lt;br /&gt;
&lt;br /&gt;
This will do a few things:&lt;br /&gt;
* Install the DiscoverEd code in &amp;lt;tt&amp;gt;/var/lib/discovered/&amp;lt;/tt&amp;gt;&lt;br /&gt;
* Add a sample curator, and a sample feed&lt;br /&gt;
* Download the web pages linked to by that feed&lt;br /&gt;
* Run a test search for the term &amp;quot;crime&amp;quot;, and print the results to your terminal&lt;br /&gt;
&lt;br /&gt;
The above steps are performed by DiscoverEd. Next, we install a copy of the excellent web server Tomcat, which will allow to you perform searches through your web browser. So the script will do a few more things:&lt;br /&gt;
* Install a copy of the excellent web server Tomcat in the same place&lt;br /&gt;
* Run that copy of Tomcat&lt;br /&gt;
* Open the search engine in Firefox&lt;br /&gt;
&lt;br /&gt;
== Or do it manually ==&lt;br /&gt;
&lt;br /&gt;
See '''[[DiscoverEd/Install manually]]'''.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34489</id>
		<title>DiscoverEd Quickstart</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd_Quickstart&amp;diff=34489"/>
				<updated>2010-05-07T18:48:53Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
&lt;br /&gt;
== How to install DiscoverEd ==&lt;br /&gt;
&lt;br /&gt;
1. Run this command to download a quickstart script.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em;'&amp;gt;&lt;br /&gt;
wget http://gitorious.org/discovered/repo/blobs/raw/deploy_script/gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
2. Edit the script and change &amp;lt;tt&amp;gt;MYSQL_ROOT_PASSWORD&amp;lt;/tt&amp;gt; to the right thing.&lt;br /&gt;
&lt;br /&gt;
3. Run this command to execute the script.&lt;br /&gt;
&amp;lt;pre style='margin: 0 0 2em 2em;'&amp;gt;&lt;br /&gt;
bash gimme-discovered&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== What does the script do? ===&lt;br /&gt;
&lt;br /&gt;
This will do a few things:&lt;br /&gt;
* Install the DiscoverEd code in /var/lib/discovered&lt;br /&gt;
* Add a sample curator, and a sample feed&lt;br /&gt;
* Download the web pages linked to by that feed&lt;br /&gt;
* Run a test search for the term &amp;quot;crime&amp;quot;, and print the results to your terminal&lt;br /&gt;
&lt;br /&gt;
The above steps are performed by DiscoverEd. Next, we install a copy of the excellent web server Tomcat, which will allow to you perform searches through your web browser. So the script will do a few more things:&lt;br /&gt;
* Install a copy of the excellent web server Tomcat in the same place&lt;br /&gt;
* Run that copy of Tomcat&lt;br /&gt;
* Open the search engine in Firefox&lt;br /&gt;
&lt;br /&gt;
== Or do it manually ==&lt;br /&gt;
&lt;br /&gt;
See '''[[DiscoverEd/Install manually]]'''.&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd/Install_manually&amp;diff=34488</id>
		<title>DiscoverEd/Install manually</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd/Install_manually&amp;diff=34488"/>
				<updated>2010-05-07T18:46:41Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: +cat +stub&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:DiscoverEd]]&lt;br /&gt;
{{Stub}}&lt;br /&gt;
&lt;br /&gt;
=== Check out and build the source code ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ git clone git://gitorious.org/discovered/repo.git discovered&lt;br /&gt;
$ cd discovered&lt;br /&gt;
$ ant&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Add a curator and a feed ===&lt;br /&gt;
&lt;br /&gt;
DiscoverEd uses feeds to help identify resources to crawl.  Feeds are provided by curators, who can also provide metadata about resources.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds addcurator &amp;quot;ND OCW&amp;quot; http://ocw.nd.edu/ &lt;br /&gt;
$ ./bin/feeds addfeed rss http://ocw.nd.edu/front-page/courselist/rss http://ocw.nd.edu/&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Aggregate and crawl resources ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds aggregate&lt;br /&gt;
$ mkdir seed&lt;br /&gt;
$ ./bin/feeds seed &amp;gt; seed/urls.txt&lt;br /&gt;
$ ant -f dedbuild.xml crawl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Run the web server ===&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	<entry>
		<id>https://wiki.creativecommons.org/index.php?title=DiscoverEd/Install_manually&amp;diff=34487</id>
		<title>DiscoverEd/Install manually</title>
		<link rel="alternate" type="text/html" href="https://wiki.creativecommons.org/index.php?title=DiscoverEd/Install_manually&amp;diff=34487"/>
				<updated>2010-05-07T18:45:51Z</updated>
		
		<summary type="html">&lt;p&gt;Dithyramble: This text was moved from DiscoverEd Quickstart&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=== Check out and build the source code ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ git clone git://gitorious.org/discovered/repo.git discovered&lt;br /&gt;
$ cd discovered&lt;br /&gt;
$ ant&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Add a curator and a feed ===&lt;br /&gt;
&lt;br /&gt;
DiscoverEd uses feeds to help identify resources to crawl.  Feeds are provided by curators, who can also provide metadata about resources.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds addcurator &amp;quot;ND OCW&amp;quot; http://ocw.nd.edu/ &lt;br /&gt;
$ ./bin/feeds addfeed rss http://ocw.nd.edu/front-page/courselist/rss http://ocw.nd.edu/&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Aggregate and crawl resources ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./bin/feeds aggregate&lt;br /&gt;
$ mkdir seed&lt;br /&gt;
$ ./bin/feeds seed &amp;gt; seed/urls.txt&lt;br /&gt;
$ ant -f dedbuild.xml crawl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Run the web server ===&lt;/div&gt;</summary>
		<author><name>Dithyramble</name></author>	</entry>

	</feed>