Difference between revisions of "DiscoverEd/Install manually"
Dithyramble (talk | contribs) (This text was moved from DiscoverEd Quickstart) |
Paulproteus (talk | contribs) (→Switching to MySQL) |
||
(5 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
+ | [[Category:DiscoverEd]] | ||
+ | |||
+ | {{Infobox| | ||
+ | [[DiscoverEd]] is based on [http://nutch.apache.org/ Nutch]. As such, you may wish to consult the [http://wiki.apache.org/nutch/ Nutch Wiki] for general deployment questions.}} | ||
+ | |||
+ | {{Stub}} | ||
+ | |||
=== Check out and build the source code === | === Check out and build the source code === | ||
+ | |||
<pre> | <pre> | ||
$ git clone git://gitorious.org/discovered/repo.git discovered | $ git clone git://gitorious.org/discovered/repo.git discovered | ||
Line 24: | Line 32: | ||
</pre> | </pre> | ||
− | === Run the web | + | === Run the web application === |
+ | |||
+ | Edit conf/nutch-site.xml to point to your crawl location. | ||
+ | |||
+ | <pre> | ||
+ | $ ant war | ||
+ | $ [copy the war file to your J2EE container] | ||
+ | </pre> | ||
+ | |||
+ | === Switching to MySQL === | ||
+ | |||
+ | By default, DiscoverEd (at least on the ''next'' branch) uses an on-disk database called Derby for storing resource metadata. You should use a different database, like MySQL, in production. | ||
+ | |||
+ | To do that, edit '''conf/discovered.xml''' and update the following sections as appropriate: | ||
+ | |||
+ | <pre> | ||
+ | <property> | ||
+ | <name>rdfstore.db.driver</name> | ||
+ | <value>com.mysql.jdbc.Driver</value> | ||
+ | </property> | ||
+ | |||
+ | <property> | ||
+ | <name>rdfstore.db.url</name> | ||
+ | <value>jdbc:mysql://localhost/discovered?autoReconnect=true</value> | ||
+ | </property> | ||
+ | |||
+ | <property> | ||
+ | <name>rdfstore.db.user</name> | ||
+ | <value>discovered</value> | ||
+ | </property> | ||
+ | |||
+ | <property> | ||
+ | <name>rdfstore.db.password</name> | ||
+ | <value></value> | ||
+ | </property> | ||
+ | |||
+ | </pre> | ||
+ | |||
+ | == Known issues == | ||
+ | |||
+ | === Derby and OAI:PMH aren't compatible === | ||
+ | |||
+ | If you use the default backend, OAI:PMH crawls won't work. Instead, you'll get SQL syntax errors from the code. We haven't fully diagnosed the problem; instead, if you get a problem like that, we suggest you switch to MySQL as per the "Switching to MySQL" section. |
Latest revision as of 14:38, 7 September 2010
DiscoverEd is based on Nutch. As such, you may wish to consult the Nutch Wiki for general deployment questions.
Contents
Check out and build the source code
$ git clone git://gitorious.org/discovered/repo.git discovered $ cd discovered $ ant
Add a curator and a feed
DiscoverEd uses feeds to help identify resources to crawl. Feeds are provided by curators, who can also provide metadata about resources.
$ ./bin/feeds addcurator "ND OCW" http://ocw.nd.edu/ $ ./bin/feeds addfeed rss http://ocw.nd.edu/front-page/courselist/rss http://ocw.nd.edu/
Aggregate and crawl resources
$ ./bin/feeds aggregate $ mkdir seed $ ./bin/feeds seed > seed/urls.txt $ ant -f dedbuild.xml crawl
Run the web application
Edit conf/nutch-site.xml to point to your crawl location.
$ ant war $ [copy the war file to your J2EE container]
Switching to MySQL
By default, DiscoverEd (at least on the next branch) uses an on-disk database called Derby for storing resource metadata. You should use a different database, like MySQL, in production.
To do that, edit conf/discovered.xml and update the following sections as appropriate:
<property> <name>rdfstore.db.driver</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>rdfstore.db.url</name> <value>jdbc:mysql://localhost/discovered?autoReconnect=true</value> </property> <property> <name>rdfstore.db.user</name> <value>discovered</value> </property> <property> <name>rdfstore.db.password</name> <value></value> </property>
Known issues
Derby and OAI:PMH aren't compatible
If you use the default backend, OAI:PMH crawls won't work. Instead, you'll get SQL syntax errors from the code. We haven't fully diagnosed the problem; instead, if you get a problem like that, we suggest you switch to MySQL as per the "Switching to MySQL" section.