AgShare/Tech
The AgShare deployment works analogously to the CC Labs deployment of DiscoverEd. Some important things to note:
- Username: agshare
- Host name: search.agshare.org (currently the same as discovered.labs.creativecommons.org)
So, for example, to set up your environment, do:
$ sudo su - agshare
Given that, give Running DiscoverEd a look!
Contents
Deploying new WARs
To deploy a new war, do this:
- rm -rf ~/tomcat/webapps/ROOT
- cp nutch-1.1.war ~/tomcat/webapps/ROOT.war
Then restart Tomcat.
Restarting Tomcat
The AgShare deployment uses a Tomcat instance in its $HOME (supported by the tomcat6-instance-create script). It's wrapped as "/etc/init.d/agshare" so the boot process can use it. But you can restart it this way:
- ~/tomcat/bin/shutdown.sh
- ~/tomcat/bin/startup.sh
Starting Tomcat at boot
/etc/rc.local contains a call to run ~/tomcat/bin/startup.sh as the agshare user. That's kind of hackish, I realize.
Piwik analytics
We use a self-hosted package called Piwik to record search engine queries and measure traffic to the website. All the data stays with us.
You can use the Piwik admin interface to view the stats, if you have an account. If you want an account, talk to Nathan.
Piwik general configuration
- Configuration: It uses a MySQL database. You can see the details in the Piwik configuration file.
- Path on the server: /var/www/search.agshare.org/www/static/piwik/piwik
- Web serving: Apache + mod_php5 serve it up. We set up /var/www/search.agshare.org/www/static to be served by Apache; you can see that in /etc/apache2/sites-available/search.agshare.org.
To get piwik running, we had to add piwik to the default template. I implemented that in a commit.
Site search
We added the the sitesearch plugin (still in beta; see this Piwik ticket) to let us analyze site search.
The site search plugin requires that we:
- Change the default translations so that they
- Configure it: In the Site Search settings, I set the "Search URL" to "search.jsp" (no leading slash) and the "Search Parameter" to "query". This matches queries like this.
Piwik SiteSearch can keep track of the number of results that the search engine returns for each query. To do that, it needs some to be able to "scrape" the information out of the web page, or alternately have the servlet provide it. I chose the "scrape" option. I implemented that in a commit.
Version control
The Agshare deployment's git repository can be found on Gitorious. That is available from within the agshare deployment as a git remote named mirror.
When you want to back up the AgShare deployment's git state, just do:
$ git push mirror --mirror