AgShare/Tech

From Creative Commons
Jump to: navigation, search


The AgShare deployment works analogously to the CC Labs deployment of DiscoverEd. Some important things to note:

  • Username: agshare
  • Host name: search.agshare.org (currently the same as discovered.labs.creativecommons.org)

So, for example, to set up your environment, do:

$ sudo su - agshare

Given that, give Running DiscoverEd a look!

Deploying new WARs

To deploy a new war, do this:

  • rm -rf ~/tomcat/webapps/ROOT
  • cp nutch-1.1.war ~/tomcat/webapps/ROOT.war

Then restart Tomcat.

Restarting Tomcat

The AgShare deployment uses a Tomcat instance in its $HOME (supported by the tomcat6-instance-create script). It's wrapped as "/etc/init.d/agshare" so the boot process can use it. But you can restart it this way:

  • ~/tomcat/bin/shutdown.sh
  • ~/tomcat/bin/startup.sh

Starting Tomcat at boot

/etc/rc.local contains a call to run ~/tomcat/bin/startup.sh as the agshare user. That's kind of hackish, I realize.

Piwik analytics

We use a self-hosted package called Piwik to record search engine queries and measure traffic to the website. All the data stays with us.

You can use the Piwik admin interface to view the stats, if you have an account. If you want an account, talk to Nathan.

Piwik general configuration

  • Configuration: It uses a MySQL database. You can see the details in the Piwik configuration file.
  • Path on the server: /var/www/search.agshare.org/www/static/piwik/piwik
  • Web serving: Apache + mod_php5 serve it up. We set up /var/www/search.agshare.org/www/static to be served by Apache; you can see that in /etc/apache2/sites-available/search.agshare.org.

To get piwik running, we had to add piwik to the default template. I implemented that in a commit.

Site search

We added the the sitesearch plugin (still in beta; see this Piwik ticket) to let us analyze site search.

The site search plugin requires that we:

  • Change the default translations so that they
  • Configure it: In the Site Search settings, I set the "Search URL" to "search.jsp" (no leading slash) and the "Search Parameter" to "query". This matches queries like this.

Piwik SiteSearch can keep track of the number of results that the search engine returns for each query. To do that, it needs some to be able to "scrape" the information out of the web page, or alternately have the servlet provide it. I chose the "scrape" option. I implemented that in a commit.

Version control

The Agshare deployment's git repository can be found on Gitorious. That is available from within the agshare deployment as a git remote named mirror.

When you want to back up the AgShare deployment's git state, just do:

$ git push mirror --mirror