Difference between revisions of "AgShare/Tech"

From Creative Commons
Jump to: navigation, search
(Piwik general configuration)
 
(5 intermediate revisions by the same user not shown)
Line 16: Line 16:
 
To deploy a new war, do this:
 
To deploy a new war, do this:
  
 +
* rm -rf ~/tomcat/webapps/ROOT
 
* cp nutch-1.1.war ~/tomcat/webapps/ROOT.war
 
* cp nutch-1.1.war ~/tomcat/webapps/ROOT.war
  
Line 22: Line 23:
 
== Restarting Tomcat ==
 
== Restarting Tomcat ==
  
The AgShare deployment uses a Tomcat instance in its $HOME (supported by the tomcat6-instance-create script). So to restart it, try:
+
The AgShare deployment uses a Tomcat instance in its $HOME (supported by the tomcat6-instance-create script). It's wrapped as "/etc/init.d/agshare" so the boot process can use it. But you can restart it this way:
  
 
* ~/tomcat/bin/shutdown.sh
 
* ~/tomcat/bin/shutdown.sh
Line 30: Line 31:
  
 
/etc/rc.local contains a call to run ~/tomcat/bin/startup.sh as the agshare user. That's kind of hackish, I realize.
 
/etc/rc.local contains a call to run ~/tomcat/bin/startup.sh as the agshare user. That's kind of hackish, I realize.
 +
 +
== Piwik analytics ==
 +
 +
We use a self-hosted package called [http://piwik.org/ Piwik] to record search engine queries and measure traffic to the website. All the data stays with us.
 +
 +
You can use the [http://search.agshare.org/static/piwik/piwik/index.php Piwik admin interface] to view the stats, if you have an account. If you want an account, talk to Nathan.
 +
 +
=== Piwik general configuration ===
 +
 +
* Configuration: It uses a MySQL database. You can see the details in the Piwik configuration file.
 +
* Path on the server: '''/var/www/search.agshare.org/www/static/piwik/piwik'''
 +
* Web serving: Apache + mod_php5 serve it up. We set up '''/var/www/search.agshare.org/www/static''' to be served by Apache; you can see that in /etc/apache2/sites-available/search.agshare.org.
 +
 +
To get piwik running, we had to add piwik to the default template. I implemented that in [http://gitorious.org/+discovereders/discovered/agshare-live/commit/b396498f99de2d21259aed48bbf7918f7cf436d2 a commit].
 +
 +
=== Site search ===
 +
 +
We added the [http://github.com/BeezyT/piwik-sitesearch the sitesearch plugin] (still in beta; see [http://dev.piwik.org/trac/ticket/49 this Piwik ticket]) to let us analyze site search.
 +
 +
The site search plugin requires that we:
 +
* Change the default translations so that they
 +
* Configure it: In the [http://search.agshare.org/static/piwik/piwik/index.php?module=SiteSearch&action=admin&idSite=1&period=day&date=yesterday Site Search settings], I set the "Search URL" to "search.jsp" (no leading slash) and the "Search Parameter" to "query". This matches [http://search.agshare.org/search.jsp?query=body queries like this].
 +
 +
Piwik SiteSearch can keep track of the number of results that the search engine returns for each query. To do that, it needs some to be able to "scrape" the information out of the web page, or alternately have the servlet provide it. I chose the "scrape" option. I implemented that in [http://gitorious.org/+discovereders/discovered/agshare-live/commit/4ffdd225670c9af3e57d686111907aa5e5d150fe a commit].
 +
 +
== Version control ==
 +
 +
The Agshare deployment's git repository can be [http://gitorious.org/+discovereders/discovered/agshare-live/ found on Gitorious]. That is available from within the agshare deployment as a ''git remote'' named ''mirror''.
 +
 +
When you want to back up the AgShare deployment's git state, just do:
 +
 +
$ git push mirror --mirror

Latest revision as of 15:45, 12 October 2010


The AgShare deployment works analogously to the CC Labs deployment of DiscoverEd. Some important things to note:

  • Username: agshare
  • Host name: search.agshare.org (currently the same as discovered.labs.creativecommons.org)

So, for example, to set up your environment, do:

$ sudo su - agshare

Given that, give Running DiscoverEd a look!

Deploying new WARs

To deploy a new war, do this:

  • rm -rf ~/tomcat/webapps/ROOT
  • cp nutch-1.1.war ~/tomcat/webapps/ROOT.war

Then restart Tomcat.

Restarting Tomcat

The AgShare deployment uses a Tomcat instance in its $HOME (supported by the tomcat6-instance-create script). It's wrapped as "/etc/init.d/agshare" so the boot process can use it. But you can restart it this way:

  • ~/tomcat/bin/shutdown.sh
  • ~/tomcat/bin/startup.sh

Starting Tomcat at boot

/etc/rc.local contains a call to run ~/tomcat/bin/startup.sh as the agshare user. That's kind of hackish, I realize.

Piwik analytics

We use a self-hosted package called Piwik to record search engine queries and measure traffic to the website. All the data stays with us.

You can use the Piwik admin interface to view the stats, if you have an account. If you want an account, talk to Nathan.

Piwik general configuration

  • Configuration: It uses a MySQL database. You can see the details in the Piwik configuration file.
  • Path on the server: /var/www/search.agshare.org/www/static/piwik/piwik
  • Web serving: Apache + mod_php5 serve it up. We set up /var/www/search.agshare.org/www/static to be served by Apache; you can see that in /etc/apache2/sites-available/search.agshare.org.

To get piwik running, we had to add piwik to the default template. I implemented that in a commit.

Site search

We added the the sitesearch plugin (still in beta; see this Piwik ticket) to let us analyze site search.

The site search plugin requires that we:

  • Change the default translations so that they
  • Configure it: In the Site Search settings, I set the "Search URL" to "search.jsp" (no leading slash) and the "Search Parameter" to "query". This matches queries like this.

Piwik SiteSearch can keep track of the number of results that the search engine returns for each query. To do that, it needs some to be able to "scrape" the information out of the web page, or alternately have the servlet provide it. I chose the "scrape" option. I implemented that in a commit.

Version control

The Agshare deployment's git repository can be found on Gitorious. That is available from within the agshare deployment as a git remote named mirror.

When you want to back up the AgShare deployment's git state, just do:

$ git push mirror --mirror