Deployment research

From Creative Commons
Revision as of 22:44, 18 March 2011 by Christopher Webber (talk | contribs) (Puppet)
Jump to: navigation, search

In the future, deployment will happen at the touch of a button. If you want a new server set up, you will just touch this button, several magical woodland fairies will be sacrificed to the elder gods, and a server will just appear set up for you, completely automagical. When something goes wrong, you just destroy the server, and maybe send apologetic cards to the families of the etherial woodland creatures. This will greatly improve our scalability.

However we have no idea how this will happen yet. Watch here for details as this is researched.

Goals

In short:

  • Failover / load balancing.
    • If a server goes down, another should pick it up.
    • If things are slow, another server should be able to pick things up.
  • Boot and shoot approach
    • Quick and mindless spin-up of a new server
    • Ability to just take down a server / node if it's not working well
    • Still have the ability to step in and debug things as much as we like
  • An approach for spinning up live & devel servers using mostly the same setup

The cc.engine stack has some advantages. For the most part, there simply is no "changing database"... everything is stored in RDF files that are in a git repository / in python packages.

We also need to support CC Wordpress, but this can also be checked out of an svn repository on the fly. Assuming we do edits somewhere else and just push to the server, no need to do backups of these "node" servers, even!

(However maybe we will eventually want to use this setup with things that *do* have a database that matters, like cc network?)

Cleanness is also a goal. It should be clear enough what's happening in the deployment setup and how to adjust / reproduce things. From the talk Continuous Deployment, by Laurens Van Houtven, he talks about a setup wherein anything goes, and perl scripts wrap scheme scripts alongside erlang alongside java and PHP...:

"And developers' sense of decency and morals just completely falter, and they start implementing just completely anything, and then you have some giant shambling lovecraftian horror of a deployment system..."

That's the deployment setup we don't want :)

Deployment

Deployment covers:

  • Server creation (spinning up a new server with our setup)
  • Server management (which stuff is on which machine?)
  • Server updating (update software / data on server)

The following tools are being considered:

plain ol' ssh and bash

Of course, we could always just have ssh commands that run remote commands, like:

  $ ssh webadmin@someserver run_command.sh

or even

  $ ssh webadmin@someserver run_command.py

... this is the most minimalist solution! :) But also, maybe the least "powerful".


Fabric

I really like the the idea of Fabric because it doesn't do "too much". It's just a system for running a lot of remote commands from your machine. Combine this with being able to run local commands on your machine, and maybe you have a good system for checking some local information and pushing a lot of updates at once to a number of machines? In this way, it's kind of like a python-wrapped "plain ol' ssh and bash" solution. Not a bad thing, we can combine the power of python's functions / etc with remote script execution, and nice output, etc.

On the other hand, the downside is that Fabric doesn't do too much. Unlike Puppet, there's no "description" of what our remote systems should be, so if we start changing things manually on servers there's no way to automagically propagate that setup to all our running servers.

But back to the first hand, automagic is sometimes just not nice, and generally confusing and hard to debug anyway!

Old stuff

(Old stuff from Jurisdiction Management)

On live Perhaps in a future time, the follow will replace the above steps:

  $ fab staging deploy

http://docs.fabfile.org/1.0/

On staging

Perhaps in a future time, the following can replace the above steps:

  $ fab production deploy
  $ fab idot imgsplat '''XXX It'd be nice to have this maintained in a fab file so we don't need extended instructions; maybe we need a repo of "deployment tools" that contain these scripts?  Feels a bit like a inappropriate mixing of concerns'''

Silver Lining

http://cloudsilverlining.org/

I really like the approach that Silver Lining takes in many ways. The idea of being able to just take a virtualenv and push it to a new server is quite appealing. It's also fairly declarative and uses a config file format that's like paste deploy.

However, we need some very custom stuff. Our apache config files are large and bloated. Maybe they could be a little bit less bloated, but there is a ton of stuff in there like all the rewriting and static file serving we're doing that Silver Lining doesn't support and doesn't want to support. So for cc.engine at least, silver lining is right out.

Puppet

http://www.puppetlabs.com/

A decent PyCon talk on Puppet

Puppet seems similar to Silver Lining in the sense that it has an abstracted config file setup that "describes" the server and you use that to push to a bunch of slave nodes. Unlike Silver Lining though it appears to be a lot less "simple" and allows things like having your own apache configs.

As I understand it, Puppet kind of has its own config "language" to describe the server setup. It also has a full node management system, which would be useful.

There are some advantages of this in that, as I understand it, you can have some existing nodes running, and if you change the description of the server, it can reconfigure the server for you to match the new "described" setup.

Or so I think.

Anyway that all seems pretty cool but also pretty complicated and abstract, maybe a lot more than we need initially.

Oh yeah, and something cool... apparently blueprint is great for reverse-engineering a puppet setup.

Chef

I don't know much about Chef except that it's apparently a lot closer to Puppet. But insted of its abstracted config system and etc you write a lot of ruby.

For now, that makes me pretty disinterested in Chef. I don't understand yet how it could be better than Fabric on that end.

bcfg2

openstack

Rackspace Cloud, for example, runs this.

Actually, OpenStack isn't a technology, it's a bunch of technologies. And much in the way that "cloud computing" is confusing because it seems to encapsulate so many ideas, OpenStack encapsulates a whole bunch of technologies in a way that I look at it and am currently confused. But I don't think it's actually a deployment technology, so maybe it doesn't belong here, though it has some deployment technologies in it.

I think I feel about OpenStack the way I felt about Pylons' documentation the first time I looked at it. Holy cow, there are a lot of things going on here, and as a complete newbie I have no idea what is going on or how all this stuff relates together. Also the documentation seems all at once congealed and fragmented. What?

Maybe also like Pylons I will eventually look back at the documentation after I come to understand a lot of components individually and think, "Oh this isn't so confusing, but also no wonder I was confused."

Where to deploy to

This section is about what we might end up deploying to in the end.

Ideally our deployment system will ge generic enough to deploy to multiple types of servers, but...

generic metal servers

Linode

Load balancing

Milestones