Linux Datacenter: systemimager

Showing posts with label systemimager. Show all posts

Thursday, May 20, 2010

Systemimager and KVM guests - tips

Recently I've been figuring out how to deploy kvm guests with systemimager and pxe. This is a continuation of my previous story.

If you go for kvm you will probably end up using virtio network and virtio-hdd drivers with your guests as they provide best performance. Using these drivers implies some strange hdd naming. You place your instalation od /dev/vda instead of /dev/sda. Unfortunately systemimager (or better - systeminstaller - one of its components), will not recognize vda disks properly which leads to some complications.

Normally, when you run si_prepareclient on your golden-client, partitioning configuration is placed in file /etc/systemimager/autoinstallscript.conf in section <config><disk> (this is done by systeminstaller in fact). Then this section is used by si_mkautoinstallscript to generate parted commands in your <imagename>.master script. In case of /dev/vda disk, the <config><disk> section is empty, therefore you don't get the partitioning section in .master script. Due to this, deployment of your kvm guest image will fail as no partitions will be created. You may either want to debug systeminstaler (try do tebug perl code ;-) or just work it around, which is what I chose.

What you need to do
First, manually add the <disk> section on the golden client to /etc/systemimager/autoinstallscript.conf after you run si_prepareclient. The file might now be looking like this. The <disk> section is in italics.

This one is taken from redhat with no LVM, the hdd size is 12gb, and 2gb go for swap:
<config>
<disk dev="/dev/vda" label_type="msdos" unit_of_measurement="MB">

<part num="1" size="101" p_type="primary" p_name="-" flags="boot" />
<part num="2" size="10676" p_type="primary" p_name="-" flags="-" />
<part num="3" size="2097" p_type="primary" p_name="-" flags="-" />
</disk>

<fsinfo line="10" real_dev="/dev/vda2" mount_dev="LABEL=/" mp="/" fs="ext3" options="defaults" dump="1" pass="1" />
<fsinfo line="20" real_dev="/dev/vda1" mount_dev="LABEL=/boot" mp="/boot" fs="ext3" options="defaults" dump="1" pass="2" />
<fsinfo line="30" real_dev="tmpfs" mp="/dev/shm" fs="tmpfs" options="defaults" dump="0" pass="0" />
<fsinfo line="40" real_dev="devpts" mp="/dev/pts" fs="devpts" options="gid=5,mode=620" dump="0" pass="0" />
<fsinfo line="50" real_dev="sysfs" mp="/sys" fs="sysfs" options="defaults" dump="0" pass="0" />
<fsinfo line="60" real_dev="proc" mp="/proc" fs="proc" options="defaults" dump="0" pass="0" />
<fsinfo line="70" real_dev="/dev/vda3" mount_dev="LABEL=SWAP-vda3" mp="swap" fs="swap" options="defaults" dump="0" pass="0" />

<boel devstyle="udev"/>

</config>

Second: pull the image from the golden-client (si_getimage).

Third: check your /var/lib/systemimager/scripts/<imagename>.master if it contains "parted" section (cat <imagename>.master |grep parted or something like this should verify this).

Fourth: have your image deployed to a new vm:
si_mkclientnetboot --image yourimage --flavor yourimage
The --flavor is needed to use a kernel with virtio drivers for deployment - it needs to recognize your /dev/vda and virtio ethernet.

That's all. Reboot your new vm with pxe boot and watch your image being deployed.
Comments welcome ;-)

Thursday, May 6, 2010

Consistency in the datacenter

One of the main challenges of system administration is dealing with chaos.
This is in fact one of the features of the world we live in. Every construction degrades over time. Every organized structure turns into chaos if not maintained approprietaly.

At its birth, the server is just like an innocent child. Clean, lean,configured, doing what's it designed for. Over time, developers, sysadmins, etc.
(whoever has access to it) introduce changes to this setup. Some of them are made by you and hopefully are recorded somewhere. Some of them you probably don't authorize and don't even know they were performed. One day you find yourself saying "Where the fuck this load is comming from?" And you discover that something that was once designed to be a plain mysql server now runs apache, nfs and a whole bunch of other stuff.

Of course life of a sysadmin is hard. You got lots of machines and hardly have time to record every single thing you've done on each of them. You want certain aspects of your systems to be the same on every single server. For example you want every machine to get its time from your company's ntp server, log to a remote logging server, get its hostname from local dns, etc. etc..

This can be called configuration consistency management. Usually when talking
about large server farms people focus on rapid deployment (1000 machines in an
hour - large numbers are impressive). But in fact this is not the deployment itself which poses the greatest challenge. Maintaining configuration over time is much harder.

So how you do it? Depends on what farm you are running. If you run a homogenic
computational cluster, you probably have just one type setup that every worker node in the cluster should have. If you work for a dotcom, you are probably dealing with larger number of configs (database servers, www servers, cache servers etc.). Also if the dotcom runs more than one website, the number aforementioned server types is multiplied. Add different linux distributions, bsd, solaris to the mix and you find yourself in the middle of chaos.

In case of homogeneous server farm, you are likely to have just one configuration to maintain. Your focus is not on imposing server configuration, but rather on maintaining a consisitent server image over time. When you deal with different configurations, different websites, you probably want to control only certain aspects of every server (ntp for example), while others are left untouched and can be modified directly by people working on a certain website (users, groups, etc).

The tool I recommend for running a homogenic system farm is systemimager (http://systemimager.sourceforge.net). It is a suite which combines automated deployment (via pxe & rsync/torrent), and further server management. You install one cluster node, get its image to the systemimager server. The image is located in a directory into which you can chroot and make changes. You deploy the image on the nodes with pxe. A small linux distro loads itself into a ramdisk first. The distro creates filesystems and rsync-s the image from systemimager server. And
here comes the most interesting part about consistency. After you have deployed all your systems and you need to update them, there's no need to go to all servers,run scripts, etc. You just chroot into your image, intall software, add users, whatever. Then you rsync your image to the nodes. And voila - all your nodes are upgraded and in consistent state. Of course systemimager is smart enough and will not overwrite anything in /home, /tmp, /varetc.. Downside - AFAIK it now supports fully only linux

If you run lots of different system types, systemimager is probably not the best
choice for you, as you would need to have a separate image for every single node
type. Puppet (http://reductivelabs.com) is probably better here. Puppet is a language to describe system configuration. You can specify which packages must be present, users to be added, services to be running. A process "puppetd" runs on every node and and applies this configuration periodically. You can have
classes of nodes for different hardware types, different services. What's important - you can model only certain aspects of your configuration and leave others untouched.

Sunday, May 2, 2010

Systemimager & puppet

I've recently read this article on puppet wiki describing how to deploy systems for puppet. The method uses kickstart and cobbler. Here I describe how to provision systems for puppet with systemimager.

The goal
Do a bare metal provisioning of a huge number of servers with systemimager.
Have them automagically registered in puppet.

Why I do it
Systemimager has certaing advantages over deploying servers with kickstart or preseed:
-> It is distro-independent
-> It scales (installations can be done with torrent which enables you to deploy several hundreds nodes in around 10 minutes - there is a paper on this on systemimager website)
-> Deployed image is always the same as opposed to kicstart. In kickstart the resulting OS is based on the state (version of packages) in repos from which it has been installed. So unless you keep a local mirror on which you control package versions, you cannot really ensure that OSes you are deploying are equal. So if you decide to re-deploy a system a month later you might find that it's different from your previous installations.
-> Images can be easily modified further on and changes can be put to the clients without interrupting their work (if you want to update a client image, just chroot into it and do apt-get update or yum update - as simple as that ;-). Then populate the change to clients with systemimager tools. They will sync the changes you've done in your image to clients.
-> You can easily monitor progress/errors of your installation with systemimager-monitor
-> With systemimager you get not only a deployment solution, but also cluster management tools like parallel shell, file syncing and - most important - si_updateclient utility. Suppose you have deployed your image to servers and forgot to put your software in /opt. You chroot into your reference image on systemimager-server and untar your software. Then you run si_updateclient on the client and voila changes are synced - your package is installled on the client. This finely complements puppet which is not designed to transfer large data with profiles.

Assumptions
-> I assume that a flavour of linux is to be deployed (any distro *NOT* using grub2 will do). Examples are based on centos.
-> You have a systemimager-server installed and running. This requires dhcp, pxe-boot and storage for images, all of them set up for systemimager. Systemimager has a set of wizards for it.
-> You have puppetmaster instance already in place.

Procedure overwiev
-> Manually install basic, mini linux on one of your new servers. You only install basic release of your linux flavour (just like "Base installation" in centos). This speeds up deployments as the image hass less files.
-> Prepare it for puppet
-> Have its image retrieved by systemimager
-> Deploy the image to other systems
-> Register all systems in puppet

Step 1: Install linux reference image
No comments here - just use your distro iso ;-)

Step 2: Modify the OS to operate with puppet
First install ntpd and configure it. Puppet uses certificates for security. It's likely that hwclock on new servers does not show correct time. So the csr generated by puppet might be valid somewhere in the future or far in the past. Even if it is signed by the puppetmaster, It will not be valid at the time of deployment.
Install puppet client and configure it to point to your puppetmaster & start at boottime.
Install systemimager-client.
Edit /etc/systemimager/updateclient.local.exclude and add /var/lib/puppet/ (if you do further management using systemimager suite, contents of this directory will be left untouched).
Configure passwordless ssh to your clients from systemimager-server. Generate ssh-keys without passphrase (or have a passphrase and further use ssh-agent to cache it) on systemimager-server. Copy ./root/ssh/id-rsa.pub to your clients to /root/.ssh/authorized_keys
Do further modifications as you like.

Step 3: Retrieve golden client image with systemimager
Please, see systemimager manual for details. This is general procedure:

On the systemimager-server:
/etc/init.d/systemimager-server-rsyncd start

On the "golden client":
si_prepareclient --server systemimager-server-ip

On the systemimager-server:
si_getimage --image img-name --golden-client client-ip-addr

The image is stored in a plain dir in /var/lib/systemimager/images/. You can chroot into it and adjust if you forgot something in step 2.

Step 4: Deploy the image to other systems
On the systemimager-server prepare other clients to pxe-boot:
si_mkclientnetboot --netboot --clients ip-list-of-nodes --image img-name

This command generates dhcp,pxe,tftp configuration for your clients so that they install the image next time they boot.

Reboot your new servers and watch them deploying the image ;-) (You might have time for a cup of coffe here unless you are using torrents for deployment which is extremely fast ;-)

After the last node is deployed, run:
si_mkclientnetboot --localboot --clients ip-list-of-nodes
This makes nodes boot from local hdd instead of pxe.

Step 5: Register new clients with puppet
After reboot, all you need to do is to sign new nodes' certificates as they appear. They are ready for puppet configuration. If you have problems at this stage (not all clients appear in puppet etc.), you may use parallel shell from systemimager to troubleshoot (just like: si_psh --hosts 'host_list' 'puppetd --verbose -o'). For this stuff you enabled passwordless login in your image.

Summary
I think the procedure described here is a more versatile replacement for kickstart and preseed instalations. Besides, systmemimager a great cluster management tool by itself.
-> It's faster and less complicated.
-> You don't need a local copy of your repo.
-> Easy to fine tune your images (no scripting for this as it is with kickstart).
-> Systemimager configures pxe, dhcp, tftp stuff for you.
-> If you have a homogenic cluster (HPC worker nodes are a good example) not so big, you may find that you don't even need puppet to manage it. Systemimager will do.

I mention systemimager in some of my posts. Please, check them out on the tag cloud.
Comments are very welcome as usual ;-)

Sunday, April 18, 2010

Degrees of control

This post was inspired by what happened to me lately at work. A guy from security came in and told it would be great if we could allow only certain packages to be installed on our linux boxes. Everything what is not specified on the machine's profile would be automatically erased.

When I look at this situation, I come to think that there are times and setups when you want to control every change that happens on your server farm and sometimes you only want to control some parameters of your machines.

So there are basically two approaches:
-> "God-mode": you have a reference server image to which you introduce changes and then sync your servers to this image (changes entered manually on your servers are overwritten)
-> "modelling-mode": you say: this server must have an httpd & postfix running, also group apache needs to be present, etc. . You care only about httpd, postfix and apache group - the rest can be modified freely.

Approach 1. you can use if you run a homogenic server farm, just like an HPC cluster where you have a headnode and a number of similar worker nodes. This approach does not deal well with situations where you have a mixture of different OSes, hardware and machine types. On the upside - you always know what you are running. The security guy is always happy ;-) Also - tools used here are quite simple. All you gotta do is to sync your cients with reference image.

Approach 2. you use if you run more diverse environment (who'd suppose ;-). I mean here a bunch of large websites, serving different domains, several database configurations, proxies etc. - see here you might easily have over 10 installation types, each of them possibly running different OS, hardware etc. When you think about it, it is easy to realize that controlling this mess with approach 1 is impossible. Especially when there are several admins, each controlling his domain of expertise. It's likely that your database admins don't know your configuration tools . Also they know databases better than you. So it's reasonable only to assure that package postgres or mysql is installed on their machines and leave other system tuning up to your fellows.

Some words about tools that can be used here:

For approach 1:
-> systemimager - a cluster deployment and management suite. You store images of your servers in a central repository. They are plain directories so you can chroot into them, install some software, add users, etc. and then propagate changes to your clients. All of this is done with rsync so you don't interrupt your farm members' work.

-> startng machines with common nfs root - machines mount a common root filesystem from a NFS server. What you change on the nfs share is immediately propagated to clients

For approach 2:
I recommend running puppet + nagios. With puppet you ensure that certain aspects of your servers are the way you want them (i.e. apache installed, user apache present etc.). However puppet fails on reporting, so you need to monitor how puppet imposes your configuration with nagios checks. All the rest is in the hands of your fellow admins.

Comments and suggestions higly welcome ;-)

Sunday, April 11, 2010

How to migrate linux between different hardware

Any experienced systems administrator must have come across this issue in his life. Your server became outdated 3 years ago with all RAM and disk slots allready filled. You need more power. Gotta buy a new server, set it up, install the software, configure it and run the services on the new server.

But wait a minute... The old redhat 3.x you are using presently is hardly available now. What about the code your fellow admin wrote 3 years ago and left soon afterwards (code still works but you don't have idea about its dependancies etc.).

The best idea is to clone the server and deploy it on the new one. But this is usually hardware dependant - which means you cannot redeploy the clone on servers which require different drivers).

Fortunately, linux handles hardware changes quite nicely and it is easy to setup hardware independent imaging.

However some conditions apply:

-> on the new server, the only thing that will change is probably modules used, along with initrd. These things are hardware dependant.

-> you will be running grub1 on the destination server (grub2 is not supported by systemimager AFAIK - see later)

-> you might experience some minor problems with udev which may fail to start during boot. From my experience it turns out, that this is usually not a major problem.

Software:

-> systemimager
-> an iso of your linux distribution

Hardware:
-> old server (ServerA)
-> new server (ServerB)
-> a third server (ServerC)to run systemimager-server - just for migration time

Procedure overview:

We first get the image from ServerA and transfer it to systemimager-server on ServerC. We also install a basic, plain operating system on serverB.
Then we overwrite serverB with ServerA's image excluding parts which are hardware dependent (modprobe.conf, modprobe.d). We regenerate initrd and reboot ;-)

Diving in

-> install your linux distro onto ServerB from iso

-> On ServerC download and install systemimager-server package (instructions available on the systemimager website. My ubuntu has it out-of-the-box in apt). The server on which you install should have enough disk capacity to hold ServerA's filesystem. Start systemimager-server-rsyncd service.

-> download and install systemimager-client on ServerA and ServerB

-> on serverA turn off firewall, shut down production services (possibly - as many daemons as you can). Then run:

si_prepareclient --server serverC

-> on serverC:

si_getimage --image image-serverA --golden-client serverA

this will start image retrieval which is done with rsync

-> while the image is being cloned, login to serverB and say which files will not be overwritten. On serverB - edit the file /etc/systemimager/updateclient.local.exclude and add the following lines:

/etc/modprobe.d/

/etc/modprobe.conf

-> when image retrieval is finished, on serverB run:

si_updateclient --server serverC --image image-serverA --no-bootloader

this will transfer the image from serverC onto your new server serverB

-> to hold new hardware on serverB, you need to regenerate initrd and look if all grub entries are correct. mkinitrd reads /etc/modprobe* to determine which hardware modules are needed to start the system. They are not overwritten by cloning because you excluded them earlier. (redhat provides easy command for the whole process called new-kernel-pkg). On redhat it also works to reinstall the kernel with rpm --force option.

-> after this is finished, reboot.

Conclusion

That's it. In practice this has worked for me several times. I would mostly reimage IBM eServer (from /dev/sda) on an HP ProLiant (on /dev/cciss/c0d0) and vice-versa. It worked well on dell blades. I successfully done V2V and P2V migrations on vmware/xen,kvm. The systems were redhat 5 and 4, debian.
However I expect some problems might arise if you tried to advance too much in kernel versions (for example run ancient redhat with a modern kernel on new hardware). An idea to handle this is to exclude not only modprobe* stuff from imaging but also whole /boot partition and /lib/modules/*. This is still to be tested ;-)

Linux Datacenter