ParallelKnoppix Tutorial
Michael Creel, Universitat Autònoma de Barcelona
30 Jan., 2006
Welcome to ParallelKnoppix! This tutorial explains how to set it up and
gives some examples of how to use it.
For more information see the home page.
Questions that are not answered by this tutorial should be asked at the
forum.
Disclaimer: P-KPX is offered as
is, with no warranty. I offer no guarantees that it will work
properly, and assume no resposability for any losses that may result
from its use. P-KPX allows you to view and potentially destroy data on
any of the computers that form part of the cluster. Respect the privacy
of data, and be careful not to destroy it, especially if it's not yours.
Contents
Introduction
Booting the master node
Setting up the cluster
Examples
Installing new software
on a running cluster
Shutting down
Advanced topics
Introduction
ParallelKnoppix (P-KPX) is a bootable CD that allows users with average
computing skills to create a HPC cluster in very little time. P-KPX
contains libraries (examples: LAM/MPI,
MPICH, MPITB, PVM) and software
packages (examples: Octave, R, xpvm) that
allow one to run example programs immediately after creating a cluster.
The computers used in a P-KPX cluster may be heterogeneous, and the
cluster is temporary, in the sense that nothing is installed on the
computers that are used in the cluster - they are not altered in any
way. Thus, for example, the computers in a university computer room
that are used for students' work during the day could be converted into
a HPC cluster for nighttime research work, without affecting their use
by students the next day.
P-KPX is based upon the Knoppix
distribution of Linux. Needless to say, there are many people to thank
for those resources, but I'd like to mention Klaus Knopper, Linux
Torvalds, and the GNU Project. If
you like P-KPX and you have some spare money, please make a donation to
the Free Software Foundation.
return to contents
Booting the master node
You need to download the P-KPX CD image (see the home page for
download links) and burn it to a CD. I recommend checking the md5 sum
of your downloaded image with the correct sum posted on the download
page to make sure that your image is not corrupted. When burning the
CD, use a reasonably low speed. Then boot your master node using the
CD. You will see something like:

OK, first thing: slow down and read
this before hitting enter. Note that the release version
appears, right above the boot:
prompt. Before
continuing, you should make sure that there is not a newer
release. You can hit F2 and F3 to get some information about boot
options. By
default, DMA is
not enabled. You can enable it by typing "knoppix dma" before hitting
enter. I recommend trying this, since it works with most hardware and
speeds up access to the CD drive. There's more information on
cheatcodes available on the Net, if you have trouble getting the
master node to boot. OK, now you can hit enter.... When the computer
finishes booting,
you're in the KDE Desktop, looking at the following:

Then we can move on to setting up the cluster.
return to contents
Setting up the cluster
To set up a cluster, you need at least one more computer. The computer
you booted with the CD is the master node, and the other computers are
the slave nodes. They need to be connected together in a network. You
can use an existing ethernet, you can buy a switch and some cables, or
to really keep it simple, you can use a crossover cable to connect a
single slave to the master node. I
recommend disconnecting the master node from any network other than
your cluster, at least until we take some steps to ensure that the
external connection will be secure. This is also important to ensure
that the slave nodes do not see any DHCP server other than the master
node, which causes all kinds of
headaches.
The slave nodes can be booted either using copies of the PK CD, or
across the network, using the PXE boot
capabilities of their network cards. To use the CD method, you need a
PK CD for each slave. This works fine, provided your cluster is
relatively small. It has the advantage that it works with network cards
that don't do PXE boot. Also, you can use this method even if you don't
know what kind of network cards the slaves have, since you won't have
to worry about choosing the kernel modules to include in the terminal
server configuration (see below).
To use the PXE method, you may need to
enable this feature in the BIOS setup routines of the slave nodes.
Set the slaves to try PXE boot before booting from their hard
drives. If you're net cards are too old to do PXE boot, I
recommend replacing them with newer ones, if you value your time at
all. If you're unable to afford that, and you're willing to get into
grimy details, rom-o-matic
can be very useful.
One last detail before we start. Your friendly computer vendor may have
supplied you with a hard disk
that has nothing but NTFS partitions. If that's the case, plug a USB
storage device with a FAT32, reiserfs, ext2, ext3, or any other
Linux-friendly partition type into the master node now. Most USB
storage units are sold formated as FAT32, so as long as you have one
with some free space you're ok.
Assuming you have done the physical setup of a cluster, and the slaves
are ready to net boot, we can get started. Find the
ParallelKnoppix menu in the panel:

Then click on the Setup
ParallelKnoppix entry:

The following message appears:

If you have more than one network card in your master node, you must
select which card connects to your cluster. Which card has which name
may not be obvious to you. If the slave nodes won't boot with your
first choice, start again and try the other(s). Note to advanced users:
open a terminal and type dmesg|grep
eth to get some information about which cards were found.

Next we need to configure the process that will boot the slave nodes.
There is some information:

Then you need to start the configuration. My experience as a worker in
a fast food restaurant is apparent here:

Click on OK. Next you need to specify how many nodes (including the
master node) you have in your cluster:

Next, we come to an important point, that is one of your best
opportunities to have problems. If the following is a probem, and your
cluster is small, try booting the slaves using copies of the PK CD, and
forget about this step - just click OK using the defaults. But if your
cluster is large, you will want to get this working. You need to select
the drivers for the
network cards that are in your slave nodes. To do this, you need to
know what kind of network cards they have, and you need to know the
Linux kernel's name for the driver. Some popular cards are
pre-selected. Be careful with
selecting too many modules. Basically, for each one you add, you
need to de-select another, though the exact number that can be used may
depend upon which particular modules you select. If you have no idea
about all of this, just try clicking OK, maybe you'll be lucky. If you
have trouble with a given slave node, try booting it using the P-KPX
CD, open a terminal, and type dmesg|grep
eth to see what kernel modules are loaded.

Click OK once you have selected your modules. Next we see the
following, where you can add boot options. I recommend not adding
anything here, and giving the defaults a try. Some hardware may require
options like acpi=off, etc. See this information on
cheatcodes if you have trouble getting the slaves to boot.
Keep in mind that all the slaves receive the same options.

In the background, you can see the preparation of the boot image for
the slave nodes. All of this stuff that looks like terrible errors is
normal, don't worry about it.

Now, you are told to boot the slave nodes. DO IT NOW, either relying on PXE, or using copies of the PK CDROM

Up to this point, everything is in memory - the hard disk(s) of the
master node have not been touched. Now we need to mount some storage
media to create a shared directory that all the nodes of the cluster
can see. You need to select a storage device. This can be a hard
disk partition, a USB storage device, etc. It will be mounted
read-write, and a directory called parallel_knoppix_working will
be created there. Later, you will be given the opportunity to remove
this directory, to leave the master node exactly as you found it, if
that is required. The most important thing is that you cannot use NTFS partitions, and
they will not appear on the list to prevent you from accidentally
choosing one. Choose a partition, any partition:

You get a message telling you that the working directory has been
created, and a handy link to it appears on your desktop:

Now the master node repeatedly pings the slaves to check whether or not
they have booted up. This may not be very useful if the slave nodes are
visible to you, but if they're remote or headless nodes, it is useful.

Once the tkping window has all green buttons, click OK. The master node
will pause for about a minute,
be a little patient. This is to make sure that the slave nodes are
running ssh. Then the working directory is NFS mounted on the
slaves. Finally, LAM/MPI and PVM are configured automatically.

TAA DAA! The cluster is running. But wait, let's make it safe to
connect to the Internet, so that we can get data/results on/off the
cluster:

Click OK, and your RSA keys are regenerated.

OK, that's done. Here's a little message:

Remember, to use ssh/scp/fish, etc, to copy things onto the cluster,
you need to set a password
for the knoppix user.
To do that, open a terminal, type passwd,
and follow the instructions. Once you do that you can connect to the
master node. For example, using the konqueror browser on my regular
desktop machine, I can connect to a P-KPX master node as follows:

After connecting, I'm in the master node's home directory, and I can
copy files on/off the cluster:

An alternative is to use the master nodes
hard disks, a USB storage device, etc. to copy information on/off the
cluster. To go that way there is no need to set a password.
return to contents
Examples
PVM
To run PVM, just right-click on the desktop, select "run command", and
enter xpvm, as follows:

This opens up the following window, where we see the master (node1) and
a single slave (node2). If you have set up a larger cluster you'll see
more nodes. I'm not a PVM user, so I don't have any nifty examples. If
you have a good one, send it to me and I'll add it to the CD.

return to contents
C
On to MPI. Open up the parallel_knoppix_working directory:

Go to ./Examples/C/pi.
Open up a terminal, and type mpirun
-np 10 pi.

return to contents
LINPACK
Great, you have just done parallel computations on a Linux cluster. By
the way, make sure to read all the READMEs you will find scattered
around the Examples directory. P-KPX contains the LINPACK benchmark, in
case you want to try to get into the Top500.
You'll find it in the ./Examples/hpl/bin/Linx_ParallelKnoppix
directory. It's not tuned at all carefully. If anyone gets
better results using different tunings, please let me know.

return to contents
FORTRAN
If you're doubting about the ability of the C language to calculate pi,
we can try it with FORTRAN. Go to the ./Examples/FORTRAN directory,
and do what the REAME tells you to:

return to contents
Octave
And now, the best for last, Octave
with MPITB. Go to ./Examples/Octave/kernel, open
a terminal (F4), type octave,
and then at the octave prompt, type kernel_example1. You will see
the following startling graphic:

There are a number of other examples for Octave. The main reason I
developed P-KPX was to be able to use Octave with MPITB on large
clusters. My thanks again to Javier Fernández Baldomero for
making this great code available under the GPL.
return to contents
Installing new
software on a running cluster
Often the software you need won't be on the P-KPX CD. You can install
it in the parallel_knoppix_working
directory, if you like. This is done in this section. An alternative is
to create your own remastered version of P-KPX, which is (will be)
discussed in the section (to be added soon).
First go to the Examples directory:

Uncompress the mpich.tar.gz
file:

Open up a konsole in the new mpich-1.2.7pl
directory (hit F4 when the mouse cursor is in the konqueror window),
then type "./configure". A lot of output will
result, and it will take a while to complete the process. The following
is just the beginning...

...after configuration finishes, type "make". Then relax a bit more,
this will take a
little while too....

... OK, back again! Once mpich is built, cd to mpe/contrib/mandel,
and have a look at what files are there:

Make the example, by typing "make",
and then run it by typing "../../../bin/mpirun
-np 2 pmandel". The reason we supply the path to mpirun is because we want to
use the one that goes with the mpich we just compiled, not the default
LAM mpirun that you
would get without specifying the path.

Taa daa!

You can zoom by highlighting a region using the mouse:

You can keep on zooming in as much as you like. You can re-run the
example using different numbers of slave nodes to see the effect of
doing this in parallel. Thanks to Scott Granneman for suggesting this
example in his book "Hacking Knoppix".
The same way we installed mpich, you can install whatever else you need
into the working directory. But you might like to do a more
standard install so that odd paths won't have to be specified. For
that, see the section on remastering.
return to contents
Shutting down
When you're done, there is a menu item "Shutdown ParallelKnoppix". This
will turn off the slave nodes for you (a good thing when they're
numerous and/or remote) and will offer to remove the working directory
from the storage device that you mounted. Removing it will leave all
nodes in their original state. Leaving it there will be useful if you
have done work that you would like to return to in the future. You can
always make a tgz file and extract it later, too.
return to contents
Advanced topics
There is a menu item that will help you to remaster the P-KPX CD. This
will be useful if you want to add/remove software. A script will copy
the CD contents to a hard disk partition. Another script will set you
up in a chroot
environment, so you can use apt-get
to add packages. A third script will create a new CD image for
you. Basic remastering is not difficult, and it's a good way to
build up a collection of coasters for your coffee cups. There's a lot
of information here
and here.
An alternative to remastering is to compile the software in the parallel_knoppix_working
directory. Hacking Knoppix by
Scott Granneman has an example, which I'll probably get around to
including here sometime. If you leave your working directory on the
hard drive when you shut down, its contents will be there
Another interesting thing is to use a persistent image. This allows you
to personalize your setup quite easily.
return to contents