Duplicate a node

Goal of duplication is to easily deploy a computer over network without taking care of numbers of computer. In this documentation, we call golden node the node we want to clone. - We can duplicate SCSI or IDE hard drive, and duplication support multiple filesystem (reiserfs, ext2, ext3, xfs, jfs). -

1.1. KA method

+ We can duplicate SCSI or IDE hard drive, and duplication support multiple filesystem (reiserfs, ext2, ext3, ext4, xfs, jfs). +

WARNING: all data on client nodes will be erased ! We ducplicate partitions of HDD' golden node, and the process will do a fdisk command on the client node, so ALL YOUR DATA will be erased on client nodes.

1.1. KA method

With KA method you can quickly duplicate a node using a desc file describing partitions. KA method only duplicate data on partitions, so if you have 80go HDD disk, and only 10go on it, KA only duplicates 10go, and not the whole disk. KA method doesn't not support RAID software.

Drawbacks:

KA method doesn't support RAID software
you can only clone Linux filesystems (if you want to duplicate another kinf of FS, it's up to you to modify the scripts)
you can only duplicate same kind of HDD (IDE or SCSI)

1.2. HOW it works

1.2.1. Steps

KA method doesn't support RAID software
you can only clone Linux filesystems (if you want to duplicate another kinf of FS, it's up to you to modify the scripts)

1.2. HOW it works

1.2.1. Steps

The clone process works in three steps -

PXE boot to retrieve stage1: the computer boot on PXE mode, retrieve vmlinuz and an initrd. The computer is in stage1 mode, and is able to get the stage2 throug KA. Network is up.
get stage2: the computer gets the stage2 with KA method. The stage2 contains all necessary tools to recognize your hardware (the most important things is to detect your HDD), and all necessary tools to finalize the cloning process.
Duplication process: the computer auto-probes needed modules to be able to access to HDD. A basic log server is launched on the client node to be able to run command and get status of the KA duplication process.

1.2.2. Needed files

- All needed files are available on Mandriva Linux cooker. +

PXE boot to retrieve stage1: the computer boot on PXE mode, retrieve vmlinuz and an initrd image. The computer is in stage1 mode, and is able to get the stage2 throug KA. Network is up.
get stage2: the computer gets the stage2 with the KA method. The stage2 contains all necessary tools to recognize your hardware (the most important things is to detect your HDD and your network card), and all necessary tools to finalize the cloning process.
Duplication process: the computer auto-probes needed modules to be able to access the HDD. A basic log server is launched on the client node to be able to run command and get status of the KA duplication process. The computer reconfigure the modprobe.conf and restore the booloader (grub or lilo)

1.2.2. Needed files

+ All needed files are available in Mandriva Linux cooker.

install/stage2/rescue.sqhfs: this is the stage2 file with all needed files to detect and probe modules, and launch the third step of the duplication process. This file will be used on the golden node.
isolinux/alt0/vmlinuz: linux kernel, needed in the /var/lib/tftpboot/X86PC/linux/images/ directory of the PXE server
isolinux/alt0/all.rdz: stage1 and all needed modules.

1.3. Step 1: PXE, TFTP, DHCPD services

install/stage2/rescue.sqhfs: this is the stage2 file with all needed files to detect and probe modules, and launch the third step of the duplication process. This file will be used on the golden node.
isolinux/alt0/vmlinuz: linux kernel, needed in the /var/lib/tftpboot/X86PC/linux/images/ directory of the PXE server
isolinux/alt0/all.rdz: stage1 and all needed modules and tools.

1.3. Step 1: PXE, TFTP, DHCPD services

To easily clone a computer node, we use PXE technology to boot a kernel, and an initrd image wich contains all needed modules for network and media storage. Documentation about PXE can be found here: PXE doc. Please, keep in mind setting such services can DISTURB your current network architecture. -

1.3.1. PXE parameters on server

Mandriva Linux installer supports various methods to install a computer. With PXE configuration file you can specify wich method you want to use to install your node, or add a specific option at boot prompt. Edit your default PXE configuration file to add your custom entry (/var/lib/tftpboot/X86PC/linux/pxelinux.cfg/default).

@@ -35,12 +35,12 @@ label local
 label kamethod
     KERNEL images/vmlinuz
     APPEND initrd=images/all.rdz ramdisk_size=64000 vga=788 \
-	      automatic=method:ka,interface:eth0,network:dhcp root=/dev/ram3 rw rescue kamethod

+ automatic=method:ka,interface:eth0,network:dhcp root=/dev/ram3 rw kamethod

At boot prompt no you can boot:

DEFAULT local: default boot will be local one, change it with the name of a LABEL
local: boot local
kamethod: automatic mode, get stage2 through KA. Network interface is set to eth0. Auto setup the network with DHCP, and use the KA technology to launch the replication method.

1.3.2. TFTP server

TFTP server should be activated in /etc/xinetd.d/tftp file, and the xinetd service started.

@@ -57,7 +57,7 @@ service tftp
 	    cps= 100 2
 	    flags= IPv4
 }

1.3.3. PXE configuration

 # which interface to use
 interface=eth0
@@ -87,8 +87,8 @@ prompt_timeout=2
 # what services to provide, priority in ordering
 # CSA = Client System Architecture
 # service=<CSA>,<min layer>,<max layer>,<basename>,<menu entry>
-service=X86PC,0,2,linux,Mandrake Linux x86
-service=IA64PC,0,2,linux,Mandrake Linux IA64
+service=X86PC,0,2,linux,Mandriva Linux x86
+service=IA64PC,0,2,linux,Mandriva Linux IA64
 service=X86PC,0,0,local,Local boot
 
 # tftpd base dir
@@ -97,7 +97,7 @@ tftpdbase=/
 # domain=guibland.com
 domain=

1.3.4. DHCPD configuration

IE of an /etc/dhcpd.conf configuration file. Change IPADDR_TFTP with the IP address of the TFTP serrver, and the NET value. Don't forget to adjust the domain-name and the domain-name-servers.

@@ -183,25 +183,27 @@ subnet NET.0 netmask 255.255.255.0 {
   range NET.30 NET.40;
   }
 }

2. Setup a node as a golden node

2.1. The rescue.sqfs file

2. Setup a node as a golden node

2.1. The rescue.sqfs file

You need the rescue disk (wich contains the /ka directory), - and mount it loop in /mnt/ka. + Just extract this file, and copy all directory in /mnt/ka.

-mkdir /mnt/ka
-mount -o loop -t squashfs rescue.sqfs /mnt/ka

+[root@guibpiv ~]# mkdir /mnt/ka +[root@guibpiv ~]# cd /mnt/ka/ +[root@guibpiv ka]# unsquashfs rescue.sqfs +[root@guibpiv ka]# mv squashfs-root/* . +[root@guibpiv ka]# ls +bin/ dev/ etc/ ka/ lib/ modules/ proc/ sbin/ squashfs-root/ tmp/ usr/ var/ +

- Go in the /mnt/ka/ka directory, and see all new files available. All those files are needed - to do a KA duplication process. We will explain now the rule of each of them. If you want - to modify the content of this rescue disk (add scripts...), mount it in a tempory directory, and copy the content - into the /mnt/ka directory. -

2.1.1. ka-d.sh

+ Go in the /mnt/ka/ka directory, and see all new files available. All those files are needed to do a KA duplication process. We will explain now the rule of each of them.i You can modify all them, those files will be copied onto the directory /tmp/stage2 of the client node. +

2.1.1. ka-d.sh

This is the master script to declare a node as a golden node. This script takes a lot of arguments.

     -h, --help : display this message
     -n num : specify the number of (destination) nodes
     -x dir : exclude directory
-    -X sdb : exclude sdb for the replication
+    -X sdb|sdc : exclude sdb for the replication
     -m drive : copy the master boot record (for windows) of this drive
     -M drive file : use 'file' as master boot record (must be 446 bytes long) for the specified drive
     -D partition : also copy partition 'partition'
@@ -209,53 +211,42 @@ mount -o loop -t squashfs rescue.sqfs /mnt/ka

-d delay : delay beteween the release of 2 clients (1/10 second) -r 'grub|lilo' : choose the bootloader (you can add mkinitrd options) - ie: ka-d.sh -n 3 -p sda /tmp/desc -X sdb -r 'grub --with=ata_piix --with=piix'

2.1.2. replication.conf

+ ie: ka-d.sh -n 3 -p sda /tmp/desc -X 'sdb|sdc' -r 'grub --with=ata_piix --with=piix'

2.1.2. replication.conf

This file contain all variables needed by other scripts. It also tries to get information like IP address. -

2.1.3. fdisk_to_desc

This script generate the description table of the hard drive disk in the /tmp/desc file. This file must follow some rules: one line per partition, with two fields : type of partition and size in megabytes. The type can be linux, swap, extended. Other types can be obtained by appending their hexadecimal number to 'type'. For example linux is the same as type83. The size is either a number of megabytes, or the keyword fill (to take all available space). The logical partitions must have the logical keyword. -

2.1.4. gen_modprobe_conf.pl

This script create a basic output like the content of the/etc/modprobe.conf file. Drawbacks this file must be updated for each new modules available in the kernel (based on the kernel/list_modules.pm file). -

2.1.5. ka-d-client

The ka-d-client binary file is used to get stage2 with the KA method, and after get the whole system. The important argument is the -s session name. A KA can only connect to a specific session (getstage2, kainstall ...). The code source is available in the ka-deploy-0.92 SRPM. -

2.1.6. ka-d-server

- The ka-d-server binary file is used to be a KA golden node server. - Like the ka-d-client the session arguments is an important parameter (-s session_name). - The code source is available in the ka-deploy-0.92 SRPM. -

2.1.7. ka_replication.sh

- Script launched on the KA client (after getting stage2 and probing modules), to do the full process of the - Ka duplication. +

2.1.6. ka-d-server

+ The ka-d-server binary file is used to be a KA golden node server. Like the ka-d-client the session arguments is an important parameter (-s session_name). The session name will be getstage2 to retrieve the stage2 (after the PXE boot) and will be kainstall1 at duplication process step. If you want to do more than one duplication process of nodes at the same time, you should synchronize the ka_sesion name between the server and the client. The code source is available in the ka-deploy SRPM. +

2.1.7. ka_replication.sh

+ Script launched on the KA client (after getting stage2 and probing modules), to do the full process of the Ka duplication. This script call other scripts to prepare the node (prepare_node.sh), configure the bootloader (make_initrd_grub or make_initrd_lilo). -

2.1.8. store_log.sh

- Basic script to store the log of the KA duplication process on an FTP server. Adjust to feet your need, and uncomment - the line #store_log.sh in the /mnt/ka/ka/ka_replication.sh file. -

2.1.9. bootable_flag.sh

2.1.8. store_log.sh

+ Basic script to store the log of the KA duplication process on an FTP server. Adjust to feet your need, and uncomment the line #store_log.sh in the /mnt/ka/ka/ka_replication.sh file. +

2.1.9. bootable_flag.sh

Script to set bootable an HDD using fdisk. First arg must be the HDD device. -

2.1.10. make_initrd_grub

- Restore and reload the Grub bootloader in the /mnt/disk directory. It's a very basic script, and perhaps - use the restore_bootloader of the Mandriva Linux Rescue should be a better idea. -

2.1.11. make_initrd_lilo

- Restore and reload the lilo bootloader in the /mnt/disk directory. Again it's a very basic script, perhaps we should use the - restore_bootloader of the Mandriva Linux Rescue. -

2.1.12. prepare_node.sh

- This script remove in the futur system the old network's udev rules, old dhcp cache files, launch the script gen_modprobe_conf.pl to - regenerate an uptodate /etc/modprobe.conf in the new system, and launch the script to restore the bootloader. - If you want to do more action on the installed, system, you can modify this script. -

2.1.13. send_status.pl

- Very basic perl script to open the port 12345, and paste the content of the /tmp/ka* file. It also - permit the execution of commands on node, if user send a message from the golden node with the exec prefix. -

2.1.14. status_node.pl

2.1.10. make_initrd_grub

+ Restore and reload the Grub bootloader in the /mnt/disk directory. It's a very basic script, and perhaps use the restore_bootloader of the Mandriva Linux Rescue should be a better idea.

2.1.11. make_initrd_lilo

+ Restore and reload the lilo bootloader in the /mnt/disk directory. Again it's a very basic script, perhaps we should use the restore_bootloader of the Mandriva Linux Rescue. +

2.1.12. prepare_node.sh

+ This script remove in the futur system the old network's udev rules, old dhcp cache files, launch the script gen_modprobe_conf.pl to regenerate an up to date /etc/modprobe.conf in the new system, and launch the script to restore the bootloader. If you want to do more action on the installed, system, you can modify this script. +

2.1.13. send_status.pl

+ Very basic perl script to open the port 12345, and paste the content of the /tmp/ka* file. It also permit the execution of commands on node, if user send a message from the golden node with the exec prefix. +

2.1.14. status_node.pl

Script to connect to a client node, first arg must be the IP address of the node. You can run command on the node with the exec prefix. -

3. The golden node, KA server

- Now, it is time to build a description of the node partitions. You can use the script - /mnt/ka/ka/fdisk_to_desc as root user, or your favorite text editor, +

3. The golden node, KA server

+ Now, it is time to build a description of the node partitions. You can use the script /mnt/ka/ka/fdisk_to_desc as root user, or your favorite text editor, you can write a file like this one:

@@ -269,7 +260,7 @@ logical linux fill

partition, and /var fills the rest, of course you can adjust sizes accoding to your system.

- Type the following to start the ka replication server as root user: + Type the following to start the ka replication server as root user on the golden node:

       <screen>
@@ -298,15 +289,14 @@ Socket 5 on port 30764 on node40.guibland.com ready.
     

       
-r "grub --with=jfs --with=ata_piix": use grub bootloader and --with=jfs --with=piix mkinitrd option in the chrooted system after the KA deploiement
-n nb_nodes: specify how many nodes are clients
-p sda/hda desc: specify if you want to duplicate SCSI or IDE storage, and the name of the hdd
-x /tmp: exclude /tmp directory
-X sdb: exclude sdb hdd for the duplication

     

-      Now the node is waiting for the rest of the nodes to start replication.
-

4. KA client node

4.1. PXE server (kamethod)

+ Now the golden node is waiting for clients nodes to start replication. +

4. KA client node

4.1. PXE server (kamethod)

We have to configure the PXE to boot by default on kamethod. To do this just edit /var/lib/tftpboot/X86PC/linux/pxelinux.cfg/default and set DEFAULT to kamethod:

DEFAULT kamethod

- So, next time a node boots, the PXE server will force the - node to boot using the kamethod. -

4.2. Stage1 KA method, node waiting stage2

+ So, next time a node boots, the PXE server will force the node to boot using the kamethod entry. +

4.2. Stage1 KA method, node waiting stage2

Now, you boot all remaining nodes. The replication process will start once all nodes are up and waiting on the KA screen. @@ -315,20 +305,18 @@ Socket 5 on port 30764 on node40.guibland.com ready. server the message Can't reach a valid KA server will appear. Each node will try five times to reach the KA server, after that the node will reboot. As the node boots on kamethod, it will retry until it finds it. -

4.3. Stage2, the duplication process

Once all the nodes have found the KA server, the first duplication process will start. This step duplicates the - rescue_stage2 from the /mnt/ka directory - of the golden node, in the client's nodes memory (/dev/ram3). Then, nodes chroot their - memories (the /tmp/stage2 directory), and launch the drvinst command from the rescue disk, to probe all needed their modules (drivers). - Then, the second step of the duplication starts. + stage2 from the /mnt/ka directory + of the golden node, in the client's nodes memory (/dev/ram3 formated as ext2). Then, nodes chroot their memories (the /tmp/stage2 directory), and launch the drvinst command from the stage2, to probe all needed their modules (drivers). Then, the second step of the duplication starts.

The duplication process will clone your drives following the description you have made (/tmp/desc of the golden node). Nodes will rewrite their partition table, then format their filesystems (ReiserFs, XFS, - ext2/3, JFS). All new partitions will be mounted in the /mnt/disk directory. + ext2/3/4, JFS). All new partitions will be mounted in the /mnt/disk directory. Then, the drive duplication process will begin. On a fast Ethernet switch you can reach speeds of 10MBytes/sec. -

4.4. Prepare the node

At the end of the duplication process, each node will chroot its partitions and rebuild its /boot/initrd.img, and /etc/modprobe.conf files. @@ -336,10 +324,10 @@ Socket 5 on port 30764 on node40.guibland.com ready. SCSI drives and adjusting its network card driver. Before rebooting, each node reinstalls lilo/grub. All your node are now ready, and are clone of master node. -

4.5. PXE server to local boot

Don't forget to change the default PXE boot to local so node after replication will boot localy. -

5. full log of a KA duplication

5.1. Golden node side

5. full log of a KA duplication

5.1. Golden node side

 [root@node40 ka]# ./ka-d.sh -n 1 -p sda /root/desc -X sdb -r "grub --with=jfs --with=ata_piix"
 takembr =
@@ -475,7 +463,7 @@ Total data sent = 627 Megs, in 34011 packets
 Transfer time = 127.140 seconds, throughput = 4.937 Mbytes/second
 The pipeline was emptied in 1.549 seconds

5.2. KA client side

Just launch /mnt/ka/ka/status_node.pl IPADD to get log of the KA client.

 10.0.1.35> ------| Ka |---- Install starting...
-- 
cgit v1.2.1