AUTHOR: Bryan Kadzban <bryan at linuxfromscratch dot org>

DATE: 2009-09-25

LICENSE:
Creative Commons Attribution-Share Alike 3.0 United States
(http://creativecommons.org/licenses/by-sa/3.0/us/)

SYNOPSIS: LFS on RAID, dm-crypt, and/or LVM2

DESCRIPTION:
This hint explains how to build an LFS system capable of booting most RAID, most
dm-crypt, and most LVM2 setups.  It allows you to use dmraid (most "RAID" add-on
boards), and/or Linux kernel "md" RAID.  It also allows you to encrypt the RAID
array, and use LVM2 on top of the encryption.  The rootfs can then be a logical
volume.

Any of these transformations can be omitted, but the transformations must be
layered in this order.  It is worth noting explicitly that this hint does not
cover encrypting a single logical volume (encryption after LVM); it only covers
encrypting the entire PV.

PREREQUISITES:
On the host:

- LVM2 userspace tools, if required.
- dmraid, if required.
- mdadm, if required.
- LUKS-capable cryptsetup, if required.
- Also all dependencies of the above packages.

For the LFS system:

Required:

- Sources for LVM2 userspace tools (with device-mapper): at least version
  2.02.53.  Note that all the other packages require device-mapper, so you will
  need the LVM2 package even if you don't plan on using LVM.

  http://sourceware.org/lvm2/
  ftp://sources.redhat.com/pub/lvm2/LVM2.2.02.53.tgz

- Patch for LVM2 udev rules, to make them work better with udev in LFS.

  http://www.linuxfromscratch.org/patches/downloads/lvm2/LVM2-2.02.53-fix_udev_rules-1.patch

- Sources for lfs-initramfs, newest stable version.

  http://www.linuxfromscratch.org/~bryan/lfs-initramfs-1.0.tar.bz2

Optional (depends on configuration):

- dmraid sources: newest stable version.

  http://people.redhat.com/heinzm/sw/dmraid/src/
  http://people.redhat.com/heinzm/sw/dmraid/src/dmraid-1.0.0.rc15.tar.bz2

- mdadm sources: newest stable version.

  http://neil.brown.name/blog/mdadm
  http://www.kernel.org/pub/linux/utils/raid/mdadm/mdadm-3.0.tar.bz2

- LUKS-capable cryptsetup sources: newest stable version.

  http://code.google.com/p/cryptsetup/
  http://cryptsetup.googlecode.com/files/cryptsetup-1.0.7.tar.bz2

  - You will need popt for cryptsetup: latest stable version.
    http://www.linuxfromscratch.org/blfs/view/svn/general/popt.html

HINT:

0. Preface

Throughout this hint, individual steps are annotated according to which sets of
transformations require them.  [dmr] denotes dmraid; [mdr] denotes md-raid;
[enc] denotes encryption, and [lvm] denotes LVM2.  Multiple transformations per
note are comma-separated.

Steps are grouped according to what portion of the book they apply to.  A note
precedes each group of steps, offset by lots of = characters.

Also note that the only dmraid setup supported is RAID1.  Although dmraid
probably supports RAID0/RAID5 if your particular RAID BIOS does, grub does not.
If grub cannot read the /boot partition, you will not be able to boot.  If you
have another single disk, you can put /boot on that disk, of course.  Most of
this hint should still work in that case, but that setup is untested.

1. [dmr,mdr,enc,lvm] Decide on a disk layout

dmraid arrays are most often composed of full disks.  md RAID can be composed of
either disks, or partitions (but see above: due to grub support, only partitions
are supported here).  dm-crypt can handle any source block device, as can LVM.

===============================================================================
  The following steps apply at section 2.2 of the LFS book: "Creating a New
  Partition".  Follow them *instead of* the instructions in that section of the
  book, until the next ==-delimited note.
===============================================================================

2. [dmr] Build the array

Build your array according to whatever process your BIOS uses.  This will write
out the RAID signature sector to all involved drives, so that the dmraid program
will be able to find it in the next step.  (This usually must be done inside the
BIOS or option ROM setup, so you'll have to reboot to do it.)  Boot your host
system.

3. [dmr] Bring up the dmraid array

You will need the "dm-mod" kernel module loaded, if it is not already.  To scan
your hard drives and bring up the array, run (as root):

  dmraid -ay

Explanation of argument:

  -ay: Activates all scanned dmraid arrays ("activate yes").

This will create a new device node or symlink (depending on your version of
device-mapper and its configuration) in /dev/mapper named after the ID of your
array.  This name is supposed to be unique.

4. [dmr] Partition the dmraid array

Run your favorite partitioning tool on /dev/mapper/<dmraid array>, as root.  You
will need at least one partition for the following steps, but if you also plan
to use dm-crypt or LVM, you will need two: one for /boot and one for the rest of
the data.

5. [mdr] Bring up md-raid on the host

First set aside a partition for /boot.  You may be able to get away without this
if you use RAID1 and the correct metadata format (to avoid overwriting the first
few sectors of the disk), but this may not work.  The size of this partition can
be as small as a few hundred megabytes; it only has to hold any kernel images
you wish to boot to, the initramfs image for those kernels, and the config files
for grub.  This is usually on the order of 10 megabytes per kernel.

mdadm should load the correct kernel modules for this step.  Ensure the source
partitions you decided on (step 1 above) are correct and present, and as root:

  mdadm --create --metadata=1.2 --homehost="<id string>" --level=<X> \
        --auto=<md|mdp> /dev/sdAB /dev/sdCD /dev/sdXY

Explanation of arguments:

  --create: Create a new array

  --metadata=1.2: Use the newest level metadata.  This is not necessarily
                  required.  All supported options:
    --metadata=1.0: Put the superblock at the end of each device.
    --metadata=1.2: Put the superblock 4K into each device.

    There is also a --metadata=1.1 option, but it definitely doesn't work with
	grub, since it puts the metadata right at the start of each device.  There
	is also a 0.9 metadata level, but that has serious limitations and is not
	supported by this hint at all.

  --homehost="<id string>": Set the "home host" string for this array.  This is
                            used for auto-assembly inside the initramfs: any
                            device with a matching home host string will be put
                            into the RAID array.  Remember the string you use.

  --level=<X>: Select the RAID level to use.  Valid options are 0 (not
               recommended), 1, 5, 6, 10, and any others listed in your local
               mdadm(8) man page.

  --auto=<md|mdp>: Choose either "md" or "mdp".  "mdp" allows partitions to be
                   created on the result of the RAID array; this is normally not
                   needed with LVM, but may be useful otherwise.  Remember the
                   value you choose.

  /dev/sd*: Device files to use as RAID sources.  These are usually partitions,
            but if you're brave, you can try to get a RAID1 to work with grub
            across a pair of disks.

Note that mdadm will find the smallest source device, and act as if the rest of
the devices were also that size.  Note also that standard RAID array device
counts apply here: a RAID5 array needs at least three members, a RAID1 array
needs exactly two members, etc.

Note also that you could use partitions created on the dmraid array, but that
type of setup is at the edge of what this hint supports.  It may work, but mdadm
might have issues reassembling the array inside initramfs, for example.  (You
may be able to hack on mdadm.conf to get this to work, though.)

Finally, note that you could name the array if you want.  See the mdadm(8) man
page for even more options.

6. [enc] Create encrypted volume

If you didn't do mdraid above (step 5), then set aside a partition for /boot
now.  The same comments as above apply: a few hundred megs is large enough.

You will need the "dm-crypt" kernel module loaded, if it is not already.  As
root, first fill the partition with random data, to foil some types of attack
which I am not familiar with but which most of the dm-crypt guides seem to know
about:

  dd if=/dev/urandom of=<device> bs=1048576

Here, <device> is the disk or partition or dmraid array or mdadm array that you
wish to encrypt.  This operation will take a *long* time; approximately five
minutes per gigabyte depending on disk and processor speeds.  If that is too
long, you can use a worse-quality random number generator that will run faster:

  badblocks -c 10240 -s -w -t random -v <device>

You could also skip this step entirely if you really want to.

Now put the encryption header on the volume, and choose a key:

  cryptsetup --cipher=aes-cbc-essiv:sha256 --key-size=256 luksFormat <device>

The cipher given here is not required (use any cipher your kernel supports),
although using -cbc-essiv is recommended to defend against watermark attacks.
Note that the size of the hash chosen must match the --key-size argument.

Now open the encrypted volume:

  cryptsetup luksOpen <device> pv

The last argument ("pv") is arbitrary, although "pv" makes sense if you're going
to put LVM on this encrypted volume.  The string given will be used verbatim for
the decrypted device name in /dev/mapper, so remember it if you use something
else.

7. [lvm] Set up LVM

If you did not do encryption above (step 6) or mdraid (step 5), then set aside a
partition for /boot now.  The same comments apply: a few hundred megabytes will
be more than enough.

LVM groups a set of physical volumes (PVs) into a volume group (VG), which is
then split into logical volumes (LVs).  LVs can also be resized at runtime,
either larger (if room is available on any of the VG's PVs) or smaller.

To start, format the sources as an LVM PV:

  pvcreate --zero y /dev/<device>

Use whichever <device> is current at this point: either /dev/mapper/pv, or
/dev/mdXpX, or /dev/mdX, or /dev/mapper/<dmraid-name>, or /dev/sdXY.

You can also remove the "--zero y" arguments if you don't wish to zero out the
contents of the volume before you start.

(Writing zeros may or may not increase security if the underlying volume is
encrypted.  On the one hand, if an attacker knows that a certain sector is
all-zeros in plaintext, this may -- depending on the encryption algorithm --
help them attack the encryption key.  On the other hand, it does write out lots
of seemingly-random data to the raw disk device, and this may disguise the fact
that it contains an encrypted volume if you didn't already write random data out
to it in step 6.  Writing zeros also takes a while.)

Repeat this step for any other device you wish to add to the volume group (VGs
can do a sorta-kinda-almost-RAID setup, where a set of logical volumes spans any
number of physical volumes in the VG.  It's easiest with a single PV though.)

Now create a VG containing this PV:

  vgcreate lfs /dev/<device>

Use the same /dev/<device> that you just formatted as a PV.  If you have more
than one PV, you can specify more than one /dev/<device>.

The volume group is named "lfs" here.  You can use any name you wish.  Once
logical volumes are created below, they will be accessible as symlinks at
/dev/<vg-name>/<lv-name>, so use any volume group name that will make them easy
to keep separate from all the other stuff in /dev.

Now create the logical volume(s):

  lvcreate -L 10G -n root lfs
  lvcreate -L 1G -n swap lfs
  ...

The argument to -L is the size of the logical volume.  Any size that will fit in
the current volume group is acceptable, though I recommend leaving some space
free for future LV growth, or for snapshot LVs.  (These can be used to check
your filesystems in the background, for instance, instead of having a 180-day
timeout after which fsck does a full check at boot time.)

The argument to -n is the logical volume's name: here, root and swap.  The final
argument is the volume group that the new LVs are part of: here, lfs.

At this point, you should have a /dev/lfs/root and /dev/lfs/swap (or whatever
names you chose); use these for the LFS and swap partitions in the book.

===============================================================================
  Notes on section 2.4: Mounting the New Partition
===============================================================================

8. [mdr,enc,lvm] Mounting /boot

Be sure to mount your boot partition at $LFS/boot in this step.

Also be sure to use the correct (final) root and swap devices (for example,
/dev/lfs/root and /dev/lfs/swap if you are doing LVM, you used "lfs" for your
VG name, and you used "root" / "swap" for your LV names).

===============================================================================
  Notes on sections 3.2: All Packages, and 3.3: Needed Patches
===============================================================================

9. [dmr,mdr,enc,lvm] New packages

You need LVM2 and lfs-initramfs, and maybe dmraid, mdadm, and cryptsetup+popt.
You also need the patch from above for LVM2.  See above for links.

==============================================================================
  Modifications to Chapter 6
===============================================================================

10. [dmr,mdr,enc,lvm] Modifications to Udev-XXX (any recent version)

Do *not* install rules/packages/64-device-mapper.rules, even though the book
says to.  Instead, install rules/suse/64-device-mapper.rules -- this is a much
better base for device-mapper support.  (Installing 64-md-raid.rules is fine,
and is required if you plan to do md RAID.)

install -m644 -v rules/packages/64-md-raid /lib/udev/rules.d/
install -m644 -v rules/suse/64-device-mapper.rules /lib/udev/rules.d/

Before installing udev-config, stop it from ignoring device-mapper devices (so
that the device-mapper udev rules will work), by commenting out the rule:

sed -i -e 's/^KERNEL=="dm-\*",.*ignore_device"$/#&/' 55-lfs.rules

===============================================================================
  New packages at the end of chapter 6
===============================================================================

11. [dmr,mdr,enc,lvm] Installing LVM2

Once you're finished with chapter 6, you need to install a few more packages to
support your chosen root FS type.  All of these packages require device-mapper
(which is now part of LVM2).  Whether you want to use LVM or not:

patch -Np1 -i ../LVM2-2.02.53-fix_udev_rules-1.patch
./configure --prefix=/usr --bindir=/bin --sbindir=/sbin --libdir=/lib \
    --enable-readline --with-udev-prefix= --with-udevdir=/etc/udev/rules.d \
    --enable-udev_rules

If you do *not* want to use LVM, build and install just device-mapper:

make device-mapper
make install_device-mapper

If you *do* want to use LVM, build and install everything:

make
make check
make install
install -m644 doc/example.conf /etc/lvm/lvm.conf

(Note that the every test in the testsuite will fail, unless you run it as root.
It requires the ability to mknod(2) so it can create a path to the device-mapper
kernel driver.  Also note that the tests require support for pretty much
everything in the kernel -- all device-mapper targets.  Therefore, if you have
configured your kernel to disable any of these, or you build as a non-root user,
then either consider the tests optional, or do not be surprised if they fail.)

12. [dmr] Installing dmraid

./configure --prefix=/usr --bindir=/bin --sbindir=/sbin --libdir=/lib
make
make install

13. [mdr] Installing mdadm

First, prevent mdadm from installing udev rules that rely on vol_id (which is
now gone), and which are also duplicated by the 64-md-raid.rules file that you
already have:

sed -i -e '/^install :/s/ install-udev//' Makefile

Then build and install it:

make INSTALL=install
make INSTALL=install install

(Note that mdadm also has a testsuite, which you can run -- but again, only as
root -- with "sh ./test" after building.)

14. [enc] Installing cryptsetup

First install popt and move its library to /lib (but keep the compile-time
symlink in /usr/lib):

./configure --prefix=/usr
make
make check
make install
mv -v /usr/lib/libpopt.so.* /lib
ln -sfv ../../lib/libpopt.so.0 /usr/lib/libpopt.so

Then install cryptsetup:

./configure --prefix=/usr --libdir=/lib --sbindir=/sbin --sysconfdir=/etc
make
make install

15. [dmr,mdr,enc,lvm] Installing lfs-initramfs

./configure --prefix=/usr --sysconfdir=/etc
make
make install

Now edit /etc/mkinitramfs.conf; see the inline comments for guidance on what
variables to set to what values.

Once the config file looks sane, continue in the LFS book.

===============================================================================
  Modifications to Chapter 8
===============================================================================

16. [dmr,mdr,lvm,enc] Kernel Configuration

Be sure to enable the kernel options required for your chosen layers of device
mapping.

17. [enc] Creating /etc/crypttab

The initramfs uses /etc/crypttab to find out which source devices to decrypt
into which target names.  Find the source device for your encryption.  Then
create the crypttab file:

cat <<'EOF' >/etc/crypttab
# Begin /etc/crypttab

# Syntax is:
# dest src password options
#   "options" and "password" are optional (and "password" may not be supported
#   in the initramfs).  "none" means no password or options.

pv /dev/xxxxxxxxxxx none none

# End /etc/crypttab
EOF

18. [dmr,mdr,lvm,enc] Creating the initramfs image

Now that everything is set up, you can actually create the initramfs.  The
mkinitramfs script has several options; we will explicitly force it to pull the
files for the LFS book's current kernel version.  (Add "-f" if you have already
run the script once, and it has created the initramfs-<version>.cpio.gz file in
/boot.)  As root:

mkinitramfs -k <kernel-version>

19. [dmr,mdr,lvm,enc] Grub configuration

A note: since we're using a separate partition for /boot, none of the grub path
names will begin with /boot (since grub's root is the other partition, its
pathnames will be either (e.g.) /grub/menu.lst, or /lfskernel-<version>).

Another note: you don't actually need a "root=X" kernel command line parameter
with this setup.  The initramfs will (try to) find it from /etc/fstab.  This
*does* mean that whenever you change /etc/fstab (or /etc/crypttab), you will
want to rerun mkinitramfs so it picks up the changes, but you don't need to keep
changing menu.lst if you want to change how the root FS is found.  You will also
be able to choose more mount-time options (e.g., "barrier=1" now works, assuming
your device-mapper stack supports it all the way down; "data=journal" does as
well), since they all come from fstab.

On to the actual changes.  In addition to the kernel line in menu.lst, you will
need an "initrd" line that refers to the initramfs-<kernel-version>.cpio.gz
file.  (Don't worry, it's not really an initrd.  The interface from grub to the
kernel is the same as it would be for an initrd, so keeping the same name in
grub's configuration makes sense.)

Be *SURE* to keep an entry in menu.lst for your host distro.  If the initramfs
breaks, your system will likely be unusable; you will have to mount everything
back up from the host and figure out how to fix the initramfs.  (Having a
reasonable LiveCD available also helps with this.)

===============================================================================
  End: Use the initramfs to boot
===============================================================================

Reboot.  Choose the LFS entry from the GRUB menu.  Enter the dm-crypt password
if you configured that, when requested.  (Note that you may have to add a
rootdelay=X kernel command line parameter, depending on your disk hardware and
discovery speeds.)  If you are dropped to "an extremely minimal shell", that
means that something went wrong: you can either reboot and play with kernel
command line options to try to get it to work, or manually run the commands in
the script named /init to see where the breakage is.

Unfortunately there are hardly any debugging tools on the initramfs; there may
never be very many.


ACKNOWLEDGEMENTS:

- Alexander Patrakov <patrakov AT gmail.com>
  For filing the bug that started this.  Also for providing feedback on early
  versions of lfs-initramfs, and the list of features he thought it needed.

- Simon Geard <delgarde AT ihug.co.nz>
  For testing many early versions of lfs-initramfs.


CHANGELOG:

[2009-09-25]
    Initial Release
