Yet another GRUB recovery article… Yep, I can’t deny, that’s what it’s about!

Introduction

GRUB is a bootloader, ie. a piece of software that gets loaded very early on when you boot your machine, and which is in charge of booting the Operating System. GRUB is the bootloader of choice for many Linux distributions.

From time to time, it happens that the poor GRUB is stepped over by some very rude operating systems. In my case, I needed the infamous Windows, and as I didn’t want to install it, I googled my way to a tutorial that explained how to create a bootable USB stick with Windows. Great, I thought, and it worked! Except that after doing my things in Windows, I restarted and found that the machine couldn’t boot anymore. GRUB had been wiped away, silently, without any warning, by the infamous Windows…

If you find yourself in a similar situation, fear not! We can always boot a live system from a USB stick, and repair the boot partition (ie. re-install GRUB) from there. It’s fairly easy, it’s just a matter of knowing the right commands.

There’s already many tutorial of this kind available, each one slightly different from each other depending on the particular hardware and software details of the person writing the article. And same goes with this article, so let me tell you immediately about my configuration:

  • We’re talking about recovering GRUB with a GParted Live CD.
  • This happens on a UEFI machine.
  • The disk is partitioned with GPT.
  • The disk happens to be a SSD.
  • The whole disk is encrypted with LUKS.
  • The LUKS partition is “partitioned” using LVM.

If you have no idea what I’m talking about, you’re probably on the wrong page. But if you understand what I mean, and even if your configuration is a bit different, read on. I’ll do my best to explain what’s going on so that you can adapt these commands to your particular setup.

A word of caution though: we’re dealing with the disks and filesystems here, so if you screw up you can lose data. I will just assume that you understand what’s going on, and that you’re able to adapt the commands below to your own setup. Mostly, it’s about changing the device names here and there, and I can’t do that for you.

At first, let’s understand how things work

So let’s get started with some explanations. If you already know how things work and just want to copy/paste some commands, skip this part. But if you want to understand a bit what we’re doing here, read on.

The commands in this part are meant to be run when your machine is up and running, the purpose it to look at how things are setup. So of course, if right now your machine can’t boot, you can’t run these commands. I mean, don’t run it from a live system, it won’t give you the expected result.

Overview of the disk partitions

Let’s have an overview of the partitions present on the disk, the best command for that being lsblk. On my laptop, here’s the output:

$ lsblk
NAME                 MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
nvme0n1              259:0    0   477G  0 disk  
├─nvme0n1p1          259:1    0   512M  0 part  /boot/efi
├─nvme0n1p2          259:2    0   244M  0 part  /boot
└─nvme0n1p3          259:3    0 476.2G  0 part  
  └─nvme0n1p3_crypt  253:0    0 476.2G  0 crypt 
    ├─deb--vg-root   253:1    0  23.3G  0 lvm   /
    ├─deb--vg-var    253:2    0    40G  0 lvm   /var
    ├─deb--vg-swap_1 253:3    0  15.9G  0 lvm   [SWAP]
    ├─deb--vg-tmp    253:4    0    10G  0 lvm   /tmp
    └─deb--vg-home   253:5    0 385.9G  0 lvm   /home

As I explained above, I have an UEFI machine, the hard drive is a SSD, partitioned with GPT, and the disk is encrypted. You can see all of that above, ie:

  • nvme0n1 is what you get if you have a SSD drive. For comparison, a more ancient HDD would be named something like sda. nvme stands for Non-Volatile Memory Express (roughly), and n1 stands for disk number one. Additionally, the suffix p1, p2 and so on is appended to give the partition number.
  • nvme0n1p1 and nvme0n1p2 are the two partitions involved in the boot process.
  • nvme0n1p3 is the LUKS encrypted partition that takes up the whole disk. When the partition is unlocked, it is available unencrypted at nvme0n1p3_crypt, thanks to the Linux device mapper.
  • Additionnally, all the deb-- devices are the different logical volumes presents on the partition. If you’re familiar with Linux distributions, you can recognize the usual partitions /, /home and the swap partition. I also decided to have a separate /tmp and /var partitions, but that’s a matter of taste.

The logical volumes are not exactly partitions the way we know it. Ultimately, they appear to the user as block devices, however they are virtual block devices, you won’t find them directly in /dev, but instead in the sub-directory /dev/mapper.

$ ls -1 /dev/mapper/
control
nvme0n1p3_crypt
deb--vg-home
deb--vg-root
deb--vg-swap_1
deb--vg-tmp
deb--vg-var

In case you wonder, such layout is nothing fancy. It’s what you get with the Debian Buster installer, assuming that you select the whole disk encryption during the install process.

Ok, let’s dive a bit more into details.

The boot partition

There are two partitions involved in the boot process.

The first partition of the disk, nvme0n1p1 is the EFI System Partition. At boot time, the UEFI firmware loads this partition and will boot your system from there. In our case, this is where GRUB lives.

On a Debian system, this partition is usually mounted at /boot/efi, and you can have a look at what’s inside.

$ sudo tree /boot/efi
/boot/efi
└── EFI
    ├── debian
    │   └── grubx64.efi
...

Indeed, GRUB lives there, among other things.

The kernel and initrd partition

I mentioned that my disk is encrypted, right ? It means that at boot time, I need to enter my password in order to decrypt the operating system. The decryption can be implemented in different places, and actually recent versions of GRUB can handle that. However, in the setup I describe here, this is not the case. GRUB will just boot and unencrypted kernel, and it’s the kernel who’s in charge of decrypting the encrypted partition.

It means that we need an unencrypted partition somewhere to store the kernel and the initrd: this is the purpose of the second partition: nvme0n1p2.

On a Debian system, this partition is usually mounted at /boot. Once again, the best is to have a look and convince yourself.

$ ls -1 /boot
config-4.13.0-1-amd64
config-4.14.0-3-amd64
efi
grub
initrd.img-4.13.0-1-amd64
initrd.img-4.14.0-3-amd64
lost+found
System.map-4.13.0-1-amd64
System.map-4.14.0-3-amd64
vmlinuz-4.13.0-1-amd64
vmlinuz-4.14.0-3-amd64

vmlinuz is the usual filename for the Linux kernel, while initrd.img is the initial ramdisk, ie. a temporary, minimal root filesystem that gets loaded by the kernel at first, and perform some initial setup before mounting your “real” root filesystem.

Some more commands

Another good command to have a hierachical view of the mount points is findmnt.

$ findmnt
...
├─/boot        /dev/nvme0n1p2  ext2  rw,relatime,...
│ └─/boot/efi  /dev/nvme0n1p1  vfat  rw,relatime,...
...

The output is interesting, because it shows clearly how the efi mount point is nested inside the /boot mount point. This is interesting because during the recovery process we will have to reproduce this layout.

At last, you can also have some useful information with fdisk.

$ sudo fdisk -l
Disk /dev/nvme0n1: 477 GiB, 512110190592 bytes, 1000215216 sectors
Units: ...
Sector size ...
I/O size ...
Disklabel type: gpt
Disk identifier: ...

Device           Start        End   Sectors   Size Type
/dev/nvme0n1p1    2048    1050623   1048576   512M EFI System
/dev/nvme0n1p2 1050624    1550335    499712   244M Linux filesystem
/dev/nvme0n1p3 1550336 1000214527 998664192 476.2G Linux filesystem
...

One interesting detail here is the Disklabel type, which should be gpt if your disk is partitioned with GPT, or dos otherwise.

Ok, I hope you get a good overview of things now, time to get started with the real thing: recovering GRUB.

Get yourself a GParted Live CD

Let’s get started with the recovery process!

The first thing to do is to prepare a USB stick (or a CD-ROM if you’re a bit old-fashioned) with a live system of some kind. I usually go with a GParted Live CD/USB for this task, as it’s been around forever and always did the job. But there are other alternatives, and you might want to give a try to ReFind if you want to try out new stuff.

So, let’s visit https://gparted.org/download.php, download an iso, and install that on a USB stick. I won’t cover these details, it’s nothing complicated and there’s plenty of explanation available on the GParted website already.

When you’re done, plug the USB stick, reboot your machine, get a boot prompt, and then choose to boot from the USB stick.

Be sure that your machine is configured to boot in EFI mode, and not in legacy BIOS mode. If I’m not mistaken, GRUB will detect that later on, and will install itself accordingly. So if you boot in legacy BIOS mode, then grub will install itself the legacy way, and you will not be able to boot it through UEFI.

Your first steps in GParted

The desktop looks a bit old fashioned, it’s not super pretty, but that’s not the point. The point is that GParted comes furnished with all the tools you need to deal with disks and filesystems, all of that up-to-date and well maintained.

There are a few icons on the desktop, one is about changing the display resolution, and it can be very useful. Other than that, just right click somewhere on the desktop, and choose to open a root terminal.

And that’s pretty much all we need to know.

The recovery process

Now, it’s just a matter of chrooting properly. chroot (literally “change root”) is the magic of executing a process in another root directory. Right now, if you type ls / in your terminal, you will see the current root directory of the system. And which system? The live system, the one you booted from the USB stick.

However, in order to recover GRUB, we will need to run some commands in the root directory of your machine, the one you can’t boot at the moment. The reason is that GRUB needs to access resources in several places of this hierarchy. So the right way to achieve that is to chroot. And we need a bit of work to prepare the environment in which we will chroot, it’s a bit more complicated than just doing cd somewhere, or just remounting a directory.

So, yep, that’s where all the difficulty is, simply because it’s the kind of commands we never have to run in the daily life. And there are different elements in the picture, we will have to deal with the encryption, and the logical volumes, and all the rest.

So, let me guide you in the process.

Mapping every devices

Please remember that I told you to open a root terminal!

# whoami
root

A lot of magic will happen in /dev/mapper, so you might as well have a look right now.

# ls /dev/mapper

Yep, right now it’s more or less empty, but we will populate it.

The first thing we need to handle is the encryption: we need to unlock our encrypted partition before we can work with it.

This is actually super easy, as long as you know the command :)

# cryptsetup luksOpen /dev/nvme0n1p3 cryptdisk

And have a look at /dev/mapper immediately to see what happened.

# ls /dev/mapper

Done, so from now on your encrypted partition is unlocked, let’s move on to the next step, which is to activate the LVM partitions. I’m not sure this is always needed, but it doesn’t hurt either.

Just enter these LVM commands.

# vgscan
# vgchange -ay

And once again, be curious, check what’s up in /dev/mapper.

# ls /dev/mapper

As you can see, all your partitions are now visible there. That’s great, we can start to mount them now.

Preparing the chroot

We will prepare the chroot in /mnt. Right now, it’s empty.

# ls /mnt

We need to mount the root partition first.

# mount /dev/mapper/deb--vg-root /mnt
# ls /mnt

If you read the explanations above, then you know that the boot partitions are supposed to be mounted in /boot. Since one mount point is nested into the other, the order for these commands matter.

# mount /dev/nvme0n1p2 /mnt/boot
# mount /dev/nvme0n1p1 /mnt/boot/efi

Additionally, other system partitions need to be mounted, if any. So if you have a separate /var partition, it’s time to mount it.

# mount /dev/mapper/deb--vg-var /mnt/var

At last, we bind-mount the virtual things. These are some special directories, containing some runtime things that the system needs to operate properly.

# mount -o bind /dev  /mnt/dev
# mount -o bind /proc /mnt/proc
# mount -o bind /sys  /mnt/sys

And we’re done! We can now enter another world, ie. change root directory.

# chroot /mnt

As you noticed, your prompt changed. That’s the indication that you are now in a different environment.

Recovering GRUB

It’s now super easy. We want to re-install GRUB, and we just need two commands for that.

# grub-install /dev/nvme0n1
Installing for x86_64-efi platform.
Installation finished. No error reported.

# update-grub
Generating grub configuration file...
Found background image: blablabla
Found linux image: /boot/vmlinuz-4.14.0-3-amd64
...
Adding boot menu entry for EFI firmware configuration
done

If you see this warning along the way, fear not, it’s harmless.

WARNING: Failed to connect to lvmetad. Falling back to device scanning.

Done! Now you want to exit the chroot.

exit

See how your prompt changed ? You’re out of the chroot, back to the real world! Time to reboot and see if GRUB is happy again.

reboot

After that, I’m not even sure your system will boot ;) You might need to enter your UEFI manager, and then manually add a boot entry: select the file bootx64.efi. On my laptop, the UEFI manager is a bit dumb, I also need to give a name when I create the entry, otherwise it fails with some unhelpful error message.

Thanks

Here are some readings that made this article possible.