IT-LINUXMAKER, OpenSource, Tutorials

Repair Proxmox VM disks with fsck

An fsck (File System Check) is necessary when the file system is inconsistent or corrupted—typically after an unclean shutdown, sudden power outage, or kernel I/O errors. This can be done traditionally with a live system like GParted Live, which is made available as an ISO image on the Proxmox host and mounted to the individual VMs' CD drive. This can be done both via the Proxmox GUI and in the Bash terminal.

Here, however, we'll introduce the file system check via the Bash console on the Proxmox host. It's also assumed that the VM images are located on a ZFS-based storage, as is standard with Proxmox.
The idea is this:
The hard disk and its partitions are located within the VM's ISO image on the ZFS volume. Then you could run the fsck command directly from the KVM host on the VM's system disk. However, it's not that simple, since this is a complete virtual drive, i.e., a virtual disk with a partition table (e.g., MBR or GPT). This contains a partition table and typically one or more partitions, e.g. /dev/zvol/zp_100/vm-100-disk-0 contains sda1, sda2 etc. So if this

~# fsck /dev/zvol/zp_100/vm-100-disk-0

were to be done simply, it would not check the actual file system, but the raw file including the partition table - which is incorrect and can lead to data loss.

So for this procedure the program kpartx is installed first.

~# apt-get update
~# apt install -y kpartx

Main function of kpartx

This is a Linux command-line tool used to analyze partition tables in image files or device files and make the partitions they contain available to the system—that is, to create mapper devices for partitions. This creates devices such as /dev/mapper/loop0p1, /dev/mapper/loop0p2, etc., which can be mounted like normal partitions.

Procedure for the file system check with kpartx

First, the ID of the VM is required, which can be determined this way.

~# qm list
        VMID NAME                          STATUS       MEM(MB)   BOOTDISK(GB) PID
        100 DNS-Master-Server     running     3072             150.00               1626922
        101 Secondary-DNS-Server running   3072             150.00               1358521

In addition, the VM 100 to be treated must be stopped:

~# qm stop 100

Next, the location of the hard disk is required, which is “scsi0:” from

~# qm config 100

This shows where the hard drive is located and what its name is:

scsi0: zp_100:vm-100-disk-0,cache=writethrough,iothread=1,size=150G

Respectively

~# ls -l /dev/zvol/zp_100/vm-100-disk-0
lrwxrwxrwx 1 root root 10 Jun 11 12:27 /dev/zvol/zp_100/vm-100-disk-0 -> ../../zd16

Now kpartx can be applied - provided

~# qm status 100

returns "stopped." With

~# kpartx -av /dev/zd160

 maps the partitions, which can be displayed with

~# blkid /dev/mapper/zd160p1

Now the file system check can be performed on the individual partitions.

~# fsck -f -y /dev/mapper/zd160p1
~# fsck -f -y /dev/mapper/zd160p2
~# fsck -f -y /dev/mapper/zd160p5

If fsck can't repair it, maybe add -c for a badblocks check. Afterwards the mapping must be undone.

~# kpartx -dv /dev/zd160

And the VM can be restarted.

~# qm start 100
~# qm status 100

Logged in to the Bash console on VM 100, the test 

~# dumpe2fs -h /dev/sda1 | grep ‘Error count’

is now run. If everything went well, there shouldn't be any more errors. Then this system hard drive will be "clean" again.


IT-LINUXMAKER, OpenSource, IT-Support, IT-Consulting

© IT-LINUXMAKER 2025