An fsck
(File System Check) is necessary when the file system is inconsistent or corrupted—typically after an unclean shutdown, sudden power outage, or kernel I/O errors. This can be done traditionally with a live system like GParted Live, which is made available as an ISO image on the Proxmox host and mounted to the individual VMs' CD drive. This can be done both via the Proxmox GUI and in the Bash terminal.
Here, however, we'll introduce the file system check via the Bash console on the Proxmox host. It's also assumed that the VM images are located on a ZFS-based storage, as is standard with Proxmox.
The idea is this:
The hard disk and its partitions are located within the VM's ISO image on the ZFS volume. Then you could run the fsck
command directly from the KVM host on the VM's system disk. However, it's not that simple, since this is a complete virtual drive, i.e., a virtual disk with a partition table (e.g., MBR or GPT). This contains a partition table and typically one or more partitions, e.g. /dev/zvol/zp_100/vm-100-disk-0
contains sda1, sda2
etc. So if this
~# fsck /dev/zvol/zp_100/vm-100-disk-0
were to be done simply, it would not check the actual file system, but the raw file including the partition table - which is incorrect and can lead to data loss.
So for this procedure the program kpartx
is installed first.
~# apt-get update
~# apt install -y kpartx
This is a Linux command-line tool used to analyze partition tables in image files or device files and make the partitions they contain available to the system—that is, to create mapper devices for partitions. This creates devices such as /dev/mapper/loop0p1
, /dev/mapper/loop0p2
, etc., which can be mounted like normal partitions.
First, the ID of the VM is required, which can be determined this way.
~# qm list
VMID NAME STATUS MEM(MB) BOOTDISK(GB) PID
100 DNS-Master-Server running 3072 150.00 1626922
101 Secondary-DNS-Server running 3072 150.00 1358521
In addition, the VM 100 to be treated must be stopped:
~# qm stop 100
Next, the location of the hard disk is required, which is “scsi0:” from
~# qm config 100
This shows where the hard drive is located and what its name is:
scsi0: zp_100:vm-100-disk-0,cache=writethrough,iothread=1,size=150G
Respectively
~# ls -l /dev/zvol/zp_100/vm-100-disk-0
lrwxrwxrwx 1 root root 10 Jun 11 12:27 /dev/zvol/zp_100/vm-100-disk-0 -> ../../zd16
Now kpartx
can be applied - provided
~# qm status 100
returns "stopped." With
~# kpartx -av /dev/zd160
maps the partitions, which can be displayed with
~# blkid /dev/mapper/zd160p1
Now the file system check can be performed on the individual partitions.
~# fsck -f -y /dev/mapper/zd160p1
~# fsck -f -y /dev/mapper/zd160p2
~# fsck -f -y /dev/mapper/zd160p5
If fsck
can't repair it, maybe add -c
for a badblocks check. Afterwards the mapping must be undone.
~# kpartx -dv /dev/zd160
And the VM can be restarted.
~# qm start 100
~# qm status 100
Logged in to the Bash console on VM 100, the test
~# dumpe2fs -h /dev/sda1 | grep ‘Error count’
is now run. If everything went well, there shouldn't be any more errors. Then this system hard drive will be "clean" again.