When you are managing a huge number of virtual machines with a shared datastore, it could be very dangerous for different virtual machines to access the same disk file or volume. This can lead to data consistency problems, and in a worse case scenario, the loss of all information on the disk.

We all know that things that should never happen sometimes do. That’s why it’s good practice to add an extra security layer, to ensure that disk access will always be controlled.

A tool for this purpose is listed on the libvirt webpage:
http://libvirt.org/locking.html

With libvirt-lock-sanlock, we create a connection between libvirt and sanlock. Whenever libvirt is using a disk or volume, sanlock will create a lock, so other virtual machines that would be using the same disk won’t be able to start up.

Abiquo supports a configuration for using this libvirt functionality on KVM. The requirements are:

  • NFS resource (with at least 1 MB of space) to save the locks
  • KVM installed on CentOS 6
  • Shared datastore

This will lock:

  • Main disks
  • Volumes

You will find the Abiquo documentation at:
http://wiki.abiquo.com/display/DEVOPS/Configuring+Sanlock+on+KVM+Centos+6

How libvirt-lock-sanlock works

To understand and troubleshoot libvirt and sanlock, first you must understand how they connect and work together.

Libvirt will lock a disk file whenever it has a virtual machine in a “running” state. So, the lock will be released when:

  1. The virtual machine is stopped
  2. The virtual machine goes into start “suspended”
  3. The lock timestamp is not updated for a certain time

To understand the 3rd case, it’s necessary to understand how sanlock works.

Sanlock requires a shared location to synchronize between all hypervisors. At this location, two kind of different files will be created. Also, each hypervisor will have a unique identifier assigned.

A unique file called “__LIBVIRT__DISKS__” will be created at the shared location. This file will save information about every sanlock registered. The useful data registered for each hypervisor is:

  1. Sanlock ID (own)
  2. Timestamp
  3. Sanlock registration ID (gen)

Also, for each volume or disk being used, a new lock file will be created. The useful data registered is:

  1. Resource (UUID that identifies the volume)
  2. Timestamp
  3. Owner (the ID of the hypervisor locking the disk)
  4. Sanlock register lock ID (gen)

When a hypervisor is correctly registered with sanlock, it will constantly update the timestamp of the hypervisor and the timestamp of the locks of the volumes that are being used. When the hypervisor has an issue (for example, power outage), the timestamps are not updated, so after a certain time, the other hypervisors know that the disk can be used.

Every time the hypervisor is restarted, the gen on __LIBVIRT__DISKS__ is increased by one (on new registration). So in the case that a hypervisor is restarted without unlocking the locks, after the hypervisor is registered again, the gen field will not match, so another hypervisor will know that those lock files are outdated.

Troubleshooting libvirt-sanlock

Some useful tips for troubleshooting libvirt-sanlock:

1. Sanlock status

[root@bc3blade6 sanlock]# sanlock status
daemon 2eee3c5f-b57c-4220-8c91-f9bc26f42ccb.bc3blade6.
p -1 helper
p -1 listener
p -1 status
p 7540 ABQ_1c310eb0-0263-4aa8-aef9-0a4172248f68
p 7580 ABQ_cc4b4473-9545-49d5-8441-7e3a23ed8968
p 7609 ABQ_613dc482-5315-4980-9fac-9196d63dcc83
s __LIBVIRT__DISKS__:1:/sanlock/__LIBVIRT__DISKS__:0
r __LIBVIRT__DISKS__:7e32cf41ef5f940a9dd3fde71e4c18fc:/sanlock/7e32cf41ef5f940a9dd3fde71e4c18fc:0:7 p 7609
r __LIBVIRT__DISKS__:7b0ffb9006b82366fcb97cb2e6ea84e5:/sanlock/7b0ffb9006b82366fcb97cb2e6ea84e5:0:8 p 7580
r __LIBVIRT__DISKS__:9439e2b9e36fdeddc4865da01a62b432:/sanlock/9439e2b9e36fdeddc4865da01a62b432:0:3 p 7540

This command gives us some useful information.

s __LIBVIRT__DISKS__:1:/sanlock/__LIBVIRT__DISKS__:0

__LIBVIRT__DISKS__ is the lockspace created by libvirt. This tells us that the sanlock is registered on the shared location.


p 7540 ABQ_1c310eb0-0263-4aa8-aef9-0a4172248f68

This tells us that sanlock is monitoring process 7540, which is the process of the virtual machine ABQ_1c310eb0-0263-4aa8-aef9-0a4172248f68


r __LIBVIRT__DISKS__:7e32cf41ef5f940a9dd3fde71e4c18fc:/sanlock/7e32cf41ef5f940a9dd3fde71e4c18fc:0:7 p 7609

This informs about a disk lock. As you can see at the end of the line, it’s registered to the process 7609, so it comes from virtual machine ABQ_613dc482-5315-4980-9fac-9196d63dcc83. The line also tells us about the following (in order):
– The lockspace it is registered to
– UUID of the lock
– Path of the lock

2. sanlock direct dump <path>

With this command, we can directly see the information from the files at the shared location. We can see the information from the lockspace file

[root@bc3blade6 sanlock]# sanlock direct dump /sanlock/__LIBVIRT__DISKS__
offset                            lockspace                                         resource  timestamp  own  gen lver
00000000                   __LIBVIRT__DISKS__  2eee3c5f-b57c-4220-8c91-f9bc26f42ccb.bc3blade6. 0000644810 0001 0003
00000512                   __LIBVIRT__DISKS__  cac0938e-74f7-4d38-a5ab-2930044479b0.bc3blade7. 0000004027 0002 0004
00001024                   __LIBVIRT__DISKS__  96eef8df-1d1b-4397-860a-cce3e20905ed.bc3blade8. 0000643532 0003 0002

Or the information about a lock

[root@bc3blade6 sanlock]# sanlock direct dump /sanlock/7e32cf41ef5f940a9dd3fde71e4c18fc
offset                            lockspace                                         resource  timestamp  own  gen lver
00000000                   __LIBVIRT__DISKS__                 7e32cf41ef5f940a9dd3fde71e4c18fc 0000006784 0001 0003 7

Possible problems

After everything is correctly configured, it should all work smoothly. However, during the configuration process, you might have some problems that you may recognize from the following list:

1. The hypervisor is not registering correctly after a reboot. I can see various __LIBVIRT__DISKS__ files on the NFS

You must check the NFS mount options. Incorrect configurations may create different pointers to the __LIBVIRT__DISKS__ file.

2. HA is not working correctly with sanlock

You might see that after an HA operation, the messages log shows that the disk is still locked. Sanlock takes some time to recognize that the disk is unlocked because of the timestamp. In Abiquo tests, setting HA checks every minute works fine. If you notice this issue, try increasing the time between HA checks.

3. The disk appears as locked and libvirt won’t unlock it

If sanlock did not unlock the disk and you are sure that the file should be unlocked, just delete the lock file and the disk will be automatically unlocked.

4. I have a large number of unused lock files

The lock files are not deleted when a disk is unlocked. You can use the “virt-sanlock-cleanup” command to clean up old lock files. We recommend that you should configure a cron job for that purpose.

5. The hypervisor can’t access the sanlock shared folder

If the hypervisor can’t access the sanlock shared folder, the other hypervisors might not be informed that the disk is locked. We recommend that you should create the sanlock shared NFS folder on the same device as the shared datastore, and monitor access to the NFS.