Backup of Large VM has errors & says no backup BUT space on Backup Server is used for Backup

I tried backing up a VM server (we use as a database) that is quite large (around 800 GBs). The backup process took at least a day and half. At the end I saw this error message:

Creating a new virtual machine snapshot for MYDB
Uploaded file /vmfs/volumes/VM2-SSD/MYDB/MYDB.vmdk
Uploaded file /vmfs/volumes/VM2-SSD/MYDB/MYDB_3.vmdk
Uploading file MYDB_3-flat.vmdk
Uploaded file MYDB_3-flat.vmdk 6.62MB/s 1 day 06:05:43
Uploading file MYDB-flat.vmdk
Uploaded file MYDB-flat.vmdk 13.52MB/s 01:40:57
Uploaded file MYDB.vmx
Uploaded file MYDB.vmxf
Removing all snapshots of MYDB
Command ‘/bin/vim-cmd vmsvc/snapshot.removeall 6’ times out
Exception in thread Thread-12:
Traceback (most recent call last):
File “threading.py”, line 801, in __bootstrap_inner
File “threading.py”, line 1073, in run
File “vertical_backup.py”, line 31, in runCommandTimeout
File “vertical_utils.py”, line 46, in LOG_ERROR
File “vertical_utils.py”, line 80, in log
VerticalBackupError: COMMAND_RUN Command ‘/bin/vim-cmd vmsvc/snapshot.removeall 6’ times out

Remove All Snapshots:
Command ‘/bin/vim-cmd vmsvc/snapshot.removeall 6’ returned -9
/opt/verticalbackup # df -h

=============================================================================================

/opt/verticalbackup # vertical list
Vertical Backup 1.0.4
Storage set to sftp://user@172.1.1.2//home/vmwarebackup

No backup is listed? But the space on backup volume has been allocated for this large backup so something is there? What should I do?

The default timeout for removing the snapshots is 1200 seconds. Obviously that was not enough in your case.
So the backup file didn’t get a chance to be uploaded, while all chunks that had been uploaded would remain in the storage.

The real issue here is that the backup file should be uploaded first before removing the snapshots. It doesn’t matter if the final snapshot removal takes too long. It will likely be done when the next backup is to be started.

So I made a change to move the uploading of the backup file to before the snapshot removal and uploaded a new version 1.0.5. You can download it from http://verticalbackup.com/esxi/vertical.

Ok thanks will download! What should I do after downloading the latest version? Just run another backup and it will remove the previous backup attempt?

Currently I have space on the backup drive that is allocated for the backup for this backup but not sure how to free up space?

Right, just run another backup which will be much faster since many chunks have been uploaded by the previous backup attempt. However, some chunks uploaded by the previous backup attempt will become unreferenced due to changes in the disk files, so you’ll need to clean up these chunks after the backup is finished.

This command will print out all unreferenced chunks:

vertical prune -d --exclusive --exhaustive

Without the -d option it will remove all unreferenced chunks (assuming no backup to the same storage is in progress):

vertical prune --exclusive --exhaustive 

My network connection died while running backup and was wondering how to check the status of the running backup?

“Another backup job is in progress” when I type in vertical backup VM

You can run pgrep vertical to see if the backup process is running, pkill vertical to kill it.

If pgrep vertical doesn’t report any process while you still get “Another backup job is in progress” then you can manually remove the file .vertical/plock.

I see the process… do you know if the output of the process gets written to a log file?

No, it doesn’t log to a file unless you redirected stdout to a file when you started the program. You can run strace -p pid – if the backup is running it will print out tons of messages.

Thanks for your response… Now whenever I execute a vertical command I get the following error (was trying to do a vertical list):

File “vertical_main.py”, line 8, in
File “/tmp/pip-build-VI6Ne5/pyinstaller/PyInstaller/loader/pyimod03_importers.py”, line 389, in load_module
File “vertical_backup.py”, line 23, in
File “/tmp/pip-build-VI6Ne5/pyinstaller/PyInstaller/loader/pyimod03_importers.py”, line 389, in load_module
File “vertical_snapshot.py”, line 18, in
OSError: [Errno 28] No space left on device
Failed to execute script vertical_main

But I have space available on the vmware server and the data backup server:

VMFS-5 953.0G 894.2G 58.8G 94% /vmfs/volumes/VM2-SSD
VMFS-5 1.8T 1.3T 493.0G 73% /vmfs/volumes/VM2_Data
vfat 4.0G 22.8M 4.0G 1% /vmfs/volumes/52721af7-72e1bda9-b39b-a0369f294c78
vfat 249.7M 158.8M 90.9M 64% /vmfs/volumes/8808c774-0879865b-3767-e2c4cfe15f86
vfat 249.7M 8.0K 249.7M 0% /vmfs/volumes/c6cee871-7b0506fd-f898-2a5e09b59206
vfat 285.8M 191.3M 94.6M 67% /vmfs/volumes/52721aee-dd51aa53-fe0e-a0369f294c78

VMFS-5 volume is getting close to being full but not there yet, not sure what the issue is.

Your /tmp dir is full. There might be some folders similar to /tmp/pip-build-VI6Ne5 which were left by previous Vertical Backup processes not exiting cleanly. You can remove these folders to free up space.

The /tmp dir is empty actually…

I’m getting new issues when running the backup:

Storage set to sftp://user@172.16.8.55//home/vmwarebackup
Listing all virtual machines
Backing up MYDB, id: 6, vmx path: /vmfs/volumes/VM2-SSD/MYDB/MYDB.vmx, guest os: centos64Guest
No previous backup found
Virtual machine MYDB is powered on
Removing all snapshots of MYDB
Remove All Snapshots:
Remove all snapshots failed
Command ‘/bin/vim-cmd vmsvc/snapshot.removeall 6’ returned 1

The backup is not running. I killed the previous vertical process as I had no idea on what it was doing.

You can do a manual snapshot consolidation in vSphere Client (Right client the VM -> Snapshot -> Consolidate) and wait until it’s completed and then start a backup again.

Is this something I can do from the command line? kinda shuffling vmware hosts around the 3 host essential license :frowning:

I think vsphere client is free, for example https://communities.vmware.com/thread/445148

Ok was able to fix the issue by running the following command manually:

vim-cmd vmsvc/snapshot.removeall [VMID]