Another job in progress email, when job scheduled task kicks off


#1

I’ve setup a scheduled job using vertical cron command.

It looks like this in /etc/rc.local.d/local.sh

#VERTICALBACKUP                                                                                                                                                                                                                              
/bin/kill $(cat /var/run/crond.pid)                                                                                                                                                                                                          
/bin/echo '#VERTICALBACKUP' >> /var/spool/cron/crontabs/root                                                                                                                                                                                 
/bin/echo '0 */12 * * * /vmfs/volumes/534fcc7f-1ca88523-5866-a0b3cce17ffe/verticalbackup/vertical backup --email --exclude-disk *Toshiba_E300* &> /vmfs/volumes/Crucial\ M500/verticalbackup/vertical.log' >> /var/spool/cron/crontabs/root  
/usr/lib/vmware/busybox/bin/busybox crond    

Here’s what it looks like in /var/spool/cron/crontabs/root

#min hour day mon dow command
1    1    *   *   *   /sbin/tmpwatch.py
1    *    *   *   *   /sbin/auto-backup.sh
0    *    *   *   *   /usr/lib/vmware/vmksummary/log-heartbeat.py
*/5  *    *   *   *   /sbin/hostd-probe ++group=host/vim/vmvisor/hostd-probe
*/2 * * * * /usr/lib/vmware/vsan/bin/vsanObserver.sh
#VERTICALBACKUP
0 */12 * * * /vmfs/volumes/534fcc7f-1ca88523-5866-a0b3cce17ffe/verticalbackup/vertical backup --email --exclude-disk *Toshiba_E300* &> /vmfs/volumes/Crucial\ M500/verticalbackup/vertical.log

When ever the job kicks off at the scheduled time I always get a failed email:

Received at 12:00pm:

2018-02-08 12:00:05.509582 INFO PROGRAM_VERSION Vertical Backup 1.1.5
2018-02-08 12:00:08.515325 INFO LICENSE_INFO Trial license expires on 2018-02-18
2018-02-08 12:00:08.515656 ERROR BACKUP_BUSY Another backup job is in progress

Then I get another email an hour or so later confirming the status of the job.

Received at 12:41pm:

2018-02-08 12:00:05.738726 INFO PROGRAM_VERSION Vertical Backup 1.1.5
2018-02-08 12:00:08.467337 INFO LICENSE_INFO Trial license expires on 2018-02-18
2018-02-08 12:00:08.550497 INFO STORAGE_CREATE Storage set to /vmfs/volumes/boxroomnas01/Vertical_Backup/
2018-02-08 12:00:08.702639 INFO SNAPSHOT_GETALLVM Listing all virtual machines



...



018-02-08 12:40:56.622548 INFO BACKUP_DONE Backup OpenVPN Access Server 2.0.12@BoxroomESXi01 at revision 29 has been successfully completed
2018-02-08 12:40:56.622807 INFO BACKUP_STATS Total 8198 chunks, 8192.61M bytes; 21 new, 18.53M bytes, 2.14M uploaded
2018-02-08 12:40:56.622893 INFO BACKUP_TIME Total backup time: 00:01:46
2018-02-08 12:40:56.623934 INFO SNAPSHOT_REMOVE Removing all snapshots of OpenVPN Access Server 2.0.12

This happens twice a day.
So I’m guessing somewhere I’ve managed to schedule the job to execute twice. But I can’t see where? The jobs are 12 hours apart and the backups are taking about an hour now that the first ones are done.

If I run ps -i | grep "vertical" between jobs nothing comes back.

Any other places I should be checking?


#2

You have two instances of /usr/lib/vmware/busybox/bin/busybox crond running. You can confirm this by running ps -i | grep busybox.


#3

Nice, cheers! you are right

33407    33407  busybox                                
1192385  1192385  busybox 

Not sure how that happened.


#4

So, looks like somehow BusyBox is starting on boot but not updating crond.pid.

So when local.sh runs it’s unable to kill the existing instance and ends up launching another.

Need to get to the bottom of why this is happening but certainly not a vertical issue.