Purging over slow net connection

AshMcK · June 4, 2018, 3:04pm

Hello,

We are now looking at purging old backups from the off-site backup server. Unfortunately, it’s on a slow internet connection. I tried running the purge command manually for around 1.5 hours but didn’t look like it was getting anywhere, progress wise.

What happens in the background in terms of running this command? Would it actually work if we’re attempting to do it over a slow connection?

What would your suggestions be?

Thanks

gchen · June 5, 2018, 2:50pm

Prune can be slow if there are a large number of chunks to delete or rename. I would recommend using Duplicacy for this purpose – I’m adding a -threads option to the prune command which allows you to use multiple threads to delete or rename chunks. This should be done today or tomorrow.

AshMcK · June 6, 2018, 7:52am

Thanks for the response @gchen.

Why would it be better using duplicacity? In theory vertical does exactly what we want?

Does Duplicacity so things like retry when a failure occurs? Or are you able to view the backup progress?

The comment about using threads when purging does this apply to VerticalBackup or Duplicacity or… both?

Thanks!

gchen · June 7, 2018, 1:21am

ESXi imposes a limit on the maximum memory a process can use, which is usually about 700M to 800M. You can run Duplicacy on most computers so there isn’t such a limit.

Multiple threaded pruning in Duplicacy has been implemented by https://github.com/gilbertchen/duplicacy/pull/441. If you need a binary please let me know.

AshMcK · June 7, 2018, 8:58am

Hi @gchen.

Sorry, I’m not quite following…

Are you saying we should move away from Vertical Backup and instead use Duplicacy?
Or should we be using them both in some way?

Also, I’m not sure if Duplicacy has a retry option for SFTP if the net goes down?

Thanks.

gchen · June 7, 2018, 6:07pm

Sorry about the confusion. I was referring to the prune command only. Duplicacy doesn’t run on ESXi at all and that was why I developed Vertical Backup.

Duplicacy does not have a retry option for the SFTP backend. This is one of the frequently requested feature so I will consider implementing it soon.

AshMcK · June 11, 2018, 8:38am

Sorry @gchen. I’m still not understanding why Duplicacy would be better than using Vertical. What are the benefits of us using Duplicacy instead of Vertical?

gchen · June 12, 2018, 1:04am

If the storage contains many backups with a large number of chunks, the prune command needs a lot of memory to construct the list of chunks in the memory. A process running inside ESXi can at most use 700M to 800M bytes of memory, which may not be enough to load the complete chunk list needed by the prune command into memory. With Duplicacy, there isn’t such an artificial limit set by the OS and the amount of memory that can be used by Duplicacy is usually much larger.