How to code? — 12

Thalapathy Krishnamurthy
4 min readNov 29, 2019

Okay, We are back after a period of lull. Please read previous posts to catch up.

One important thing I want to go over again here. The main problem while backing up files is to ensure the integrity of the data. Specifically what happens when the bakitup is abruptly stopped while it is doing the backup? And where to begin again when bakitup is restarted?

If you notice I have been circling this question, trying to answer it and finding flaws in the answer and again finding another way. Every problem has a central lingering question like this. We may not have the correct answer, but we work our way to it.

The last answer we arrived at for the above questions were to compile a list of files that have got modified in terms of size in the source folders (from where we need to backup) by comparing them with the destination and then just go over this list by copying the files into the destination folder. This seem to be the most simplest and do-able thing after examining other options.

So whenever bakitup runs, it compares the size of files between source and destination and starts to copy the list of files. An additional check we want to add is that if the file sizes are same between source and destination, we will also check if the timestamp of the source is beyond the destination and include such files as well. This will ensure files that have got modified without any change in the file size will not be left out.

Now let us just imagine a scenario where an user uses bakitup to maintain a backup of the photos on his hard disk to an external USB drive. He has his photos in a couple of folders. And he would like to ensure he does not lose any of them. He travels a lot and carries his laptop. So there is a high probability of his disk going bad. This made him install bakitup. And whenever he works on his laptop, he keeps the external drive connected to it. Assuming Bakitup is configured to run automatically and begin to perform the backup whenever an external drive is connected, the source folder containing say thousands of photos are constantly backed up by bakitup into the external drive. The user keeps dumping new photos every day and bakitup keeps its pace in copying them by finding a list of photos that got newly added or those it partially copied last time that gets picked up by the size difference as checked by our copy() algorithm.

This use case is pretty good to validate bakitup and it appears like bakitup will survive this easily. I will now go forward with the story. Imagine the user while traveling to a city, tries to power the laptop after reaching the hotel, but finds that it is not booting up. Now he panics. He cannot see his photos. However he pats himself for making the wise decision to use bakitup and he knows all his photos are on the external drive safely. He believes bakitup would have done the job. He now uses the hotel computer to connect to his external drive and see if the photos are intact. And he gets a bit of shock to find that a bunch of photos he uploaded from his camera the previous evening on a beach are missing. He is pretty upset.

Now how did this happen ?

Obviously as we have been talking, the user must have shutdown his PC while bakitup was running. So not all photos could get transferred to the external drive during the last backup run. Now who is responsible for this? The user who shut down abruptly or bakitup that failed to copy the remaining photos ?

These are gaps in thinking or visualizing the functions of the software. Obviously one cannot expect a naive user of a application like bakitup to think about it. He bought it with the expectation that bakitup would take care of preserving his files. So it is important that the developer has considered this type of scenario and handled them adequately while implementing the function of bakitup.

Is it possible for a developer to handle this ? It is definitely possible if bakitup gets a signal from the OS which is shutting down the computer to notify the user that the backup is not yet complete and the user may find an incomplete backup. This is definitely a way to alert naive users of what the consequences of their actions are.

There is one more thing that can be done. Bakitup can be running all the time in the background. Whenever a file changes in the source folder, bakitup can sense that either by a notification from the OS or by scanning periodically the source folder for any change and in turn back it up immediately to the external drive if it is connected. This is actually called synching. Google photos does this on your mobile phone. In our case, providing this feature may be very useful as this can narrow the loss. Instead of a large number of files missed from back up during a shutdown, this can bring down the number to one or two and can reduce the damage.

--

--