DLT tape recovery - can data be recovered after an overwrite?

Posted by Mark on Nov 21, 2008

Interesting data recovery job this one, most especially as it had been sent to three other companies in continental Europe before it found its way across the moat to dear old blighty.

The data on this tape was an Amanda backup from a Sun UNIX system, in the region of 80GB had been written to an SDLTII data cartridge for safekeeping, but no-one had set the tape to write protect. Consequently, when one of the IT staff went to recover some data from the tape, something went wrong and the DLT was re-initialised. From being a vital system backup brimming with data ready to be accessed, the DLT was just another scratch tape waiting to be used. Needless to say the accounts department who needed the data were non-too happy when their data could no longer be restore.

So what happens when a DLT is re-initialised? Is all of the data on the tape destroyed? No it is not. There is a policy with backup tapes that the last thing you wrote is the last thing you will get back, access is not given to old data once a tape is re-used. But, much of the data is still present and can be recovered.

We are not talking about the sci-fi babble of getting data back that has been over-recorded, despite what I read recently on a forum that “the Government could do this”, they actually can not. Government agencies are subject to the same laws of physics that the rest of us are, though some have clearly not studied the bit about car windows being no protection against having your laptop stolen.

When a tape is written to, any existing data is overwritten by the new recording, and when the writing stops a tape drive sets a condition known as EOD or end-of-recorded-data. This is a full-stop to any data access from the tape, you cannot position beyond it. The consequence of this is that, in the case mentioned above, 80GB was on the tape, 3 blocks of 32KB each and a filemark were written to the tape, so almost 80GB of data was still present but could not be accessed.

Getting access to this requires a deep understanding of how the drive works, LTO, DLT, AIT etc. each use their own proprietary method of encoding EOD and making it clear to any application what areas of the tape they can access and which they cannot. In this case we were able to use one of our data recovery drives to effectively ignore the EOD and position into the original data (no they are not available in the shops, you have to build your own), then transfer it all to one of our systems. Job done? Well no, the data was written by the application named Amanda, and for this application to read it back it needed access to all of the original recording, we had 99%+ but were missing the start.

This is the next stage of tape data recovery, moving beyond the salvage of data to actually having something to return. The files were in the ufsdump format, an inode backup format often used on Sun and other UNIX systems. To get the files back from within the ufsdump encoding would require some development work and a complete understanding of the format. I’m not about the ramble on about how the ufsdump format works and how to tie the data from each inode that has been backed up to its file and directory information. A few people will already know this so will pick holes if I don’t explain something correctly, for everyone else a detailed explanation would simply prove that tape data recovery engineers make lousy party guests.

The upshot was that (thanks to a clever bit of tape hardware, a hex editor a C compiler and too much coffee) all of the important data was recovered, and returned as a nice and simple tar backup, so eveyone lived happily every after, well for a while at least.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • StumbleUpon

Related Posts:

Leave a Reply

Comment