(but not really) backups of multi-user systems

2007-12-25 9:45:00

This topic keeps coming up in on this mailing list, so I thought that

I would take this note and send it along to everyone -- I sent an

earlier draft of this to someone who'd asked about it, but never

saw a summary from that person. I invite comments on this and will

post a real summary (per our usual protocol) once they stop flowing in.

Notes: I use the terms "dump" and "restore" because that's what I'm

accustomed to. Substitute "ufsdump" and "ufsrestore" as needed.

About Multiuser Backups on Unix Systems

Unless you have a *really* specialized environment, this is much less

of a problem than you might think. Oh, sure, the folks who peddle

add-on backup software, like Legato Networker and Budtool and the

like, will try to tell you that unless you use their package that you

risk losing data or other horrible things. And, very strictly speaking,

they're right -- but not by much. Yes, there's a non-zero, finite chance

that something will go amiss and you'll miss a file here or a directory

there, but the chance is pretty darn small. *Unless* you have a

really specialized environment, like I said above. (I'll explain

what that means later.)

Why? Because all modern versions of dump have code in them that

goes to some lengths to try to cope with filesystems that may be

changing while they're being dumped. That code has a history

that involves Berkeley, CalTech, and Purdue -- and possibly others --

and goes back most of a decade. Back in the days of 4.3BSD and 2.8BSD,

lots of folks were running Unix systems in 24x7 production mode, mostly

in academic environments where taking the systems down at 3 AM to do

backups wasn't a terribly good option, because zillions of budding

hackers were banging away at ADM3A's and VT100's at that hour, and

got testy if their compiler projects were interrupted. So various

and sundry people starting figuring out ways to be able to run dump

(a) without ending up with an unreadable dump image on tape and

(b) without skipping half the filesystem. If you take a look at

the code for dump in the freely-available BSD sources, you'll find

most of this work -- I also know that it is definitely in the SunOS 4.1.4

dump, because I've seen the source code (the nice part about working

somewhere with a source license) and I have a number of reasons to

believe that it's in Solaris's ufsdump, Ultrix's dump, etc. (After all,

why wouldn't it be, since any of Sun/Digital/et.al. could just grab

it from the BSD source tree? No, of course, I would never dissassemble

their code to check. :-) )

Now, this code is not completely bulletproof -- in fact, I know some

explicit ways to break it by sending dump a SIGSTOP, then doing some

ugly things to the filesystem, then sending it a SIGCONT and watching

it fall all over itself. But that's an awfully artificial case, and

I've never seen it arise on a real-world machine.

This is why I've deployed live backups on every Unix network I've touched

over the last ten years. And in that time, I have yet to enounter a

dump tape that I couldn't restore. The size of those networks ranges

from tiny (my Sparc here at home) to rather large (several hundred

machines with several hundred filesystems and a dozen tapedrives).

I've probably pulled close to a thousand tapes during that time to

extract files for restoration, and haven't been disappointed yet.

So what I'm telling you is that unless there are some really severe

circumstances in place at your site, you can probably do this too

and not lose any sleep over it. There are a number of things that

you can put in favor, as well -- most of them are probably already

true, but I think it's worth my while to list them.

1. The quieter the filesystems are, the better. (No surprise.)

For most sites, this means doing dumps in the middle of the night.

2. The less busy the machine is, the better. This not only relates to #1,

but it means that more CPU cycles will be available for dump, which

means that dump runs faster, which means that it runs in less elapsed

time, which means a smaller window in which the filesystem can change.

3. Dumps which span multiple tapes are a *bad* idea. Besides a long

history of multi-tape related bugs, and besides the pain-in-the-ass

that this represents, it also means that there could be a substantial

amount of time going by while somebody figures out that tape #1 is

full and feeds tape #2 to dump. Again, the faster dump runs, the less

time the filesystems have to change. (And as an aside, I have yet

to use a tape stacker/jukebox/carousel that didn't make me want

to dropkick it after a month.)

4. Doing dumps across the network is not the greatest idea in the

world. It's hard to do securely, and it really drops the throughput

rate, which means that dumps take longer, which means...(you know).

Tapedrives are now getting cheap enough that putting a reasonably

high-capacity drive on each machine isn't totally unrealistic -- and in

some cases, it's a much better/cheaper solution than putting a stacker

on one central machine.

5. If you have database applications (e.g. Oracle) then dumping the

raw database files is nice...but probably not useful. Use the utilities

which come with the database package to take an ASCII snapshot of the entire

database and make sure *that* is backed up. Same for Sybase or whatever

other applications you run that store their data in some customized

interal format -- but export/import it via ASCII. Having your data

in ASCII also means that if disaster ensues, you can at least attack

the problem with standard Unix tools like sed/awk/perl, whereas if

it's in the raw database form...well, you're stuck. Also, these snapshot

tools are usually able to take advantage of their knowledge of

the database's internal structure in order to create a static

and self-consistent picture of the database's contents.

6. Back up *all* the filesystems on *all* your machines. I don't

care if you're using rdist to keep /usr/local in sync -- back it

up anyway. Should there be an rdist problem, or an intruder, or

any other kind of problem that's restricted to a single machine,

you will want that backup image. Besides, tape is cheap, cycles

are cheap, and system administrator time is scarce and expensive.

7. Use a rotating schedule of backups, full (level 0) and incremental

(levels 1-9). If you can do the rotation daily, that's even better.

For example:

Machine Filesystem Mon Tue Wed Thu Fri Sat Sun

fred / 0 1 2 3 4 5 6

fred /usr 0 1 2 3 4 5 6

barney / 4 5 6 0 1 2 3

barney /usr 4 5 6 0 1 2 3

barney /home00 2 3 4 5 6 0 1

There's a bunch of reasons for doing this. For starters, a file

that was being changed in fred:/usr on Monday when the full dump

came through will probably not be changing on Tuesday when the

partial comes through. [Note: it pays to examine your "cron"

and "at" job queues to make sure that large batch jobs of whatever

nature are not trying to run at the same time as your backups.

It's a Bad Thing to run the one job that you have that modifies

/etc/passwd every night at the exact same time that you're trying

to create a dump of /. ;-) ]

This also helps balance out the size of the dump images that are headed

for tape -- very helpful if you're putting a bunch of dump images on

one tape, which you probably are. For instance, the level 5 dump

of barney:/home00 on Thursday is probably going to be pretty small,

which is good, because barney:/usr is getting dumped at level 0...

probably on the same tape.

This also relates back to #6: dumping fred:/usr and barney:/usr on

different schedules means that even if the tape with fred:/usr

at level 0 on it fails or gets lost, at least you have barney:/usr

at level 0. Hey, it's better then starting over with distribution media.

8. If you're still worried, then set up your scripts to do a

"restore tf" or "restore tvf" on each dump image when you're

done scribbling it on the tape. This also has the nice

side-effect of giving you a catalog of all your dump tapes,

which is nice when a user comes up to you and says "Can you

restore /home00/luser/foobar? Uh...no, I don't know when

I changed it last."

9. About the "verify" option to dump: the comment in the manual page

(for most versions of dump) is pretty accurate: if you try this on

anything but an umounted filesystem, it's probably going to whine

at you. Actually, it'd be really nice if there was just a "verbose"

flag that would emit the names of files/directories as they're being

dumped, but that's not really a realistic expectation. I think

the "answer", if there is one, for people who want to do some kind

of verification on their backup images, is to rewind and

do a "restore tf" or "restore tvf" on the dump image. See #8 above.

10. I told you I'll explain what a "really specialized environment"

was. Well, I'd say that:

- An environment with multiple databases which are used 24x7 with

no real slowdown, making it difficult to find a window during which

they can be dumped to ASCII, e.g. a transaction-processing environment

- An environment with non-vanilla filesystems, e.g. journaled filesystems

- An environment whose CPU utilization is so high 24x7 that it's

difficult to grab enough cycles to run dump in a reasonable period

of time

constitute really specialized environments. This doesn't necessarily

mean that you can't use plain old dump for your backups; but it does

mean that you may need to be somewhat more clever about how you do it.

11. One more note: using tar or cpio (or GNU tar or bar or whatever)

just makes it worse. All of them have their problems, and none of

them have code to cope with non-quiescent filesystems.

12. This may all sound pretty darn complicated -- looking back, I certainly

have written a lot here. Chalk it up to this morning's coffee finally

kicking in. ;-) But the bottom line is that 99% of the people

out there can just use dump/ufsdump with a little planning and

avoid the expense and hassle of going single-user or using one

of the third-party tools. And given what I've seen happen to sites

using those third-party tools, you really do want to avoid them.

(I'm aware of one site that has an expensive 3rd-party backup

package which now occupies 2 people full-time as they try to coax

it to actually do their backups and restores. It's unbelievable.)

Cheers,

Rich Kulawiec

rsk@itw.com

Comments

Got something to say?

You must be logged in to post a comment.