cpio throughput

2007-12-25 11:24:00

My original question had to do with expectations of cpio's throughput and what

happens during multi-tape archiving. In a word, don't use cpio for large amounts

of data, especially if you have the option to use tar!

Roger Fujii mentioned:

"Remember that the MAX block size you can use with cpio is 5120 (using -B). It

defaults to 512. Tar defaults to 20 blocks (10K). The *last* time I dealt with

this (a while ago), you had to be *very* careful of what you typed and when at

the continue prompt. The next tape had to be ready and the device much be typed

in (w/o space, etc)."

This confirmed the obvious fact that cpio's throughput generally sucks. If you

*ever* have to use cpio, make sure you use the -B option to get slightly better

throughput.

Rich Pieri had some things to say about solutions to my problem (and also

mentioned the block-size debacle):

"cpio to tape should always use '-B', otherwise yes, it will take forever...

There are some tricks you can use to get around [having to swap tapes

manually]. Specifically, instead of one huge backup set, break it up into

smaller chunks (one archive per directory, for instance). There are many tricks

you can use to calculate the size of each set (du, piping a test run through wc,

etc). If the size of the next set would exceed the current tape's available

capacity, send an 'offline' signal (mt -f foo offline) to the device to

cycle the magazine...Were either the -I or -O switches used? -I redirects

stdin, -O redirects stdout. Either will screw around with multi-volume archive

prompting."

I hadn't used either of those options, and actually didn't have any control

over how the huge-ass 90gig archive had been backed up, but I thank Rich for

those ideas - I'll probably just opt to never use cpio. :)

Marianne Rodgers had an interesting point:

"For multiple diskettes, or I would multiple tapes, you should have volume

manager off (/etc/init.d/volmg stop)"

The man page for the Sun version of cpio does mention this about floppies, but

doesn't say anything about tape drives. But just to be safe, I turned off vold

for the duration of this restoral, and also compiled and used the latest GNU

version of cpio.

Did it ever work? Nope. The drive kept wanting to be cleaned about 40 hours

through, and cpio would time out while the cleaning tape ran. We kept running

into so many problems with how this was done that the person who originally

decided to do it this way finally optioned to abort the cpio and re-backup

everything with tar.

In case people are interested, here are some stats from a DLT7000 drive using a

Compac IV tape. It's fairly obvious where cpio rates in this comparison. The

file used was a 180meg data file:

*--< TAR >--*

root@u2300c(1):/data> tar cvf /dev/rmt/2 tools/file01.dat

*__extended device statistics__*

device r/s w/s kr/s kw/s wait actv svc_t %w %b

st36 0.0 302.8 0.0 3028.1 0.0 0.9 3.1 0 94

*--< PAX >--*

root@u2300c(1):/data> pax -w -f /dev/rmt/2 tools/file01.dat

*__extended device statistics__*

device r/s w/s kr/s kw/s wait actv svc_t %w %b

st36 0.0 305.4 0.0 3053.5 0.0 1.0 3.1 0 96

*--< UFSDUMP >--*

root@u2300c(1):/data> ufsdump 0ucf /dev/rmt/2 tools/file01.dat

*__extended device statistics__*

device r/s w/s kr/s kw/s wait actv svc_t %w %b

st36 0.0 58.3 0.0 3675.2 0.0 1.0 17.0 0 99

*--< CPIO (w/ -B option) >--*

root@u2300c(1):/data> find tools/file01.dat | /local/bin/cpio -oBv > /dev/rmt/2

*__extended device statistics__*

device r/s w/s kr/s kw/s wait actv svc_t %w %b

st36 0.0 484.0 0.0 2420.1 0.0 0.9 1.8 0 88

*--< CPIO (w/out -B option) >--*

root@u2300c(1):/data> find tools/file01.dat | /local/bin/cpio -ov >

/dev/rmt/2

*__extended device statistics__*

device r/s w/s kr/s kw/s wait actv svc_t %w %b

st36 0.0 323.3 0.0 161.7 0.0 1.0 3.0 0 97

------------------------------------

Matthew Ross Davis -----------------

Senior Internet System Administrator

Digex ---- Unix Technical Operations

------------------------------------

Comments

Got something to say?

You must be logged in to post a comment.