SS2/SCSI chain problem

2007-12-25 7:26:00

On November 6, I inquired about a SS2 with 5 disks and an EXB8500

on its SCSI chain, which has problems when we try to dump it

(kernel "SCSI transport failed" messages, etc.).

We seem to have gotten rid of most of our problems by a combination

of drastically shortening all cables, and making sure the user has

a sleep before every invocation of dump.

Here are all the possible causes that respondents identified:

  The symptoms can be indicative of a SCSI chain that is too long,

esp. since some drives have considerable equivalent length inside.

One person reported that he had to shorten his SCSI cables to

only 12", on a SS1+ that had only three external disks, to make

things work.

  A couple persons stated that the SCSI chain on an SS2 will operate

nowhere near to the maximum length of approximately 6 meters, when

synchronous; one person claimed that Sun has pushed the SCSI spec

beyond its limits.

  One person stated that 5 disks and a tape is "pushing it."

  Use higher-quality (shielded) cables, not ribbon cables, and use

the same kind of cables everywhere.

  There may be a device in the chain which is internally-terminated,

which shouldn't be (we had already checked our first external disk for

this).

  Make sure that the external terminator is powered - some devices

don't provide power on the TERMPWR line of the SCSI bus.

  If your chain includes an Exabyte, make sure it's going synchronous

along with the disks; otherwise you will have problems.

  The tape should be moved to the end of the chain. [However, I had

read something somewhere that suggested an Exabyte should be close

to the start of the chain, but I can't remember where; it may have

come from the vendor, R-Squared.]

  Make sure there is a suitable pause (like 60 seconds) between

successive dumps to the Exabyte.

  Buy an Sbus SCSI host adapter, plug it into the SS2, and move some

of the devices to the second chain. ($495 retail from Sun, there are

other sources.)

  Synch SCSI can be disabled, if a solution cannot be found otherwise.

Using adb, turn off the 0x20 bit in the kernel's "scsi_options".

Thanks to the following persons for their suggestions:

kpc!kpc.com!cdr@uunet.UU.NET (Carl Rigney)

dan@breeze.bellcore.com (Daniel Strick)

stern@sunne.East.Sun.COM (Hal Stern - NE Area Tactical Engineering)

keves@meaddata.com (Brian Keves - Consultant)

johnb@edge.CIS.McMaster.CA (John Benjamins)

poffen@sj.ate.slb.com (Russ Poffenberger)

admin%esrg@hub.ucsb.edu (system administrator)

athey@lorien.ocf.llnl.gov (Charles L. Athey III)

deltam!dm!mark@uunet.UU.NET (mark galbraith)

Ken Nawyn <ken@nynexst.com>

andrew@calvin.doc.ca (Andrew Patrick)

----------

Ed Arnold * NCAR * POB 3000, Boulder, CO 80307-3000 * 303-497-1253(voice)

303-497-1137(fax) * era@ncar.ucar.edu [128.117.64.4] * era@ncario.BITNET

Comments

Got something to say?

You must be logged in to post a comment.