Strange 670MP crash

2007-12-25 7:40:00

Original message (posted 2/27):

-------------------------------------------------------------

Hardware: 670MP, 4.1.2, 1 IPI drive, 3 SCSI drives

Symptoms: Four crashes in two days, both times with the following messages,

           repeated ad infinitum, with different numbers each time:

     ipi0: missing interrupt: refnum 352

     id000b: block 17805 (51018 abs)

         write: missing interrupt - recovery in progress

     ipi0: missing interrupt: refnum 4a9

     id000b: block 12343 (49238 abs)

         write: missing interrupt - recovery in progress

-------------------------------------------------------------

I received 6-8 replies to this question, all with different suggestions.

CAUSE: In our case, the crashes were caused by a C program written by

       a programming student. On a SPARC 1, the program simply exits

       with "Out of Memory Error", but on the 670MP, the IPI disk died

      (and brought the entire network down with it).

SOLUTION: Swap out the IPI disk controller for the latest rev

         (this was free since our 670MP is on hardware maintenance).

          Now, when I run the C program, it exits with "Out of Memory"

          like it's supposed to.

Thanks to all who took the time to reply. I hope my solution

works for you!

-----------------------------------------------------------------

David Mostardi Phone: (510) 643-6071

Systems Administrator FAX: (510) 643-5348

Mathematical Sciences Research Institute Email: david@msri.org

1000 Centennial Drive, Berkeley CA 94720

Comments

Got something to say?

You must be logged in to post a comment.