Difference between revisions of "SLab:Todo"

From CCGB
Jump to: navigation, search
(CRITICAL)
(CRITICAL DISK STORAGE PROBLEM WITH MD1K-2)
Line 57: Line 57:
  
 
<pre>
 
<pre>
s3% MegaCli -LDInfo L1 -a0
+
[root@s3: ~/storage]# MegaCli -LDInfo L1 -a0
  
 
Adapter 0 -- Virtual Drive Information:
 
Adapter 0 -- Virtual Drive Information:
Line 83: Line 83:
 
# Press the power button on s3 until it turns off.
 
# Press the power button on s3 until it turns off.
 
# Press the power button on s3 again to turn it back on.
 
# Press the power button on s3 again to turn it back on.
 +
# Log into s3 as root after it has finished booting.
  
 
== Miscellaneous ==
 
== Miscellaneous ==

Revision as of 11:37, 26 March 2010

CRITICAL DISK STORAGE PROBLEM WITH MD1K-2

There is a serious problem with md1k-2, one of the PowerVault MD1000's connected to s3. It will get into a state where two disks appear to have failed. A two-disk failure when using RAID-5 would mean complete data loss. Fortunately, I've found a remedy that is allowing us to copy the data to a different array.

Dell Higher Education Support
1-800-274-7799
Enter Express Service Code 43288304365 when prompted on the call.
host service tag contract end description notes
c8 CPM1NF1 02/15/2011 PowerEdge 1950 old schuster storage, moved PERC 5/E to s3
s3 GHNCVH1 06/22/2012 PowerEdge 1950 connected to md1k-2 via PERC 5/E
md1k-1 FVWQLF1 02/07/2011 PowerVault MD1000 enclosure 3
md1k-2 JVWQLF1 02/07/2011 PowerVault MD1000 enclosure 2
md1k-3 4X9NLF1 03/06/2011 PowerVault MD1000 enclosure 1


I've narrowed down the error by looking through the adapter event logs. There will be two errors, one followed about 45 seconds after the first:

Tue Mar 23 23:09:56 2010   Error on PD 31(e2/s0) (Error f0)
Tue Mar 23 23:10:43 2010   Error on PD 30(e2/s1) (Error f0)

After that, the virtual disk goes offline.

[root@s3: ~/storage]# MegaCli -LDInfo L1 -a0

Adapter 0 -- Virtual Drive Information:
Virtual Disk: 1 (Target Id: 1)
Name:md1k-2
RAID Level: Primary-5, Secondary-0, RAID Level Qualifier-3
Size:8.862 TB
State: Offline
Stripe Size: 64 KB
Number Of Drives:14
Span Depth:1
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Access Policy: Read/Write
Disk Cache Policy: Disk's Default
Encryption Type: None
Number of Dedicated Hot Spares: 1
    0 : EnclId - 18 SlotId - 1 

If you go into the server room, you'll see flashing amber lights on the disks in md1k-2 slot 0 and 1. I can md1k-2 back using the following procedure (replace slot 0 and 1 which whichever slots are appropriate):

  1. Take the disks in md1k-2 slots 0 and 1 about half-way out and then push them back in.
  2. Wait a few seconds for the lights on md1k-2 slots 0 and 1 to return to green.
  3. Press the power button on s3 until it turns off.
  4. Press the power button on s3 again to turn it back on.
  5. Log into s3 as root after it has finished booting.

Miscellaneous