‘R’ is for Redundant.
My RAID array is broken. So can I get it to work again, preferably with my data intact?
chino:/# cat /proc/mdstat
Personalities : [raid5]
md0 : inactive sdd1[1] sdb1[3] sda1[2]
734419136 blocks
unused devices: <none>
Some Googling found a page describing a similar problem, so I tried a copy-cat approach:
chino:/# mdadm --stop /dev/md0
chino:/# mdadm -A /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
chino:/# mdadm -A --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm: device /dev/md0 already active - cannot assemble it
chino:/# mdadm --stop /dev/md0
chino:/# mdadm -A --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm: /dev/md0 has been started with 3 drives (out of 4).
chino:/# cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 sdd1[1] sdb1[3] sda1[2]
732587712 blocks level 5, 64k chunk, algorithm 2 [4/3] [_UUU]
unused devices:
chino:/# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Tue Jan 31 21:45:55 2006
Raid Level : raid5
Array Size : 732587712 (698.65 GiB 750.17 GB)
Device Size : 244195904 (232.88 GiB 250.06 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Fri Aug 25 10:01:50 2006
State : clean, degraded
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : ebdfaedf:59e64777:d81d8f6e:8d6b0392
Events : 0.150987
Number Major Minor RaidDevice State
0 0 0 - removed
1 8 49 1 active sync /dev/sdd1
2 8 1 2 active sync /dev/sda1
3 8 17 3 active sync /dev/sdb1
Now this seems better already. As with the original poster, I then found that a “cat /dev/md0″ produced output.Continuing as suggested:
chino:/# e2fsck -n -f -v /dev/md0
e2fsck 1.37 (21-Mar-2005)
Warning: skipping journal recovery because doing a read-only filesystem check .
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (169366819, counted=168627552).
Fix? no
Free inodes count wrong (91579502, counted=91578664).
Fix? no
/home: ********** WARNING: Filesystem still has errors **********
7058 inodes used (0%)
6259 non-contiguous inodes (88.7%)
# of inodes with ind/dind/tind blocks: 6442/3227/0
13780109 blocks used (7%)
0 bad blocks
0 large files
7075 regular files
797 directories
0 character device files
0 block device files
0 fifos
0 links
15 symbolic links (15 fast symbolic links)
0 sockets
--------
7887 files
This is much more promising – e2fsck thinks there’s a file system there. It’s not perfect, but it’s better than the nothing I thought I had a couple of days ago. I really don’t know the first thing about ext3 and journalling, but I suppose it’s possible that some of the problems that fsck is finding will go away when the journal is applied.
chino:/# e2fsck -f -v /dev/md0
e2fsck 1.37 (21-Mar-2005)
/home: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/home: ***** FILE SYSTEM WAS MODIFIED *****
7896 inodes used (0%)
6259 non-contiguous inodes (79.3%)
# of inodes with ind/dind/tind blocks: 6442/3227/0
14519376 blocks used (7%)
0 bad blocks
0 large files
7075 regular files
797 directories
0 character device files
0 block device files
0 fifos
0 links
15 symbolic links (15 fast symbolic links)
0 sockets
--------
7887 files
That all looks good, so the final step is to add in the removed drive and let the array rebuild:
chino:/# mdadm -a /dev/md0 /dev/sdc1
mdadm: hot added /dev/sdc1
chino:/# cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 sdc1[4] sdd1[1] sdb1[3] sda1[2]
732587712 blocks level 5, 64k chunk, algorithm 2 [4/3] [_UUU]
[>....................] recovery = 0.2% (551168/244195904) finish=66.3min speed=61240K/sec
unused devices:
Fantastic. The array is rebuilding. So it needs to be left to cook for a while. I’m not feeling brave enough to remount it when it’s still rebuilding. After ninety minutes:
chino:/# cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 sdc1[0] sdd1[1] sdb1[3] sda1[2]
732587712 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
unused devices:
chino:/# mount -a
chino:/# cd /home
chino:/home#
So, thanks to djlee, Linux Format and Google, my RAID array is working again.
[...] Re: sh: no job control in this shell Well, no response from this forum.. but incase someone else has this error.. maybe this can steer them in the right direction. The raid somehow corrupted, and it could not read it. This page i found, pointed me in the right direction to rebuild the array. RAID Victory « Down the Docks [...]
Pingback by sh: no job control in this shell - openSUSE Forums — October 20, 2008 @ 10:27 am