zpool 'replace' vdev failed after resilvering - can't detach

My one zpool has experienced two successive drive failures. As I was resilvering the first, the second failed and I got two errors, in snapshots. The resilvering finished, and then I used "zpool replace" to resilver the second faulty drive.

The pool is mounted, all data safe and available except for the two files:

pool: gggpool
state: DEGRADED
status: One or more devices has experienced an error resulting in data corruption. Applications may be affected.
scan: resilvered 2,35T in 19h29m with 5 errors on Sat Sep 21 03:08:24 2013
config:
NAME STATE READ WRITE CKSUM
gggpool DEGRADED 0 0 5 raidz1-0 DEGRADED 0 0 10 scsi-SATA_ST3000DM001-9YN_Z1F0NJKS ONLINE 0 0 0 scsi-SATA_ST3000DM001-9YN_Z1F0RPKE ONLINE 0 0 0 scsi-SATA_ST3000DM001-9YN_Z1F0RPZG ONLINE 0 0 0 scsi-SATA_ST3000DM001-9YN_Z1F0RQJ2 ONLINE 0 0 0 scsi-SATA_ST3000DM001-9YN_Z1F0RQSV ONLINE 0 0 0 scsi-SATA_ST3000DM001-9YN_Z1F0T6VN ONLINE 0 0 0 spare-6 DEGRADED 0 0 0 scsi-SATA_WDC_WD30EZRX-00_WD-WMC1T4095404 UNAVAIL 0 0 0 scsi-SATA_ST3000DM001-9YN_Z1F118BA ONLINE 0 0 0 replacing-7 UNAVAIL 0 0 0 scsi-SATA_ST3000DM001-1CH_Z1F2Z9VC UNAVAIL 0 0 0 scsi-SATA_ST3000DM001-1CH_Z1F2Z8SM ONLINE 0 0 0
spares scsi-SATA_ST3000DM001-9YN_Z1F118BA INUSE currently in use

The remaining errors probably point to where the faulty files were - I destroyed the relevant snapshots but these error indications remain:

errors: Permanent errors have been detected in the following files: <0x218>:<0x7308> <0x3a0>:<0x295a6b>

I am not worried about these errors. I am trying to detach the two failed drives, both of which has been replaced, but zpool doesn't do it:

root@ggg:~# zpool detach gggpool scsi-SATA_ST3000DM001-1CH_Z1F2Z9VC
cannot detach scsi-SATA_ST3000DM001-1CH_Z1F2Z9VC: no valid replicas
root@ggg:~# zpool detach gggpool scsi-SATA_WDC_WD30EZRX-00_WD-WMC1T4095404
cannot detach scsi-SATA_WDC_WD30EZRX-00_WD-WMC1T4095404: no valid replicas

The two drives have been physically removed from the array - sent in for warranty replacement - but they live on in the zpool configuration. How do I get rid of them?

When reading data from the pool, I can see the "replacing-7" vdev is not active:

 capacity operations bandwidth
pool alloc free read write read write
----------------------------------------------- ----- ----- ----- ----- ----- -----
gggpool 19,8T 1,96T 323 0 36,8M 0 raidz1 19,8T 1,96T 323 0 36,8M 0 scsi-SATA_ST3000DM001-9YN_Z1F0NJKS - - 177 0 5,42M 0 scsi-SATA_ST3000DM001-9YN_Z1F0RPKE - - 184 0 5,26M 0 scsi-SATA_ST3000DM001-9YN_Z1F0RPZG - - 183 0 5,55M 0 scsi-SATA_ST3000DM001-9YN_Z1F0RQJ2 - - 183 0 5,25M 0 scsi-SATA_ST3000DM001-9YN_Z1F0RQSV - - 180 0 5,39M 0 scsi-SATA_ST3000DM001-9YN_Z1F0T6VN - - 181 0 5,21M 0 spare - - 298 0 5,47M 0 scsi-SATA_WDC_WD30EZRX-00_WD-WMC1T4095404 - - 0 0 0 0 scsi-SATA_ST3000DM001-9YN_Z1F118BA - - 230 0 5,49M 0 replacing - - 0 0 0 0 scsi-SATA_ST3000DM001-1CH_Z1F2Z9VC - - 0 0 0 0 scsi-SATA_ST3000DM001-1CH_Z1F2Z8SM - - 0 0 0 0
----------------------------------------------- ----- ----- ----- ----- ----- -----

This is worrying because without this VDEV working, the pool has no redundancy - yet I cannot remove or detach any of its two drives. I am in the process of making a full backup - only a day to go. However, destroying this pool and rebuilding it will cause a LOT of headaches, with many filesystems and smb and afs shared having to be re-set up.

And ideas how I can force this failed replacing-7 vdev to work again?

1 Answer

SOLVED

Steps:

destroy all the snapshots containing the errors

Then issue this:

zpool online gggpool [drive in 'spare' or 'rebuilding' that says online but is not really online]

- this starts a resilver process on all vdevs that needs to resilver.

Wait for resilvering to finish; Vdevs will then all indicate "online" in stead of "degraded".

Finally, detach the stubborn removed disks:

zpool detach gggpool [unavailable drive]

All pools healthy.

1 Answer

Your Answer

Sign up or log in

Post as a guest

Related Journals

Is it possible to activate a single random wire in Terraria?

Fez: Treasure Chest in the sewer/well area with acid

What is happening with the "Totally Not a Cow Level" buff?

How do I use /setblock to put down a flower pot with an Oxeye Daisy in it?