Wait for udev to recreate /dev/PTN entries when calibrating (#762941)
File system specific commands sometimes fail reporting that the partition specific /dev entry doesn't exist. Example failing check operation details: Check and repair file system (ext4) on dev/sdb4 calibrate /dev/sdb4 path: /dev/sdb4 (partition) start: 4196352 end: 6293503 size: 2097152 (1.00 GiB) check file system on /dev/sdb4 for errors and (if possible) fix them e2fsck -f -y -bv -C 0 /dev/sdb4 e2fsck 1.42.9 (28-Dec-2013) e2fsck: No such file or directory while trying to open /dev/sdb4 Possibly non-existent device? This has been reproduced on CentOS 7. Debugging shows that the libparted calls used to re-read the partition details in GParted_Core::calibrate_partition() leads to udev removing and re-adding all the partition /dev entries for the disk. # udevadm monitor & # gpartedbin ... 16.480662 +12.618659 calibrate_partition() calling get_device("/dev/sdb", lp_device) ... 16.483644 +0.002982 calibrate_partition() get_device() returned 16.483678 +0.000034 calibrate_partition() calling get_disk(lp_device, lp_disk) ... 16.618113 +0.134435 calibrate_partition() get_disk() returned KERNEL[19275.707968] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block) 16.618561 +0.000448 destroy_device_and_disk() calling ped_disk_destroy(lp_disk) ... 16.618584 +0.000023 destroy_device_and_disk() ped_disk_destroy() returned 16.618591 +0.000007 destroy_device_and_disk() calling ped_device_destroy(lp_disk) ... 16.618602 +0.000011 destroy_device_and_disk() ped_device_destroy() returned 16.618687 +0.000085 calibrate_partition() return true 16.618851 +0.000164 execute_command() e2fsck -f -y -v -C 0 /dev/sdb4 KERNEL[19275.708389] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block) KERNEL[19275.708500] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block) KERNEL[19275.708643] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block) KERNEL[19275.768278] change /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb (block) KERNEL[19275.771171] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block) KERNEL[19275.771360] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block) KERNEL[19275.771542] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block) KERNEL[19275.775858] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block) UDEV [19275.820153] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block) UDEV [19275.823152] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block) UDEV [19275.828275] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block) 16.742735 +0.123884 execute_command() exit status 8 UDEV [19275.841425] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block) UDEV [19275.905478] change /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb (block) UDEV [19276.013580] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block) UDEV [19276.034728] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block) UDEV [19276.174840] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block) UDEV [19276.237105] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block) So exactly when GParted is running the external e2fsck command, udev is in the middle of removing and re-adding all the /dev partition entries for the disk. Hence the above failure reporting that /dev/sdb4 didn't exist. This error depends on the timing between GParted running the external file system specific command and udev removing and re-adding the entries, so sometimes it works and sometimes it fails. Further debugging showed that simply opening and closing the whole disk device read-write triggers the same removing and re-adding of all the partition /dev entries with udev >= 219. Opening the whole disk device read-write is what libparted has always done until this post libparted 3.2 patch to make it open read-only when probing: http://git.savannah.gnu.org/cgit/parted.git/commit/?id=44d5ae0115c4ecfe3158748309e9912c5aede92d libparted: Use read only when probing devices on linux (#1245144) To fix this simply wait for udev devices to settle in calibrate_partitions(). The longest I have seen udev take to do this is 0.80 seconds in a VM. Wait up to 10 seconds as is done in commit() -> commit_to_os(), also called when applying operations. On configurations which don't have this issue execution of udevadm settle, which will return immediately, adds at most 0.1 seconds to the time taken for the calibrate step. This won't be noticed in the time taken of the operation details so there is no point in trying to avoid executing udevadm settle when not needed. Bug 762941 - Operations sometimes failing with: No such file or directory
This commit is contained in:
parent
a93a678a7b
commit
fd9013d5f6
|
@ -3521,6 +3521,12 @@ bool GParted_Core::calibrate_partition( Partition & partition, OperationDetail &
|
|||
destroy_device_and_disk( lp_device, lp_disk ) ;
|
||||
}
|
||||
|
||||
// (#762941) Above libparted partition querying triggers udev >= 219 to
|
||||
// remove and re-add all the partition specific /dev/ entries. Wait for
|
||||
// this to complete to avoid FS specific commands failing because they
|
||||
// happen to run just when the needed /dev/PTN entry doesn't exist.
|
||||
settle_device( 10 );
|
||||
|
||||
operationdetail.get_last_child().set_status( success ? STATUS_SUCCES : STATUS_ERROR );
|
||||
return success;
|
||||
}
|
||||
|
|
Loading…
Reference in New Issue