Background information about UDF is that when creating a file system
it's block size must match the underlying device's sector size. For
optical media like CDs and DVDs that is 2K. For hard drives that is
usually 512 bytes or 4K. However if a UDF file system has been copied
from a device with a different sector size the UDF block size won't
match the sector size. Linux will happily mount such UDF file system.
Therefore the derived udf::get_filesystem_limits() will need access to
the file system block size when determining the size limits of an
existing UDF file system being resized and use the device sector size
when a new UDF file system is being created. All this can be queried
from an appropriate Partition object passed to get_filesystem_limits().
All the calls to get_filesystem_limits() have an appropriate Partition
object available already, except in Win_GParted::activate_reformat()
when composing a format operation. Or more correctly
activate_reformat() constructs temp_ptn, a suitable Partition object,
including with fs_block_size member defaulting to -1 indicating not a
resize, but not until after the file system size limits had been checked
and get_filesystem_limits() called.
Therefore reorder the code in activate_paste() so that the file system
size limits are checked after the wanted Partition object has been
created. No functional change with this commit.
Bug 787204 - Minimum and maximum size of the UDF partition/disk
All the code has been switched to call get_filesystem_limits() and use
struct FS_Limits. Remove struct FS members .MIN & .MAX.
Bug 787204 - Minimum and maximum size of the UDF partition/disk
Change Dialog_Partition_New to use a fs_limits rather than struct FS
and .MIN and .MAX. No passing of struct FS_Limits required. Just use
the FILESYSTEMS vector of struct FS to provide the file system type and
look up it's size limits each time the selection changes.
Bug 787204 - Minimum and maximum size of the UDF partition/disk
Refactor Win_GParted::activate_resize() to query the file system size
limits using the new get_filesystem_limits() method and pass those
limits into the dialog class as struct FS_Limits.
Bug 787204 - Minimum and maximum size of the UDF partition/disk
Changes the internal code in Dialog_Partition_Resize_Move to use
fs_limits instead of fs.MIN and fs.MAX. The limits are still passed
into the constructor via struct FS and it's members .MIN and .MAX but
immediately used to assign to fs_limits.
Bug 787204 - Minimum and maximum size of the UDF partition/disk
Refactor Win_GParted::activate_paste() to query the file system size
limits using the new get_filesystem_limits() method and pass those
limits into the the dialog class as struct FS_Limits.
Bug 787204 - Minimum and maximum size of the UDF partition/disk
Adds working copy fs_limits member into common Dialog_Base_Partition
class. Changes the internal code in Dialog_Partition_Copy class to use
fs_limits instead of fs.MIN and fs.MAX. The limits are still passed
into the constructor via object of struct FS and it's members .MIN and
.MAX but immediately used to assign to the fs_limits member.
Bug 787204 - Minimum and maximum size of the UDF partition/disk
Duplicate the assignment of file system size limits into
struct FS_Limits, matching the fixed values currently assigned to
struct FS members .MIN and .MAX.
Bug 787204 - Minimum and maximum size of the UDF partition/disk
PATCH SET OVERVIEW:
Currently the supported actions of each file system and their size
limits are stored in struct FS objects. These are created by calling
file system specific derived implementations of
FileSystem::get_filesystem_support(). This happens when GParted is
started or when a when a rescan for supported actions is performed. The
file system size limits are expressed as a fixed number of bytes.
The maximum UDF file system size is specified in terms of file system
block size units. Also the file system block size must match the sector
size of the underlying device. Typically 2K for optical media and 512
bytes or 4K for hard drives.
Therefore GParted can't properly express the true UDF file system size
limits because they depend on the block size of an existing UDF file
system or the sector size of the device for new UDF file systems. In
fact other file systems such as EXT2/3/4 and XFS actually express their
maximum file system size in terms of numbers of file system blocks but
these tend to always be 4K and don't have to match the sector size of
the underlying device, so fixed byte values tend to suffice.
To update GParted for this, first separate file system size limits from
struct FS into struct FS_Limits and provide new
FileSystem::get_filesystem_limits() method to allow the limits to be
queried independently of the calls to get_filesystem_support().
Second, pass Partition objects and allow derived get_filesystem_limits()
implementations.
THIS PATCH:
Just creates a separate structure storing fixed value file system
minimum and maximum size limits along with getter method
get_filesystem_limits().
Bug 787204 - Minimum and maximum size of the UDF partition/disk
Those external tools were introduced in version 2.0 of udftools package
and can show or change UDF label, UDF uuid and can provide information
needed for counting total/free sectors.
Bug 792052 - Add support for changing UDF label/uuid and show disk usage
Attempt to grow a partition to more than twice it's size. If committing
that change to the partition fails in such a way that the new larger
partition boundaries are not written to the disk drive then rolling back
will fail with libparted error:
Can't have overlapping partitions.
Example operation details:
Grow /dev/sdb8 from 1.00 GiB to 2.20 GiB
* calibrate /dev/sdb8 (SUCCESS)
* check file system on /dev/sdb8 for errors and (if poss...(SUCCESS)
* grow partition from 1.00 GiB to 2.20 GiB (ERROR)
* attempt to rollback failed change to the partition (ERROR)
original start: 7350272
original end: 9447423
original size: 2097152 (1.00 GiB)
* libparted messages (ERROR)
Can't have overlapping partitions.
What happened is that resize_move_partition() passed the new Partition
object to resize_move_partition_implement() as the source partition for
the rollback, and than called ped_disk_partition_by_sector() with a
sector in the middle to identify the partition to be changed. However
the new partition was never written to the drive so in the middle was
outside the old smaller partition. Therefore libparted identified empty
space after the partition, rather than the partition itself, as the
intended target so when ped_disk_set_partition_geom() was called it
reported error "Can't have overlapping partitions" because it thought
another partition was being created with the same boundaries as the old
partition, rather than the boundaries of the old partition being
updated.
The same error also occurs when rolling back a failed partition change
as part of a move operation when the middle of the new partition falls
outside of the boundaries of the old partition.
Fix by making a temporary Partition object from the intersection of the
old and new partition boundaries just to be used to identify the
partition being changed to libparted. As this is only rolling back a
single step adjusting the partition boundaries as part of a resize
and/or move operation, the old and new partition boundaries must
intersect (and in fact that intersection contains the file system data).
Bug 791875 - Rollback specific failed partition change steps
The general rule is that:
1) For a partition change step BEFORE a file system change step,
rollback on failure;
2) For a partition change step AFTER a file system change step, don't
rollback on failure.
Examining every case where resize_move_partition() is called and whether
rollback on failure is wanted or not:
* In resize_move()
Resize / move extended partition. No associated file system change.
NO ROLLBACK
Just to keep possibly applied operation.
* #1 in move()
Making all encompassing partition before moving file system.
ROLLBACK
To restore partition boundaries back to those of the file system.
* #2 in move()
Recreating original partition boundaries after file system move
failed or was cancelled and has been rolled back.
NO ROLLBACK
To keep updated partition boundaries to match restored file system
data.
* #3 in move()
Replacing all encompassing partition with final partition after
successful file system move.
NO ROLLBACK
Keep new partition boundaries to match moved file system.
* #1 in resize_encryption()
Making the partition larger before growing closed LUKS encrypted
data.
ROLLBACK
Restore partition boundaries back to those of the closed LUKS
encrypted data.
* #2 in resize_encryption()
Shrinking the partition after open LUKS mapping has been shrunk, but
before swap is re-created (smaller).
NO ROLLBACK
Difficult case because the partition shrink is in the middle of a
LUKS shrink and a swap shrink (re-create). If swap was actually
shrunk like other types of file system, rather than re-created, then
the operation sequence would be (1) shrink swap, (2) shrink LUKS
encryption, (3) shrink partition. In this hypothetical case and the
actual case no rollback is preferred to try to keep the new
partition boundaries match the shrunk open LUKS encryption mapping.
* #3 in resize_encryption()
Grow the partition before growing open LUKS mapping and re-creating
swap larger.
ROLLBACK
Restore partition boundaries back to those of the smaller open LUKS
encryption mapping.
* #4 in resize_encryption()
Shrink the partition after shrinking the file system and open LUKS
encryption mapping.
NO ROLLBACK
Keep new smaller partition boundaries to match shrunk encrypted file
system.
* #5 in resize_encryption()
Grow the partition before growing the open LUKS encryption mapping
and file system.
ROLLBACK
Restore partition boundaries back to those of the not yet grown
encrypted file system.
* #1 in resize_plain()
Resize partition before re-creating swap a different size.
ROLLBACK
Restore partition boundaries back to those of the not yet resized
(re-created) swap space.
* #2 in resize_plain()
Shrink partition after shrinking the file system.
NO ROLLBACK
Keep new smaller partition boundaries to match shrunk file system.
* #3 in resize_plain()
Grow partition before growing the file system.
ROLLBACK
Restore partition boundaries back to those of the not yet grown
file system.
Removes the default value from the rollback_on_fail parameter so
rollback or not has to be explicitly specified for every call of
resize_move_partition().
Bug 791875 - Rollback specific failed partition change steps
Even after implementing a fix for bug 790418 "Unable to inform the
kernel of the change message may lead to corrupted partition table"
GParted/libparted can still encounter errors informing the kernel of the
new partition layout. This has been seen with GParted on CentOS 7 with
libparted 3.1.
In such a case the partition has been successfully written to the disk
but just informing the kernel failed. This is a problem because when a
partition is being moved in advance of a file system move step, failure
to inform the kernel leaves the partition boundaries not matching the on
disk limits of the file system. For a move to the left this leaves the
partition reported as unknown, apparently losing the user's data.
For example start with a 512 MiB partition containing an XFS file
system. This is recognised by blkid and parted, hence also by GParted.
# blkid /dev/sdb1
/dev/sdb1: UUID=... TYPE="xfs" PARTUUID="37965980-01"
# parted /dev/sdb unit s print
Model: ATA VBOX HARDDISK (scsi)
Disk /dev/sdb: 16777216s
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 1048576s 2097151s 1048576s primary xfs
Now move the partition 100 MiB to the left and have it fail to inform
the kernel after the first partition change step. Operation details:
Move /dev/sdb1 to the left (ERROR)
* calibrate /dev/sdb1 (SUCCESS)
* check file system on /dev/sdb1 for errors and (if poss...(SUCCESS)
* grow partition from 512.00 MiB to 612.00 MiB (ERROR)
old start: 1048576
old end: 2097151
old size: 1048576 (512.00 MiB)
requested start: 843776
requested end: 2097151
requested size: 1253376 (612.00 MiB)
* libparted messages (ERROR)
Error informing the kernel about modifications to partition
/dev/sdb1 -- Device or resource busy. This means Linux won't
know about any changes you made to /dev/sdb1 until you reboot
-- so you shouldn't mount it or use it in any way before
rebooting. Failed to add partition 1 (resource temporarily
unavailable)
Now because the start of the partition is 100 MiB before the start of
the file system, the file system is no longer recognised, and apparently
the user's data has been lost.
# blkid /dev/sdb1
/dev/sdb1: PARTUUID="37965980-01"
# parted /dev/sdb unit s print
...
Number Start End Size Type File system Flags
1 843776s 2097151s 1253376s primary
It doesn't matter why updating the partition failed, even if it was
because of an error writing to the disk. Rollback of the change to the
partition should be attempted. The worst case scenario is that rollback
of the change fails, which is the equivalent to how the code worked
before this patch set.
However in other cases where the partition boundaries are being updated
after a file system move or shrink step then the partition should be
updated to match the new location of the file system itself. And no
rollback is wanted. If the failure was only informing the kernel then
in fact the partition has actually been updated on disk after all.
So each partition resize/move step needs examining on a case by case
basis to decide if rolling back the change to the partition is wanted or
not.
This patch only adds partition change rollback into
resize_move_partition(). Rollback remains disabled until all cases are
examined in the following patch.
Bug 791875 - Rollback specific failed partition change steps
Extract common code which updates a DMRaid device mapper entry into a
sub-function. This will also be needed when adding rollback of a
partition change on failure.
Bug 791875 - Rollback specific failed partition change steps
Extract the code which actually implements the partition change into a
sub-function ready for adding rollback of the change on failure.
Bug 791875 - Rollback specific failed partition change steps
This is not required, but it is more logical to have an OperationDetail
object created and it's final status set in the same function rather
than split between caller and callee. So move creation of
"copy %1 using a block size of %2" OperationDetail objects into
GParted_Core::copy().
Also introduces a couple of variables to remove some recomputation:
benchmark_od & remaining_length.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
Performing a copy or move operation which uses GParted's internal copy
routine triggered the new GParted bug message. Example operation
details:
Copy /dev/sdb8 to /dev/sdb (start at 4.51 GiB) (SUCESSS)
* calibrate /dev/sdb8 (SUCCESS)
* check file system on /dev/sdb8 for errors and (if possib...(SUCCESS)
* create empty partition (SUCCESS)
* set partition type on /dev/sdb9 (SUCCESS)
* copy file system from /dev/sdb8 to /dev/sdb9 (SUCCESS)
using internal algorithm
copy 1.00 GiB
* finding optimal block size
* copy 16.00 MiB using a block size of 1.00 MiB (SUCCESS)
16.00 MiB of 16.00 MiB copied
GParted Bug: Adding more information to the result...(WARNING)
0.797269 seconds
* copy 16.00 MiB using a block size of 2.00 MiB (SUCCESS)
* copy 16.00 MiB using a block size of 4.00 MiB (SUCCESS)
* copy 16.00 MiB using a block size of 8.00 MiB (SUCCESS)
* copy 16.00 MiB using a block size of 16.00 MiB (SUCCESS)
optimal block size is 1.00 MiB
* copy 944.00 MiB using a block size of 1.00 MiB (SUCCESS)
This is because when performing the initial benchmarking copies the time
taken by each copy is added to the operation detail results in the
calling GParted_Core::copy_blocks() after the final status was set in
CopyBlocks::copy() with set_success_and_capture_errors(). Fix by
setting the final status in the parent function after adding the time to
the benchmark copies.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
To be consistent with all previous bug messages being translatable.
Also only mark the bug as a warning instead of an error because the bug
doesn't cause any disk drive operations to fail.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
There is still another subtle issue. When GParted_Core::commit() closes
the device, the kernel initiates a second set of events which removes
and re-adds the partitions again. Need to wait for these to complete
to prevent any following step failing with missing partition device
nodes.
Bug 790418 - "Unable to inform the kernel of the change" message may
lead to corrupted partition table
Operations involving modifications to a partition are sometimes failing
with a libparted error informing the kernel about modifications to
partitions. For example I encountered these errors when just creating a
fourth partition on CentOS 7 in a VirtualBox VM. Operation results:
Create Primary Partition #1 (ext4, 4.73 GiB) on /dev/sdb (ERROR)
* create empty partition (ERROR)
* libparted messages (ERROR)
* Error informing the kernel about modification to partition
/dev/sdb1 -- Device or resource busy. This means Linux won't
know about any changes you made to /dev/sdb1 until you reboot
-- so you shouldn't mount it or use it in any way before
rebooting.
* Failed to add partition 1 (Resource temporarily unavailable)
Those two libparted messages were presented in "Libparted Error" dialogs
and [Cancel] was selected both times.
Libparted Error
(-) Error informing the kernel about modifications to partition
/dev/sdb1 -- Device or resource busy. This means Linux won't
know about any changes you made to /dev/sdb1 until you reboot --
so you shouldn't mount it or use it in any way before rebooting.
[ Cancel ] [ Ignore ]
Libparted Error
(-) Failed to add partition 1 (Resource temporarily unavailable)
[ Retry ] [ Cancel ]
This is the edited output showing GParted print debugging, stracing of
GParted and monitoring of udev events for this case.
# ./gpartedbin /dev/sdb
======================
libparted : 3.1
======================
...
24.541604 +23.923435 create_partition() start (new_partition, optdet, min_size=0) new_partition.device_path="/dev/sdb"
24.556101 +0.014497 create_partition() type=PED_PARTITION_NORMAL
24.556354 +0.000253 commit() start (lp_disk) lp_disk->dev->path="/dev/sdb"
D: strace pid 18054. Press [Return] to continue.
^Z
[1]+ Stopped ./gpartedbin /dev/sdb
# udevadm monitor &
[2] 18124
monitor will print the received events for:
UDEV - the event which udev sends out after rule processing
KERNEL - the kernel uevent
# strace -p 18054 -e open,ioctl,write,close &
[3] 18129
strace: Process 18054 attached
# fg %1
./gpartedbin /dev/sdb
128.175811 +103.619457 commit() calling ped_disk_commit_to_dev(lp_disk) ...
open("/dev/sdb", O_RDWR) = 6
ioctl(6, BLKFLSBUF) = 0
write(6, "\372\270\0\20\216\320\274\0\260\270\0\0\216\330\216\300\373\276\0|\277\0\6\271\0\2\363\244\352!\6\0"..., 512) = 512
ioctl(6, BLKFLSBUF) = 0
close(6)
128.181352 +0.005542 commit() ped_disk_commit_to_dev(lp_disk) returned true
128.181475 +0.000122 commit_to_os() start (lp_disk, timeout=10) lp_disk->dev->path="/dev/sdb"
128.181527 +0.000052 commit_to_os() calling ped_disk_commit_to_os(lp_disk) ...
open("/dev/sdb", O_RDWR) = 6
ioctl(6, BLKFLSBUF) = 0
open("/sys/block/sdb/ext_range", O_RDONLY) = 7
close(7) = 0
KERNEL[1158935.380543] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
KERNEL[1158935.380565] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
KERNEL[1158935.380578] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
ioctl(6, BLKPG, {BLKPG_DEL_PARTITION, flags=0, datalen=152, data={start=0, length=0, pno=1, devname="", volname=""}}) = -1 ENXIO (No such device or address)
ioctl(6, BLKPG, {BLKPG_DEL_PARTITION, flags=0, datalen=152, data={start=0, length=0, pno=2, devname="", volname=""}}) = -1 ENXIO (No such device or address)
ioctl(6, BLKPG, {BLKPG_DEL_PARTITION, flags=0, datalen=152, data={start=0, length=0, pno=3, devname="", volname=""}}) = -1 ENXIO (No such device or address)
...
KERNEL[1158935.380977] change /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb (block)
KERNEL[1158935.381296] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
KERNEL[1158935.381367] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
KERNEL[1158935.381432] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
KERNEL[1158935.382992] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block)
ioctl(6, BLKPG, {BLKPG_DEL_PARTITION, flags=0, datalen=152, data={start=0, length=0, pno=62, devname="", volname=""}}) = -1 ENXIO (No such device or address)
ioctl(6, BLKPG, {BLKPG_DEL_PARTITION, flags=0, datalen=152, data={start=0, length=0, pno=63, devname="", volname=""}}) = -1 ENXIO (No such device or address)
ioctl(6, BLKPG, {BLKPG_DEL_PARTITION, flags=0, datalen=152, data={start=0, length=0, pno=64, devname="", volname=""}}) = -1 ENXIO (No such device or address)
ioctl(6, BLKPG, {BLKPG_ADD_PARTITION, flags=0, datalen=152, data={start=1048576, length=1073741824, pno=1, devname="/dev/sdb1", volname=""}}) = -1 EBUSY (Device or resource busy)
write(2, "Error informing the kernel about"..., 251) = 251
Error informing the kernel about modifications to partition
/dev/sdb1 -- Device or resource busy. This means Linux won't know
about any changes you made to /dev/sdb1 until you reboot -- so you
shouldn't mount it or use it in any way before rebooting.
UDEV [1158935.384641] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
UDEV [1158935.390203] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
UDEV [1158935.390243] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
UDEV [1158935.462866] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block)
UDEV [1158935.469207] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
UDEV [1158935.471512] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
UDEV [1158935.492173] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
write(2, "Failed to add partition 1 (Resou"..., 60) = 60
Failed to add partition 1 (Resource temporarily unavailable)
close(6)
KERNEL[1158955.730960] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
KERNEL[1158955.731095] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
KERNEL[1158955.731314] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
KERNEL[1158955.731397] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block)
KERNEL[1158955.731817] change /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb (block)
KERNEL[1158955.731981] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
KERNEL[1158955.732166] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
KERNEL[1158955.732232] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
KERNEL[1158955.733955] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block)
148.533154 +20.351627 commit_to_os() ped_disk_commit_to_os(lp_disk) returned false
UDEV [1158955.738262] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
UDEV [1158955.738460] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
UDEV [1158955.738525] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
148.537648 +0.004494 execute_command() udevadm settle --timeout=10
UDEV [1158955.740864] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block)
UDEV [1158955.760192] change /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb (block)
UDEV [1158955.801211] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block)
UDEV [1158955.815262] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
UDEV [1158955.815314] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
UDEV [1158955.828134] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
148.630797 +0.093149 execute_command() exit status 0
148.630882 +0.000085 commit_to_os() return false
D: stop strace pid 18054. Press [Return] to continue.
^Z
[1]+ Stopped ./gpartedbin /dev/sdb
# kill %3
strace: Process 18054 detached
[3]- Done strace -p 18054 -e open,ioctl,write,close
# kill %2
[2] Done udevadm monitor
# fg %1
./gpartedbin /dev/sdb
173.700143 +25.069261 commit() return false
173.700470 +0.000327 create_partition() return false
What happens is that GParted calls ped_disk_commit_to_dev() which opens
the device, writes the updated partition table and closes the device.
When the device closes the kernel initiates asynchronous uevents and
user space udev rules which remove and re-add all the partitions. In
the mean time GParted calls ped_disk_commit_to_os() to inform the kernel
of the changes to the partition table. This involves opening the
device, using ioctl() to remove all possible partitions [1] and re-add
needed partitions. It finds partitions 1 to 3 already removed and
accepts this along with all other non-existent partitions up to 64.
When it tries to re-add partition 1 the ioctl() BLKPG_ADD_PARTITION call
returns EBUSY. Presumably because the partition is in use by udev which
is in the process of running the user space rules associated with
removing and re-adding it. Then ped_disk_commit_to_os() closes the
device which initiates a second round of asynchronous uevents and user
space udev rules removing and re-adding all the partitions again.
So in summary the kernel and udev are removing and re-adding the
partitions exactly when libparted is trying to do exactly the same
thing!
[1] The algorithm in libparted 3.1 is to try to remove all possible
partitions, 64 for this kernel, followed by re-adding the needed
partitions.
parted/libparted/arch/linux.c:_disk_sync_part_table()
http://git.savannah.gnu.org/cgit/parted.git/tree/libparted/arch/linux.c?h=v3.1#n2541
Partprobe has had exactly the same issue with failing to inform the
kernel about modifications to the partition table [2]. This was fixed
in libparted post v3.2 release by this commit [3].
[2] rhbz#1339705 - ceph-disk prepare: Error: partprobe /dev/vdb failed :
Error: Error informing the kernel about modifications to partition
/dev/vdb1 -- Device or resource busy.
https://bugzilla.redhat.com/show_bug.cgi?id=1339705
[3] partprobe: Open the device once for probing
Previously there were 3 open/close pairs for the device, which may
result in triggering extra udev actions. Instead, open it once at
the start of process_dev and close it at the end.
http://git.savannah.gnu.org/cgit/parted.git/commit/?id=cfafa4394998a11f871a0f8d172b13314f9062c2
Implement the same fix as implemented for partprobe. Hold a file handle
open which libparted can use internally to avoid having to open() and
close() the device itself twice, once for each of the calls
ped_disk_commit_to_dev() and ped_disk_commit_to_os(). This avoids the
first close() initiating the kernel and udev to remove and re-add the
partitions exactly when ped_disk_commit_to_os() is trying to do the same
thing.
Bug 790418 - "Unable to inform the kernel of the change" message may
lead to corrupted partition table
The includes were missed being removed by this earlier refactoring
commit which reduced direct access to the single ProgressBar object:
b1313281bd
Simplify use of the progress bar via OperationDetail (#760709)
All libparted messages were reported as informational, even for a step
which failed. Instead identify libparted messages as either
informational or errors depending on whether this step was successful
or not respectively.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
Resizing any unmounted file system which has to be mounted to be resized
triggered the new GParted bug message. However the operation did
complete successfully. Example operation details:
Grow /dev/sdb8 from 1.00 GiB to 1.50 GiB (SUCCESS)
* calibrate /dev/sdb
* check file system on /dev/sdb8 for errors and (if possib...(SUCCESS)
* grow partition from 1.00 GiB to 1.50 GiB (SUCCESS)
* grow file system to fill the partition (SUCCESS)
* mkdir -v /tmp/gparted-wvH0Ez (SUCCESS)
* GParted Bug: Adding another child after no_more_chil...(ERROR)
* Created directory /tmp/gparted-wvH0Ez
* mount -v -t btrfs '/dev/sdb8' '/tmp/gparted-wvH0Ez' (SUCCESS)
* btrfs filesystem resize 1:max '/tmp/gparted-wvH0Ez' (SUCCESS)
* umount -v '/tmp/gparted-wvH0Ez' (SUCCESS)
* rmdir -v /tmp/gparted-wvH0Ez (SUCCESS)
* GParted Bug: Adding another child after no_more_chil...(ERROR)
* Removed directory /tmp/gparted-wvH0Ez
This is because set_success_and_capture_errors() was called first and
the child details added after. Reverse this ordering to fix.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
Transition all remaining code, DMRaid and file system code, to use the
new method of reporting success of a step and automatic error
collection. None of this code calls libparted so can't generate any
libparted exceptions. This is just for consistency so all code follows
the same pattern using set_success_and_capture_errors().
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
Refactor nested if-then-else into a sequence of if fail return early.
Makes the code simpler to understand and converts separate
OperationDetail::set_status() calls for success or error into a single
call using ternary conditional matching how it is or was done everywhere
else. This is also ready for status and error capture refactoring.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
Transition GParted block copying code and partition manipulation code,
which uses libparted API, to the new method of reporting success of a
step and automatic error collection. Libparted exceptions are now
reported with the step at which they occurred.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
Just copies the callback into each newly added child detail. As there
are no more uses of set_success_and_capture_errors() yet, libparted
errors are still only captured once at the top-level of each operation.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
Replace the explicit adding of libparted exception messages with a
callback to do it instead, and fire the callback just once per operation
by only changing the very top-level OperationDetail to use the new
set_success_and_capture_errors(). Therefore this still produces exactly
the same operation details with libparted messages at the end of each
operation.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
All code implementing a step of an operation follows this pattern:
od.add_child(OperationDetail("Step heading"));
od.get_last_child().add_child(OperationDetail("More details"));
// Do step
success = ...
od.get_last_child().set_status(success ? STATUS_SUCCESS
: STATUS_ERROR);
At this point any libparted messages reported via exceptions need to be
added into the OperationDetail tree. Also adding further children into
the tree after collecting those errors needs to be prohibited (as much
as the previous patch prohibited it).
Add a new method which will replace the final set_status() call above
like this which set the status, captures the errors and flags that
further children shouldn't be added:
...
od.get_last_child().set_success_and_capture_errors(status);
It emits a callback to capture the errors to provide flexibility and so
that the OperationDetail class doesn't have to get into the details of
how GParted_Core saves libparted exception messages.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
Want functionality to prevent further child details being added to an
OperationDetail. This is so that the captured libparted error messages
are always the last child in the list, and more details (at that point
in the tree) can't be added.
For example we want GParted to report like this:
Move /dev/sdb3 to the right and shrink it from 1.14 GiB to...(SUCCESS)
...
* shrink partition from 1.14 GiB to 1.00 GiB (SUCCESS)
* old start: 4464640
old end: 6856703
old size: 2392064 (1.14 GiB)
* new start: 4464640
new end: 6561791
new size: 2097152 (1.00 GiB)
* libparted messages (INFO)
* DEBUG: GParted generated synthetic libparted excepti...
and not like this:
Move /dev/sdb3 to the right and shrink it from 1.14 GiB to...(SUCCESS)
...
* shrink partition from 1.14 GiB to 1.00 GiB (SUCCESS)
* old start: 4464640
old end: 6856703
old size: 2392064 (1.14 GiB)
* libparted messages (INFO)
* DEBUG: GParted generated synthetic libparted excepti...
* new start: 4464640
new end: 6561791
new size: 2097152 (1.00 GiB)
So actually preventing the addition of more child details would stop
users seeing information they should see. So instead just report a bug
message into the operation details. This doesn't stop anything, but the
bug message will be seen and allow us to fix GParted.
So far nothing is enforced. This patch just adds the mechanism to
report a bug when a new child detail is added when prohibited.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur
PATCH SET SUMMARY:
Libparted exception messages are reported into the operation details at
the end of each separate operation. For operations which involve
multiple steps of partition manipulation there is no way to identify
which exceptions occurred with which steps.
Example resize/move operation in which multiple libparted exceptions
were raised:
Move /dev/sdb to the right and shrink it from 1.15 GiB to ...(ERROR)
* calibrate /dev/sdb3 (SUCCESS)
* check file system on /dev/sdb3 for errors and (if possib...(SUCCESS)
* e2fsck -f -y -v -C 0 '/dev/sdb3' (SUCCESS)
* shrink file system (SUCCESS)
* resize2fs -p 'dev/sdb3' 1048576K (SUCCESS)
* shrink partition from 1.14 GiB to 1.00 GiB (SUCCESS)
* check file system on /dev/sdb3 for errors and (if possib...(SUCCESS)
* e2fsck -f -y -v -C 0 '/dev/sdb3' (SUCCESS)
* grow partition from 1.00 GiB to 1.12 GiB (SUCCESS)
* move file system to the right (SUCCESS)
* e2image -ra -p -O 134217728 '/dev/sdb3' (SUCCESS)
* shrink partition from 1.12 GiB to 1.00 GiB (ERROR)
* libparted messages (INFO)
* DEBUG: GParted generated synthetic libparted exception...
* Error informing the kernel about modifications to part...
* Error informing the kernel about modifications to part...
* DEBUG: GParted generated synthetic libparted exception...
* DEBUG: GParted generated synthetic libparted exception...
But there is no way to know which of the libparted steps: 1 calibrate or
3 partition resize steps encountered which exceptions.
Fix this by reporting the libparted messages into the operation details
at the point at which they occur. Then the above example would become:
Move /dev/sdb to the right and shrink it from 1.15 GiB to ...(ERROR)
* calibrate /dev/sdb3 (SUCCESS)
* check file system on /dev/sdb3 for errors and (if possib...(SUCCESS)
* e2fsck -f -y -v -C 0 '/dev/sdb3' (SUCCESS)
* shrink file system (SUCCESS)
* resize2fs -p 'dev/sdb3' 1048576K (SUCCESS)
* shrink partition from 1.14 GiB to 1.00 GiB (SUCCESS)
* libparted messages (INFO)
* DEBUG: GParted generated synthetic libparted excepti...
* check file system on /dev/sdb3 for errors and (if possib...(SUCCESS)
* e2fsck -f -y -v -C 0 '/dev/sdb3' (SUCCESS)
* grow partition from 1.00 GiB to 1.12 GiB (SUCCESS)
* libparted messages (INFO)
* Error informing the kernel about modifications to pa...
* Error informing the kernel about modifications to pa...
* DEBUG: GParted generated synthetic libparted excepti...
* move file system to the right (SUCCESS)
* e2image -ra -p -O 134217728 '/dev/sdb3' (SUCCESS)
* shrink partition from 1.12 GiB to 1.00 GiB (ERROR)
* libparted messages (ERROR)
* DEBUG: GParted generated synthetic libparted excepti...
THIS PATCH:
Small change so that setting the status of an OperationDetail to N/A,
warning, also stops the execution timer if it was running. Matching
what happens when the status is set to either success or error.
This is to avoid having to set status twice, first time just to stop the
timer, and second time to set it to the desired status when reporting a
warning.
Bug 790842 - Report libparted messages into operation details at the
point at which they occur