Background
GParted stored a list of paths for Device and Partition objects. It
sorted this list [1][2] and treated the first specially as that is what
get_path() returned and was used almost everywhere; with the file system
specific tools, looked up in various *_Info caches, etc.
[1] Device::add_path(), ::add_paths()
[2] Partition::add_path(), ::add_paths()
Mount point display [3] was the only bit of GParted which really worked
with the path list. Busy file system detection [4] just used the path
provided by libparted, or for LUKS /dev/mapper/* names. It checked that
single path against the mounted file systems found from /proc/mounts,
expanded with additional block device names when symlinks were
encountered.
[3] GParted_Core::set_mountpoints() -> set_mountpoints_helper()
[4] GParted_Core::set_device_partitions() -> is_busy()
GParted_Core::set_device_one_partition() -> is_busy()
GParted_Core::set_luks_partition() -> is_busy()
Having the first path, by sort order, treated specially by being used
everywhere and virtually ignoring the others was wrong, complicated to
remember and difficult code with. As all the additional paths were
virtually unused and made no difference, remove them. The "improved
detection of mountpoins, free space, etc.." benefit from commit [5]
doesn't seem to exist. Therefore simplify to a single path for Device
and Partition objects.
[5] commit 6d8b169e73
changed the way devices and partitions store their device paths.
Instead of holding a 'realpath' and a symbolic path we store paths
in a list. This allows for improved detection of mountpoins, free
space, etc..
This patch
Simplify the Device object from a vector of paths to a single path.
Remove add_paths() and get_paths() methods. Keep add_path() and
get_path() for now.
Bug 767842 - File system usage missing when tools report alternate block
device names
Recognise GRUB2 core.img boot code written to a partition without a file
system. Such setups are possible/likely with GPT partitioned disks as
there is a specific partition type reserved for it [1][2]:
21686148-6449-6E6F-744E-656564454649 (BIOS Boot partition)
[1] GUID Partition Table, Partition types
https://en.wikipedia.org/wiki/GUID_Partition_Table#Partition_type_GUIDs
[2] BIOS boot partition
https://en.wikipedia.org/wiki/BIOS_boot_partition
Bug 766989 - zfsonline support - need file system name support for ZFS
type codes
Composing these operations caused GParted to abort on an assert failure:
(1) Delete an existing partition,
(2) Create a new partition,
(3) Delete new partition.
This is the equivalent issue as fixed in the previous commit, except with
the delete operation rather than the check operation:
Prevent assert failure from OperationCheck::get_partition_new() (#767233)
# ./gpartedbin
======================
libparted : 2.4
======================
**
ERROR:OperationDelete.cc:41:virtual GParted::Partition& GParted::OperationDelete::get_partition_new(): assertion failed: (false)
Aborted (core dumped)
# gdb ./gpartedbin core.19232 --batch --quiet --ex backtrace -ex quit
[New Thread 19232]
[New Thread 19234]
[Thread debugging using libthread_db enabled]
Core was generated by `./gpartedbin'.
Program terminated with signal 6, Aborted.
#0 0x000000361f2325e5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
#0 0x000000361f2325e5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x000000361f233dc5 in abort () at abort.c:92
#2 0x0000003620a67324 in g_assertion_message (domain=<value optimized out>, file=<value optimized out>, line=<value optimized out>, func=0x50fcc0 "virtual GParted::Partition& GParted::OperationDelete::get_partition_new()", message=0x1b55f60 "assertion failed: (false)") at gtestutils.c:1358
#3 0x0000003620a678f0 in g_assertion_message_expr (domain=0x0, file=0x50fa68 "OperationDelete.cc", line=41, func=0x50fcc0 "virtual GParted::Partition& GParted::OperationDelete::get_partition_new()", expr=<value optimized out>) at gtestutils.c:1369
#4 0x000000000049aa0d in GParted::OperationDelete::get_partition_new (this=0x1b321b0) at OperationDelete.cc:41
#5 0x00000000004c6700 in GParted::Win_GParted::activate_delete (this=0x7ffc91403670) at Win_GParted.cc:2068
...
As before the crash is happened in Win_GParted::activate_delete() as it
was going through the list of operations removing those which applied to
the never created partition. It came across the delete operation of an
existing partition and called get_partition_new(). As partition_new was
not set or used by the delete operation this asserted false and crashed
GParted.
Unlike the check operation case, the delete operation doesn't have a
partition afterwards. (As GParted represents unallocated space with
partition objects then the delete operation could be populated with a
new partition by constructing the correctly merged unallocated space
partition object, but that is what OperationDelete::apply_to_visual()
does and having a place holder doesn't seem like the right thing to do).
Instead exclude delete operations, on existing partitions, when looking
for operations applying to this not yet created partition as they are
mutually exclusive.
Bug 767233 - GParted core dump on assert failure in
OperationDelete::get_partition_new()
Composing these operations caused GParted to abort on an assert failure:
(1) Check an existing partition,
(2) Create a new partition,
(3) Delete new partition.
# ./gpartedbin
======================
libparted : 2.4
======================
**
ERROR:OperationCheck.cc:40:virtual GParted::Partition& GParted::OperationCheck::get_partition_new(): assertion failed: (false)
Aborted (core dumped)
# gdb ./gpartedbin core.8876 --batch --quiet --ex backtrace -ex quit
[New Thread 8876]
[New Thread 8879]
[Thread debugging using libthread_db enabled]
Core was generated by `./gpartedbin'.
Program terminated with signal 6, Aborted.
#0 0x000000361f2325e5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
#0 0x000000361f2325e5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x000000361f233dc5 in abort () at abort.c:92
#2 0x0000003620a67324 in g_assertion_message (domain=<value optimized out>, file=<value optimized out>, line=<value optimized out>, func=0x50f400 "virtual GParted::Partition& GParted::OperationCheck::get_partition_new()", message=0x1d37a00 "assertion failed: (false)") at gtestutils.c:1358
#3 0x0000003620a678f0 in g_assertion_message_expr (domain=0x0, file=0x50f1a8 "OperationCheck.cc", line=40, func=0x50f400 "virtual GParted::Partition& GParted::OperationCheck::get_partition_new()", expr=<value optimized out>) at gtestutils.c:1369
#4 0x0000000000498e21 in GParted::OperationCheck::get_partition_new (this=0x1d1bb30) at OperationCheck.cc:40
#5 0x00000000004c66ec in GParted::Win_GParted::activate_delete (this=0x7fff031c3e30) at Win_GParted.cc:2068
...
When Win_GParted::activate_delete() was stepping through the operation
list removing operations (2 & 3 in the above recreation steps) which
related to the new partition never to be created it called
get_partition_new() on all operations in the list. This included
calling get_partition_new() on the check operation (1 in the above
recreation steps). As partition_new was not set or used by the check
operation get_partition_new() asserted false and crashed GParted.
Fix by populating the partition_new member in OperationCheck objects,
thus allowing get_partition_new() to be called on the object. As a
check operation doesn't change any partition boundaries or file system
attributes, just duplicate the new partition from the original
partition.
Bug 767233 - GParted core dump on assert failure in
OperationDelete::get_partition_new()
E2fsprogs 1.42 adds ext4 64bit feature [1] allowing volume sizes larger
than 16 TiB. However only enable large volumes from e2fsprogs 1.42.9
when a large number of 64bit bugs were fixed [2]. (Also RHEL / CentOS 7
use e2fsprogs 1.42.9 and always enable 64bit feature by default).
[1] Release notes, E2fsprogs 1.42 (November 29, 2011)
http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.42
"This release of e2fsprogs has support for file systems > 16TB.
Online resize requires kernel support which will hopefully be in
Linux version 3.2. Offline support is not yet available for > 16TB
file systems, but will be coming".
[2] Release notes, E2fsprogs 1.42.9 (December 28, 2013)
http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.42.9
"Fixed a large number of bugs in resize2fs, e2fsck, debugfs, and
libext2fs to correctly handle bigalloc and 64-bit file systems.
There were many corner cases that had not been noticed in previous
uses of these file systems, since they are not as common. Some of
the bugs could cause file system corruption or data loss, so users
of 64-bit or bigalloc file systems are strongly urged to upgrade to
e2fsprogs 1.42.9".
Bug 766910 - Multiple boot loaders don't work on 64bit EXT4 file systems
Calling libparted ped_geometry_new() creates a new PedGeometry object
from malloced memory, however the corresponding ped_geometry_destroy()
is never called to destroy the object and free the memory.
Perform a resize of a FAT file system when running GParted under
valgrind identifies several memory blocks leaked via ped_geometry_new()
from resize_move_filesystem_using_libparted(). One such example:
# valgrind --track-origins=yes --leak-check=full ./gpartedbin
...
==32069== 32 bytes in 1 blocks are definitely lost in loss record 5,419 of 11,542
==32069== at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==32069== by 0x8ECD8C5: ped_malloc (libparted.c:231)
==32069== by 0x8ED23C1: ped_geometry_new (geom.c:79)
==32069== by 0x4764F3: GParted::GParted_Core::resize_move_filesystem_using_libparted(GParted::Partition const&, GParted::Partition const&, GParted::OperationDetail&) (GParted_Core.cc:2666)
==32069== by 0x478007: GParted::GParted_Core::resize_filesystem(GParted::Partition const&, GParted::Partition const&, GParted::OperationDetail&, bool) (GParted_Core.cc:2990)
==32069== by 0x478440: GParted::GParted_Core::maximize_filesystem(GParted::Partition const&, GParted::OperationDetail&) (GParted_Core.cc:3037)
==32069== by 0x4769A0: GParted::GParted_Core::resize(GParted::Partition const&, GParted::Partition const&, GParted::OperationDetail&) (GParted_Core.cc:2746)
==32069== by 0x47582B: GParted::GParted_Core::resize_move(GParted::Partition const&, GParted::Partition&, GParted::OperationDetail&) (GParted_Core.cc:2457)
==32069== by 0x46DDB2: GParted::GParted_Core::apply_operation_to_disk(GParted::Operation*) (GParted_Core.cc:767)
...
There is also a leak of a PedGeometry object from
resize_move_partition(). Fix by calling ped_geometry_destroy() to
delete all the allocated PedGeometry objects and free the memory.
Bug 767009 - PedGeometry objects are memory leaked
When replacing the list of paths for the other partition object involved
in either a Resize/Move or Format operation in apply_operation_to_disk()
should replace the whole list of partitions not just replace with the
first path. Copy the whole path list is the correct action. It makes
no material difference because secondary partition paths are only used
to discover mount points during refresh phase and not at the apply
phase, where only the primary path is used.
Bug 766349 - Resolve code ugliness with partition path getting set to
"copy of /dev/SRC"
Quoting the relevant comments from GParted_Core::calibrate_partition():
Re-add the real partition path from libparted.
When creating a copy operation by pasting into unallocated space the
list of paths for the partition object was set to
["Copy of /dev/SRC"] because the partition didn't yet exist before
the operations were applied. Additional operations on that new
partition also got the list of paths set to ["Copy of /dev/SRC"].
This re-adds the real path to the start of the list making it
["/dev/NEW", "Copy of /dev/SRC"]. This provides the real path for
file system specific tools used during those additional operations
such mkfs for the format operation or fsck and others for the
resize/move operation.
FIXME: Having this work just because "/dev/NEW" happens to sort
before "Copy of /dev/SRC" is ugly! Probably have a separate display
path which can be changed at will without affecting the list of real
paths for the partition.
Having a separate display path is overly complicated and unnecessary.
Just replace the list of paths with the real one reported by libparted
if it contained "Copy of /dev/SRC", determined by checking if the file
exists. Otherwise continue to add the libparted name as an alternate
path. Whole disk devices can never be named "Copy of /dev/SRC" because
they are not partitioned so never created or deleted by GParted, only
ever written to, hence don't need the extra exists test logic.
Bug 766349 - Resolve code ugliness with partition path getting set to
"copy of /dev/SRC"
When composing a copy operation it always named the destination
partition as "copy of /dev/SRC". For the case of pasting into
unallocated space creating a new partition this was the right thing to
do as the partition doesn't yet exist so the path is not yet known.
However for the case of pasting into an existing partition the path is
known and replacing it with "copy of /dev/SRC" is wrong. No other
operation when operating on an existing partition changes it path.
Given a set of existing partitions, sdb1 to sdb4, compose a set of copy
operations as: copy sdb1 to sdb2, copy sdb2 to sdb3 and copy sdb3 to
sdb4. The displayed partitions before being applied become:
/dev/sdb1
copy of /dev/sdb1
copy of copy of /dev/sdb1
copy of copy of copy of /dev/sdb1
And the pending operations are named:
Copy /dev/sdb1 to /dev/sdb2
Copy copy of /dev/sdb1 to /dev/sdb3
Copy copy of copy of /dev/sdb1 to /sev/sdb4
This is perverse. In the case of pasting into an existing partition
keep the real path name. This keeps the partitions being displayed as:
/dev/sdb1
/dev/sdb2
/dev/sdb3
/dev/sdb4
And the pending operations named as:
Copy /dev/sdb1 to /dev/sdb2
Copy /dev/sdb2 to /dev/sdb3
Copy /dev/sdb3 to /dev/sdb4
Much more understandable.
Also switch to an upper case "C" in "Copy of /dev/SRC" as the temporary
path name when pasting into unallocated space. Finally update the
comment in calibrate_partition() to describe the remaining cases when
re-adding the path is still required.
Bug 766349 - Resolve code ugliness with partition path getting set to
"copy of /dev/SRC"
Make the code a little more self documenting by adding the symbolic
constants:
SETTLE_DEVICE_APPLY_MAX_WAIT_SECONDS
SETTLE_DEVICE_PROBE_MAX_WAIT_SECONDS
which highlight that settle_device() is called in two different
contexts, device probe and apply operations, with two different timeout
values.
File system specific commands sometimes fail reporting that the
partition specific /dev entry doesn't exist. Example failing check
operation details:
Check and repair file system (ext4) on dev/sdb4
calibrate /dev/sdb4
path: /dev/sdb4 (partition)
start: 4196352
end: 6293503
size: 2097152 (1.00 GiB)
check file system on /dev/sdb4 for errors and (if possible) fix them
e2fsck -f -y -bv -C 0 /dev/sdb4
e2fsck 1.42.9 (28-Dec-2013)
e2fsck: No such file or directory while trying to open /dev/sdb4
Possibly non-existent device?
This has been reproduced on CentOS 7. Debugging shows that the
libparted calls used to re-read the partition details in
GParted_Core::calibrate_partition() leads to udev removing and re-adding
all the partition /dev entries for the disk.
# udevadm monitor &
# gpartedbin
...
16.480662 +12.618659 calibrate_partition() calling get_device("/dev/sdb", lp_device) ...
16.483644 +0.002982 calibrate_partition() get_device() returned
16.483678 +0.000034 calibrate_partition() calling get_disk(lp_device, lp_disk) ...
16.618113 +0.134435 calibrate_partition() get_disk() returned
KERNEL[19275.707968] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
16.618561 +0.000448 destroy_device_and_disk() calling ped_disk_destroy(lp_disk) ...
16.618584 +0.000023 destroy_device_and_disk() ped_disk_destroy() returned
16.618591 +0.000007 destroy_device_and_disk() calling ped_device_destroy(lp_disk) ...
16.618602 +0.000011 destroy_device_and_disk() ped_device_destroy() returned
16.618687 +0.000085 calibrate_partition() return true
16.618851 +0.000164 execute_command() e2fsck -f -y -v -C 0 /dev/sdb4
KERNEL[19275.708389] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
KERNEL[19275.708500] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
KERNEL[19275.708643] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block)
KERNEL[19275.768278] change /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb (block)
KERNEL[19275.771171] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
KERNEL[19275.771360] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
KERNEL[19275.771542] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
KERNEL[19275.775858] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block)
UDEV [19275.820153] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
UDEV [19275.823152] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block)
UDEV [19275.828275] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
16.742735 +0.123884 execute_command() exit status 8
UDEV [19275.841425] remove /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
UDEV [19275.905478] change /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb (block)
UDEV [19276.013580] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb3 (block)
UDEV [19276.034728] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb4 (block)
UDEV [19276.174840] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb1 (block)
UDEV [19276.237105] add /devices/pci0000:00/0000:00:0d.0/ata4/host3/target3:0:0/3:0:0:0/block/sdb/sdb2 (block)
So exactly when GParted is running the external e2fsck command, udev is
in the middle of removing and re-adding all the /dev partition entries
for the disk. Hence the above failure reporting that /dev/sdb4 didn't
exist. This error depends on the timing between GParted running the
external file system specific command and udev removing and re-adding
the entries, so sometimes it works and sometimes it fails.
Further debugging showed that simply opening and closing the whole disk
device read-write triggers the same removing and re-adding of all the
partition /dev entries with udev >= 219. Opening the whole disk device
read-write is what libparted has always done until this post
libparted 3.2 patch to make it open read-only when probing:
http://git.savannah.gnu.org/cgit/parted.git/commit/?id=44d5ae0115c4ecfe3158748309e9912c5aede92d
libparted: Use read only when probing devices on linux (#1245144)
To fix this simply wait for udev devices to settle in
calibrate_partitions(). The longest I have seen udev take to do this is
0.80 seconds in a VM. Wait up to 10 seconds as is done in commit() ->
commit_to_os(), also called when applying operations.
On configurations which don't have this issue execution of udevadm
settle, which will return immediately, adds at most 0.1 seconds to the
time taken for the calibrate step. This won't be noticed in the time
taken of the operation details so there is no point in trying to avoid
executing udevadm settle when not needed.
Bug 762941 - Operations sometimes failing with: No such file or
directory
Minor issues:
1) In the while loop reading from /proc/partitions into variable line,
just after the sscanf() call the variable was re-purposed to hold the
device name making the code unnecessarily hard to follow.
2) Variable c_str was a fixed sized buffer holding the device name read
from /proc/partitions.
3) Variable c_str name provides no meaning as to what data it holds.
4) Return value from all the Utils::regexp_label() calls is converted
from Glib::ustring to std::string to be stored in device variable.
Resolve by using Utils::regexp_label() to extract the device name from
each line in /proc/partitions and store in the variable device, already
used for this purpose and now changed to type Glib::ustring.
realpath(3) manual page says:
BUGS
The POSIX.1-2001 standard version of this function is broken by
design, since it is impossible to determine a suitable size for
the output buffer, resolved_path. According to POSIX.1-2001 a
buffer of size PATH_MAX suffices, but PATH_MAX need not be a
defined constant, and may have to be obtained using pathconf(3).
And asking pathconf(3) does not really help, since, on the one
hand POSIX warns that the result of pathconf(3) may be huge and
unsuitable for mallocing memory, and on the other hand
pathconf(3) may return -1 to signify that PATH_MAX is not
bounded. The resolved_path == NULL feature, not standardized in
POSIX.1-2001, but standardized in POSIX.1-2008, allows this
design problem to be avoided.
The resolved_path == NULL feature of realpath() has existed as a Glibc
extension since realpath() was first added to Glibc 1.90, released in
June 1996. Therefore it can be used unconditionally.
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=fa0bc87c32d02cd81ec4d0ae00e0d943c683e6e1
Bug 764369 - Use realpath() safely
The previous commit (Fix crash reading NTFS usage when there is no
/dev/PTN entry) identified that the FileSystem member variable "index"
is too small on 64-bit machines. Also this member variable stores no
FileSystem class information and was being used as a local variable.
Replace with local variables of the of the correct type, wide enough to
store the npos not found value.
Bug 764658 - GParted crashes when reading NTFS usage when there is no
/dev/PTN entry