Reiser4 has introduced new disk format which includes support for
spanning the file system over multiple block devices (subvolumes)
[1][2]. As such the output of the debugfs.reiser4 for the UUID has
changed slightly. So far the new reiser4progs package (version 2.0.x)
is only available as a Debian experimental package.
Using reiser4progs 1.2.1 the old output was like this:
$ debugfs.reiser4 test.img
debugfs.reiser4 1.2.1
Format release: 4.0.2
Copyright (C) 2001-2005 by Hans Reiser, licensing governed by reiser4progs/COPYING.
Master super block (16):
magic: ReIsEr4
blksize: 4096
format: 0x0 (format40)
uuid: 1116afce-99fd-4a6e-94cb-2d9f19c91d67
label: <none>
...
With reiser4progs 2.0.4 the new output is like this:
$ debugfs.reiser4 test.img
debugfs.reiser4
Package Version: 2.0.4
Software Framework Release: 5.1.3
Copyright (C) 2001-2005 by Hans Reiser, licensing governed by reiser4progs/COPYING.
Master super block (16):
magic: ReIsEr4
blksize: 4096
volume: 0x1 (asym)
distrib: 0x1 (fsx32m)
format: 0x1 (format41)
description: Standard layout for logical volumes.
stripe bits: 14
mirror id: 0
replicas: 0
volume uuid: 9538bfa3-5694-4abe-864c-edc288a9d801
subvol uuid: d841c692-2042-49e6-ac55-57e454691782
label: <none>
...
GParted happens to read the correct UUID just because the first matching
"uuid" string in the output is the volume UUID. Make the code more
robust by explicitly reading the volume uuid when labelled as such.
[1] Logical Volumes Howto
https://reiser4.wiki.kernel.org/index.php/Logical_Volumes_Howto
[2] Logical Volumes Background
https://reiser4.wiki.kernel.org/index.php/Logical_Volumes_Background
The GitLab CI ubuntu_test job has occasionally been failing like this,
perhaps once every few weeks or so.
[ RUN ] My/SupportedFileSystemsTest.CreateAndReadUUID/reiser4
test_SupportedFileSystems.cc:569: Failure
Expected: (m_partition.uuid.size()) >= (9U), actual: 0 vs 9
[ FAILED ] My/SupportedFileSystemsTest.CreateAndReadUUID/reiser4, where GetParam() = 24 (17 ms)
[----------] 1 test from My/SupportedFileSystemsTest (17 ms total)
Turns out there are 2 bugs in resier4progs. One causes debugfs.reiser4
to report a null UUID if the first byte of the UUID happens to be zero
[1], and the other cases mkfs.resier4 to write a corrupted UUID,
sometimes a null (all zeros) UUID [2].
There is a 1 in 256 chance of getting a null UUID [2] when creating and
reading a reiser4 file system, hence the occasional failure of the CI
job. The centos_test job isn't affected because CentOS doesn't have the
reiser4progs package.
Fix this by detecting when reiser4 reports a null UUID and assign a
dummy UUID to make the test pass. This does mean that there is a 1 in
256 chance of not detecting a true failure. However that still means
there is a 255 in 256 chance of detecting a true failure. That's good
odds. When a null UUID is detected for a reiser4 file system the test
output looks like this:
[ RUN ] My/SupportedFileSystemsTest.CreateAndReadUUID/reiser4
test_SupportedFileSystems.cc:580: Ignore test failure of a null UUID.
[ OK ] My/SupportedFileSystemsTest.CreateAndReadUUID/reiser4 (46 ms)
[1] 4802cdb18a
Fix up repair_master_print()
[2] 44cc024f39
Stop occasionally making file systems with null UUIDs
Closes#145 - Sporadic failure of test case
My/SupportedFileSystemsTest.CreateAndReadUUID/reiser4
Print the kernel version and supported file systems inside the GNOME
GitLab CI jobs as a debugging aid. Kernel version helps identify the
CI job runner's distribution to identify kernel features. Supported
file systems identifies which ones can be mounted, should that be
possible in future. Print supported file systems before and after the
tests because checking for support may load additional modules. See
calls to Utils::kernel_supports_fs() for: btrfs, jfs, nilfs2 and xfs.
Closes#147 - GitLab CI test failure from *.CreateAndGrow/jfs
For the first time ever the ubuntu_test GitLab CI job failed running the
JFS grow test like this. Fragment from tests/test-suite.log:
[ RUN ] My/SupportedFileSystemsTest.CreateAndGrow/jfs
test_SupportedFileSystems.cc:387: Failure
Failed
create_loopdev(): Execute: losetup --find --show 'test_SupportedFileSystems.img'
losetup: cannot find an unused loop device
create_loopdev(): Losetup failed with exit status 1
create_loopdev(): Failed to create required loop device
Error: Could not stat device - No such file or directory.
test_SupportedFileSystems.cc:446: Failure
Value of: lp_device != NULL
Actual: false
Expected: true
test_SupportedFileSystems.cc:649: Failure
Value of: m_fs_object->create(m_partition, m_operation_detail)
Actual: false
Expected: true
Operation details:
mkfs.jfs -q -L '' '' 00:00:00 (ERROR)
mkfs.jfs version 1.1.15, 04-Mar-2011
The system cannot find the specified device.
detach_loopdev(): Execute: losetup --detach ''
losetup: : failed to use device: No such device
detach_loopdev(): Losetup failed with exit status 1
detach_loopdev(): Failed to detach loop device. Test NOT affected
[ FAILED ] My/SupportedFileSystemsTest.CreateAndGrow/jfs, where GetParam() = 17 (24 ms)
JFS can only be grown when mounted by the kernel and GParted only
enables JFS grow support when, among other things, kernel support is
detected.
Unknowingly the JFS grow test had always previously been skipped, even
in the ubuntu_test CI job which installs jfsutils, because the kernel
didn't support JFS. Capture of this test from another run of the
ubuntu_test CI job:
[ RUN ] My/SupportedFileSystemsTest.CreateAndGrow/jfs
test_SupportedFileSystems.cc:641: Skip test. grow not supported or support not found
[ OK ] My/SupportedFileSystemsTest.CreateAndGrow/jfs (0 ms)
Plus additional debug added into the job based on what
Utils::kernel_supports_fs() does to identify kernel support:
$ fgrep jfs /proc/filesystems || true
$ modprobe jfs || true
modprobe: FATAL: Module jfs not found in directory /lib/modules/3.10.0-1160.11.1.el7.x86_64
$ fgrep jfs /proc/filesystems || true
Therefore until now every GitLab job runner machine kernel didn't
support JFS, but for the first time ever this ubuntu_test job ran on a
runner machine where the kernel did support JFS, hence the attempt to
use losetup.
Examining test_SupportFileSystems.cc there are 24 file system tests
which specify SKIP_IF_NOT_ROOT_FOR_REQUIRED_LOOPDEV_FOR_FS(), but only
17 exclusions in .gitlab-ci.yaml [1]. The 7 tests without exclusions
are:
*.CreateAndReadLabel/lvm2pv
*.CreateAndReadUUID/lvm2pv
*.CreateAndWriteLabel/lvm2pv
*.CreateAndWriteUUID/lvm2pv
*.CreateAndGrow/jfs
*.CreateAndGrow/nilfs2
*.CreateAndShrink/nilfs2
For LVM2 PVs reading and writing of labels and UUIDs aren't implemented
(only reading of UUIDs could be supported as the others are impossible)
so those tests are always skipped. Add unit test exclusions just for
completeness.
JFS grow is this case. NILFS2 grow and shrink are more cases where
kernel support is needed. Add unit test exclusions to stop attempting
to run JFS and NILFS2 resizing tests, which don't currently work because
losetup doesn't work in the GitLab CI docker images [1].
[1] 39fdfe51da
Exclude unit tests needing losetup in Docker CI image (!59)
Closes#147 - GitLab CI job failure from *.CreateAndGrow/jfs
Executables which are not intended for execution by users, but by other
programs, should be installed into /usr/libexec [1][2]. gpartedbin
falls into this category. Update it's installation accordingly.
Standard Autotools details: gpartedbin will be installed into
EPREFIX/libexec by default. To install gpartedbin into a different
directory set libexecdir when configuring the build system. Like this
from git:
./autogen.sh --libexecdir=DIR
or like this from tar release:
./configure --libexecdir=DIR
[1] Filesystem Hierarchy Standard, version 3.0,
4.7. /usr/libexec : Binaries run by other programs (optional)
https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html
"/usr/libexec includes internal binaries that are not intended to be
executed directly by users or shell scripts.
"
[2] GNU Coding Standards, June 12, 2020,
7.2.5 Variables for Installation Directories
https://www.gnu.org/prep/standards/html_node/Directory-Variables.html
"libexecdir
The directory for installing executable programs to be run by other
programs rather than by users. This directory should normally be
/usr/local/libexec, but write it as $(exec_prefix)/libexec. (If you
are using Autoconf, write it as '@libexecdir@'.)
"
Closes#85 - Please install gpartedbin under /usr/libexec instead of
/usr/sbin
... in class Dialog_Partition_New and slightly refactor the code in
build_filesystems_combo() method which sets it.
Change the name from first_creatable_fs to default_fs to be more
immediately obvious what the variable represents. As default_fs is used
to index the items in the combo_filesystem derived ComboBox, make it's
type an int to match the type of the parameter passed to
Gtk::ComboBox::set_active() [1]. Initialise default_fs to -1 (no
selection) in the class constructor [2], which also allows removal of
local variable set_first just used to track whether first_creatable_fs
had been assigned yet or not.
[1] gtkmm: Gtk::ComboBox Class Reference, set_active()
https://developer.gnome.org/gtkmm/stable/classGtk_1_1ComboBox.html#a4f23cf08e85733d23f120935b235096d
[2] C++ FAQ / Should my constructors use "initialization lists" or
"assignment"?
https://isocpp.org/wiki/faq/ctors#init-lists
... in Dialog_Partition_New::build_filesystems_combo(). set_data()
populates this->FILESYSTEMS[] vector with supported file systems with
cleared, unformatted and extended added to the end. Then
build_filesystems_combo() adds those items to combo_filesystem, skipping
extended. It then makes the last item in the combobox sensitive,
relying on the fact that it is unformatted.
Refactor the code so build_filesystems_combo() no longer relies on
unformatted being the last item in combo_filesystem to always enable it.
... in Dialog_Partition_New::set_data(). As the change signal for
combo_filesystem has already been connected, combo_changed(false) is
automatically called by setting the active selection. Therefore remove
the unnecessary call.
"Type" was rather a generic name. Use "combo_type_changed" which makes
it clear that the boolean parameter indicates whether a change to
combo_type or one of the other ComboBoxes triggered this callback.
On an MSDOS partitioned drive, open the Create New Partition dialog and
change "created as" from Primary Partition to Extended Partition and
back to Primary Partition. On Fedora and RHEL/CentOS 8, which builds
packages with FORTIFY_SOURCE [1][2] and GLIBXX_Assertions [3][4]
enabled, GParted will crash.
Run GParted built with the default compilation options under valgrind
and repeat the test. Multiple out of bounds reads are reported like
this:
# valgrind --track-origins=yes ./gpartedbin
...
==232613== Invalid read of size 8
==232613== at 0x441AF6: GParted::Dialog_Partition_New::combobox_changed(bool) (Dialog_Partition_New.cc:354)
==232613== by 0x443DBD: sigc::bound_mem_functor1<void, GParted::Dialog_Partition_New, bool>::operator()(bool const&) const (mem_fun.h:2066)
Coming from Dialog_Partition_New.cc:
328 void Dialog_Partition_New::combobox_changed(bool type)
329 {
...
351 // combo_filesystem and combo_alignment
352 if ( ! type )
353 {
> 354 fs = FILESYSTEMS[combo_filesystem.get_active_row_number()];
When the partition type is changed to Extended the file system is forced
to be "Extended" too. This is done in ::combobox_changed() method by
modifying combo_filesystem to add "Extended", making that the selected
item and setting the widget as inactive.
Then when the partition type is changed back to primary the file system
combobox is returned to it's previous state. This is done by first
removing the last "Extended" item, making the widget active and setting
the selected item. However as "Extended" is the currently selected
item, removing it forces their to be no selected item and triggers a
change to combo_filesystem triggering a recursive call to
::combobox_changed() where combo_filesystem.get_active_row_number()
returns -1 (no selection) [5] and on line 354 the code accesses item -1
of the FILESYSTEMS[] vector.
Fix by setting the new combo_filesystem selection before removing the
currently selected "Extended" item. This has the added benefit of only
triggering a change to combo_filesystem once when the default item is
selected rather than twice when the currently "Extended" item is removed
and again when the default item is selected.
[1] [Fedora] Security Features, Compile Time Buffer Checks
(FORTIFY_SOURCE)
https://fedoraproject.org/wiki/Security_Features#Compile_Time_Buffer_Checks_.28FORTIFY_SOURCE.29
[2] Enhance application security with FORTIFY_SOURCE
https://access.redhat.com/blogs/766093/posts/1976213
[3] Security Features Matrix (GLIBXX_Assertions)
https://fedoraproject.org/wiki/Security_Features_Matrix
[4] GParted 1.2.0-1.fc33 package build.log for Fedora 33
https://kojipkgs.fedoraproject.org/packages/gparted/1.2.0/1.fc33/data/logs/x86_64/build.log
CXXFLAGS='-O2 -g ... -Wp,-D_FORTIFY_SOURCE=2
-Wp,-D_GLIBCXX_ASSERTIONS ...'
[5] gtkmm: Gtk::ComboBox Class Reference, get_active_row_number()
https://developer.gnome.org/gtkmm/stable/classGtk_1_1ComboBox.html#a53531bc041b5a460826babb8496c363bCloses#101 - Crash changing Partition type in "Create new partition"
dialog
This reverts commit:
e9223207e6.
Exclude PipeCapture read NUL byte unit tests in GitLab CI jobs (!60)
now that PipeCapture has been fixed to read NUL characters again.
Closes#136 - 1.2.0: test suite is failing in test_PipeCapture
On newer distributions the PipeCapture tests have been failing like
this:
$ ./test_PipeCapture
...
[ RUN ] PipeCaptureTest.ReadEmbeddedNULCharacter
test_PipeCapture.cc:336: Failure
Expected: inputstr
Of length: 6
To be equal to: capturedstr.raw()
Of length: 5
With first binary difference:
< 0x00000000 "ABC.EF" 41 42 43 00 45 46
--
> 0x00000000 "ABCEF" 41 42 43 45 46
[ FAILED ] PipeCaptureTest.ReadEmbeddedNULCharacter (0 ms)
[ RUN ] PipeCaptureTest.ReadNULByteInMiddleOfMultiByteUTF8Character
test_PipeCapture.cc:353: Failure
Expected: expectedstr
Of length: 7
To be equal to: capturedstr.raw()
Of length: 6
With first binary difference:
< 0x00000000 "._45678" 00 5F 34 35 36 37 38
--
> 0x00000000 "_45678" 5F 34 35 36 37 38
[ FAILED ] PipeCaptureTest.ReadNULByteInMiddleOfMultiByteUTF8Character (0 ms)
...
Found that test_PipeCapture succeeds on Fedora 31 and fails on
Fedora 32. Also test_PipeCapture binary from Fedora 31 and 32 both pass
on Fedora 31 and both fail on Fedora 32. So something outside of the
GParted code and tests is the cause.
Confirmed that this GLib change "Add a missing check to
g_utf8_get_char_validated()" [1], first released in GLib 2.63.0, made
the difference. On Fedora 32 with GLib 2.64.6, rebuilt GLib with that
change reverted and the tests passed. Anyway fix the wrapper GParted
has around g_utf8_get_char_validated() to also handle this case of
reading a NUL character.
[1] 568720006c
Add a missing check to g_utf8_get_char_validated()
Closes#136 - 1.2.0: test suite is failing in test_PipeCapture
Extract call to GLib's g_utf8_get_char_validated() and the associated
workaround to also read NUL characters into a separate function to make
PipeCapture::OnReadable() a little smaller and simpler, so easier to
understand.
Add max_len > 0 clause into get_utf8_char_validated() like this:
if (uc == UTF8_PARTIAL && max_len > 0)
so that the NUL character reading workaround is only applied when
max_len specifies the maximum number of bytes to read, rather than
when -1 specifies reading a NUL termination string. This makes
get_utf8_char_validated() a complete wrapper of
g_utf8_get_char_validated() [1], even though GParted always specifies
the maximum number of bytes to read.
No longer describe the inability to read NUL characters as a bug [2]
since the GLib author's said it wasn't [3].
[1] GLib Reference Manual, Unicode Manipulation Functions,
g_utf8_get_char_validated ()
https://developer.gnome.org/glib/stable/glib-Unicode-Manipulation.html#g-utf8-get-char-validated
[2] 8dbbb47ce2
Workaround g_utf8_get_char_validate() bug with embedded NUL bytes
(#777973)
[3] Bug 780095 - g_utf8_get_char_validated() stopping at nul byte even
for length specified buffers
https://bugzilla.gnome.org/show_bug.cgi?id=780095#18
"If g_utf8_get_char_validated() encounters a nul byte in the
middle of a string of given longer length, it returns -2,
indicating a partial gunichar. That is not the obvious
behaviour, but since g_utf8_get_char_validated() has been API
for a long time, the behaviour cannot be changed.
"
Closes#136 - 1.2.0: test suite is failing in test_PipeCapture
This previous commit [1] suggested that in future partition deletion
might be allowed even while a LUKS mapping was active in that partition.
To allow deletion of a partition while it has active content is wrong.
That is a significant reason GParted has busy detection of otherwise
unrecognised file systems [2] and recognition and busy detection of, but
otherwise not controllable support for, Linux Software RAID [3] and
ATARAID [4][5] arrays.
To automatically close the LUKS partition first would be against the
pattern of behaviour that GParted has established, of requiring explicit
deactivation of file systems, swap and volume groups before allowing
deletion. Therefore update the comment accordingly.
[1] f1e3d42b56
Prevent deletion of open LUKS mappings (#774818)
[2] 49a2e19462
Restore busy detection of unknown mounted file systems (#723842)
[3] d2e1130ad2
Detect busy status of Linux Software RAID members (#709640)
[4] 6e990ea48a
Detect busy status of mdadm started ATARAID members (#75)
[5] caec22871e
Detect busy status of dmraid started ATARAID members (#75)
Also with exfatprogs 1.1.0 [1], tune.exfat and exfatlabel gained the
capability to report and set the exFAT Volume Serial Number [2][3][4].
This is what blkid and therefore GParted reports as the UUID.
Report serial number:
# tune.exfat -i /dev/sdb1
exfatprogs version : 1.1.0
volume serial : 0x772ffe5d
# echo $?
0
# blkid /dev/sdb1
/dev/sdb1: LABEL="test exfat" UUID="772F-FE5D" TYPE="exfat" PTTYPE="dos"
Set serial number:
# tune.exfat -I 0xf96ef190 /dev/sdb1
exfatprogs version : 1.1.0
New volume serial : 0xf96ef190
# echo $?
0
tune.exfat exists in earlier releases of exfatprogs so check it has the
capability by searching for "Set volume serial" in the help output
before enabling this capability.
# tune.exfat
exfatprogs version : 1.1.0
Usage: tune.exfat
-l | --print-label Print volume label
-L | --set-label=label Set volume label
-i | --print-serial Print volume serial
-L | --set-serial=value Set volume serial
-V | --version Show version
-v | --verbose Print debug
-h | --help Show help
(Note the cut and paste error reporting the set volume serial flag as
'-L' rather than actually '-S').
[1] exfatprogs-1.1.0 version released
http://github.com/exfaoprogs/exfatprogs/releases/tag/1.1.0
[2] [tools][feature request] Allow To Change Volume Serial Number ("ID")
#138https://github.com/exfatprogs/exfatprogs/issues/138
[3] exfatlabel:add get/set volume serial option
b4d9c9eeb5
[4] exFAT file system specification, 3.1.11 VolumeSerialNumber Field
https://docs.microsoft.com/en-us/windows/win32/fileio/exfat-specification#3111-volumeserialnumber-fieldCloses!67 - Add support for reading exFAT usage and updating the UUID
exfatprogs 1.1.0 released 2021-02-09 [1] has gained support for
reporting file system usage [2][3] so add that capability to GParted.
It works like this:
# dump.exfat /dev/sdb1 | egrep 'Volume Length\(sectors\):|Sector Size Bits:|Sector per Cluster bits:|Free Clusters:'
Volume Length(sectors): 524288
Sector Size Bits: 9
Sector per Cluster bits: 3
Free Clusters: 23585
Unfortunately dump.exfat returns a non-zero status on success so that
can't be used to check for failure:
# dump.exfat /dev/sdb1
exfatprogs version : 1.1.0
-------------- Dump Boot sector region --------------
Volume Length(sectors): 524288
...
# echo $?
192
dump.exfat only writes errors to stderr, so use this to identify failure:
# dump.exfat /dev/sdb1 1> /dev/null
# echo $?
192
# dump.exfat /dev/zero 1> /dev/null
invalid block device size(/dev/zero)
bogus sector size bits : 0
# echo $?
234
[1] exfatprogs-1.1.0 version released
http://github.com/exfaoprogs/exfatprogs/releases/tag/1.1.0
[2] [feature request] File system usage reporting
https://github.com/exfatprogs/exfatprogs/issues/139
[3] exfatprogs: add dump.exfat
7ce9b2336bCloses!67 - Add support for reading exFAT usage and updating the UUID
A user had exfat-utils installed and tried to use GParted to create an
exfat file system. GParted ran this command but it failed:
# mkfs.exfat -L '' '/dev/sdb1'
mkexfatfs 1.3.0
mkfs.exfat: invalid option -- 'L'
Usage: mkfs.exfat [-i volume-id] [-n label] [-p partition-first-sector] [-s sectors-per-cluster] [-V] <device>
The problem is that both exfat-utils and exfatprogs packages provide
mkfs.exfat and fsck.exfat commands but they have incompatible command
line options and GParted is programmed for exfatprogs. So far GParted
just checks the executable exists, hence the mis-identification.
Reported version of exfat-utils commands:
$ mkfs.exfat -V 2> /dev/null
mkexfatfs 1.3.0
Copyright (C) 2011-2018 Andrew Nayenko
$ fsck.exfat -V 2> /dev/null
exfatfsck 1.3.0
Copyright (C) 2011-2018 Andrew Nayenko
Reported versions of exfatprogs commands:
$ mkfs.exfat -V 2> /dev/null
exfatprogs version : 1.0.4
$ fsck.exfat -V 2> /dev/null
exfatprogs version : 1.0.4
Fix this by only enabling exfat support also when the version string of
each command starts "exfatprogs version". Note that this extra checking
is not needed for tune.exfat because only exfatprogs provides that
executable.
Closes#137 - Creating exfat partition with a label fails with error
gparted shell wrapper always exits with a 0 status even if gpartedbin
fails. For example make gpartedbin fail with a non-zero exit status
like this:
$ (unset DISPLAY; unset XAUTHORITY; /usr/sbin/gpartedbin)
(gpartedbin:3936): Gtk-WARNING **: 16:36:06.263: cannot open display:
$ echo $?
1
However the gparted shell wrapper instead exits with successful status
0:
$ (unset DISPLAY; unset XAUTHORITY; gparted)
(gpartedbin:4282): Gtk-WARNING **: 16:39:23.514: cannot open display:
$ echo $?
0
Fix this.
This method is now only called from one location in the code so put it's
two lines of code there.
Closes#131 - GParted hangs when non-named device is hung
Now we always want to run blkid naming all paths, ensure the FS_Info
cache is explicitly loaded first. Report an error if not done so and
remove the cache loading code from running blkid without naming all
paths. Fewer code paths to consider and reason about.
Closes#131 - GParted hangs when non-named device is hung
Again on Fedora 31 with a slightly different disk layout to the previous
commit. sdb is partitioned with 1 empty partition and sdc remains
completely empty:
# lsblk -o name,maj:min,rm,size,ro,type,fstype,label,mountpoint
NAME MAJ:MIN RM SIZE RO TYPE FSTYPE LABEL MOUNTPOINT
sda 8:0 0 20G 0 disk
|-sda1 8:1 0 1G 0 part ext4 /boot
\-sda2 8:2 0 19G 0 part LVM2_member
|-fedora-root 253:0 0 17G 0 lvm ext4 /
\-fedora-swap 253:1 0 2G 0 lvm swap [SWAP]
sdb 8:16 0 8G 0 disk
\-sdb1 8:17 0 1G 0 part
sdc 8:32 0 8G 0 disk
sr0 11:0 1 1024M 0 rom
# blkid -v
blkid from util-linux 2.34 (libblkid 2.34.0, 14-Jun-2019)
# blkid /dev/sda /dev/sda1 /dev/sda2 /dev/sdb /dev/sdb1 /dev/sdc
/dev/sda: PTUUID="5012fb1f" PTTYPE="dos"
/dev/sda1: UUID="3cd48816-7817-4636-9fec-5f1afe76c1b2" TYPE="ext4" PARTUUID="5012fb1f-01"
/dev/sda2: UUID="PH94ej-C8xU-bnMJ-UIh8-ZimI-4B7f-dHlZxh" TYPE="LVM2_member" PARTUUID="5012fb1f-02"
/dev/sdb: PTUUID="1d120b57" PTTYPE="dos"
/dev/sdb1: PARTUUID="1d120b57-01"
Stracing GParted shows these executions of blkid:
# strace -f -q -bexecve -eexecve ./gpartedbin 2>&1 1> /dev/null | egrep -v 'ENOENT|SIGCHLD'
...
[pid 160040] execve("/usr/sbin/blkid", ["blkid", "/dev/sda", "/dev/sda1", "/dev/sda2", "/dev/sdb", "/dev/sdb1", "/dev/sdc"], 0xa4e1b0 /* 32 vars */ <detached ...>
[pid 160041] execve("/usr/sbin/blkid", ["blkid", "/dev/sdc"], 0xa4e1b0 /* 32 vars */ <detached ...>
...
On Fedora 31 with blkid from util-linux 2.34 it reports information for
sdb (partitioned drive) and sdb1 (empty partition) with only no
information for sdc (empty whole disk drive). Hence no FS_Info cache
entry and re-execution of blkid just for sdc.
On older CentOS 7 with the same disk layout blkid reports this:
# blkid -v
blkid from util-linux 2.23.2 (libblkid 2.23.0, 25-Apr-2013)
# blkid /dev/sda /dev/sda1 /dev/sda2 /dev/sdb /dev/sdb1 /dev/sdc
/dev/sda: PTTYPE="dos"
/dev/sda1: UUID="e7d559e4-3e1d-4fbc-b034-3fdeb498f44d" TYPE="xfs"
/dev/sda2: UUID="B7ODFx-BfTE-hq7N-UlrF-f5ML-CPRe-klSy26" TYPE="LVM2_member"
/dev/sdb: PTTYPE="dos"
And stracing GParted shows these executions of blkid:
# strace -f -q -bexecve -eexecve ./gpartedbin 2>&1 1> /dev/null | egrep -v 'ENOENT|SIGCHLD'
...
[pid 1889] execve("/sbin/blkid", ["blkid", "/dev/sda", "/dev/sda1", "/dev/sda2", "/dev/sdb", "/dev/sdb1", "/dev/sdc"], 0x10b8b10 /* 26 vars */ <detached ...>
[pid 1890] execve("/sbin/blkid", ["blkid", "/dev/sdb1"], 0x10b8b10 /* 26 vars */ <detached ...>
[pid 1891] execve("/sbin/blkid", ["blkid", "/dev/sdc"], 0x10b8b10 /* 26 vars */ <detached ...>
...
This time on CentOS 7 with blkid from util-linux 2.23.2 it reports
information for only sdb (partitioned drive), but not sdb1 (empty
partition) or sdc (empty whole disk drive). Hence no FS_info cache
entries and re-execution of blkid for both sdb1 and sdc.
GParted needs blkid identification of file system images, LVM Logical
Volumes or any other partitions named on the command line which it
wouldn't normally scan [1]. Now every name of interest is passed to
blkid, additional executions of blkid won't get any extra information
and are redundant. Therefore remove this unnecessary code.
Note that these last 2 commits remove creation of "blank" cache entries
(just block special with blank fstype and other attributes) when blkid
reports no information for a particular path. Those entry were needed
to suppress unnecessary additional execution of blkid. However now that
blkid is only executed once (excluding querying the label) this is no
longer necessary. All the getter functions return suitable blank values
when no cache entry is found.
[1] e8f0504b13
Make sure that FS_Info cache is loaded for all named paths (#787181)
Closes#131 - GParted hangs when non-named device is hung
On Fedora 31 with this simple disk layout where both sdb and sdc are
completely empty:
# lsblk -o name,maj:min,rm,size,ro,type,fstype,label,mountpoint
NAME MAJ:MIN RM SIZE RO TYPE FSTYPE LABEL MOUNTPOINT
sda 8:0 0 20G 0 disk
|-sda1 8:1 0 1G 0 part ext4 /boot
\-sda2 8:2 0 19G 0 part LVM2_member
|-fedora-root 253:0 0 17G 0 lvm ext4 /
\-fedora-swap 253:1 0 2G 0 lvm swap [SWAP]
sdb 8:16 0 8G 0 disk
sdc 8:32 0 8G 0 disk
sr0 11:0 1 1024M 0 rom
# blkid /dev/sda /dev/sda1 /dev/sda2 /dev/sdb /dev/sdc
/dev/sda: PTUUID="5012fb1f" PTTYPE="dos"
/dev/sda1: UUID="3cd48816-7817-4636-9fec-5f1afe76c1b2" TYPE="ext4" PARTUUID="5012fb1f-01"
/dev/sda2: UUID="PH94ej-C8xU-bnMJ-UIh8-ZimI-4B7f-dHlZxh" TYPE="LVM2_member" PARTUUID="5012fb1f-02"
Stracing GParted shows extra executions of blkid:
# strace -f -q -bexecve -eexecve ./gpartedbin 2>&1 1> /dev/null | egrep -v 'ENOENT|SIGCHLD'
...
[pid 7659] execve("/usr/sbin/blkid", ["blkid", "/dev/sda", "/dev/sda1", "/dev/sda2", "/dev/sdb", "/dev/sdc"], 0x1d300f0 /* 32 vars */ <detached ...>
[pid 7660] execve("/usr/sbin/blkid", ["blkid", "/dev/sdb"], 0x1d300f0 /* 32 vars */ <detached ...>
[pid 7661] execve("/usr/sbin/blkid", ["blkid", "/dev/sdc"], 0x1d300f0 /* 32 vars */ <detached ...>
...
blkid is only run again for sdb and sdc, not sda, because blkid didn't
report anything for them from the first execution. GParted needs blkid
identification of whole disk devices to ensure that ISO9660 images on
whole disk devices are correctly identified [1]. Now the first run of
blkid passes all the device names, so this additional execution of blkid
won't get any extra information and is redundant. Therefore remove this
unnecessary code.
[1] b2190372d0
Ensure blkid FS_Info cache has entries for all whole disk devices
(#771244)
Closes#131 - GParted hangs when non-named device is hung
A user reported that GParted would hang at "scanning all devices...",
when a fully working disk was named on the command line, but another
device on the machine was hung.
This can be replicated like this:
(on Ubuntu 20.04 LTS for it's NBD support)
1. Export and import NBD:
# truncate -s 1G /tmp/disk-1G.img
# nbd-server -C /dev/null 9000 /tmp/disk-1G.img
# nbd-client localhost 9000 /dev/nbd0
2. Hang the NBD server and therefore /dev/nbd0:
# killall -STOP nbd-server
3. Run GParted:
$ gparted /dev/sda
Tracing GParted shows that execution of blkid never returns.
# strace -f -tt -q -bexecve -eexecve ./gpartedbin 2>&1 1> /dev/null | fgrep -v ENOENT
...
[pid 37823] 13:56:24.814139 execve("/usr/sbin/mkudffs", ["mkudffs", "--help"], 0x55e2a3f2d230 /* 20 vars */ <detached ...>
[pid 37814] 13:56:24.829246 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=37823, si_uid=0, si_status=1, si_utime=0, si_stime=0} ---
[pid 37825] 13:56:25.376796 execve("/usr/sbin/blkid", ["blkid", "-v"], 0x55e2a3f2d230 /* 20 vars */ <detached ...>
[pid 37824] 13:56:25.380824 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=37825, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
[pid 37826] 13:56:25.402512 execve("/usr/sbin/blkid", ["blkid"], 0x55e2a3f2d230 /* 20 vars */ <detached ...>
Tracking of blkid shows that it hangs on either the open of or first
read from /dev/nbd0.
# strace blkid
...
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=4560, ...}) = 0
lstat("/dev/nbd0", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x2b, 0), ...}) = 0
stat("/dev/nbd0", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x2b, 0), ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=4560, ...}) = 0
lstat("/dev/nbd0", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x2b, 0), ...}) = 0
access("/dev/nbd0", F_OK) = 0
stat("/dev/nbd0", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x2b, 0), ...}) = 0
openat(AT_FDCWD, "/sys/dev/block/43:0", O_RDONLY|O_CLOEXEC) = 4
openat(4, "dm/uuid", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
close(4) = 0
openat(AT_FDCWD, "/dev/nbd0", O_RDONLY|O_CLOEXEC
Clean up:
1. Resume NBD server:
# killall -CONT nbd-server
2. Delete NBD setup:
# nbd-client -d /dev/nbd0
# killall nbd-server
# rm /tmp/disk-1G.img
Fix this by making GParted specify the whole disk device and partition
names that it is interested in to blkid, rather than letting blkid scan
and report all block devices. Do this both when GParted determines the
devices for itself and when they are named on the command line.
Also update example blkid command output being parsed and cache value
with this change to how blkid is executed.
Closes#131 - GParted hangs when non-named device is hung
GParted already always reads /proc/partitions for whole disk device
names no matter whether it uses whole disk devices named on the command
line, from /proc/partitions or from libparted. As /proc/partitions
lists all the block devices that the kernel knows about, and therefore
all the possible ones blkid could probe, so use it to provide partition
names and device to partition mapping. See code comments for more
details about the assumptions the /proc/partition parsing code makes and
the fact that these are confirmed by examining the Linux kernel source.
This commit just adds debugging to print the existing vector of
validated devices GParted shows in the UI and the vector with all
partitions added, ready for but not yet passed to blkid.
# ./gpartedbin
...
DEBUG: device_paths=["/dev/sda","/dev/sdb"]
DEBUG: device_and_partition_paths=["/dev/sda","/dev/sda1","/dev/sda2","/dev/sdb","/dev/sdb1"]
Also demonstrating that this continues to support named devices,
including file system image files [1].
# truncate -s 256M /tmp/ext4.img
# mkfs.ext4 /tmp/ext4.img
# ./gpartedbin /dev/sda /tmp/ext4.img
...
DEBUG: device_paths=["/dev/sda","/tmp/ext4.img"]
DEBUG: device_and_partition_paths=["/dev/sda","/dev/sda1","/dev/sda2","/tmp/ext4.img"]
[1] e8f0504b13
Make sure that FS_Info cache is loaded for all named paths (#787181)
Closes#131 - GParted hangs when non-named device is hung
Put whole disk device name matching code into a helper function to make
the /proc/partition parsing code easier to understand.
Closes#131 - GParted hangs when non-named device is hung
Now FS_Info::load_cache() and ::load_cache_for_paths() are nearly next
to each other, merge them together to simplify the code a little. This
makes the special case to ensure that file system images named on the
command line were queried by blkid and loaded into the FS_Info cache [1]
become the normal cache loading method. Already passing all discovered
or named devices to load_cache_for_paths() is also a step on the way to
doing it for all devices and partitions of interest.
Just need to ensure that load_cache_for_paths() always loads the cache
as load_cache() did, rather than only when it hadn't already been
loaded. Otherwise GParted will only ever run blkid and load the cache
once at startup and not on each refresh.
[1] e8f0504b13
Make sure that FS_Info cache is loaded for all named paths (#787181)
Closes#131 - GParted hangs when non-named device is hung
PATCHSET OVERVIEW
A user reported that GParted would hang at "scanning all devices...",
when a fully working disk was named on the command line, but another
device on the machine was hung.
This can be replicated like this:
(on Ubuntu 20.04 LTS for it's NBD support)
1. Export and import NBD:
# truncate -s 1G /tmp/disk-1G.img
# nbd-server -C /dev/null 9000 /tmp/disk-1G.img
# nbd-client localhost 9000 /dev/nbd0
2. Hang the NBD server and therefore /dev/nbd0:
# killall -STOP nbd-server
3. Run GParted:
$ gparted /dev/sda
Tracing GParted shows that execution of blkid never returns.
# strace -f -tt -q -bexecve -eexecve /usr/sbin/gpartedbin 2>&1 1> /dev/null | fgrep -v ENOENT
...
[pid 37823] 13:56:24.814139 execve("/usr/sbin/mkudffs", ["mkudffs", "--help"], 0x55e2a3f2d230 /* 20 vars */ <detached ...>
[pid 37814] 13:56:24.829246 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=37823, si_uid=0, si_status=1, si_utime=0, si_stime=0} ---
[pid 37825] 13:56:25.376796 execve("/usr/sbin/blkid", ["blkid", "-v"], 0x55e2a3f2d230 /* 20 vars */ <detached ...>
[pid 37824] 13:56:25.380824 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=37825, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
[pid 37826] 13:56:25.402512 execve("/usr/sbin/blkid", ["blkid"], 0x55e2a3f2d230 /* 20 vars */ <detached ...>
Tracing of blkid shows that it hangs on either the open of or first
read from /dev/nbd0.
# strace blkid
...
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=4560, ...}) = 0
lstat("/dev/nbd0", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x2b, 0), ...}) = 0
stat("/dev/nbd0", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x2b, 0), ...}) = 0
lstat("/dev", {st_mode=S_IFDIR|0755, st_size=4560, ...}) = 0
lstat("/dev/nbd0", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x2b, 0), ...}) = 0
access("/dev/nbd0", F_OK) = 0
stat("/dev/nbd0", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x2b, 0), ...}) = 0
openat(AT_FDCWD, "/sys/dev/block/43:0", O_RDONLY|O_CLOEXEC) = 4
openat(4, "dm/uuid", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
close(4) = 0
openat(AT_FDCWD, "/dev/nbd0", O_RDONLY|O_CLOEXEC
Clean up:
1. Resume NBD server:
# killall -CONT nbd-server
2. Delete NBD setup:
# nbd-client -d /dev/nbd0
# killall nbd-server
# rm /tmp/disk-1G.img
Going to fix this by making GParted specify the device and partition
names that it is interested in to blkid, rather than letting blkid scan
and report all block devices. Do this both when GParted determines the
devices for itself and when they are named on the command line.
THIS PATCH
Move the loading and initialising of caches used during content
discovery to after device and partition discovery and just before
content discovery. Just makes the code ready for the next change.
Closes#131 - GParted hangs when non-named device is hung