Commit Graph

12 Commits

Author SHA1 Message Date
Mike Fleetwood b1cad17a14 Refactor ::OnReadable() creating get_utf8_char_validated() (#136)
Extract call to GLib's g_utf8_get_char_validated() and the associated
workaround to also read NUL characters into a separate function to make
PipeCapture::OnReadable() a little smaller and simpler, so easier to
understand.

Add max_len > 0 clause into get_utf8_char_validated() like this:
    if (uc == UTF8_PARTIAL && max_len > 0)
so that the NUL character reading workaround is only applied when
max_len specifies the maximum number of bytes to read, rather than
when -1 specifies reading a NUL termination string.  This makes
get_utf8_char_validated() a complete wrapper of
g_utf8_get_char_validated() [1], even though GParted always specifies
the maximum number of bytes to read.

No longer describe the inability to read NUL characters as a bug [2]
since the GLib author's said it wasn't [3].

[1] GLib Reference Manual, Unicode Manipulation Functions,
    g_utf8_get_char_validated ()
    https://developer.gnome.org/glib/stable/glib-Unicode-Manipulation.html#g-utf8-get-char-validated

[2] 8dbbb47ce2
    Workaround g_utf8_get_char_validate() bug with embedded NUL bytes
    (#777973)

[3] Bug 780095 - g_utf8_get_char_validated() stopping at nul byte even
    for length specified buffers
    https://bugzilla.gnome.org/show_bug.cgi?id=780095#18
        "If g_utf8_get_char_validated() encounters a nul byte in the
        middle of a string of given longer length, it returns -2,
        indicating a partial gunichar.  That is not the obvious
        behaviour, but since g_utf8_get_char_validated() has been API
        for a long time, the behaviour cannot be changed.
        "

Closes #136 - 1.2.0: test suite is failing in test_PipeCapture
2021-02-22 16:14:35 +00:00
Mike Fleetwood 25780c611b Further improve speed of PipeCapture for non-watched output (#777973)
For large output a lot of time is used copying capturebuf to callerbuf
to provide a Glib::ustring copy of the buffer for the update callback.
However update callbacks are only used when commands are run to apply
operations by FileSystem::execute_command() and their output is
incrementally displayed in the UI.  Whereas update callbacks are never
used when commands are used to query information via
Utils::execute_command().

Stop performing interim copying of capturebuf to callerbuf when there
are no update callbacks registered as it is unnecessary.

Time to read portions of the recorded fsck.fat output via
fat16::set_used_sectors() and intermediate copies aren't required:

                     1 MiB     10 MiB   122 MiB
    old code :   0.074 sec   1.41 sec   210 sec [3:30]
    new code :   0.063 sec   0.56 sec     6.57 sec

Bug 777973 - Segmentation fault on bad disk
2017-06-02 11:47:35 -06:00
Mike Fleetwood 8dbbb47ce2 Workaround g_utf8_get_char_validate() bug with embedded NUL bytes (#777973)
If PipeCapture reads a NUL byte in the middle of what is expected to be
a multi-byte UTF-8 character then PipeCapture either returns the
captured characters to the previous update or loops forever depending on
whether the end of the stream is encountered before the read buffer is
full or not.  This is equivalent to saying whether the NUL byte occurs
within the last 512 bytes of the output or not.

This is caused by a bug in g_utf8_get_char_validated() reporting that a
partial UTF-8 character has been found when the NUL byte is encountered
in the middle of a multi-byte character even though more bytes are
available in the length specified buffer.  g_utf8_get_char_validated()
is always stopping at the NUL byte assuming it is working with a NUL
terminated string.

Workaround this by checking for g_utf8_get_char_validated() claiming a
partial UTF-8 character has been found when in fact there are at least
enough bytes in the read buffer to instead determine that it is really
an invalid UTF-8 character.

Reference:
    Bug 780095 - g_utf8_get_char_validated() stopping at nul byte even
                 for length specified buffers
    https://bugzilla.gnome.org/show_bug.cgi?id=780095

Bug 777973 - Segmentation fault on bad disk
2017-06-02 11:47:35 -06:00
Mike Fleetwood b5b3d199f8 Improve the performance of PipeCapture for large output (#777973)
A user had a very corrupted FAT file system such that fsck.fat produced
122 MiB of output.  GParted has to read this output to get the file
system usage information.  However GParted takes more than 48 hours to
read the 122 MiB of output, while using 100% CPU time and is
unresponsive for the duration.

Modified fsck.fat to output just the first 1 MiB of output and used perf
to capture performance data from GParted reading that output:
    # perf -g -F 1999 -- ./gpartedbin
    # perf report --stdio
      67.84% Glib::ustring::replace      [4.23s]
      17.67% g_utf8_pointer_to_offset    [1.10s]
       8.48% g_utf8_offset_to_pointer    [0.53s]
    [  6.01% (everything else)       ]   [0.38s]
    [100.00% TOTAL                   ]   [6.24s]

And to read the first 10 MiB of output the performance figures are:
      92.95% Glib::ustring::replace      [257.44s]
       4.35% g_utf8_pointer_to_offset    [ 12.05s]
       2.13% g_utf8_offset_to_pointer    [  5.90s]
    [  0.58% (everything else)       ]   [  1.61s]
    [100.00% TOTAL                   ]   [277.00s]

See how the total time is increasing non-linearly, 44 times longer for
only 10 times as much data.  This is because of the exponential increase
in time spent in Glib::ustring::replace.

After a lot of experimentation I came to the conclusion that
Glib::ustrings are not appropriate for storing and editing large buffers
of data, sizes megabytes and above.  The issues are that iterators are
invalid after the content changes and replacing UTF-8 characters by
index gets exponentially slower as the size of the string increases.
Hence the > 48 hours of 100% CPU time to read and apply the line
discipline to the 122 MiB of fsck.fat output.  See code comment for a
more detailed description of the issues found.

Rewrote OnReadable() to use Glib::ustrings as little as possible.
Instead using buffers and vectors of fixed width data types allowing for
fast access using pointers and indexes (converted to pointers by the
compiler with simple arithmetic).  Again see code comment for a more
detailed description of the implementation.

Repeating the performance capture with the new code for the first 1 MiB
of fsck.fat output:
      63.34% memcpy                      [0.35s]
    [ 36.66% (everything else)       ]   [0.21s]
    [100.00% TOTAL                   ]   [0.56s]

And for the first 10 MiB of fsck.fat output:
      96.66% memcpy                      [63.60s]
    [  3.34% (everything else)       ]   [ 2.20s]
    [100.00% TOTAL                   ]   [65.80s]

Simple timings taken to read portions of the fsck.fat output (when not
using perf):

                   1 MiB    10 MiB      122 MiB
    old code :   6.2 sec   277 sec   > 48 hours
                            (4:37)
    new code :   0.6 sec    66 sec    17262 sec
                            (1:06)    (4:47:42)

Performance of the code is still non-linear because of the assignment
of the ever growing capturebuf to callerbuf for every block of input
read.  This is required to generate a consistent Glib::ustring copy of
the input for the update callback.  However this is much faster than
before, and I have a plan for further improvements.

Bug 777973 - Segmentation fault on bad disk
2017-06-02 11:47:35 -06:00
Mike Fleetwood 0fcfd18061 Prevent cross thread write after free in _OnReadable() (#731752)
Fragment of debugging and valgrind output:
D: tid=2193 main()
...
D: tid=2202 GParted_Core::set_devices_thread()
...
D: tid=2202 Utils::execute_command(command="dumpe2fs -h /dev/sda1", output, error, use_C_locale=1)
D: tid=2202 this=0x13fef4a0 PipeCapture::PipeCapture()
D: tid=2202 this=0x13fef4f0 PipeCapture::PipeCapture()
D: tid=2202 this=0x13fef4a0 PipeCapture::connect_signal()
D:  sourceid=77
D: tid=2202 this=0x13fef4f0 PipeCapture::connect_signal()
D:  sourceid=78
D: tid=2193 data=0x13fef4a0 PipeCapture::_OnReadable()
D: tid=2193 this=0x13fef4a0 PipeCapture::OnReadable()
D:  signal_update.emit()
D:  return true
D: tid=2193 data=0x13fef4f0 PipeCapture::_OnReadable()
D: tid=2193 this=0x13fef4f0 PipeCapture::OnReadable()
D:  signal_update.emit()
D:  return true
D: tid=2193 data=0x13fef4a0 PipeCapture::_OnReadable()
D: tid=2193 this=0x13fef4a0 PipeCapture::OnReadable()
D:  signal_update.emit()
D:  return true
D: tid=2193 data=0x13fef4f0 PipeCapture::_OnReadable()
D: tid=2193 this=0x13fef4f0 PipeCapture::OnReadable()
D:  signal_eof.emit()
D:  return false
D:  (!rc)  &(pc->sourceid)=0x13fef518
D: tid=2193 data=0x13fef4a0 PipeCapture::_OnReadable()
D: tid=2193 this=0x13fef4a0 PipeCapture::OnReadable()
D:  signal_update.emit()
D:  return true
D: tid=2193 data=0x13fef4a0 PipeCapture::_OnReadable()
D: tid=2193 this=0x13fef4a0 PipeCapture::OnReadable()
D:  signal_update.emit()
D:  return true
D: tid=2193 data=0x13fef4a0 PipeCapture::_OnReadable()
D: tid=2193 this=0x13fef4a0 PipeCapture::OnReadable()
D:  signal_eof.emit()
D: tid=2202 this=0x13fef4f0 PipeCapture::~PipeCapture()
D:  sourceid=0
D: tid=2202 this=0x13fef4a0 PipeCapture::~PipeCapture()
D:  sourceid=77
D:  return false
D:  (!rc)  &(pc->sourceid)=0x13fef4c8
==2193== Thread 1:
==2193== Invalid write of size 4
==2193==    at 0x490580: GParted::PipeCapture::_OnReadable(_GIOChannel*, GIOCondition, void*) (PipeCapture.cc:56)
==2193==    by 0x38662492A5: g_main_context_dispatch (gmain.c:3066)
==2193==    by 0x3866249627: g_main_context_iterate.isra.24 (gmain.c:3713)
==2193==    by 0x3866249A39: g_main_loop_run (gmain.c:3907)
==2193==    by 0x3D7FD45C26: gtk_main (gtkmain.c:1257)
==2193==    by 0x469743: GParted::GParted_Core::set_devices(std::vector<GParted::Device, std::allocator<GParted::Device> >&) (GParted_Core.cc:155)
==2193==    by 0x4A78F1: GParted::Win_GParted::menu_gparted_refresh_devices() (Win_GParted.cc:1259)
==2193==    by 0x4A7886: GParted::Win_GParted::on_show() (Win_GParted.cc:1253)
==2193==    by 0x3D82B2009C: Gtk::Widget_Class::show_callback(_GtkWidget*) (widget.cc:3855)
==2193==    by 0x3867210297: g_closure_invoke (gclosure.c:777)
==2193==    by 0x3867221B86: signal_emit_unlocked_R (gsignal.c:3516)
==2193==    by 0x386722A0F1: g_signal_emit_valist (gsignal.c:3330)
==2193==  Address 0x13fef4c8 is not stack'd, malloc'd or (recently) free'd
==2193==

PipeCapture.cc (with debugging):
    46  gboolean PipeCapture::_OnReadable( GIOChannel *source,
    47                                     GIOCondition condition,
    48                                     gpointer data )
    49  {
    50          std::cout << "D: tid=" << (long int)syscall(SYS_gettid) << " data=" << data << " PipeCapture::_OnReadable()" << std::endl;
    51          PipeCapture *pc = static_cast<PipeCapture *>(data);
    52          gboolean rc = pc->OnReadable( Glib::IOCondition(condition) );
    53          if (!rc)
    54          {
    55                  std::cout << "D:  (!rc)  &(pc->sourceid)=" << &(pc->sourceid) << std::endl;
    56                  pc->sourceid = 0;
    57          }
    58          return rc;
    59  }

The use after free across threads only happens when an external program
is being executed from a thread other than the main() thread.  This is
because by default glib registered callbacks are run by the glib main
loop, which is only called from the main() thread with Gtk::Main::run().

Event sequence:
tid=2193                      tid=2202

main()
...
  GParted_Core::set_devices()
    Glib::Thread::create(... set_devices_thread ...)
    Gtk::Main::run()          GParted_Core::set_devices_thread()
                              ...
                                Utils::execute_command("dumpe2fs ... /dev/sda1" ...)
                                  Glib::spawn_async_with_pipes()
                                  PipeCapture outputcapture(out, output)
                                  outputcapture.connect_signal()
      //Glib main loop runs callback
      PipeCapture::_OnReadable()
        pc->OnReadable()
          //output read
          signal_update.emit()
          return true
      ...
      //Glib main loop runs callback
      PipeCapture::_OnReadable()
        pc->OnReadable()
          //eof reached
[1]       signal_eof.emit()
                                  return status.exit_status
[2]                               PipeCapture::~PipeCapture()
[3]       return false
[4]     pc->sourceid = 0

What is happening is that the PipeCapture destructor [2] is running in
the set_devices_thread() thread and freeing the object's memory as soon
as signal_eof.emit() [1] has been called.  Then signal_eof.emit()
returns back to OnReadable() which then returns false [3] back to the
_OnReadable() callback function which then assigns 0 to sourceid member
variable [4] in the already freed object, detected by valgrind as:
    Invalid write of size 4
       at ... GParted::PipeCapture::_OnReadable(...) (PipeCapture.cc:56)

This is happening because PipeCapture member variable sourceid is being
saved, in a different thread, just so the _OnReadable() callback can be
removed.  However a glib IOChannel callback, type GIOFunc(), returning
false will be automatically removed.

    GLib Reference Manual 2.26 / IO Channels
    https://developer.gnome.org/glib/2.26/glib-IO-Channels.html#GIOFunc

    GIOFunc()

    Returns : the function should return FALSE if the event source
              should be removed

Therefore fix by just not saving the event sourceid at all, and not
calling g_source_remove() to manually remove the callback, but instead
letting glib automatically remove the callback when it returns false.

Bug #731752 - Write after free cross thread race in
              PipeCapture::_OnReadable()
2014-07-06 10:22:40 -06:00
Daniel Mustieles 3861b9257b Replace obsolete FSF postal address in copyright notices (#721565)
This is part of parent bug:
    Bug #721455 - Obsolete info in license text on multiple modules

and GNOME Goal:
    https://wiki.gnome.org/Initiatives/GnomeGoals/Proposals

    * verify all source files to make sure they have a license and a
      copyright, and that both are up-to-date

Bug #721565 -  License text contains obsolete FSF postal address
2014-01-26 10:53:23 +00:00
Mike Fleetwood 1b54123580 Enhance line discipline in PipeCapture::OnReadable() (#709276)
Add concept of cursor position within the current line, separate from
the end of the buffer.  This is so that programs which output a text
progress bar using backspace, such as resize2fs -p, are displayed
correctly.

Bug #709276 - Percentage indicator for subcommand
2013-10-03 10:23:46 -06:00
Mike Fleetwood 2b51d87147 Make include guards unique (#539297)
Include guards need to be unique within GParted code and all included
library header files.
    http://en.wikipedia.org/wiki/Include_guard#Difficulties

Use this model for all include guards:
    #ifndef GPARTED_FILE_NAME_H
    #define GPARTED_FILE_NAME_H
    ...
    #endif /* GPARTED_FILE_NAME_H */

Closes Bug #539297 - Make include guards unique
2013-06-05 10:57:39 -06:00
Mike Fleetwood 0d52cd19d7 Rename signals update and eof
Rename the libsigc++ signals to signal_update and signal_eof to match
the naming used for signals in GParted.
    fgrep 'sigc::signal' include/*.h

Also explicitly use the emit() method rather than using the object
operator().  This again is to match the convention in GParted and make
it more obvious what is happening.
    fgrep '.emit(' include/*.h
2013-05-10 10:03:53 -06:00
Phillip Susi 9475731ac8 Avoid glibmm GSource bug/crash (again) (#697727)
The previous commit missed one glibmm GSource wrapper in the form of the
io watch for the PipeCapture class.  Convert this one to use glib
directly as well.

Bug #697727 - Segfault in livecd Gparted v 0.15.0-3 when copying
              partition
2013-04-29 18:42:53 +01:00
Mike Fleetwood 1386984def Add copyright notice to PipeCapture .cc and .h files 2013-04-28 13:51:06 -06:00
Phillip Susi 52a2a9b00a Reduce threading (#685740)
Win_Gparted and Dialog_Progress were creating threads to perform most
functions in the background.  Most of the time, the only reason the
threads blocked was to execute an external command.  The external command
execution has been changed to spawn the command asynchronously and wait
for completion with a nested main loop.  While waiting for completion,
the pipe output is captured via events.  In the future, this will allow
for it to be parsed in real time to obtain progress information.

Those tasks in GParted_Core that still block now spawn a background thread
and wait for it to complete with a nested main loop to avoid hanging the
gui.

Part of Bug #685740 - Refactor to use asynchronous command execution
2013-03-11 18:40:31 -06:00