From 7dbf0691f17a86190ad557467c393b80ba661527 Mon Sep 17 00:00:00 2001 From: Mike Fleetwood Date: Sat, 20 Feb 2021 23:56:28 +0000 Subject: [PATCH] Accept NUL as a valid UTF-8 character again (#136) On newer distributions the PipeCapture tests have been failing like this: $ ./test_PipeCapture ... [ RUN ] PipeCaptureTest.ReadEmbeddedNULCharacter test_PipeCapture.cc:336: Failure Expected: inputstr Of length: 6 To be equal to: capturedstr.raw() Of length: 5 With first binary difference: < 0x00000000 "ABC.EF" 41 42 43 00 45 46 -- > 0x00000000 "ABCEF" 41 42 43 45 46 [ FAILED ] PipeCaptureTest.ReadEmbeddedNULCharacter (0 ms) [ RUN ] PipeCaptureTest.ReadNULByteInMiddleOfMultiByteUTF8Character test_PipeCapture.cc:353: Failure Expected: expectedstr Of length: 7 To be equal to: capturedstr.raw() Of length: 6 With first binary difference: < 0x00000000 "._45678" 00 5F 34 35 36 37 38 -- > 0x00000000 "_45678" 5F 34 35 36 37 38 [ FAILED ] PipeCaptureTest.ReadNULByteInMiddleOfMultiByteUTF8Character (0 ms) ... Found that test_PipeCapture succeeds on Fedora 31 and fails on Fedora 32. Also test_PipeCapture binary from Fedora 31 and 32 both pass on Fedora 31 and both fail on Fedora 32. So something outside of the GParted code and tests is the cause. Confirmed that this GLib change "Add a missing check to g_utf8_get_char_validated()" [1], first released in GLib 2.63.0, made the difference. On Fedora 32 with GLib 2.64.6, rebuilt GLib with that change reverted and the tests passed. Anyway fix the wrapper GParted has around g_utf8_get_char_validated() to also handle this case of reading a NUL character. [1] https://gitlab.gnome.org/GNOME/glib/-/commit/568720006cd1da3390c239915337ed0a56a23f2e Add a missing check to g_utf8_get_char_validated() Closes #136 - 1.2.0: test suite is failing in test_PipeCapture --- src/PipeCapture.cc | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/PipeCapture.cc b/src/PipeCapture.cc index d5b6f4f2..3244123a 100644 --- a/src/PipeCapture.cc +++ b/src/PipeCapture.cc @@ -260,6 +260,10 @@ gunichar PipeCapture::get_utf8_char_validated(const char *p, gssize max_len) gunichar uc = g_utf8_get_char_validated(p, max_len); if (uc == UTF8_PARTIAL && max_len > 0) { + // Report NUL character as such. + if (*p == '\0') + return '\0'; + // If g_utf8_get_char_validated() found a NUL byte in the middle of a // multi-byte character, even when there are more bytes available as // specified by max_len, it reports a partial UTF-8 character. Report