Accept NUL as a valid UTF-8 character again (#136)
On newer distributions the PipeCapture tests have been failing like
this:
$ ./test_PipeCapture
...
[ RUN ] PipeCaptureTest.ReadEmbeddedNULCharacter
test_PipeCapture.cc:336: Failure
Expected: inputstr
Of length: 6
To be equal to: capturedstr.raw()
Of length: 5
With first binary difference:
< 0x00000000 "ABC.EF" 41 42 43 00 45 46
--
> 0x00000000 "ABCEF" 41 42 43 45 46
[ FAILED ] PipeCaptureTest.ReadEmbeddedNULCharacter (0 ms)
[ RUN ] PipeCaptureTest.ReadNULByteInMiddleOfMultiByteUTF8Character
test_PipeCapture.cc:353: Failure
Expected: expectedstr
Of length: 7
To be equal to: capturedstr.raw()
Of length: 6
With first binary difference:
< 0x00000000 "._45678" 00 5F 34 35 36 37 38
--
> 0x00000000 "_45678" 5F 34 35 36 37 38
[ FAILED ] PipeCaptureTest.ReadNULByteInMiddleOfMultiByteUTF8Character (0 ms)
...
Found that test_PipeCapture succeeds on Fedora 31 and fails on
Fedora 32. Also test_PipeCapture binary from Fedora 31 and 32 both pass
on Fedora 31 and both fail on Fedora 32. So something outside of the
GParted code and tests is the cause.
Confirmed that this GLib change "Add a missing check to
g_utf8_get_char_validated()" [1], first released in GLib 2.63.0, made
the difference. On Fedora 32 with GLib 2.64.6, rebuilt GLib with that
change reverted and the tests passed. Anyway fix the wrapper GParted
has around g_utf8_get_char_validated() to also handle this case of
reading a NUL character.
[1] 568720006c
Add a missing check to g_utf8_get_char_validated()
Closes #136 - 1.2.0: test suite is failing in test_PipeCapture
This commit is contained in:
parent
b1cad17a14
commit
7dbf0691f1
|
@ -260,6 +260,10 @@ gunichar PipeCapture::get_utf8_char_validated(const char *p, gssize max_len)
|
|||
gunichar uc = g_utf8_get_char_validated(p, max_len);
|
||||
if (uc == UTF8_PARTIAL && max_len > 0)
|
||||
{
|
||||
// Report NUL character as such.
|
||||
if (*p == '\0')
|
||||
return '\0';
|
||||
|
||||
// If g_utf8_get_char_validated() found a NUL byte in the middle of a
|
||||
// multi-byte character, even when there are more bytes available as
|
||||
// specified by max_len, it reports a partial UTF-8 character. Report
|
||||
|
|
Loading…
Reference in New Issue