[scribus] ?==?utf-8?q? xtg fileimport

Tue Jun 5 14:21:23 UTC 2018

Am Dienstag, 05. Juni 2018 15:35 CEST, "Ralf Mattes" <r.mattes at mh-freiburg.de> schrieb: 

>  - from what I can tell the xtg import plugin correctly detects the declared encoding and does find
>    the corrent QTextCodec for the encoding, so m_codec seems to be correct. But somehow the 
>    actual text imported is garbled. 
>  - If I import non-multibyte encoded text everything works as expected.
>  Cheers, RalfD

After a bit of code reading:

 in file xtgscanner.cpp, line 1460: 

  QChar XtgScanner::nextSymbol()
{
	char ch = 0;
	if (top < input_Buffer.length())
	{
		ch = input_Buffer.at(top++);
		QByteArray ba;
		ba.append(ch);
		QString m_txt = m_codec->toUnicode(ba);

How is this supposed to work? 'ba' will contain a single _char_ which is passed to QTextCodec::toUnicode.
Isn't that guaranteed to return invaild glyphs for multibyte input since the first byte of a multibyte character
never is a vaild glyph?

Also, in the constructor XtgScanner::XtgScanner there seems to be an attempt to skip over BOM:

	if ((input_Buffer[0] == '\xFF') && (input_Buffer[1] == '\xFE'))
	{
		QByteArray tmpBuf;
		for (int a = 2; a < input_Buffer.count(); a += 2)
		{
			tmpBuf.append(input_Buffer[a]);
		}
		input_Buffer = tmpBuf;
	}

Won't this fail for UTF-16-LE (or, for that matter, for UTF-8 with BOM)?

Cheers, RalfD