[scribus] windows to linux charset issue for PDF annotation

Gregory Pittman gpittman at iglou.com
Fri May 29 17:16:11 UTC 2020


On 5/29/20 12:00 PM, JLuc wrote:
> Hello
> It's about a scribus created pdf but this is not strictly related to scribus :
> A friend on windows has proofread and annotated a scribus PDF document.
> When opening it on Ubuntu, some of the notes are OK readable,
> but some other are scrambled and some other seem to be cut in the middle of the text.
> 
> It could be related to notes having accenctuated or special characters as " or «
> because none of the readable notes has such accenctuated characters afaict.
> I've tried with Evince and Okular.
> 
> Do you have an advice on how to access correctly these notes on linux ?
> (Or on how to fix that in the annotation tool on windows ?)

Hi JLuc,

Here is an issue I just noticed yesterday, which might relate to your problem. I used a script called ExtractText.py, which spits out the text content of a document to a plain text file. I never seen problems with this before. When I did this yesterday and tried to import this text into a new document, the carriage returns were wrong -- running less on the file showed that they were Ctrl-M instead of LF (line feed).

There are 2 ways that will fix this text file. I used KWrite, which interpreted the carriage returns Ok, then saved the file, and they were all fixed to LFs and imported into Scribus properly.
The other option is to use dos2unix on the command line:

dos2unix -n old.txt new.txt

I wrote this script ExtractText.py and it's never done this before, so something must have changed in Scribus, that it's not using UTF-8 consistently.

Greg




More information about the scribus mailing list