Extra Paragraphs, just load / save rtf to rtf/docx?

General TRichView support forum. Please post your questions here
a.weber
Posts: 63
Joined: Wed Mar 02, 2022 7:02 am

Extra Paragraphs, just load / save rtf to rtf/docx?

Post by a.weber »

One more,
if I load a RTF file from our pool - for example with the ActionTestUni Sample - and save it to rtf or docx - there is one / some new paragraphs generated which leads to new pages in the document if we open it again with MSWord oder LibreOffice - any idea or hint how to avoid this?

in these screenshots the left side shows the original RTF - right is the new saved rtf / docx
pngrtf1.png
pngrtf1.png (152.98 KiB) Viewed 15906 times
pngrtf2.png
pngrtf2.png (160.79 KiB) Viewed 15906 times

André
Attachments
pngdocx2.png
pngdocx2.png (134.95 KiB) Viewed 15906 times
pngdocx1.png
pngdocx1.png (146.23 KiB) Viewed 15906 times
a.weber
Posts: 63
Joined: Wed Mar 02, 2022 7:02 am

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by a.weber »

Hello Sergey,
I have found the reason for this problem - the source document uses distance of zero from the page border for header / footer. But the Value if zero is not used by the reader.
Please have a Look into "RVRTFProps.pas" - I have modified the function

Code: Select all

TRVRTFReaderProperties.ReaderEndParsing
in this way - original:

Code: Select all

    if Reader.CurSectProps.FooterYTw > 0 then
      DocParams.FooterY := DocParams.FromTwips
        (Reader.CurSectProps.FooterYTw);
    if Reader.CurSectProps.HeaderYTw > 0 then
      DocParams.HeaderY := DocParams.FromTwips
        (Reader.CurSectProps.HeaderYTw);
to this

Code: Select all

    if Reader.CurSectProps.FooterYTw  >= 0 then
      DocParams.FooterY := DocParams.FromTwips
        (Reader.CurSectProps.FooterYTw);
    if Reader.CurSectProps.HeaderYTw >= 0 then
      DocParams.HeaderY := DocParams.FromTwips
        (Reader.CurSectProps.HeaderYTw);
Is this the right way to fix this or do I miss something? (Whats about RVF need that fixing too?)

André
Sergey Tkachenko
Site Admin
Posts: 17554
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by Sergey Tkachenko »

Yes, this is the correct fix.
I'll include it in the next update.
a.weber
Posts: 63
Joined: Wed Mar 02, 2022 7:02 am

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by a.weber »

Hi Sergey,

ok - the zero distance was part one of the issue - part two are the extra created paragraphs inside Header / Footer - headers and footers are allways saved also if they are empty - disabling the header/footer with the option [rvrtfSaveHeaderFooter] is no solution, because we will have sometime headers or footers but not both.

Most code uses the function

Code: Select all

RootProperties.HasHeaderOrFooter
to test if there are headers or footers - but this function does only test if the RVData of one of the HF / subdocuments is <b>nil</b> instead of is nil and RVData.isEmpty().
Changing this function will break the RTF reader I think (TRVRTFReaderProperties.InitReader) - because there is this test used before loading if the Header/Footer etc. objects are initialized and if they should be loaded from the RTF Stream? May be we need another

Code: Select all

HasNonEmptyHeaderOrFooter
? as decision helper for the saving code - if there is really a header or footer needs saving?

But this will be only a part of the solution - before saving any header / footer should be a check if it is empty in any case?

I modified CRVData.pas for saveing rtf - without empty headers and/or footers and added an not isEmpty Test to the function SaveHF and also inside SaveListOverrideTable but there I'am not sure if this is required.

For DOCX I changed: RVDocXSave.pas
- the saving loop to test if the HF is not empty before saving?
- maybe SaveLists, SaveHeadersFootersAndLists? need the test on the result of GetRVDataByIndex too?
-> may I sent you my changed files?

André
a.weber
Posts: 63
Joined: Wed Mar 02, 2022 7:02 am

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by a.weber »

The extra paragraph in the DocX which cause the pagebreak - comes from an empty textelement.
The template contains at rootlevel two <TCustomRVItemInfo> the first is the table containing most / all of the content.
And second a very small nearly invisible TRVTextItemInfo with no text and a fontsize of 1.
Image

This is written to the document.xml as

Code: Select all

<w:p>
<w:r>
<w:rPr>
<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs ="Times New Roman" w:eastAsia="Times New Roman"/>
<w:sz w:val="2"/>
</w:rPr>
<w:t></w:t>
</w:r>
</w:p>
as long there is no text inside the <w:t> tag - word inserts a linefeed with the big font instead of the size 1 -- if I fake the xml
to this and insert a "-" - word renders the document as expected?

Code: Select all

<w:p>
<w:r>
<w:rPr>
<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs ="Times New Roman" w:eastAsia="Times New Roman"/>
<w:sz w:val="2"/>
</w:rPr>
<w:t>-</w:t>
</w:r>
</w:p>
Do you any idea if this may be a bug in Word or at your side? If Word itself converts the original RTF to DOCX this paragraph is encoded in this way:

Code: Select all

<w:p w:rsidR="00874ED8" w:rsidRDefault="00874ED8">
<w:pPr>
<w:rPr>
<w:sz w:val="2"/>
<w:szCs w:val="2"/>
</w:rPr>
</w:pPr>
</w:p>
There is a <pPr> instead of your run <r> and no <w:t>?

Any help / idea is apreciated. For the moment I will write a small loop which deletes from the end all empty Items - like a "rtrim" I hope that works.
Sorry: Deleting the empty items before save avoids only its encoding into the document.xml but the linefeed / pagebreak in word is still there.
Sergey Tkachenko
Site Admin
Posts: 17554
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by Sergey Tkachenko »

Do you use the newest version?
I believe the font problem for empty paragraphs has been fixed.
MS Word ignores font properties specified for empty text runs, so for empty paragraph font must be specified in paragraph properties, not in font properties.
a.weber
Posts: 63
Joined: Wed Mar 02, 2022 7:02 am

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by a.weber »

Ok I need to update - and will report if this is fixed with the update. But that is only a smaller of my problems with the empty paragraph at the end.
The bigger problems are the tables and there changing sizes (height/width) just after a single load rtf / save rtf/docx cycle.
Sergey Tkachenko
Site Admin
Posts: 17554
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by Sergey Tkachenko »

Does it change width every time you save and reload it in TRichView?
Or does it change width after the initial loading of a file that was created in MS Word?
a.weber
Posts: 63
Joined: Wed Mar 02, 2022 7:02 am

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by a.weber »

Hi,
I believe the font problem for empty paragraphs has been fixed.
the problem with the empty paragraph at the end with docx and MSWord - is still existent in the current version - I am sorry.
Does it change width every time you save and reload it in TRichView?
No - only the first save changes the layout, repeated open / save as doesn't change it more.

The original CellWidth in Word was fixed 3,25cm - after saving it was reduced to 3,23cm - but the whole table got a bigger width?
May be a with spacing / padding?

I also tried add [rvtoRTFSaveCellPixelBestWidth] in TRVTableItemInfo.Options - so the width after load/save in Word was the same but the whole table got a bigger width too.

Please look into viewtopic.php?t=11217
Sergey Tkachenko
Site Admin
Posts: 17554
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by Sergey Tkachenko »

As for the font size.
For empty paragraphs, the new version of TRichView must save font properties not only inside <w:p>|<w:r>|<w:rPr>, but also inside <w:p>|<w:pPr>|<w:rPr>, and it must be understood by MS Word.

As for tables, I'll check on the next week, after releasing an update.
a.weber
Posts: 63
Joined: Wed Mar 02, 2022 7:02 am

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by a.weber »

The last paragraph after the table in the docx is written as this:

Code: Select all

</w:tbl>
<w:p>
<w:r>
<w:rPr>
<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs ="Times New Roman" w:eastAsia="Times New Roman"/>
<w:sz w:val="2"/>
</w:rPr>
<w:t></w:t>
</w:r>
</w:p>
as long there is no real text inside <w:t> my MSWord generates this extra paragraph from this?
EDIT: My MSWord is an old one 2010 :)
EDIT: If I look at the table properties in Word they are the same as before - so I don't know why Word renders the newly written RTF other then the original - the width of the table is changed by one time the border width. (If I use a table with one column and two rows and a borderwidth of 6pt)

With the tables its fine if you will look into the issue next week.
Thanks
André
Sergey Tkachenko
Site Admin
Posts: 17554
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by Sergey Tkachenko »

I confirm the small bug: if the document is started from table(s) followed by an empty paragraph, text properties of this text item will not be duplicated in paragraph properties when saving to DocX, so MS Word ignores them.
It will be fixed in the next update (in this week).
a.weber
Posts: 63
Joined: Wed Mar 02, 2022 7:02 am

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by a.weber »

Ok, thank you for fixing that.

I have played around with different table settings in word the last try was a simple table with a single column and two rows - loading Word RTF/ saving to RTF back again - changes the width by one time the border width - I think - in my sample 6pt - may be that helps you.
(But keep in mind - there is still a difference saveing to RTF or DOCX, leads to different table layout)
Attachments
TableTest_WrongAfterLoadSaveRTF.rtf
try this Word file as sample
(34.99 KiB) Downloaded 1083 times
Sergey Tkachenko
Site Admin
Posts: 17554
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by Sergey Tkachenko »

I'll check it in the next week.
But the first thing to verify is measuring units used in the editor.
If they are pixels (which are set by default), typographic points cannot be converted to them without losing precision.
Measure units should be set to twips or EMU.
https://www.trichview.com/help/idh_type ... units.html
https://www.trichview.com/help-scaleric ... vertto.htm
a.weber
Posts: 63
Joined: Wed Mar 02, 2022 7:02 am

Re: Extra Paragraphs, just load / save rtf to rtf/docx?

Post by a.weber »

Hi Sergey,
the sample ("ActionTestUni") I used is configured to EMU. The saving logic of tablecell width without the option [rvtoRTFSaveCellPixelBestWidth] uses the current pixelwidth and DPI (?), since I set the option [rvtoRTFSaveCellPixelBestWidth] durring loading - the saved width of the column in Word shows the same value as before, but the whole table width changes.
Post Reply