Highlight search result in Arabic text

General TRichView support forum. Please post your questions here
Sergey Tkachenko
Site Admin
Posts: 17553
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko »

As for the second question, you need to patch it as well, there is no standard solution (although, we are thinking about adding this functionality in TRichView.SearchText)

Modify MarkSubString_:

Code: Select all

...
    if P > 0 then
      ...
      // store RVData.GetSourceRVData and I
    end;
...
(important, the code for storing must be at the very end of "if P > 0")

To select StoredRVData and StoredItemNo:

Code: Select all

var RVData: TCustomRVFormattedData;
  RVData := TCustomRVFormattedData(StoredRVData.Edit);
  RVData.SetSelectionBounds(StoredItemNo, RVData.GetOffsBeforeItem(StoredItemNo), StoredItemNo, RVData.GetOffsAfterItem(StoredItemNo));
saeid2016
Posts: 70
Joined: Wed Mar 16, 2016 11:56 am

Post by saeid2016 »

Sergey Tkachenko wrote:As for the second question, you need to patch it as well, there is no standard solution (although, we are thinking about adding this functionality in TRichView.SearchText)

Modify MarkSubString_:

Code: Select all

...
    if P > 0 then
      ...
      // store RVData.GetSourceRVData and I
    end;
...
(important, the code for storing must be at the very end of "if P > 0")

To select StoredRVData and StoredItemNo:

Code: Select all

var RVData: TCustomRVFormattedData;
  RVData := TCustomRVFormattedData(StoredRVData.Edit);
  RVData.SetSelectionBounds(StoredItemNo, RVData.GetOffsBeforeItem(StoredItemNo), StoredItemNo, RVData.GetOffsAfterItem(StoredItemNo));
I created this Types in MarkSearch.Pas:

Code: Select all

  type TMyRVData = record
   RVData : TCustomRVFormattedData;
   ItemNo : Integer;
   end;

   type MyRVData = array of TMyRVData;
Then I sent MyRVData to MarkSubStringW and MarkSubString_ functions as variable parameter to store RVData and I in it.

I stored RVData and I in MarkSubString_ like this:

Code: Select all

function MarkSubString_(RVData: TCustomRVData; const s: TRVAnsiString;
  const sw: TRVRawByteString; Color, BackColor: TColor;
  DelimSet: TSetOfChar; DelimW: PWideChar; DelimWLen: Integer;
  Options: TRVMarkStringOptions;[b]var MyStoredRVData : MyRVData[/b]): Integer;
var
...
  j : Integer;
begin
Result := 0;  
SetLength(MyStoredRVData,RVData.ItemCount);
j := 0;
...
if P > 0 then
begin
...
MyStoredRVData[j].RVData := TCustomRVFormattedData(RVData.GetSourceRVData);
MyStoredRVData[j].ItemNo := i;
j := j + 1;
End;
then in my main unit I called MarkSubStringW:

Code: Select all

var MyRVData2 : MyRvData
     FoundItemNo : integer;
...
MarkSubStringW(MySearchText,clred, clYellow,myOptions,RichViewEdit1, MyRVData2);
FoundItemNo := 0;
and in My NextButton OnClick Event I typed:

Code: Select all

  var tmpRVData: TCustomRVFormattedData;
  StoredItemNo : Integer;
begin

 if FoundItemNo <= Length(MyRVData2) - 1 then
 begin
 
  StoredItemNo := MyRVData2[FoundItemNo].ItemNo;
  tmpRVData := TCustomRVFormattedData(MyRVData2[FoundItemNo].RVData.Edit);
  tmpRVData.SetSelectionBounds(StoredItemNo, tmpRVData.GetOffsBeforeItem(StoredItemNo), StoredItemNo, tmpRVData.GetOffsAfterItem(StoredItemNo));

  FoundItemNo := FoundItemNo + 1;
 end;
and in My PreviousButton OnClick Event I typed:

Code: Select all

  var tmpRVData: TCustomRVFormattedData;
  StoredItemNo : Integer;
begin

 if FoundItemNo >=0 then
 begin
 
  StoredItemNo := MyRVData2[FoundItemNo].ItemNo;
  tmpRVData := TCustomRVFormattedData(MyRVData2[FoundItemNo].RVData.Edit);
  tmpRVData.SetSelectionBounds(StoredItemNo, tmpRVData.GetOffsBeforeItem(StoredItemNo), StoredItemNo, tmpRVData.GetOffsAfterItem(StoredItemNo));

  FoundItemNo := FoundItemNo - 1;
 end;
but it doesn't work correctly. Please help.
saeid2016
Posts: 70
Joined: Wed Mar 16, 2016 11:56 am

Post by saeid2016 »

To get marked items position and select them, I create a Text style in design time with color and back color that I want to pass to MarkSubStringW. After calling MarkSubStringW I get the list of marked items with this code:

Code: Select all

  
MarkedItemsIndexList := TStringList.Create;
  ...
  MarkedItemsIndexList.Clear;
  for i := 0 to RichViewEdit1.ItemCount - 1 do
    begin
      if RichViewEdit1.GetItemStyle(i) = RVStyle1.TextStyles.FindStyleWithColor(0,clRed,clYellow) then
        MarkedItemsIndexList.Add(IntToStr(i))
    end;
Then in my NextButton and PrevButton OnClick event I select the item:

Code: Select all

StoredItemNo := StrToInt(MarkedItemsIndexList[CurrentItem]);

   RichViewEdit1.SetSelectionBounds(StoredItemNo, RichViewEdit1.GetOffsBeforeItem(StoredItemNo), StoredItemNo, RichViewEdit1.GetOffsAfterItem(StoredItemNo));
and it works correctly.

I tested your fixed version, it works correctly and everything is ok.

I have 3 points to improve highlighting text with MarkSubStringW that tell to you later. now I'm in a hurry and must go.
saeid2016
Posts: 70
Joined: Wed Mar 16, 2016 11:56 am

Post by saeid2016 »

I have 2 points not 3 points!!
1. When we want to highlight a expression with two or more words, if that expression exists in our text but with any delimiters among its words (like , : ; / ...), the MarkSubStringW function can't find and highlight it.
for example our text is : "Sergey / Tkachenko" and we want to highlight "Sergey Tkachenko" in it.
IS there a way to do it?

2.Unlike English, In Arabic texts letters have glued together and those letters are in lowercase mode. when MarkSubStringW highlights a substring in a string, it changes the first and last character of substring to uppercase and separates it from it's before and after. Is it possible to fix this?
Sergey Tkachenko
Site Admin
Posts: 17553
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko »

1. Currently, the only search that supports searching in multiple items is SearchText method. MarkSubString searches in each item separately.
It's not very simple to implement multiitem search. It will be implemented, but currently I am not ready to do it, because immediately after releasing of ReportWorkshop I want to make an important change in TRichView internals: all text inside will be Unicode, not a mix of Unicode and non-Unicode items. This change will simplify implementation of many things, including multiitem search.

2. Unfortunately, TRichView does not support ligation of characters belonging to different items. Planned for future, but it does not have very high priority. Currently, the only way to keep existing ligations is marking text without changing document.
It is possible, see the last post here: http://www.trichview.com/forums/viewtopic.php?t=7225
But it needs the same modification as MarkSearchW to ignore diacritic.
saeid2016
Posts: 70
Joined: Wed Mar 16, 2016 11:56 am

Post by saeid2016 »

Sergey Tkachenko wrote:1. Currently, the only search that supports searching in multiple items is SearchText method. MarkSubString searches in each item separately.
It's not very simple to implement multiitem search. It will be implemented, but currently I am not ready to do it, because immediately after releasing of ReportWorkshop I want to make an important change in TRichView internals: all text inside will be Unicode, not a mix of Unicode and non-Unicode items. This change will simplify implementation of many things, including multiitem search.
I don't mean I want multiitem search but I mean that in same item when my text contains delimiters among two words the MarkSubStringW ignore the delimiters like when rvmsoWholeWords in options.
for example in my text Sergey,Tkachenko is same item and I want to search for Sergey Tkachenko without , character.
Sergey Tkachenko
Site Admin
Posts: 17553
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko »

This feature is not supported, but you can try to implement it yourself.
It can be implemented in the same way as ignoring diacritics.

Modify RVPreprocessString (used to normalize the string to find) and RVPreprocessStringRaw (used to normalize strings in text). They must return string with replaced delimiters (as I understand your requirements, you need to replace any sequence of adjacent delimiters to a single space character). You should not only adjust the returned string, but also Map (a map of characters in the resulting string to the characters of the initial string).
saeid2016
Posts: 70
Joined: Wed Mar 16, 2016 11:56 am

Post by saeid2016 »

Sergey Tkachenko wrote:This feature is not supported, but you can try to implement it yourself.
It can be implemented in the same way as ignoring diacritics.

Modify RVPreprocessString (used to normalize the string to find) and RVPreprocessStringRaw (used to normalize strings in text). They must return string with replaced delimiters (as I understand your requirements, you need to replace any sequence of adjacent delimiters to a single space character). You should not only adjust the returned string, but also Map (a map of characters in the resulting string to the characters of the initial string).
As I understand RVPreprocessString and RVPreprocessStringRaw delete diacritics in this part of code:

Code: Select all

...
      if StripDiacritic then
      begin
        if Len3 > 1 then
          Delete(Result,
            (PRVAnsiChar(PtrDest) - PRVAnsiChar(PtrStartDest)) div 2 + 2,
            Len3 - 1);
        inc(PRVAnsiChar(PtrDest), 2);
      end
...
I want to delete some other characters such as , : ; and etc. What Index and Count parameters I must pass to Delete procedure?
saeid2016
Posts: 70
Joined: Wed Mar 16, 2016 11:56 am

Post by saeid2016 »

Please help to do this.
Sergey Tkachenko
Site Admin
Posts: 17553
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko »

I'll modify the example in the next couple of days, sorry for the delay
Sergey Tkachenko
Site Admin
Posts: 17553
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko »

I updated http://www.trichview.com/support/files/ ... search.zip

A new option rvmsoIgnorePunctuation is added for MarkSubStringW.
If included, any sequence of punctuation is treated as a single space character.
Punctuation characters are listed in Delimiters property.

Image

Note that the unit is changed internally, there are no more two versions RVPreprocessString.
saeid2016
Posts: 70
Joined: Wed Mar 16, 2016 11:56 am

Post by saeid2016 »

Thank you Sergey, that's OK.
rttt
Posts: 3
Joined: Thu Jul 28, 2022 5:30 am

Re: Highlight search result in Arabic text

Post by rttt »

TRVMarkStringOptions;

1. not support turkish character [rvmsoIgnoreCase] for I, i

2. characters should be like this
I = ı
İ = i

3. could you fix or how can i fix
Sergey Tkachenko
Site Admin
Posts: 17553
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Highlight search result in Arabic text

Post by Sergey Tkachenko »

This topic is obsolete, a newer version of MarkSearch.pas is included in RichViewActions (which are included in TRichView setup).

When comparing strings case insensitive, the new version of MarkSearch.pas convers strings to lower case before comparing, using RVLowerCaseW function. For new Unicode versions of Delphi, RVLowerCaseW calls AnsiLowerCase. Despite its name, this function works with UnicodeString, and it calls CharLowerBuff WinAPI function.
In the documentation about CharLowerBufW, we can see
Remarks
Note that CharLowerBuff always maps uppercase I to lowercase I ("i"), even when the current language is Turkish or Azerbaijani. If you need a function that is linguistically sensitive in this respect, call LCMapSting.
Currently, I do not plan to modify this unit. But you can change all calls of RVLowerCaseW to the calls of your function that does the proper conversion for Turkish characters.
Post Reply