Highlight in Lucene with certain numbers of surrounding words - lucene

I want to make a highlight in Lucene which beside the highlighted word it also returns its context which consists of a certain number of words surrounding the searched term. For example, if the content of a document is The quick brown fox jumps over the lazy dog, the searched word is fox and the desired number of surrounding words is 2, the result should look like this quick brown <B> fox </B> jumps over

Related

How to remove text into the editor at current selection in ckeditor5?

let say i have this string
a quick brown fox jumps:
I want to remove colon : from that string
there is a method to remove called "write.remove"
but I don't know how can I use this

Find & Replace Long Blocks of Text (Microsoft Word) without losing the styles

I would like to find and replace long blocks of text with line breaks in Microsoft Word. With ctrl+h I am able to find and replace text up to 255 characters. But if I want to search for a text which has line breaks, ctrl+h does not work. So, I looked in the internet and found the following code which helped me to delete the copied text from the document. But the Macro is also removing the styles in the document.
Here is the code I took from https://answers.microsoft.com/en-us/msoffice/forum/all/find-replace-long-blocks-of-text-microsoft-word/2fa77e32-9085-4c74-9d11-04d86829442f
Sub Remove_text()
Dim strText As String
Dim strReplacement As String
strText = Selection.Range.Text
strReplacement = "" 'Use this command to delete the instances of the selected text
'strReplacement = InputBox("Enter the text to be used for the replacement", "Find and replace long text")
With ActiveDocument.Range
.Text = Replace(.Text, strText, strReplacement)
End With
End Sub
For e.g. if I run the macro with the fallowing words in a word document and run the macro after selecting two words, the resultant text will be with all the words in bold characters.
Before
Cow
Rabbit
Ducks
Shrimp
Pig
Goat
Crab
Deer
Chicken
Seagull
Ostrich
After
Cow
Rabbit
Ducks
Shrimp
Pig
Goat
Crab
Deer
Chicken
Seagull
Ostrich
If I run the macro with the fist word in the list without bold and italics all the words are becoming plain as shown below.
Before
Cow
Rabbit
Ducks
Shrimp
Pig
Goat
Crab
Deer
Chicken
Seagull
Ostrich
After
Cow
Rabbit
Ducks
Shrimp
Pig
Goat
Crab
Deer
Chicken
Seagull
Ostrich
The above list is just an example. The same thing also happens in a large document. In that document all the styles will be replaced with the first word's style. So the styles of the whole document is depending on the first word's style. Can someone help me to remove the text with line breaks at the same time preserve the styles?
What you are describing in your comments has nothing to do with the format of the document. Deleting text does not change the document's format. Deleting Section breaks, though, can do so drastically.
I note that you have radically changed your description of the problem since first posting it and receiving feedback. Clearly, you've wasted everyone's time by giving a poor problem description in the first place.
The examples in your updated description clearly show that you're replacing content, not merely deleting it. In that case, to preserve the formatting, you need to limit your replacement content to the same range as whatever exhibits the format of the content you want to replace. Either that, or you must type and format the replacement content exactly how you want it to appear, then copy it to the clipboard and use ^c for the replacement expression.

How to bold first line of paragraph till comma in word using VBA

if i have a paragraph for example
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
What in need to do is to bold first line of paragraph till comma
The quick brown fox jumps over the lazy dog, The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
for all paragraphs in Word using VBA.
Sub Test()
Dim p As Paragraph
Dim idx As Integer
Dim i As Integer
For Each p In ActiveDocument.Paragraphs
With p.Range
idx = InStr(1, .Text, ",", vbTextCompare)
If idx > 0 Then
For i = 1 To idx
.Characters(i).Bold = True
Next i
End If
End With
Next p
End Sub
Loops over each paragraph, finds the first Comma, bolds all characters including the first comma.
Input:
Sample paragraph, some text.
Output:
Sample paragraph, some text.
It isn't the most optimal code, since it loops over the characters, but it's tested and works and should give the idea for what you're after. Skips paragraphs without comma's.

MigraDoc Pdf long word rendering

What is about word-breaking.
I have a purpose to render LONG word in pdf, and I need to replace part of word on the next row. So now I get word that starts after some indent and finishes after right page side (I don't see word's end).
I use something like this:
var addInfo = paragraph.AddFormattedText("LOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOONG TIIIIIIIIIIIITLE");
I didn't find any tips about in MigraDoc documentation...
Of course I can implement the logic for splitting words up by myself, and I'll do this, if there is no native solution.
MigraDoc breaks sentences at spaces, hyphens, and soft hyphens. A single long word without any of these won't get broken.
Try something useful ("The quick brown fox ...") and breaking will work.
See also:
https://stackoverflow.com/a/40975554/162529

MS Word VBA - Determine extent of "style run"?

Suppose I have a Range reference in Word VBA, and I want to break that down into smaller ranges where the formatting (font, colour, etc.) is identical. For example, if I start with:
The quick brown fox jumped over the lazy dog.
...then I would want to get back 5 ranges, as follows:
The
quick
brown fox jumped over the
lazy
dog.
I had hoped that there was a built-in way to do this in VBA (and even have a phantom memory of using such a facility), but I can't find anything.
I could do what I need to do in code, but something that works natively would be much (much) quicker.
[In code, I would use the fact that - for example - oRange.Font.Bold will return "undefined" if the range contains a mix of bold and not bold, and so I could use this repeatedly to discover the extent of the uniform ranges. But I'm pretty sure that Word will be doing this under the hood, so if someone can pop that hood for me, I'd be grateful.]
EDIT: removed more complex example that the StackOverflow HTML renderer did not like.
Can't really do this in VBA as the OM doesn't support runs (OOXML does however). The best you could probably do is get the wdUnits of wdCharacterFormatting and create a loop to create ranges, extract their properties and then destroy them until the loop is finished. You'd probaby start with something like:
Dim sel As Selection
Set sel = ActiveWindow.Selection
Dim selRange As Range
Set selRange = selRange.Next(wdCharacterFormatting)
to get the start and end of the next set of formatting, like selRange.Start/selRange.End, as well as any properties like selRange.Font.Name/selRange.Bold.