VB.Net Remove Multiple parentheses and text within from strings - vb.net

So basically what I am trying to do in vb.net is remove multiple nested parentheses and all the text inside those parentheses from a string. It's easy to do if there is just one set of parentheses like in the first example below I just find the index of "(" and ")" and then use the str.remove(firstindex, lastindex) and just keep looping until all parentheses have been removed from the string.
str = "This (3) is(fasdf) an (asdas) Example"
Desired output:
str = "This is an example"
However I still can't figure out how to do it if their are multiple nested parentheses in the string.
str = "This ((dsd)sdasd) is ((sd)) an (((d))) an example"
Desired Outcome:
str = "This is an example"

This isn't really a tutorial site, so I shouldn't be answering this, but I couldn't resist.
As Ahmed said, you could use Regex.Replace, but I find Regexes complex and impenetrable. So it would be difficult for someone else to maintain it.
The following code has three loops. The our loop, a While loop, will run the two inner loops as long as the character index is less than the length of the string.
The first inner loop searches for the first "open bracket" in a group and records the position and adds 1 to the number of "open brackets" (the depth). Any subsequent "open brackets" just adds 1 to the number of brackets. This carries on until the first loop finds a "close bracket"
Then the second loop searches for the same number of "close brackets" from that point where the first "close bracket" is found.
When the loop gets to the last "close bracket" in the group, all the characters from the first "open bracket" to the last "close bracket" in the group are removed. Then the While loop starts again if the current index position is not at the end of the updated inputString.
When the While loop finishes, any double spaces are removed and the updated output string is returned from the function
Private Function RemoveBracketsAntContents(inputString As String) As String
Dim i As Integer
While i < inputString.Length
Dim bracketDepth As Integer = 0
Dim firstBracketIndex As Integer = 0
Do
If inputString(i) = "(" Then
If firstBracketIndex = 0 Then
firstBracketIndex = i
End If
bracketDepth += 1
End If
i += 1
Loop Until i = inputString.Length OrElse inputString(i) = ")"
If i = inputString.Length Then Exit While
Do
If inputString(i) = ")" Then
bracketDepth -= 1
End If
i += 1
Loop Until bracketDepth = 0
inputString = inputString.Remove(firstBracketIndex, i - firstBracketIndex)
i = i - (i - firstBracketIndex)
End While
inputString = inputString.Replace(" ", " ")
Return inputString
End Function

Related

MS Access VBA: Split string into pre-defined width

I have MS Access form where the user pastes a string into a field {Vars}, and I want to reformat that string into a new field so that (a) it retains whole words, and (b) "fits" within 70 columns.
Specifically, the user will be cutting/pasting variable names from SPSS. So the string will go into the field as whole names---no spaces allowed---with line breaks between each variable. So the first bit of VBA code looks like this:
Vars = Replace(Vars, vbCrLf, " ")
which removes the line breaks. But from there, I'm stumped---ultimately I want the long string that is pasted in the Vars field to be put on consecutive multiple lines that each are no longer than 70 columns.
Any help is appreciated!
Okay, for posterity, here is a solution:
The field name on the form that captures the user input is VarList. The call to the SPSS_Syntax function below returns the list of variable names (in "Vars") that can then be used elsewhere:
Vars = SPSS_Syntax(me.VarList)
Recall that user input into Varlist comes in as each variable (word) with a line break in between each. The problem is that we want the list to be on one line (horizontal, not vertical) AND a line can be no more than 256 characters in length (I'm setting it to 70 characters below). Here's the function:
Public Function SPSS_Syntax(InputString As String)
InputString = Replace(InputString, vbNewLine, " ") 'Puts the string into one line, separated by a space.
MyLength = Len(InputString) 'Computes length of the string
If MyLength < 70 Then 'if the string is already short enough, just returns it as is.
SPSS_Syntax = InputString
Exit Function
End If
MyArray = Split(InputString, " ") 'Creates the array
Dim i As Long
For i = LBound(MyArray) To UBound(MyArray) 'for each element in the array
MyString = MyString & " " & MyArray(i) 'combines the string with a blank space in between
If Len(MyString) > 70 Then 'when the string gets to be more than 70 characters
Syntax = Syntax & " " & vbNewLine & MyString 'saves the string as a new line
MyString = "" 'erases string value for next iteration
End If
Next
SPSS_Syntax = Syntax
End Function
There's probably a better way to do it but this works. Cheers.

for to next loop every word the first letter will be capital

what will i do to make it accept my equation and able to make all inserted word first letter capital? i f5 this equation and it show some yellow errors if i input a word.
i tried if i = 0 then
so it only shows first word first letter capital the rest isnt.
https://m.facebook.com/story.php?story_fbid=209840356412510&id=100021596419779&refid=17&tn=%2AW-R&_rdr
You didn't post your code on your post so i don't know what you tried but this code works perfectly in order to achieve your goal.
Dim textBoxString As String = TextBox1.Text.Trim
'Create a string array with every words'
Dim words() As String = textBoxString.Split(" ")
'labelString is our final result'
Dim labelString As String = ""
'cycle throught every word'
For i = 0 To textBoxString.Length - 1
Try
'substring(0,1) takes only the first char of the word'
words(i) = words(i).ToUpper().Substring(0, 1) & words(i).Substring(1, words(i).Length - 1)
labelString = labelString & words(i) & " "
Catch ex As Exception
Err.Clear()
Exit For
End Try
Next
Label1.Text = labelString
Little explanation of my code
words(i).ToUpper.Substring(0,1)
it takes only the first char of every word throught the cycle.
words(i).Substring(1, words(i).Lenght -1)
it takes the entire word without the first char
labelString = labelString & words(i) & " "
it concatenates every words back together.

Sampling StringName.SubString(p,1) for paragraph formatting character

please try to specifically answer my question and not offer alternative approaches as I have a very specific problem that needs this ad-hoc solution. Thank you very much.
Automatically my code opens Word through VB.NET, opens the document, finds the table, goes to a cell, moves that cells.range.text into a String variable and in a For loop compares character at position p to a String.
I have tried Strings:
"^p", "^013", "U+00B6"
My code:
Dim nextString As String
'For each cell, extract the cell's text.
For p = 17 To word_Rng.Cells.Count
nextString = word_Rng.Cells(p).Range.Text
'For each text, search for structure.
For q = 0 To nextString.Length - 1
If (nextString.Substring(q, 1) = "U+00B6") Then
Exit For
End If
Next
Next
Is the structural data lost when assigning the cells text to a String variable. I have searched for formatting marks like this in VBA successfully in the past.
Assuming that your string contains the character, you can use ChrW to create the appropriate character from the hex value, and check for that:
If nextString.Substring(q, 1) = ChrW(&h00B6) Then
Exit For
End If
UPDATE
Here's a complete example:
Dim nextString = "This is a test " & ChrW(&H00B6) & " for that char"
Console.WriteLine(nextString)
For q = 0 To nextString.Length - 1
If nextString(q) = ChrW(&H00B6) Then
Console.WriteLine("Found it: {0}", q)
End If
Next
This outputs:
This is a test ¶ for that char
Found it: 15

Word VBA: iterating through characters incredibly slow

I have a macro that changes single quotes in front of a number to an apostrophe (or close single curly quote). Typically when you type something like "the '80s" in word, the apostrophe in front of the "8" faces the wrong way. The macro below works, but it is incredibly slow (like 10 seconds per page). In a regular language (even an interpreted one), this would be a fast procedure. Any insights why it takes so long in VBA on Word 2007? Or if someone has some find+replace skills that can do this without iterating, please let me know.
Sub FixNumericalReverseQuotes()
Dim char As Range
Debug.Print "starting " + CStr(Now)
With Selection
total = .Characters.Count
' Will be looking ahead one character, so we need at least 2 in the selection
If total < 2 Then
Return
End If
For x = 1 To total - 1
a_code = Asc(.Characters(x))
b_code = Asc(.Characters(x + 1))
' We want to convert a single quote in front of a number to an apostrophe
' Trying to use all numerical comparisons to speed this up
If (a_code = 145 Or a_code = 39) And b_code >= 48 And b_code <= 57 Then
.Characters(x) = Chr(146)
End If
Next x
End With
Debug.Print "ending " + CStr(Now)
End Sub
Beside two specified (Why...? and How to do without...?) there is an implied question – how to do proper iteration through Word object collection.
Answer is – to use obj.Next property rather than access by index.
That is, instead of:
For i = 1 to ActiveDocument.Characters.Count
'Do something with ActiveDocument.Characters(i), e.g.:
Debug.Pring ActiveDocument.Characters(i).Text
Next
one should use:
Dim ch as Range: Set ch = ActiveDocument.Characters(1)
Do
'Do something with ch, e.g.:
Debug.Print ch.Text
Set ch = ch.Next 'Note iterating
Loop Until ch is Nothing
Timing: 00:03:30 vs. 00:00:06, more than 3 minutes vs. 6 seconds.
Found on Google, link lost, sorry. Confirmed by personal exploration.
Modified version of #Comintern's "Array method":
Sub FixNumericalReverseQuotes()
Dim chars() As Byte
chars = StrConv(Selection.Text, vbFromUnicode)
Dim pos As Long
For pos = 0 To UBound(chars) - 1
If (chars(pos) = 145 Or chars(pos) = 39) _
And (chars(pos + 1) >= 48 And chars(pos + 1) <= 57) Then
' Make the change directly in the selection so track changes is sensible.
' I have to use 213 instead of 146 for reasons I don't understand--
' probably has to do with encoding on Mac, but anyway, this shows the change.
Selection.Characters(pos + 1) = Chr(213)
End If
Next pos
End Sub
Maybe this?
Sub FixNumQuotes()
Dim MyArr As Variant, MyString As String, X As Long, Z As Long
Debug.Print "starting " + CStr(Now)
For Z = 145 To 146
MyArr = Split(Selection.Text, Chr(Z))
For X = LBound(MyArr) To UBound(MyArr)
If IsNumeric(Left(MyArr(X), 1)) Then MyArr(X) = "'" & MyArr(X)
Next
MyString = Join(MyArr, Chr(Z))
Selection.Text = MyString
Next
Selection.Text = Replace(Replace(Selection.Text, Chr(146) & "'", "'"), Chr(145) & "'", "'")
Debug.Print "ending " + CStr(Now)
End Sub
I am not 100% sure on your criteria, I have made both an open and close single quote a ' but you can change that quite easily if you want.
It splits the string to an array on chr(145), checks the first char of each element for a numeric and prefixes it with a single quote if found.
Then it joins the array back to a string on chr(145) then repeats the whole things for chr(146). Finally it looks through the string for an occurence of a single quote AND either of those curled quotes next to each other (because that has to be something we just created) and replaces them with just the single quote we want. This leaves any occurence not next to a number intact.
This final replacement part is the bit you would change if you want something other than ' as the character.
I have been struggling with this for days now. My attempted solution was to use a regular expression on document.text. Then, using the matches in a document.range(start,end), replace the text. This preserves formatting.
The problem is that the start and end in the range do not match the index into text. I think I have found the discrepancy - hidden in the range are field codes (in my case they were hyperlinks). In addition, document.text has a bunch of BEL codes that are easy to strip out. If you loop through a range using the character method, append the characters to a string and print it you will see the field codes that don't show up if you use the .text method.
Amazingly you can get the field codes in document.text if you turn on "show field codes" in one of a number of ways. Unfortunately, that version is not exactly the same as what the range/characters shows - the document.text has just the field code, the range/characters has the field code and the field value. Therefore you can never get the character indices to match.
I have a working version where instead of using range(start,end), I do something like:
Set matchRange = doc.Range.Characters(myMatches(j).FirstIndex + 1)
matchRange.Collapse (wdCollapseStart)
Call matchRange.MoveEnd(WdUnits.wdCharacter, myMatches(j).Length)
matchRange.text = Replacement
As I say, this works but the first statement is dreadfully slow - it appears that Word is iterating through all of the characters to get to the correct point. In doing so, it doesn't seem to count the field codes, so we get to the correct point.
Bottom line, I have not been able to come up with a good way to match the indexing of the document.text string to an equivalent range(start,end) that is not a performance disaster.
Ideas welcome, and thanks.
This is a problem begging for regular expressions. Resolving the .Characters calls that many times is probably what is killing you in performance.
I'd do something like this:
Public Sub FixNumericalReverseQuotesFast()
Dim expression As RegExp
Set expression = New RegExp
Dim buffer As String
buffer = Selection.Range.Text
expression.Global = True
expression.MultiLine = True
expression.Pattern = "[" & Chr$(145) & Chr$(39) & "]\d"
Dim matches As MatchCollection
Set matches = expression.Execute(buffer)
Dim found As Match
For Each found In matches
buffer = Replace(buffer, found, Chr$(146) & Right$(found, 1))
Next
Selection.Range.Text = buffer
End Sub
NOTE: Requires a reference to Microsoft VBScript Regular Expressions 5.5 (or late binding).
EDIT:
The solution without using the Regular Expressions library is still avoiding working with Ranges. This can easily be converted to working with a byte array instead:
Sub FixNumericalReverseQuotes()
Dim chars() As Byte
chars = StrConv(Selection.Text, vbFromUnicode)
Dim pos As Long
For pos = 0 To UBound(chars) - 1
If (chars(pos) = 145 Or chars(pos) = 39) _
And (chars(pos + 1) >= 48 And chars(pos + 1) <= 57) Then
chars(pos) = 146
End If
Next pos
Selection.Text = StrConv(chars, vbUnicode)
End Sub
Benchmarks (100 iterations, 3 pages of text with 100 "hits" per page):
Regex method: 1.4375 seconds
Array method: 2.765625 seconds
OP method: (Ended task after 23 minutes)
About half as fast as the Regex, but still roughly 10ms per page.
EDIT 2: Apparently the methods above are not format safe, so method 3:
Sub FixNumericalReverseQuotesVThree()
Dim full_text As Range
Dim cached As Long
Set full_text = ActiveDocument.Range
full_text.Find.ClearFormatting
full_text.Find.MatchWildcards = True
cached = full_text.End
Do While full_text.Find.Execute("[" & Chr$(145) & Chr$(39) & "][0-9]")
full_text.End = full_text.Start + 2
full_text.Characters(1) = Chr$(96)
full_text.Start = full_text.Start + 1
full_text.End = cached
Loop
End Sub
Again, slower than both the above methods, but still runs reasonably fast (on the order of ms).

Listbox Remove Spaces

I am trying to use this code to remove spaces from a listbox but it is not working
Dim word As String() = {" "}
For i As Integer = 0 To ListBox5.Items.Count - 1
For Each Word As String In word
If ListBox5.Items(i).ToString.Contains(Word) Then
ListBox5.Items(i) = ListBox5.Items(i).ToString.Replace(Word, String.Empty)
End If
Next
Next
any help would be appreciated a lot.
You define a string array with one value and its static, why?
It seems you could do this simply by coding it like this
scan every item and replace " " with string.empty,
don't bother checking if it exists, just run the replace statement on every item
For i As Integer = 0 To ListBox5.Items.Count - 1
ListBox5.Items(i) = ListBox5.Items(i).ToString.Replace(" ", String.Empty)
Next
try this code
For i As Integer = 0 To ListBox5.Items.Count - 1
ListBox5.Items(i) = ListBox5.Items(i).ToString.Replace(" ", Nothing)
Next
fist we have to loop through the listbox items. we declared the items from index 0 to the final item index,ie listbox.items.count-1 . this is stored in a variable i. next all the items ,say from 0 to count-1 is replaced.
istBox5.Items(i).ToString.Replace(" ", Nothing) " " indicates a space and'nothing' indicates null.you can also use regex.replace here by importing system.regularexpressions.