Split words but preserve special words - vb.net

I am using VS2012, vb.net.
I am wanting to load a string into an array. The string is a series of words that are separated by a space (" ").
Currently I am using the following code:
Dim stringTempCurrentGameTypes() As String = Split(stringTypeList, " ")
This works perfectly. However, I have an exception to some of the data that gets loaded into the array. Sometimes the data in the string (the string that is separated by spaces) has two words that I want to load into the array as one item, and not two items.
Here is an example of a string that I am talking about:
tourney ffa team ctf clan arena test
The exception is the two words 'clan arena'.
Currently, if I just use the split command, I get an array with the following elements:
item(0) = tourney
item(1) = ffa
item(2) = team
item(3) = ctf
item(4) = clan
item(5) = arena
item(6) = test
I am after the following:
item(0) = tourney
item(1) = ffa
item(2) = team
item(3) = ctf
item(4) = clan arena
item(5) = test
How can I detect if the item being added to the array is the words 'clan arena', and add this as one entry, rather than as two entries? Also, the words 'clan arena' may change, so rather than hard coding the words 'clan arena', I need to do it via a string variable.

There are, of course, several ways to do this.
One way is to replace the whitespace of all your special items in the input string with a temporary character, split the input string, and then change the temporary character back to the original whitespace.
Example:
Dim raw = "tourney ffa team ctf clan arena test"
Dim special_words = new String() {"clan arena"}
Dim tmp_char = "$"
For Each word in special_words
raw = raw.Replace(word, word.Replace(" ", tmp_char))
Next
Dim result = raw.Split(new Char() {" "c})
For i = 0 To result.Count -1
result(i) = result(i).Replace(tmp_char, " ")
Next
As temporary character, you could use a unprintable character like Chr(31) (unit seperator) or anything you know that would not be in your input string.
This approach is quite simple and preserves the order of your items.

Related

Is it possible to use String.Split() when NewLine is the delimiter?

I have a question which asks me to calculate something from an input file. The problem is, the lines in the file don't use any special character as delimiter, like , or |. I will show it down below.
Data Communication
20
Visual Basic
40
The output I need to write to another file should look like this:
Data communication 20
Visual Basic 40
Total Books : 60
The problem is, how can I specify the delimiter? Like when there is a symbol as in strArray = strLine.Split(","). Since there is nothing I can use as delimiter, how can I split the file content?
There's no real need to split the text in the input file, when you can read a file line by line using standard methods.
You can use, e.g., a StreamReader to read the lines from the source file, check whether the current line is just text or it can be converted to a number, using Integer.TryParse and excluding empty lines.
Here, when the line read is not numeric, it's added as a Key in a Dictionary(Of String, Integer), unless it already exists (to handle duplicate categories in the source file).
If the line represents a number, it's added to the Value corresponding to the category Key previously read, stored in a variable named previousLine.
This setup can handle initial empty lines, empty lines in the text body and duplicate categories, e.g.,
Data Communication
20
Visual Basic
40
C#
100
Visual Basic
10
Other stuff
2
C++
10000
Other stuff
1
If a number is instead found in the first line, it's treated as a category.
Add any other check to handle a different structure of the input file.
Imports System.IO
Imports System.Linq
Dim basePath = "[Path where the input file is stored]"
Dim booksDict = New Dictionary(Of String, Integer)
Dim currentValue As Integer = 0
Dim previousLine As String = String.Empty
Using sr As New StreamReader(Path.Combine(basePath, "Books.txt"))
While sr.Peek > -1
Dim line = sr.ReadLine().Trim()
If Not String.IsNullOrEmpty(line) Then
If Integer.TryParse(line, currentValue) AndAlso (Not String.IsNullOrEmpty(previousLine)) Then
booksDict(previousLine) += currentValue
Else
If Not booksDict.ContainsKey(line) Then
booksDict.Add(line, 0)
End If
End If
End If
previousLine = line
End While
End Using
Now, you have a Dictionary where the Keys represent categories and the related Value is the sum of all books in that category.
You can Select() each KeyValuePair of the Dictionary and transform it into a string that represents the Key and its Value (Category:Number).
Here, also OrderBy() is used, to order the categories alphabetically, in ascending order; it may be useful.
File.WriteAllLines is then called to store the strings generated.
In the end, a new string is appended to the file, using File.AppendAllText, to write the sum of all books in all categories. The Sum() method sums all the Values in the Dictionary.
Dim newFilePath = Path.Combine(basePath, "BooksNew.txt")
File.WriteAllLines(newFilePath, booksDict.
Select(Function(kvp) $"{kvp.Key}:{kvp.Value}").OrderBy(Function(s) s))
File.AppendAllText(newFilePath, vbCrLf & "Total Books: " & booksDict.Sum(Function(kvp) kvp.Value).ToString())
The output is:
C#:100
C++:10000
Data Communication:20
Other stuff:3
Visual Basic:50
Total Books: 10173
Sure.. System.IO.File.ReadAllLines() will read the whole file and split into an array based on newlines, so you'll get an array of 4 elements. You can process it with a flipflop boolean to get alternate lines, or you can try and parse the line to a number and if it works, then its a number and if not, it's a string. If it's a number take the string you remembered (using a variable) from the previous loop
Dim arr = File.ReadALlLines(...)
Dim isStr = True
Dim prevString = ""
For Each s as String in arr
If isStr Then
prevString = s
Else
Console.WriteLine($"The string is {prevString} and the number is {s}")
End If
'flip the boolean
isStr = Not isStr
Next s
I used File.ReadAllLines to get an array containing each line in the file. Since the size of the file could be larger than the sample shown, I am using a StringBuilder. This save having to throw away and create a new string on each iteration of the loop.
I am using interpolated strings indicated by the $ preceding the quotes. This allows you to insert variables into the string surrounded by braces.
Note the Step 2 in the For loop. i will increment by 2 instead of the default 1.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim lines = File.ReadAllLines("input.txt")
Dim sb As New StringBuilder
Dim total As Integer
For i = 0 To lines.Length - 2 Step 2
sb.AppendLine($"{lines(i)} {lines(i + 1)}")
total += CInt(lines(i + 1))
Next
sb.AppendLine($"Total Books: {total}")
TextBox1.Text = sb.ToString
End Sub

How to split a string in VBA by more than one character

In C# one can easily split a split string by more than one character, one supplies an array of split characters. I was wondering what is best way to achieve this in VBA. I use VBA.Split typically but to split on more than one characters requires drilling in to the results and sub-splitting the elements. Then one has to re-dimension arrays etc. Quite painful.
Contraints
VBA responses only please. You may use .NET collection classes if you wish (yes they are creatable and callable in VBA). You may use JSON, XML as vessels for the list of split segments if you wish. You may use the humble VBA.Collection class if you wish, or even a Scripting.Dictionary. You may use even a fabricated recordset if you wish.
I know full well one can write a .NET asssembly to call the .NET String.Split method and expose assembly to VBA with COM interfaces but where is the challenge in that.
This should be fairly easy to do with a regular expression. If you match on the negation of the passed characters to split on, the matches will be the members of the output array. The upside to doing this is that the output array only needs to be sized once because you can get a count of the matches returned by the RegExp. The pattern is fairly simple to build - it boils down to something like [^abc]+ where 'a', 'b', and 'c' are the characters to split on. About the only thing that you need to do to prepare the expression is to escape a couple characters that have special meaning in that context inside a regular expression (I probably forgot some):
Private Function BuildRegexPattern(ByVal inputString As String) As String
Dim escapeTargets() As String
escapeTargets = VBA.Split("- ^ \ ]")
Dim returnValue As String
returnValue = inputString
Dim idx As Long
For idx = LBound(escapeTargets) To UBound(escapeTargets)
returnValue = Replace$(returnValue, escapeTargets(idx), "\" & escapeTargets(idx))
Next
BuildRegexPattern = "[^" & returnValue & "]+"
End Function
Once you have the pattern, it's just a simple matter of sizing the array and iterating over the matches to assign them (plus some other special case handling, etc.):
Public Function MultiSplit(ByVal toSplit As String, Optional ByVal delimiters As String = " ") As String()
Dim returnValue() As String
If toSplit = vbNullString Then
returnValue = VBA.Split(vbNullString)
Else
With New RegExp
.Pattern = BuildRegexPattern(IIf(delimiters = vbNullString, " ", delimiters))
.MultiLine = True
.Global = True
If Not .Test(toSplit) Then
'Only delimiters.
ReDim returnValue(Len(toSplit) - 1)
Else
Dim matches As Object
Set matches = .Execute(toSplit)
ReDim returnValue(matches.Count - 1)
Dim idx As Long
For idx = LBound(returnValue) To UBound(returnValue)
returnValue(idx) = matches(idx)
Next
End If
End With
End If
MultiSplit = returnValue
End Function
In my attempt, I replace all the other characters with space before splitting on space. (So I cheat a little.)
Private Function SplitByMoreThanOneChars(ByVal sLine As String)
'*
'* Brought to you by the Excel Development Platform Blog
'* http://exceldevelopmentplatform.blogspot.com/2018/11/
'*
'* Don't get excited, this splits by spaces only
'* we fake splitting by multiple characters by replacing those characters
'* with spaces
'*
Dim vChars2 As Variant
vChars2 = Array(" ", "<", ">", "[", "]", "(", ")", ";")
Dim sLine2 As String
sLine2 = sLine
Dim lCharLoop As Long
For lCharLoop = LBound(vChars2) To UBound(vChars2)
Debug.Assert Len(vChars2(lCharLoop)) = 1
sLine2 = VBA.Replace(sLine2, vChars2(lCharLoop), " ")
Next
SplitByMoreThanOneChars = VBA.Split(sLine2)
End Function

Parsing through an Array For Next loop Visual Basic

I am stuck here. Spent hours trying many different approaches but nothing is working
I have an array that holds text that looks like this
4456|4450|17
4466|4430|18
4446|4420|19
4436|4410|20
The separator is a pica ("|").
What I am trying to do is run through the array and extract the first two columns in separate strings to compare the values, look for the max, and min.
I am trying to end up with a string like this
4456,4466,4446,4436
Here is the solution:
Dim source As String = prices
Dim stringSeparators() As String = {vbCrLf}
Dim result() As String
result = source.Split(stringSeparators,
StringSplitOptions.RemoveEmptyEntries)
Dim fString As String = String.Join(Of String)(", ", result.Cast(Of String).Select(Of String)(Function(x) x.Split("|")(0)))
MsgBox(fString)
Let's take your example below...
4456|4450|17
4466|4430|18
4446|4420|19
4436|4410|20
prices = [the array shown above]
For Each i As String In prices
high = (i.Split("|"))(0)
highs = highs & highs1 & ","
MsgBox(highs)
Next
The reason you are getting 4,4,5,6,,4,4,5,0,,1,7 is because for each string you are splitting on the | and then taking the first character adding a comma to it.
If you want to get the first column or index whatever you want to call it before the | you need to loop through each string in that array and select out the values...
'this is my test array...
Dim arr As New ArrayList From {"4456|4450|17", "4466|4430|18", "4446|4420|19", "4436|4410|20"}
Now we can use a String.Join function, cast the array for each item as a string and finally select the first item on the split. This will get every item before the | and put them in a string separated with a comma.
Dim fString As String = String.Join(Of String)(", ", arr.Cast(Of String).Select(Of String)(Function(x) x.Split("|")(0)))
If you want the second section select the 1st index as arrays start at 0...
Dim sString As String = String.Join(Of String)(", ", arr.Cast(Of String).Select(Of String)(Function(x) x.Split("|")(1)))
Here is my screenshot of the outputs...

How to separate data with a comma

For example, I'm having a two set of vars with data type of string:
users = "Admin, Staff"
pass = "202cb9, caf1a"
These are vars with only normal string data type. The two vars above is generated, so I could only get these kind of data. The question is:
How can I separate those data by the commas (like Admin -> 202cb9, Staff -> caf1a) and then store them into an array.
users_array(0) = "Admin"
users_array(1) = "Staff"
pass_array(0) = "202cb9"
pass_array(1) = "caf1a"
Thank you.
You can use users.Split(New Char() {","c}) as in this link.
http://www.dotnetperls.com/split-vbnet
I have two options to solve this problem:
Dim users_array() As String = users.Split(New String() {", "}, StringSplitOptions.RemoveEmptyEntries)
and:
Dim pass_array() As String = Split(users, ", ")
IMO, it is better to use ,<space> as a separator string instead of just ,, to avoid getting <space>staff at index 1.
Here the first solution works for both C# and VB.Net and second one is specific to VB.Net.
User String.Split to break strings apart on a specific character. It returns an array.

Break a string into Arrays

What I have:
So there is this large project I'm working on for school, and I have everything working except for a small but vital piece. The programm I am working on must convert currency, and take the rates from a txt file. The file looks like this:
USD 1,2694
JPY 100,44
BGN 1,955
CZK 25,396
DKK 7,45792
...
There is a tab break between the name and the value and a line break between the value and the next currency name. Values have a floating point, and don't have a fixed length.
What I need:
I need to break this string into two arrays, currencyNames() and currencyValues(), or into a two-dimentional array currency().
What I can do myself:
I can load it from a file into a string with
fileReader = My.Computer.FileSystem.ReadAllText("rates.txt")
And I was able to break it into an array with a simple loop
Do While i < 32
dummyArray = Split(fileReader, " ")
i += 1
Loop
but only when there is a space separating the names and values inside the file.
What you're looking for are the VB Constants, a set of special strings for special characters like tab and new line - there's a list at the link, but yours in particular are vbTab and vbCrLf. You shouldn't need to import anything - they're built in to VB.
To use them, you'd change it to something like:
dummyArray = Split(fileReader, vbCrLf) ' to split on lines
And then:
For Each s as String In dummyArray
otherArray = Split(s, vbTab) ' to split on tab characters
The basic idea is something like this:
Read each line from the file
Split the line on the space bar
Store the Country as the first portion of the split
Store the amount as the second portion, formatted as an integer
Project the Country and Amount into seperate arrays
Here's a simple implementation in Vb.Net
Sub Main
dim input = System.IO.File.ReadAllLines("c:\yourdata.txt")
dim projection = from line in input
let split = line.Split(new string(){" "},StringSplitOptions.RemoveEmptyEntries)
select Country = split.First(), Amount = split.Last().Replace(",","").Parse()
dim countries = projection.Select(function(p) p.Country).ToArray()
dim amounts = projection.Select(function(p) p.Amount).ToArray()
End Sub
I also used a small extension method to wrap Integer.TryParse
namespace ExtensionMethods
public module Extensions
<Extension()>_
public function Parse(byval value as string) as integer
dim i = 0
if integer.TryParse(value,out i) then
return i
end if
return 0
end function
end module
end namespace
A combination of ReadLine() and String.Split() should help you solve your problem.
If you were to a read each item line by line, using ReadLine(), you could then split on the space like this:
ReadLine().Split(' ').First();
and
ReadLine().Split(' ').Last();
to get the relevant values from your pair.