Splitting string in VB.NET produces unusual results - vb.net

I am using the Split function in VB.NET, and it is producing a weird result when I display the output.
I have an XML document, and I just want to loop through each <backup> tag in the document, so I am using:
Dim XMLString As String
XMLString = "<backup>INFO</backup><backup>ANOTHER INFO</backup>"
Dim SplitBackup As String()
SplitBackup = XMLString.Split("<backup>")
For Each BackupContent In SplitBackup
MsgBox(BackupContent)
Next
Now, one would expect this to output INFO in a MsgBox, I click OK and then another one would popup and show 'ANOTHER INFO', but it seems that the Split function is getting stuffed up with the '<' and '>' in it. Is there someway I can get around this by escaping it, or parsing it some other way.
Any suggestions are much appreciated!

Give XML a chance.
Dim bups As XElement = <backups><backup>INFO</backup><backup>ANOTHER INFO</backup></backups>
For Each xe As XElement In bups.Elements
Debug.WriteLine(xe.Value)
Next

One possibility to do this is to use regular expression to split the tags (and escape the problematic symbols) and then use LINQ to Objects to get the value from each tag:
Dim XMLString As String = "<backup>INFO</backup><backup>ANOTHER INFO</backup>"
Dim words = Regex.Split(XmlString, "\<backup\>").Where(function(f) not String.IsNullOrEmpty(f)).Select(function(f) f.Replace("</backup>", String.Empty)) '
for each word in words
Console.WriteLine(word)
next word
The output is:
INFO
ANOTHER INFO
What the code does:
Dim words = Regex
' split on every tag
.Split(XmlString, "\<backup\>")
' remove empty items
.Where(function(f) not String.IsNullOrEmpty(f))
' remove trailing closing tag and thus retrieve the value within
.Select(function(f) f.Replace("</backup>", String.Empty))
As already suggested you better learn how to use the build-in XML support - it is easier and safer because you do not need to pay attention to the brackets - < and > are automatically handled. A possible solution could look like this (! you need to have a valid XML structure - one unique root node!):
' !!! you need to have a valid XML element - one root node !!!
Dim XMLString As String = "<root><backup>INFO</backup><backup>ANOTHER INFO</backup></root>"
dim words = XDocument.Parse(XMLString).Root.Descendants("backup").Select(function (f) f.Value)
for each word in words
Console.WriteLine(word)
next word
The output is the same as above. How does the code work:
dim words = XDocument
' parse the specified XML structure; use Load("file.xml") to load from file
.Parse(XMLString)
' from the root node
.Root
' take all nodes matching the specified tag
' note - no need to pay attention to the < and >
.Descendants("backup")
' select the value of the XML node
.Select(function (f) f.Value)

You would need to get rid of the close element.
So you could use:
Dim XMLString As String
XMLString = "<backup>INFO</backup><backup>ANOTHER INFO</backup>"
Dim SplitBackup As String()
SplitBackup = XMLString.Split("<backup>")
For Each BackupContent In SplitBackup
Dim Something() as string
Something = BackupContent.Split("</backup">)
MsgBox(Something(0))
Next
Not the most elegant coding though.

Related

VB.NET Get text in between multiple Quotations

i need some help.
i need to get there text file value on Quot ("") on multi textbox1, textbox2, textbox3. but can only get on value (first value on textbox1)
now a time i just get one value (firt value on Quot)
text file (2.txt):
C:\contexture\img2itp.exe "\mynetwork\1.png" "\mynetwork\2.png" "148"
code vb:
Using sr As New StreamReader("C:\test\2.txt")
Dim line As String
' Read the stream to a string and write the string to the console.
line = sr.ReadToEnd()
Dim s As String = line
Dim i As Integer = s.IndexOf("""")
Dim f As String = s.Substring(i + 1, s.IndexOf("""", i + 1) - i - 1)
TextBox1.Text = f
thanks for a help :)
Regex Match
What Is Regex :
A regular expression is a pattern that the regular expression engine attempts to match in input text. A pattern consists of one or more character literals, operators, or constructs.
Source: https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx
Here we could consider two Regex Expressions to solve this problem a simple version and a more complex creating capture groups.
Simple one :
"(.*?)"
Here is the explanation: https://regex101.com/r/mzSmH5/1
Complex One :
(["'])(?:(?=(\\?))\2.)*?\1"(.*?)"
Here is an explanation: https://regex101.com/r/7ZMVsB/1
VB.NET Implementation
This would be a job for a Regex.Matches which would work like this:
Dim value As String = IO.File.ReadAllText("C:\test\2.txt")
Dim matches As MatchCollection = Regex.Matches(value, """(.*?)""")
' Loop over matches.
For Each m As Match In matches
' Loop over captures.
For Each c As Capture In m.Captures
' Display.
Console.WriteLine("Index={0}, Value={1}", c.Index, c.Value)
Next
Next

XML elements source contains spaces

I'm very new in reading XML content and now i'm running into the issue that some XML elements are containing a white space and VB.net is not accepting this.
Please have a look at the line of code starting with "Today_CurrentTemp". In this line you find an element , the space and quotes are not accepted like this by the XDocument.
Please help me how to work arround this. I cannot change the XML source format.
Const URL As String = "http://xml.buienradar.nl/"
Try
Dim xml = XDocument.Load(URL)
Today_DescriptionShort = xml.<buienradarnl>.<weergegevens>.<verwachting_vandaag>.<samenvatting>.Value
Today_DescriptionLong = xml.<buienradarnl>.<weergegevens>.<verwachting_vandaag>.<tekst>.Value
Today_CurrentTemp = xml.<buienradarnl>.<weergegevens>.<actueel_weer>.<weerstations>.<weerstation id="6391">.<temperatuurGC>.Value
The element <weerstation id="6391"> does not contain whitespace in its name.
The whitespace indicates that the next literal is considered as an Xml Attribute with a specific value in double quotes (id="6391").
Here is how you get the current temp:
Today_CurrentTemp = xml.<buienradarnl>.<weergegevens>.<actueel_weer>.<weerstations>.<weerstation>.Where(Function (x) x.Attribute("id") = "6391").First().<temperatuurGC>.Value
I used a lambda expression to give me the first occurance of an element named <weerstation> with an Attribute named id and value 6391.
I assume that the id is unqiue so the .First() appraoch is correct.
That works perfectly!
Now I have a similar issue with the line
<icoonactueel zin="bewolkt" ID="p">http://xml.buienradar.nl/icons/p.gif</icoonactueel>
<temperatuur10cm>11.3</temperatuur10cm>
Where bewolkt can have different values and the ID changes as wel, based on the type of weather. The only that I would like to have out of this element is the URL.
How to handle this?
see part of the XML below as example:
<weerstation id="6391">
<stationcode>6391</stationcode>
<stationnaam regio="Venlo">Meetstation Arcen</stationnaam>
<lat>51.30</lat>
<lon>6.12</lon>
<datum>04/13/2016 11:50:00</datum>
<luchtvochtigheid>83</luchtvochtigheid>
<temperatuurGC>10.5</temperatuurGC>
<windsnelheidMS>2.12</windsnelheidMS>
<windsnelheidBF>2</windsnelheidBF>
<windrichtingGR>123.0</windrichtingGR>
<windrichting>OZO</windrichting>
<luchtdruk>-</luchtdruk>
<zichtmeters>-</zichtmeters>
<windstotenMS>3.7</windstotenMS>
<regenMMPU>-</regenMMPU>
<icoonactueel zin="bewolkt" ID="p">http://xml.buienradar.nl/icons/p.gif</icoonactueel>
<temperatuur10cm>11.3</temperatuur10cm>
Try this...
Const URL As String = "http://xml.buienradar.nl/"
Sub Main()
Dim Today_DescriptionShort
Dim Today_DescriptionLong
Dim Today_CurrentTemp
Try
Dim xsrcdoc = XDocument.Load(URL)
Today_DescriptionShort = (From xml In xsrcdoc.Descendants("verwachting_vandaag")
Select New With {.Val = xml.Element("samenvatting").Value}).FirstOrDefault
Today_DescriptionLong = (From xml In xsrcdoc.Descendants("verwachting_vandaag")
Select New With {.Val = xml.Element("tekst").Value}).FirstOrDefault
Today_CurrentTemp = (From xml In xsrcdoc.Descendants("weerstation").Where(Function(x) x.Attribute("id").Value = "6391")
Select New With {.Val = xml.Element("temperatuurGC").Value}).FirstOrDefault
Catch ex As Exception
End Try
End Sub

Lowercase the first word

Does anybody know how to lowercase the first word for each line in a textbox?
Not the first letter, the first word.
I tried like this but it doesn't work:
For Each iz As String In txtCode.Text.Substring(0, txtCode.Text.IndexOf(" "))
iz = LCase(iz)
Next
When you call Substring, it is making a copy of that portion of the string and returning it as a new string object. So, even if you were successfully changing the value of that returned sub-string, it still would not change the original string in the Text property.
However, strings in .NET are immutable reference-types, so when you set iz = ... all you are doing is re-assigning the iz variable to point to yet another new string object. When you set iz, you aren't even touching the value of that copied sub-string to which it previously pointed.
In order to change the value of the text box, you must actually assign a new string value to its Text property, like this:
txtCode.Text = "the new value"
Since that is the case, I would recommend building a new string, using a StringBuilder object, and then, once the modified string is complete, then set the text box's Text property to that new string, for instance:
Dim builder As New StringBuilder()
For Each line As String In txtCode.Text.Split({Environment.NewLine}, StringSplitOptions.None)
' Fix case and append line to builder
Next
txtCode.Text = builder.ToString()
The solutions here are interesting but they are ignoring a fundamental tool of .NET: regular expressions. The solution can be written in one expression:
Dim result = Regex.Replace(txtCode.Text, "^\w+",
Function (match) match.Value.ToLower(), RegexOptions.Multiline)
(This requires the import System.Text.RegularExpressions.)
This solution is likely more efficient than all the other solutions here (It’s definitely more efficient than most), and it’s less code, thus less chance of a bug and easier to understand and to maintain.
The problem with your code is that you are running the loop only on each character of the first word in the whole TextBox text.
This code is looping over each line and takes the first word:
For Each line As String In txtCode.Text.Split(Environment.NewLine)
line = line.Trim().ToLower()
If line.IndexOf(" ") > 0 Then
line = line.Substring(0, line.IndexOf(" ")).Trim()
End If
// do something with 'line' here
Next
Loop through each of the lines of the textbox, splitting all of the words in the line, making sure to .ToLower() the first word:
Dim strResults As String = String.Empty
For Each strLine As String In IO.File.ReadAllText("C:\Test\StackFlow.txt").Split(ControlChars.NewLine)
Dim lstWords As List(Of String) = strLine.Split(" ").ToList()
If Not lstWords Is Nothing Then
strResults += lstWords(0).ToLower()
If lstWords.Count > 1 Then
For intCursor As Integer = 1 To (lstWords.Count - 1)
strResults += " " & lstWords(intCursor)
Next
End If
End If
Next
I used your ideas guys and i made it up to it like this:
For Each line As String In txtCode.Text.Split(Environment.NewLine)
Dim abc() As String = line.Split(" ")
txtCode.Text = txtCode.Text.Replace(abc(0), LCase(abc(0)))
Next
It works like this. Thank you all.

Creating Newlines in PDF with VB.net

I have an application which creates a list from items in a collection. Then for each item, I will add it to an empty string, then add a newline character to the end of it. So ideally my string will look something like:
List1\nList2\nList3\n
Once this string is generated, I send it back to be placed in a placeholder for a pdf. If I try this code in a simple console application, it prints everything on a newline. But in my real world situation, I have to print it to a pdf. The items only show up with spaces in between them and not newlines. How can can format my strings so that pdf recognizes the newline symbol rather than ignoring it?
Here is my code that generates the string with newlines.
Private Function ConcatPlacardNumbers(ByVal BusinessPlacardCollection As BusinessPlacardCollection) As String
Dim PlacardNumbersList As String = Nothing
Dim numberofBusinessPlacards As Long = BusinessPlacardCollection.LongCount()
For Each BusinessPlacard As BusinessPlacard In BusinessPlacardCollection
numberofBusinessPlacards = numberofBusinessPlacards - 1
PlacardNumbersList = String.Concat(PlacardNumbersList, BusinessPlacard.PlacardNumber)
If numberofBusinessPlacards <> 0 Then
PlacardNumbersList = String.Concat(PlacardNumbersList, Enviornment.newline)
End If
Next
Return PlacardNumbersList
End Function
Try to add \u2028 instead:
Private Function ConcatPlacardNumbers(ByVal BusinessPlacardCollection As _
BusinessPlacardCollection) As String
Dim PlacardNumbersList As New StringBuilder()
For Each BusinessPlacard As BusinessPlacard In BusinessPlacardCollection
PlacardNumbersList.Append(BusinessPlacard.PlacardNumber)
'PlacardNumbersList.Append(ChrW(8232)) '\u2028 line in decimal form
PlacardNumbersList.Append(ChrW(8233)) '\u2029 paragr. in decimal form
Next
Return PlacardNumbersList.ToString
End Function
For paragraphs use \u2029instead. Fore more details:
http://blogs.adobe.com/formfeed/2009/01/paragraph_breaks_in_plain_text.html
The answer will depend on the tool that is being used to produce the PDF. Since newline doesn't work, I would actually try \n. The other possibility is that the PDF generation code is not designed to emit multiple lines; you can only determine this by examining the generation code.
However, there is a significant performance issue that you should address in your code: you will be generating a lot of string objects using this code. You should change the design to use System.Text.StringBuilder, which will greatly improve the performance:
Private Function ConcatPlacardNumbers(ByVal BusinessPlacardCollection As BusinessPlacardCollection) As String
Dim PlacardNumbersList As New System.Text.StringBuilder(10000)
For Each BusinessPlacard As BusinessPlacard In BusinessPlacardCollection
If PlacardNumbersList.Length <> 0 Then
' This is equivalent to Environment.NewLine
'PlacardNumbersList.AppendLine()
' The attempt to use \n
PlacardNumbersList.Append("\n")
End If
PlacardNumbersList.Append(BusinessPlacard.PlacardNumber)
Next
Return PlacardNumbersList.ToString
End Function
Note that you also do not need to keep track of the placard number: you can add a newline to the end of the previous item on each pass after the first one.

Read text file with tab and carraige return format to store them in array

I have to text file in the following format :
Word[tab][tab]Word[Carriage Return]
Word[tab][tab]Word[Carriage Return]
Word[tab][tab]Word[Carriage Return]
I want to get all the words before the tab into one array or to create a new text file and the all the words after the tab into another array or create a new text file too.
Here my function to get the words before tab into an array :
Protected Sub MakeWordListBeforeTab()
Dim filename As String = "D:\lao\00001.txt"
'read from file'
Dim MyStream As New StreamReader(filename)
'words before tab
Dim WordBeforeTabArr() As String = MyStream.ReadToEnd.Split(CChar("\t"))
MyStream.Close()
'test to see the word in array
For d As Integer = 0 To WordBeforeTabArr.Length - 1
MsgBox(WordBeforeTabArr(d))
Next
End Sub
I wrote the above function to get all words before tab but I got all the words into array. I've been trying to use the Split method above. What is another method to split those words ? Can anyone show me some code to get this done right ?
I know this can be done with regular expression but I don't know regex yet. If you can show me how to get this done with regex it'll be awesome. Thanks.
You could try the split function on String. It could be used like this:
Dim lines() As String = IO.File.ReadAllLines(filename)
For Each line As String In lines
Dim words() As String = _
line.Split(New Char() {vbTab}, StringSplitOptions.RemoveEmptyEntries)
Next
The words array for each line would the two words. One word at each position. You could fill your two arrays or write the values out to a text file or file as you split the lines of the input file in the loop.
First of all above code is not compiling: See proper code as follows:
Dim lines() As String = IO.File.ReadAllLines(test_Filename)
For Each line As String In lines
Dim words() As String = _
line.Split("\t".ToCharArray()(0), StringSplitOptions.RemoveEmptyEntries)
Next