I want to find the highest numerical value in a CSV field as this will determine the next highest number.
Dim founditem() As String = Nothing
For Each line As String In File.ReadAllLines("F:\Computing\Spelling Bee\testtests.csv")
Dim item() As String = line.Split(","c)
Do While item(8) = choice
If weeknumber < item(9) Then
weeknumber += 1
Else
Exit Do
End If
Loop
Next
I am getting an "index is out of bounds" exception. Why?
.NET arrays are zero-bound. Their indexes range from 0 to number
of columns - 1.
Do you have ten columns? Because of item(9) you would at least need
to have ten columns.
Note that also empty fields at the end of the line must be separated
by commas in a CSV file. If you have 10 columns, a line must always have 9 commas.
Also an empty line at the end of the file might cause the problem
because it will yield exactly one empty item for that line, instead
of ten.
Add a test for the line length:
Do While item.Length = 10 AndAlso item(8) = choice
If weeknumber < item(9) Then
weeknumber += 1
Else
Exit Do
End If
Loop
If this does not help, set a breakpoint at the beginning of the method, step through it and inspect the variables. The Visual Studio debugger makes it very easy to find most such errors. Even the Exception tells you the line and column numbers of the faulty spot.
Parsing a CSV file is much more than splitting by comma. You may encounter nested commas, such as when comma is part of the value. Your way of parsing will retrieve weird results on such data. You can also have nested newlines, such as when a newline character is part of the value. So your CSV record can span multiple lines. There may be more issues out there, which I don't remember off top of my head.
Better be using a 3rd party CSV parser, such as this one:
KBCsv # Codeplex.
Related
Hi I am trying to search for a line which contains whats the user inputs in a text box and display the whole line. My code below doesnt display a messsagebox after the button has been clicked and i am not sure if the record has been found
Dim filename, sr As String
filename = My.Application.Info.DirectoryPath + "\" + "mul.txt"
Dim file As String()
Dim i As Integer = 0
file = IO.File.ReadAllLines(filename)
Dim found As Boolean
Dim linecontain As Char
sr = txtsr.ToString
For Each line As String In file
If line.Contains(sr) Then
found = True
Exit For
End If
i += 1
If found = True Then
MsgBox(line(i))
End If
Next
End Sub
You should be calling ReadLines here rather than ReadAllLines. The difference is that ReadAllLines reads the entire file contents into an array first, before you can start processing any of it, while ReadLines doesn't read a line until you have processed the previous one. ReadAllLines is good if you want random access to the whole file or you want to process the data multiple times. ReadLines is good if you want to stop processing data when a line satisfies some criterion. If you're looking for a line that contains some text and you have a file with one million lines where the first line matches, ReadAllLines would read all one millions lines whereas ReadLines would only read the first.
So, here's how you display the first line that contains specific text:
For Each line In File.ReadLines(filePath)
If line.Contains(substring) Then
MessageBox.Show(line)
Exit For
End If
Next
With regards to your original code, your use of i makes no sense. You seem to be using i as a line counter but there's no point because you're using a For Each loop so line contains the line. If you already have the line, why would you need to get the line by index? Also, when you try to display the message, you are using i to index line, which means that you're going to get a single character from the line rather than a single line from the array. If the index of the line is greater than the number of characters in the line then that is going to throw an IndexOutOfRangeException, which I'm guessing is what's happening to you.
This is what comes from writing code without knowing what it actually has to do first. If you had written out an algorithm before writing the code, it would have been obvious that the code didn't implement the algorithm. If you have no algorithm though, you have nothing to compare your code to to make sure that it makes sense.
I use this code to check if a String is in another String:
If StringData(1).Contains("-SomeText2.") Then
'some code
End If
'StringData(1) looks like this:
'-SomeText1.1401-|-SomeText2.0802-|-SomeText3.23-|-SomeText4.104-|
'In case I look for -SomeText1. I need 1401
'In case I look for -SomeText2. I need 0802
'In case I look for -SomeText3. I need 23
'In case I look for -SomeText4. I need 104
I first check if -SomeText2. is in StringData(1), and if it is, I need to get the next part of the text: 0802 which is the part I don't know how to do, how can I do it?
All the strings are separated by | and all substrings start and end with - and have a . separating the first part from the second. I check all the strings starting with - and ending with . because there are some with - and | in the middle, so Split function won't work.
Those strings change quite often, so I need something to check it no matter the length of the strings.
I would just split the string up and get the text between "." and "-" when the search text is found like this:
Dim str As String = "-SomeText1.1401-|-SomeText2.0802-|-SomeText3.23-|-SomeText4.104-"
Dim searches() As String = {"-SomeText1", "-SomeText2", "-SomeText3", "-SomeText4"}
For Each search As String In searches
For Each value As String In str.Split(CChar("|"))
If value.Contains(search) Then
Dim partIwant As String = value.Substring(value.IndexOf(".") + 1, value.Length - value.IndexOf(".") - 2)
MsgBox(partIwant)
'Outputs: 1401, 0802, 23, 104
Exit For
End If
Next
Next
In this example, we just use Contains() to see if our search string is present or not...we can't actually use that function to get any further information because all it returns is a True or False. So once we know that our string has been found, it's just a matter of some string manipulation to grab the text between the "." and "-" characters. IndexOf() will get us the index of the period, and then we just pull the text between there and the last character of the string.
Your question has nothing to do with WPF, so the tag and title are misleading.
To solve your problem, you should use String.IndexOf(string) instead of String.Contains(string). That tells you at which position the given string starts. If that value is -1, it means that the original string does not contain your search string at all.
Once you have that starting index, you can use String.IndexOf(string, int) to search for the next occurrence of -, so you know where the entry stops. The second parameter tells it at which index it should start the search, and in this case you should start the search at the index where you found your first match.
Now that you know the starting index of your match, the end index of the entry and the length of your search string, you can put those together and easily use String.Substring(int, int) to get the part of the string that you are interested in.
That's the straight forward, naive solution. A more sophisticated solution would simply build a regular expression for the search string that is built in a way that the part you are interested in is included in the capture group. But that's a more elaborate topic.
I need some direction on how to solve a problem I am working on. The root issue is that I need to work with CSV files in another program. The source system that creates the CSV files does not strip out CRLF in any of the data fields that get exported (meaning some fields have an embedded CRLF). As a result I receive a CSV file that has malformed rows in it. My end goal is an utility that will
check the first column of each row (which if correct is a GUID with a length of 36, or
count the columns in each row (which is the example below).
In the example below I am looking at the column count. If the correct count is 18 then I want it to write that row to a new file. If the column count is not correct I want to remove the CRLF from that row until the column count is correct.
Again, two ways to solve the issue that I know of:
Check the length of the first column for a length of 36 (before the first comma and excluding the first row which is the title row), or
count the columns and remove any trailing CRLF until the column count is equal to 18 (the total column count).
My issue with the code so far is being able to write out a valid row to a new file. Currently it writes out System.String[] instead of the actual row.
Public Class Form1
Private Sub btnFixit_Click(sender As Object, e As EventArgs) Handles btnFixit.Click
Dim iBadRowNumber As Integer = vbNull
Dim strFixedFile As System.IO.StreamWriter = My.Computer.FileSystem.OpenTextFileWriter(Me.txtFixedFile.Text, True)
Using MyReader As New Microsoft.VisualBasic.FileIO.TextFieldParser(Me.txtBaselineFileToProcess.Text)
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
If currentRow.Count = 18 Then
strFixedFile.WriteLine(currentRow)
Else
' Future code here to fix the line
End If
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
End Using
strFixedFile.Close()
End Sub
End Class
Here is an example of 2 correct rows with one incorrect row in the middle. In this example the row beginning with Sometown is really part of the prior row. I have also seen that one true row may be broken into three or more partial rows like similar to what you see in the Sometown row.
CustomerId,CustomerName,Status,Type,CustomerNumber,DBA,Address1,Address2,City,State,ZipCode,WebAddr,EMail,SalesCode,ServiceCode,DivisionCode,BranchCode,DepartmentCode
6d0125cd-70cf-4048-9ee1-8d9682e426a5,"Smith,James",Active,Customer,8,,103 Long Dr,,AnotherTown,NJ,000000,,,!!S,!%9,!!#,!!#,"!""."
35ed375c-c226-4879-a789-469cae63383c,"Doe, John",Active,Customer,55281,,28 Short Drive,,
Sometown,CA,12345,,
email#domain.com,"!$,",!$^,!!#,!!#,!!K
a5972bce-408f-4def-b77c-4ae0148dd045,"Duck,Donald",Active,Customer,25,,236 North Main St,,Mytown,PA,11111,,,!!2,!%9,!!#,!!#,"!""."
There may be much more elegant ways to perform the specific task. I am open either to corrections to my logic above or a totally different way to solve the problem either in VB.net or PowerShell.
Normally, csv can have multiline fields without a problem. But those need to be surrounded with quotes.
In your example this doesn't seem the case, but on the other hand there is no multiline field either, the field with value Sometown starts at a new line. So I wonder if this is the original data.
In case your multiline fields are surrounded with quotes you need to inform your parser about it.
But even with the single lines you will have problems caused by the fields with a seperator inside. Luckily those are quoted (as they should be), so you need to set the TextFieldParser.HasFieldsEnclosedInQuotes property as wel.
Now, if your multiline fields happen to be quoted (as they should be), the above setting should solve everything.
Update
You could do something like this:
currentRow = MyReader.ReadFields()
If currentRow.Count = 18 Then
strFixedFile.WriteLine(currentRow)
Else
'Write current row without newline
'Read next line/row
'WriteLine this row
End If
But you'll have to take care of fields like "Smith,James" with a seperator inside. Make sure your parser handles quoted fields properly (see above).
The most straightforward approach would probably be a variation of your first validation check:
Read the file line-by-line and keep both the current and the previous line in a buffer.
Check if the beginning of the line is a proper GUID (e.g. with a regular expression).
If the current line does not start with a GUID, append it to the previous line.
Otherwise write the previous line to the output file unless it's empty, then replace it with the current line.
I don't know VB.net, but in PowerShell it would look somewhat like this:
$reader = New-Object IO.StreamReader ('C:\path\to\input.csv')
$writer = New-Object IO.StreamWriter ('C:\path\to\output.csv', $false)
$writer.WriteLine($reader.ReadLine()) # copy CSV header
$output = '' # output buffer
$current = '' # pre-buffered current line from input file
while ($reader.Peek() -ge 0) {
# read line into pre-buffer
$current = $reader.ReadLine()
$hasGUID = $current -match '^[a-f0-9]{8}(-[a-f0-9]{4}){3}-[a-f0-9]{12},'
# append line to output buffer if it doesn't have a GUID, otherwise
# write the output buffer to file if it contains data and move the
# current line to the output buffer
if (-not $hasGUID) {
$output += $current
} else {
if ($output) { $writer.WriteLine($output) }
$output = $current
}
}
# write remaining pre-buffered line (if there is one)
if ($current -and $hasGUID) { $writer.WriteLine($current) }
$reader.Close(); $reader.Dispose()
$writer.Close(); $writer.Dispose()
I am having some issues handling strings with carriage returns in them. (I need to keep the carriage returns. And would like to not have to code for them uniquely.) This string has 2 carriage returns...
Press the Switch
Drying
And when I view it in SQL editor of Visual Studio it appears like....
Press the SwitchDrying
Which isn't technically an issue. I can copy and paste the contents into Excel or elsewhere and it is correctly formatted.
The issue comes when I try to compare the string to another variable, even if it has the same value.
I query a set of records from my Sql 2008 R2 DB table and then compare them to an external datasoruce.
So as I loop through the records of the SQL result set....
For Each row As DataRow In myTable.Rows
Dim stringVal As String = row(columnName).ToString()
' Eventually added this to see that the row was adding 2 spaces after the carriage return
Dim cstringVal() As Char = stringVal.ToCharArray
Dim csearchValue() As Char = searchValue.ToCharArray
' Originally tried
If row(columnName) = searchValue And row(columnName2) = searchValue2 Then
return True
End If
' Tried this
If stringVal = searchValue And row(columnName2) = searchValue2 Then
Return True
End If
' And this
If String.Compare(stringVal, searchValue, False) = 0 And row(columnName2) = searchValue2 Then
Return True
End If
Next
Return False
After adding the char array, noticed that somehow there were 2 spaces being added after the first carriage return. Or maybe it is one space after each of them, as the CRs are not identifed in the char array.
I do not have any code that splits this string and it is only strings that have carriage returns that cause this. The only other difference is that one string is an OLE DataAdapter and the other is a SQL DataAdapter.
What gives? any ideas?
UPDATE: It appears that the SQL DataAdapater is not correctly representing the carraige returns in the dataset that is returned from my query. When I view the table contents in VS SQL Editor, the string can correctly be copy and pasted from the table into any other app. I will be walking the code shortly looking the comparison of the values.
The new line is likely a combination of two bytes for CR and LF (or 0D 0A in hex, escaped as '\r\n' in C#, etc). If your string comes from a Linux system, it likely uses a single byte LF.
String comparisons would be expected to fail if they don't account for the embedded new line characters. If you would like to ignore newlines for purposes of the string comparison, you will have to do something like a String.Replace to remove newline characters. Of course you can always keep a copy of your original string that includes the newline characters.
I hope this is relevant to the problem you are seeing?
I have a text file that has multiple blank lines and Im trying to return all the lines between two of them specifically
so if I have a file that looks like this:
____________________________
1########################
2##########################
3
4########################
5##########################
6#######################
7
8#########################
9##########################
10#######################
11####################
12########################
13#########################
14
15##########################
----------------------------
I would like to grab lines 8-13. Unfortunately, it might not always be 8-13 as it could be 9-20 or 7-8, but it will however always be between the 2nd and 3rd line break.
I know how to trim characters and pull out singular lines, but I have no idea how to trim entire sections.
Any help would be appreciated, even if you just point me to a tutorial.
Thanks in advance.
The basic idea here is to get the entire thing as a string, split it into groups at the double line breaks, and then reference the group you want (in your case, the third one).
Dim value As String = File.ReadAllText("C:\test.txt")
Dim breakString As String = Environment.NewLine & Environment.NewLine
Dim groups As String() = value.Split({breakString}, StringSplitOptions.None)
Dim desiredString As String = groups(2)
MsgBox(desiredString)
Edit:
In response to the question in your comment -
Environment.NewLine is a more dynamic way of specifying a line break. Assuming you're running on windows - you could use VbCrLf as well. The idea is that if you were to compile the same code on Linux, it Environment.NewLine would generate a Lf instead. You can see here for more information: http://en.wikipedia.org/wiki/Newline
The reason I used Environment.NewLine & Environment.NewLine is because you want to break your information where there are two line breaks (one at the end of the last line of a paragraph, and one for the blank line before the next paragraph)
What I ended up doing was trimming the last part and searching for what I needed in the first part (I know I didnt include the searching part in the question, but I was just trying to figure out a way to narrow down the search results as it would have had repeated results). Im posting this incase anyone else stumbles upon this looking for some answers.
Dim applist() = System.IO.File.ReadAllLines("C:\applist.txt")
Dim findICSName As String = "pid"
Dim ICSName As New Regex("\:.*?\(")
Dim x = 0
Do Until applist(x).Contains("Total PSS by OOM adjustment:")
If applist(x).Contains(findICSName) Then
app = ICSName.Match(applist(x)).Value
app = app.TrimStart(CChar(": "))
app = app.TrimEnd(CChar("("))
ListBox1.Items.Add(app)
End If
x = x + 1
Loop
End If
How this works is that it looks through each line for the regex until it reaches first word in the breakpoint "Total PSS by OOM adjustment:"