Stored list of array using For Each loop - vb.net

I want to store the "zone_check_value" to a array of string then while inserting into array i will check the array of string if the next value is repeated or have duplicate.
Example.
1st Example
1st loop = zone_check_value = ZD1/01/2014
2nd loop = zone_check_value = ZD1/01/2014
2nd Example
1st loop = zone_check_value = ZD1/01/2014
2nd loop = zone_check_value = ZD2/02/2014
3rd loop = zone_check_value = ZD1/01/2014
Code:
For Each dt As DataTable In xls.Tables
Dim array_of_string as String() 'i want to put the value in here
For Each dr As DataRow In dt.Rows
Dim zone_destination As String = dr(2).ToString
Dim affected_date As String = dr(7).ToString
Dim zone_check_value = zone_destination + affected_date
''''''How can i store zone_check_value in a string array?
Next
Next
EDIT
What if i add select case in my loop? the array_of_string value become NULL. I need to check the value of the current value in array_of_string .
Example Code
For Each dt As DataTable In xls.Tables
Dim array_of_string as String() 'i want to put the value in here
select Case dt.tablename
case "Sheet1"
For Each dr As DataRow In dt.Rows
Dim zone_destination As String = dr(2).ToString
Dim affected_date As String = dr(7).ToString
Dim zone_check_value = zone_destination + affected_date
''''''How can i store zone_check_value in a string array?
Next
case "Sheet 2"
For Each dr As DataRow In dt.Rows
Dim check_value as Boolean = array_of_string.Contains(dr(0).ToString)
'but when i got in sheet 2 the array_of_string is null
Next
Next

The easiest way is to use a List(Of String).
For Each dt As DataTable In xls.Tables
Dim array_of_string as List(Of String) = New List(Of String) 'i want to put the value in here
For Each dr As DataRow In dt.Rows
Dim zone_destination As String = dr(2).ToString
Dim affected_date As String = dr(7).ToString
Dim zone_check_value = zone_destination & affected_date
''''''How can i store zone_check_value in a string array?
array_of_string.Add(zone_check_value)
Next
''' Now if you really need it in array form you can cast it via:
''' Dim values() As String = array_of_string.ToArray()
Next
You might even consider:
For Each dt As DataTable In xls.Tables
Dim values As List(Of String) = New List(Of String)
dt.Rows.ForEach( Sub(item) values.Add(item(2).ToString & item(7).ToString) )
''' Now do something with values.
Next
Either way, make sure you always use the string concatenation operator & to concatenate strings. The arithmetic addition operator + will cause you problems from time to time if you use it on strings.

Related

Search and replace inside string column in DataTable is slow?

I am fetching distinct words in a string column of a DataTable (.dt) and then replacing the unique values with another value, so essentially changing words to other words. Both approaches listed below work, however, for 90k records, the process is not very fast. Is there a way to speed up either approach?
The first approach, is as follows:
'fldNo is column number in dt
For Each Word As String In DistinctWordList
Dim myRow() As DataRow
myRow = dt.Select(MyColumnName & "='" & Word & "'")
For Each row In myRow
row(fldNo) = dicNewWords(Word)
Next
Next
A second LINQ-based approach is as follows, and is actually not very fast either:
Dim flds as new List(of String)
flds.Add(myColumnName)
For Each Word As String In DistinctWordsList
Dim rowData() As DataRow = dt.AsEnumerable().Where(Function(f) flds.Where(Function(el) f(el) IsNot DBNull.Value AndAlso f(el).ToString = Word).Count = flds.Count).ToArray
ReDim foundrecs(rowData.Count)
Cnt = 0
For Each row As DataRow In rowData
Dim Index As Integer = dt.Rows.IndexOf(row)
foundrecs(Cnt) = Index + 1 'row.RowId
Cnt += 1
Next
For i = 0 To Cnt
dt(foundrecs(i))(fldNo) = dicNewWords(Word)
Next
Next
So you have your dictionary of replacements:
Dim d as New Dictionary(Of String, String)
d("foo") = "bar"
d("baz") = "buf"
You can apply them to your table's ReplaceMe column:
Dim rep as String = Nothing
For Each r as DataRow In dt.Rows
If d.TryGetValue(r.Field(Of String)("ReplaceMe"), rep) Then r("ReplaceMe") = rep
Next r
On my machine it takes 340ms for 1 million replacements. I can cut that down to 260ms by using column number rather than name - If d.TryGetValue(r.Field(Of String)(0), rep) Then r(0) = rep
Timing:
'setup, fill a dict with string replacements like "1" -> "11", "7" -> "17"
Dim d As New Dictionary(Of String, String)
For i = 0 To 9
d(i.ToString()) = (i + 10).ToString()
Next
'put a million rows in a datatable, randomly assign dictionary keys as row values
Dim dt As New DataTable
dt.Columns.Add("ReplaceMe")
Dim r As New Random()
Dim k = d.Keys.ToArray()
For i = 1 To 1000000
dt.Rows.Add(k(r.Next(k.Length)))
Next
'what range of values do we have in our dt?
Dim minToMaxBefore = dt.Rows.Cast(Of DataRow).Min(Function(ro) ro.Field(Of String)("ReplaceMe")) & " - " & dt.Rows.Cast(Of DataRow).Max(Function(ro) ro.Field(Of String)("ReplaceMe"))
'it's a crappy way to time, but it'll prove the point
Dim start = DateTime.Now
Dim rep As String = Nothing
For Each ro As DataRow In dt.Rows
If d.TryGetValue(ro.Field(Of String)("ReplaceMe"), rep) Then ro("ReplaceMe") = rep
Next
Dim ennd = DateTime.Now
'what range of values do we have now
Dim minToMaxAfter = dt.Rows.Cast(Of DataRow).Min(Function(ro) ro.Field(Of String)("ReplaceMe")) & " - " & dt.Rows.Cast(Of DataRow).Max(Function(ro) ro.Field(Of String)("ReplaceMe"))
MessageBox.Show($"min to max before of {minToMaxBefore} became {minToMaxAfter} proving replacements occurred, it took {(ennd - start).TotalMilliseconds} ms for 1 million replacements")

Looking for a way to populate listview from String array

I have a problem populating a ListView with an I have created. All of the data goes to one column instead of rows. Could you help me to populate it correctly?
Dim finalas() As String = arrf.ToArray(GetType(System.String))
For Each element As String In finalas
Dim item As New ListViewItem(element)
ListView1.Items.Add(item)
Next
I have found a solution myself, if some one needs it, here you go:
Sub readTextFile()
' Create new StreamReader instance with Using block.
Dim path As String = "D:\data.txt"
Dim st() As String = File.ReadAllLines(path) 'read the file into array of
Dim p_1 As String = ""
Dim p_2 As String = ""
Dim arrl As Integer = 11
For Each itm As String In st 'loop the array of string item by item
Dim Arr() As String = itm.Split(New String() {" "}, StringSplitOptions.RemoveEmptyEntries) 'split the string
Dim name() As String = itm.Split(New String() {"'"}, StringSplitOptions.RemoveEmptyEntries)
' Arr.Skip(1).ToArray -nenuskaito pirmojo
' Arr = Arr.Take(Arr.Length - 1).ToArray - nenuskaito paskutinio
'galutinis array
p_1 = Arr(1)
p_2 = Arr(2)
Dim finarr As New List(Of String)
finarr.Add(p_1)
finarr.Add(p_2)
finarr.Add(name(1))
For i As Integer = 4 To arrl
finarr.Add(Arr(((Arr.Length - 1) - arrl) + i))
Next
'MsgBox(finarr(0) & finarr(1) & finarr(2) & finarr(3) & finarr(4) & finarr(5) & finarr(6) & finarr(7) & finarr(8) & finarr(9) & finarr(10))
Dim items As New List(Of ListViewItem)
Dim lvItem = New ListViewItem(finarr(0))
For i = 1 To 10
lvItem.SubItems.Add(finarr(i))
Next i
items.Add(lvItem)
ListView1.Items.AddRange(items.ToArray)
Next
Return
End Sub

VB: Count number of columns in csv

So, quite simple.
I am importing CSVs into a datagrid, though the csv always has to have a variable amount of columns.
For 3 Columns, I use this code:
Dim sr As New IO.StreamReader("E:\test.txt")
Dim dt As New DataTable
Dim newline() As String = sr.ReadLine.Split(";"c)
dt.Columns.AddRange({New DataColumn(newline(0)), _
New DataColumn(newline(1)), _
New DataColumn(newline(2))})
While (Not sr.EndOfStream)
newline = sr.ReadLine.Split(";"c)
Dim newrow As DataRow = dt.NewRow
newrow.ItemArray = {newline(0), newline(1), newline(2)}
dt.Rows.Add(newrow)
End While
DG1.DataSource = dt
This works perfectly. But how do I count the number of "newline"s ?
Can I issue a count on the number of newlines somehow? Any other example code doesn't issue column heads.
If my csv file has 5 columns, I would need an Addrange of 5 instead of 3 and so on..
Thanks in advance
Dim sr As New IO.StreamReader(path)
Dim dt As New DataTable
Dim newline() As String = sr.ReadLine.Split(","c)
' MsgBox(newline.Count)
' dt.Columns.AddRange({New DataColumn(newline(0)),
' New DataColumn(newline(1)),
' New DataColumn(newline(2))})
Dim i As Integer
For i = 0 To newline.Count - 1
dt.Columns.AddRange({New DataColumn(newline(i))})
Next
While (Not sr.EndOfStream)
newline = sr.ReadLine.Split(","c)
Dim newrow As DataRow = dt.NewRow
newrow.ItemArray = {newline(0), newline(1)}
dt.Rows.Add(newrow)
End While
dgv.DataSource = dt
End Sub
Columns and item values can be added to a DataTable individually, using dt.Columns.Add and newrow.Item, so that these can be done in a loop instead of hard-coding for a specific number of columns. e.g. (this code assumes Option Infer On, so adjust as needed):
Public Function CsvToDataTable(csvName As String, Optional delimiter As Char = ","c) As DataTable
Dim dt = New DataTable()
For Each line In File.ReadLines(csvName)
If dt.Columns.Count = 0 Then
For Each part In line.Split({delimiter})
dt.Columns.Add(New DataColumn(part))
Next
Else
Dim row = dt.NewRow()
Dim parts = line.Split({delimiter})
For i = 0 To parts.Length - 1
row(i) = parts(i)
Next
dt.Rows.Add(row)
End If
Next
Return dt
End Function
You could then use it like:
Dim dt = CsvToDataTable("E:\test.txt", ";"c)
DG1.DataSource = dt

Trimming a datagridview duplicat rows except the recent one

i'm in VS2008 Studio, i have this datagridview with multiple columns which the last column contains a date and time value.
lot's of rows are pretty the same except by they're date column.
what i wanted to do is to trim the whole datagridview duplicate rows except they're most recent ones based on they're date column.
i have sth like this:
Administrator,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 23:11:59
Administrator,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 -
21:11:59
Administrator,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 22:11:59
Administrator,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 20:11:59
Administrator,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 11:11:59
Everyone ,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 17:11:59
Everyone ,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 14:11:59
the output i want should be like this:
Administrator 192.168.137.221 2 file://C:\WMPub\WMRoot\industrial.wmv 07.Jul.2014 - 23:11:59
Everyone 192.168.137.201 2 file://C:\WMPub\WMRoot\industrial.wmv 07.Jul.2014 - 17:11:59
....
please consider "," as column seprators! (i dont know how to draw a table here, sorry again)!
i have this snippet that trim the duplicate lines in a datagridview but it lacks preserving the latest entry:
Public Function RemoveDuplicateRows(ByVal dTable As DataTable, ByVal colName As String) As DataTable
Dim hTable As New Hashtable()
Dim duplicateList As New ArrayList()
For Each dtRow As DataRow In dTable.Rows
If hTable.Contains(dtRow(colName)) Then
duplicateList.Add(dtRow)
Else
hTable.Add(dtRow(colName), String.Empty)
End If
Next
For Each dtRow As DataRow In duplicateList
dTable.Rows.Remove(dtRow)
Next
Return dTable
End Function
what should i do?
thanks in advance
Here is some code that illustrates the approach:
Dim dict As New dictionary(Of String, DataRow)
For Each dtRow As DataRow In dTable.Rows
Dim key As String = dtRow("column1") + "," + dtRow("column2") ' + etc.
Dim dictRow As DataRow = Nothing
If dict.TryGetValue(key, dictRow) Then
'check and update date
'you can skip this part, if your data is sorted
If dtRow("dateColumn") > dictRow("dateColumn") Then
dictRow("dateColumn") = dtRow("dateColumn")
End If
Else
dict.Add(key, dtRow)
End If
Next
In the end dict contains the rows you need, you can get them via dict.Values.ToArray()
EDIT: I found the error - dictRow should be dtRow in the above code (now fixed). Then it should work. Here is a full version of self contained example (console app), since I wrote it anyway - focus on RemoveDuplicates, the rest is just prepwork:
Sub Main()
Dim dt As New DataTable
With dt.Columns
.Add("PublishingPoint")
.Add("Username")
.Add("IP")
.Add("Status")
.Add("Req URL")
.Add("Last seen", GetType(Date))
End With
'this populates the initial data table, use your method
Dim _assembly As Assembly = Assembly.GetExecutingAssembly()
Dim _textStreamReader As New StreamReader(_assembly.GetManifestResourceStream("ConsoleApplication16.data.csv"))
While Not _textStreamReader.EndOfStream
Dim sLine As String = _textStreamReader.ReadLine().TrimEnd
If String.IsNullOrEmpty(sLine) Then Exit While
Dim values() As String = sLine.Split(",")
Dim newRow As DataRow = dt.NewRow
For iColumnIndex As Integer = 0 To dt.Columns.Count - 1
Dim columnName As String = dt.Columns(iColumnIndex).ColumnName
newRow.Item(columnName) = values(iColumnIndex)
Next
dt.Rows.Add(newRow)
End While
Console.WriteLine("Old count: " & dt.Rows.Count)
Dim newDt As DataTable = RemoveDuplicates(dt, "Last seen")
Console.WriteLine("New count: " & newDt.Rows.Count)
Console.ReadLine()
End Sub
Private Function RemoveDuplicates(dt As DataTable, colName As String) As DataTable
Dim keyColumnNames As New List(Of String)
Dim exceptColumnsHash As New HashSet(Of String)({colName})
For Each col As DataColumn In dt.Columns
Dim columnName As String = col.ColumnName
If Not exceptColumnsHash.Contains(col.ColumnName) Then
keyColumnNames.Add(columnName)
End If
Next
Dim dict As New Dictionary(Of String, DataRow)
For Each dtRow As DataRow In dt.Rows
Dim keyColumnValues As New List(Of String)
For Each keyColumnName In keyColumnNames
keyColumnValues.Add(dtRow.Item(keyColumnName))
Next
Dim key As String = String.Join(",", keyColumnValues)
Dim dictRow As DataRow = Nothing
If dict.TryGetValue(key, dictRow) Then
If dtRow(colName) > dictRow(colName) Then
dictRow(colName) = dtRow(colName)
End If
Else
dict.Add(key, dtRow)
End If
Next
Dim dtReturn As DataTable = dt.Clone
For Each dtRow As DataRow In dict.Values
dtReturn.ImportRow(dtRow)
Next
Return dtReturn
End Function
To make this code run, you need to manually add a file to the project and set build action to "Embedded resource".

How to select a value from a comma delimited string?

I have a string that contains comma delimited text. The comma delimited text comes from an excel .csv file so there are hundreds of rows of data that are seven columns wide. An example of a row from this file is:
2012-10-01,759.05,765,756.21,761.78,3168000,761.78
I want to search through the hundreds of rows by the date in the first column. Once I find the correct row I want to extract the number in the first position of the comma delimited string so in this case I want to extract the number 759.05 and assign it to variable "Open".
My code so far is:
strURL = "http://ichart.yahoo.com/table.csv?s=" & tickerValue
strBuffer = RequestWebData(strURL)
Dim Year As String = 2012
Dim Quarter As String = Q4
If Quarter = "Q4" Then
Dim Open As Integer =
End If
Once I can narrow it down to the right row I think something like row.Split(",")(1).Trim) might work.
I've done quite a bit of research but I can't solve this on my own. Any suggestions!?!
ADDITIONAL INFORMATION:
Private Function RequestWebData(ByVal pstrURL As String) As String
Dim objWReq As WebRequest
Dim objWResp As WebResponse
Dim strBuffer As String
'Contact the website
objWReq = HttpWebRequest.Create(pstrURL)
objWResp = objWReq.GetResponse()
'Read the answer from the Web site and store it into a stream
Dim objSR As StreamReader
objSR = New StreamReader(objWResp.GetResponseStream)
strBuffer = objSR.ReadToEnd
objSR.Close()
objWResp.Close()
Return strBuffer
End Function
MORE ADDITIONAL INFORMATION:
A more complete picture of my code
Dim tickerArray() As String = {"GOOG", "V", "AAPL", "BBBY", "AMZN"}
For Each tickerValue In Form1.tickerArray
Dim strURL As String
Dim strBuffer As String
'Creates the request URL for Yahoo
strURL = "http://ichart.yahoo.com/table.csv?s=" & tickerValue
strBuffer = RequestWebData(strURL)
'Create Array
Dim lines As Array = strBuffer.Split(New String() {Environment.NewLine}, StringSplitOptions.None)
'Add Rows to DataTable
dr = dt.NewRow()
dr("Ticker") = tickerValue
For Each columnQuarter As DataColumn In dt.Columns
Dim s As String = columnQuarter.ColumnName
If s.Contains("-") Then
Dim words As String() = s.Split("-")
Dim Year As String = words(0)
Dim Quarter As String = words(1)
Dim MyValue As String
Dim Open As Integer
If Quarter = "Q1" Then MyValue = Year & "-01-01"
If Quarter = "Q2" Then MyValue = Year & "-04-01"
If Quarter = "Q3" Then MyValue = Year & "-07-01"
If Quarter = "Q4" Then MyValue = Year & "-10-01"
For Each line In lines
Debug.WriteLine(line)
If line.Split(",")(0).Trim = MyValue Then Open = line.Split(",")(1).Trim
dr(columnQuarter) = Open
Next
End If
Next
dt.Rows.Add(dr)
Next
Right now in the For Each line in lines loop, Debug.WriteLine(line) outputs 2,131 lines:
From
Date,Open,High,Low,Close,Volume,Adj Close
2013-02-05,761.13,771.11,759.47,765.74,1870700,765.74
2013-02-04,767.69,770.47,758.27,759.02,3040500,759.02
2013-02-01,758.20,776.60,758.10,775.60,3746100,775.60
All the way to...
2004-08-19,100.00,104.06,95.96,100.34,22351900,100.34
But, what I expect is for Debug.WriteLine(line) to output one line at a time in the For Each line in lines loop. So I would expect the first output to be Date,Open,High,Low,Close,Volume,Adj Close and the next output to be 2013-02-05,761.13,771.11,759.47,765.74,1870700,765.74. I expect this to happen 2,131 times until the last output is 2004-08-19,100.00,104.06,95.96,100.34,22351900,100.34
You could loop through the lines and call String.Split to parse the columns in each line, for instance:
Dim lines() As String = strBuffer.Split(New String() {Environment.NewLine}, StringSplitOptions.None)
For Each line As String In lines
Dim columns() As String = line.Split(","c)
Dim Year As String = columns(0)
Dim Quarter As String = columns(1)
Next
However, sometimes CSV isn't that simple. For instance, a cell in a spreadsheet could contain a comma character, in which case it would be represented in CSV like this:
example cell 1,"example, with comma",example cell 3
To make sure you're properly handling all possibilities, I'd recommend using the TextFieldParser class. For instance:
Using parser As New TextFieldParser(New StringReader(strBuffer))
parser.TextFieldType = FieldType.Delimited
parser.SetDelimiters(",")
While Not parser.EndOfData
Try
Dim columns As String() = parser.ReadFields()
Dim Year As String = columns(0)
Dim Quarter As String = columns(1)
Catch ex As MalformedLineException
' Handle the invalid formatting error
End Try
End While
End Using
I would break it up into a List(of string()) - Each row being a new entry in the list.
Then loop through the list and look at Value(0).
If Value(0) = MyValue, then Open = Value(1)
You can use String.Split and this linq query:
Dim Year As Int32 = 2012
Dim Month As Int32 = 10
Dim searchMonth = New Date(Year, Month, 1)
Dim lines = strBuffer.Split({Environment.NewLine}, StringSplitOptions.None)
Dim dt As Date
Dim open As Double
Dim opens = From line In lines
Let tokens = line.Split({","c}, StringSplitOptions.RemoveEmptyEntries)
Where Date.TryParse(tokens(0), dt) AndAlso dt.Date = searchMonth AndAlso Double.TryParse(tokens(1), open)
If opens.Any() Then
open = Double.Parse(opens.First().tokens(1))
End If