Trimming a datagridview duplicat rows except the recent one - vb.net

i'm in VS2008 Studio, i have this datagridview with multiple columns which the last column contains a date and time value.
lot's of rows are pretty the same except by they're date column.
what i wanted to do is to trim the whole datagridview duplicate rows except they're most recent ones based on they're date column.
i have sth like this:
Administrator,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 23:11:59
Administrator,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 -
21:11:59
Administrator,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 22:11:59
Administrator,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 20:11:59
Administrator,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 11:11:59
Everyone ,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 17:11:59
Everyone ,192.168.137.221,2,file://C:\WMPub\WMRoot\industrial.wmv , 07.Jul.2014 - 14:11:59
the output i want should be like this:
Administrator 192.168.137.221 2 file://C:\WMPub\WMRoot\industrial.wmv 07.Jul.2014 - 23:11:59
Everyone 192.168.137.201 2 file://C:\WMPub\WMRoot\industrial.wmv 07.Jul.2014 - 17:11:59
....
please consider "," as column seprators! (i dont know how to draw a table here, sorry again)!
i have this snippet that trim the duplicate lines in a datagridview but it lacks preserving the latest entry:
Public Function RemoveDuplicateRows(ByVal dTable As DataTable, ByVal colName As String) As DataTable
Dim hTable As New Hashtable()
Dim duplicateList As New ArrayList()
For Each dtRow As DataRow In dTable.Rows
If hTable.Contains(dtRow(colName)) Then
duplicateList.Add(dtRow)
Else
hTable.Add(dtRow(colName), String.Empty)
End If
Next
For Each dtRow As DataRow In duplicateList
dTable.Rows.Remove(dtRow)
Next
Return dTable
End Function
what should i do?
thanks in advance

Here is some code that illustrates the approach:
Dim dict As New dictionary(Of String, DataRow)
For Each dtRow As DataRow In dTable.Rows
Dim key As String = dtRow("column1") + "," + dtRow("column2") ' + etc.
Dim dictRow As DataRow = Nothing
If dict.TryGetValue(key, dictRow) Then
'check and update date
'you can skip this part, if your data is sorted
If dtRow("dateColumn") > dictRow("dateColumn") Then
dictRow("dateColumn") = dtRow("dateColumn")
End If
Else
dict.Add(key, dtRow)
End If
Next
In the end dict contains the rows you need, you can get them via dict.Values.ToArray()
EDIT: I found the error - dictRow should be dtRow in the above code (now fixed). Then it should work. Here is a full version of self contained example (console app), since I wrote it anyway - focus on RemoveDuplicates, the rest is just prepwork:
Sub Main()
Dim dt As New DataTable
With dt.Columns
.Add("PublishingPoint")
.Add("Username")
.Add("IP")
.Add("Status")
.Add("Req URL")
.Add("Last seen", GetType(Date))
End With
'this populates the initial data table, use your method
Dim _assembly As Assembly = Assembly.GetExecutingAssembly()
Dim _textStreamReader As New StreamReader(_assembly.GetManifestResourceStream("ConsoleApplication16.data.csv"))
While Not _textStreamReader.EndOfStream
Dim sLine As String = _textStreamReader.ReadLine().TrimEnd
If String.IsNullOrEmpty(sLine) Then Exit While
Dim values() As String = sLine.Split(",")
Dim newRow As DataRow = dt.NewRow
For iColumnIndex As Integer = 0 To dt.Columns.Count - 1
Dim columnName As String = dt.Columns(iColumnIndex).ColumnName
newRow.Item(columnName) = values(iColumnIndex)
Next
dt.Rows.Add(newRow)
End While
Console.WriteLine("Old count: " & dt.Rows.Count)
Dim newDt As DataTable = RemoveDuplicates(dt, "Last seen")
Console.WriteLine("New count: " & newDt.Rows.Count)
Console.ReadLine()
End Sub
Private Function RemoveDuplicates(dt As DataTable, colName As String) As DataTable
Dim keyColumnNames As New List(Of String)
Dim exceptColumnsHash As New HashSet(Of String)({colName})
For Each col As DataColumn In dt.Columns
Dim columnName As String = col.ColumnName
If Not exceptColumnsHash.Contains(col.ColumnName) Then
keyColumnNames.Add(columnName)
End If
Next
Dim dict As New Dictionary(Of String, DataRow)
For Each dtRow As DataRow In dt.Rows
Dim keyColumnValues As New List(Of String)
For Each keyColumnName In keyColumnNames
keyColumnValues.Add(dtRow.Item(keyColumnName))
Next
Dim key As String = String.Join(",", keyColumnValues)
Dim dictRow As DataRow = Nothing
If dict.TryGetValue(key, dictRow) Then
If dtRow(colName) > dictRow(colName) Then
dictRow(colName) = dtRow(colName)
End If
Else
dict.Add(key, dtRow)
End If
Next
Dim dtReturn As DataTable = dt.Clone
For Each dtRow As DataRow In dict.Values
dtReturn.ImportRow(dtRow)
Next
Return dtReturn
End Function
To make this code run, you need to manually add a file to the project and set build action to "Embedded resource".

Related

convert csv data to DataTable in VB.net, capturing column names from row 0

I've adapted the code from the #tim-schmelter answer to question convert csv data to DataTable in VB.net (see below)
I would like to parse in the column titles from row 0 of the csv file
DT|Meter Number|Customer Account Number|Serial Number|Port...
but I'm not having any luck trying to figure out how to do this. any suggestions would be very appreciated.
Public Function csvToDatatable_2(ByVal filename As String, ByVal separator As String)
'////////////////////////////////////////
'Reads a selected txt or csv file into a datatable
'based on code from http://stackoverflow.com/questions/11118678/convert-csv-data-to-datatable-in-vb-net
'////////////////////////////////////////
Dim dt As System.Data.DataTable
Try
dt = New System.Data.DataTable
Dim lines = IO.File.ReadAllLines(filename)
Dim colCount = lines.First.Split(separator).Length
For i As Int32 = 1 To colCount
dt.Columns.Add(New DataColumn("Column_" & i, GetType(String)))
Next
For Each line In lines
Dim objFields = From field In line.Split(separator)
Dim newRow = dt.Rows.Add()
newRow.ItemArray = objFields.ToArray()
Next
Catch ex As Exception
Main.Msg2User(ex.Message.ToString)
Return Nothing
End Try
Return dt
End Function
Just loop thru all the line of the file. Use a boolean to check for the first row.
Public Function csvToDatatable_2(ByVal filename As String, ByVal separator As String)
Dim dt As New System.Data.DataTable
Dim firstLine As Boolean = True
If IO.File.Exists(filename) Then
Using sr As New StreamReader(filename)
While Not sr.EndOfStream
If firstLine Then
firstLine = False
Dim cols = sr.ReadLine.Split(separator)
For Each col In cols
dt.Columns.Add(New DataColumn(col, GetType(String)))
Next
Else
Dim data() As String = sr.Readline.Split(separator)
dt.Rows.Add(data.ToArray)
End If
End While
End Using
End If
Return dt
End Function
Here is a hybrid of the two solutions above, with a few other changes:
Public Shared Function FileToTable(ByVal fileName As String, ByVal separator As String, isFirstRowHeader As Boolean) As DataTable
Dim result As DataTable = Nothing
Try
If Not System.IO.File.Exists(fileName) Then Throw New ArgumentException("fileName", String.Format("The file does not exist : {0}", fileName))
Dim dt As New System.Data.DataTable
Dim isFirstLine As Boolean = True
Using sr As New System.IO.StreamReader(fileName)
While Not sr.EndOfStream
Dim data() As String = sr.ReadLine.Split(separator, StringSplitOptions.None)
If isFirstLine Then
If isFirstRowHeader Then
For Each columnName As String In data
dt.Columns.Add(New DataColumn(columnName, GetType(String)))
Next
isFirstLine = True ' Signal that this row is NOT to be considered as data.
Else
For i As Integer = 1 To data.Length
dt.Columns.Add(New DataColumn(String.Format("Column_{0}", i), GetType(String)))
Next
isFirstLine = False ' Signal that this row IS to be considered as data.
End If
End If
If Not isFirstLine Then
dt.Rows.Add(data.ToArray)
End If
isFirstLine = False ' All subsequent lines shall be considered as data.
End While
End Using
Catch ex As Exception
Throw New Exception(String.Format("{0}.CSVToDatatable Error", GetType(Table).FullName), ex)
End Try
Return result
End Function

VB: Count number of columns in csv

So, quite simple.
I am importing CSVs into a datagrid, though the csv always has to have a variable amount of columns.
For 3 Columns, I use this code:
Dim sr As New IO.StreamReader("E:\test.txt")
Dim dt As New DataTable
Dim newline() As String = sr.ReadLine.Split(";"c)
dt.Columns.AddRange({New DataColumn(newline(0)), _
New DataColumn(newline(1)), _
New DataColumn(newline(2))})
While (Not sr.EndOfStream)
newline = sr.ReadLine.Split(";"c)
Dim newrow As DataRow = dt.NewRow
newrow.ItemArray = {newline(0), newline(1), newline(2)}
dt.Rows.Add(newrow)
End While
DG1.DataSource = dt
This works perfectly. But how do I count the number of "newline"s ?
Can I issue a count on the number of newlines somehow? Any other example code doesn't issue column heads.
If my csv file has 5 columns, I would need an Addrange of 5 instead of 3 and so on..
Thanks in advance
Dim sr As New IO.StreamReader(path)
Dim dt As New DataTable
Dim newline() As String = sr.ReadLine.Split(","c)
' MsgBox(newline.Count)
' dt.Columns.AddRange({New DataColumn(newline(0)),
' New DataColumn(newline(1)),
' New DataColumn(newline(2))})
Dim i As Integer
For i = 0 To newline.Count - 1
dt.Columns.AddRange({New DataColumn(newline(i))})
Next
While (Not sr.EndOfStream)
newline = sr.ReadLine.Split(","c)
Dim newrow As DataRow = dt.NewRow
newrow.ItemArray = {newline(0), newline(1)}
dt.Rows.Add(newrow)
End While
dgv.DataSource = dt
End Sub
Columns and item values can be added to a DataTable individually, using dt.Columns.Add and newrow.Item, so that these can be done in a loop instead of hard-coding for a specific number of columns. e.g. (this code assumes Option Infer On, so adjust as needed):
Public Function CsvToDataTable(csvName As String, Optional delimiter As Char = ","c) As DataTable
Dim dt = New DataTable()
For Each line In File.ReadLines(csvName)
If dt.Columns.Count = 0 Then
For Each part In line.Split({delimiter})
dt.Columns.Add(New DataColumn(part))
Next
Else
Dim row = dt.NewRow()
Dim parts = line.Split({delimiter})
For i = 0 To parts.Length - 1
row(i) = parts(i)
Next
dt.Rows.Add(row)
End If
Next
Return dt
End Function
You could then use it like:
Dim dt = CsvToDataTable("E:\test.txt", ";"c)
DG1.DataSource = dt

Stored list of array using For Each loop

I want to store the "zone_check_value" to a array of string then while inserting into array i will check the array of string if the next value is repeated or have duplicate.
Example.
1st Example
1st loop = zone_check_value = ZD1/01/2014
2nd loop = zone_check_value = ZD1/01/2014
2nd Example
1st loop = zone_check_value = ZD1/01/2014
2nd loop = zone_check_value = ZD2/02/2014
3rd loop = zone_check_value = ZD1/01/2014
Code:
For Each dt As DataTable In xls.Tables
Dim array_of_string as String() 'i want to put the value in here
For Each dr As DataRow In dt.Rows
Dim zone_destination As String = dr(2).ToString
Dim affected_date As String = dr(7).ToString
Dim zone_check_value = zone_destination + affected_date
''''''How can i store zone_check_value in a string array?
Next
Next
EDIT
What if i add select case in my loop? the array_of_string value become NULL. I need to check the value of the current value in array_of_string .
Example Code
For Each dt As DataTable In xls.Tables
Dim array_of_string as String() 'i want to put the value in here
select Case dt.tablename
case "Sheet1"
For Each dr As DataRow In dt.Rows
Dim zone_destination As String = dr(2).ToString
Dim affected_date As String = dr(7).ToString
Dim zone_check_value = zone_destination + affected_date
''''''How can i store zone_check_value in a string array?
Next
case "Sheet 2"
For Each dr As DataRow In dt.Rows
Dim check_value as Boolean = array_of_string.Contains(dr(0).ToString)
'but when i got in sheet 2 the array_of_string is null
Next
Next
The easiest way is to use a List(Of String).
For Each dt As DataTable In xls.Tables
Dim array_of_string as List(Of String) = New List(Of String) 'i want to put the value in here
For Each dr As DataRow In dt.Rows
Dim zone_destination As String = dr(2).ToString
Dim affected_date As String = dr(7).ToString
Dim zone_check_value = zone_destination & affected_date
''''''How can i store zone_check_value in a string array?
array_of_string.Add(zone_check_value)
Next
''' Now if you really need it in array form you can cast it via:
''' Dim values() As String = array_of_string.ToArray()
Next
You might even consider:
For Each dt As DataTable In xls.Tables
Dim values As List(Of String) = New List(Of String)
dt.Rows.ForEach( Sub(item) values.Add(item(2).ToString & item(7).ToString) )
''' Now do something with values.
Next
Either way, make sure you always use the string concatenation operator & to concatenate strings. The arithmetic addition operator + will cause you problems from time to time if you use it on strings.

Converting Gridview to DataTable in VB.NET

I am using this function to create datatable from gridviews. It works fine with Gridviews with AutoGenerateColumns = False and have boundfields or template fileds. But if I use it with Gridviews with AutoGenerateColumn = True I only get back an empty DataTable. Seems Gridview viewstate has been lost or something. Gridview is binded on PageLoad with If Not IsPostback. I can't think of anything else to make it work. Hope someone can help me.
Thanks,
Public Shared Function GridviewToDataTable(gv As GridView) As DataTable
Dim dt As New DataTable
For Each col As DataControlField In gv.Columns
dt.Columns.Add(col.HeaderText)
Next
For Each row As GridViewRow In gv.Rows
Dim nrow As DataRow = dt.NewRow
Dim z As Integer = 0
For Each col As DataControlField In gv.Columns
nrow(z) = row.Cells(z).Text.Replace(" ", "")
z += 1
Next
dt.Rows.Add(nrow)
Next
Return dt
End Function
Slight modification to your function above. If the autogenerate delete, edit or select button flags are set, the values for the fields are offset by one. The following code accounts for that:
Public Shared Function GridviewToDataTable(ByVal PassedGridView As GridView, ByRef Error_Message As String) As DataTable
'-----------------------------------------------
'Dim Tbl_StackSheets = New Data.DataTable
'Tbl_StackSheets = ReportsCommonClass.GridviewToDataTable(StackSheetsGridView)
'-----------------------------------------------
Dim dt As New DataTable
Dim ColInd As Integer = 0
Dim ValOffset As Integer
Try
For Each col As DataControlField In PassedGridView.Columns
dt.Columns.Add(col.HeaderText)
Next
If (PassedGridView.AutoGenerateDeleteButton Or PassedGridView.AutoGenerateEditButton Or PassedGridView.AutoGenerateSelectButton) Then
ValOffset = 1
Else
ValOffset = 0
End If
For Each row As GridViewRow In PassedGridView.Rows
Dim NewDataRow As DataRow = dt.NewRow
ColInd = 0
For Each col As DataControlField In PassedGridView.Columns
NewDataRow(ColInd) = row.Cells(ColInd + ValOffset).Text.Replace(" ", "")
ColInd += 1
Next
dt.Rows.Add(NewDataRow)
Next
Error_Message = Nothing
Catch ex As Exception
Error_Message = "GridviewToDataTable: " & ex.Message
End Try
Return dt
End Function

How to add List as value in Hashtable

In vb.net If I have HashTable,key is integer and the value is a list of integers, how to append integers to the value of a given key,
I have tried it but each time I found the integer last added only, (the list only has the last item added).
Here is my code , where dt is DataTable object
Dim dt = report.getEvaluationReportByObjectiveGroupId(29)
Dim data As New Hashtable()
Dim dataEntry As DictionaryEntry
Dim res As String
For Each row As DataRow In dt.Rows
Dim strYear = row.Item("Year")
Dim strData = row.Item("EmpCount")
If data.ContainsKey(strYear) Then
Dim newCountArr As List(Of Int32) = DirectCast(data(strYear), List(Of Int32))
' newCountArr.AddRange(data(strYear))
newCountArr.Add(strData)
' data.Remove(strYear)
' data.Add(strYear, newCountArr)
Else
Dim countArr As New List(Of Integer)
countArr.Add(strData)
data.Add(strYear, countArr)
End If
' data.Add(strYear, strData)
Next row
I would suggest to use the strongly typed Dictionary(Of Int32, List(Of Int32)) instead, it works similar. But anyway, here's the HashTable approach:
Dim table = New Hashtable
Dim list = New List(Of Int32)
For i = 1 To 999
list.Add(i)
Next
table.Add(1, list)
' somewhere else you want to read the list for a given key (here 1) '
Dim list1 As List(Of Int32) = DirectCast(table(1), List(Of Int32))
list.Add(1000) ' add another integer to the end of the list '
' note: you don't need to add the list to the HashTable again '
Edit: Since you've posted your code, here's the corrected:
For Each row As DataRow In dt.Rows
Dim strYear = row.Field(Of Int32)("Year")
Dim strData = row.Field(Of Int32)("EmpCount")
Dim list As List(Of Int32)
If data.ContainsKey(strYear) Then
list = DirectCast(data(strYear), List(Of Int32))
Else
list = New List(Of Int32)
data.Add(strYear, list)
End If
list.Add(strData)
Next row