Find differences in datatables with linq when datatables have different columns - vb.net

The following function returns the differences of two datatables, but only works if both of the tables have the same columns.
In my case, the columns are different, but column "dictkey". Column "dictkey" exists in both of my datatables.
How I get it to work, that my function returns only rows, where "dictkey" is different respectivly not existent regardless of the other columns.
Public Function Check_Desparity(Byval dtTestStep as DataTable, Byval dtLimits as DataTable) as IEnumerable(Of DataRow)
Dim diff = dtLimits.AsEnumerable.Except(dtTestSteps.AsEnumberable, DataRowComparer.Default)
Return diff
End Function

You can create a custom IEqualityComparer that only compares on the column needed and use the Except that takes an IEqualityComparer as an argument or you can use ExceptBy from MoreLINQ, or you could do your own ExceptBy equivalent:
Public Function Check_Desparity(Byval dtTestSteps As DataTable, Byval dtLimits As DataTable) As IEnumerable(Of DataRow)
Dim dtTestStepsHash As New HashSet(Of String)(dtTestSteps.AsEnumerable.Select(Function(dr) CType(dr("dictkey"), String)))
Return dtLimits.AsEnumerable.Where(Function(dr) Not dtTestStepsHash.Contains(CType(dr("dictkey"),String)))
End Function
I assumed column dictkey was of type String, but you can just put in the correct type.
This functions as an ExceptBy operation.
The first line in the function creates a HashSet of all the "dictkey" values you want to exclude from the answer, because HashSet provides an efficient (near-constant) speed Contains operation.
The second line returns those rows from the dtLimits DataTable that have a "dictkey" value that is not contained in the dtTestSteps DataTable, as determined by checking in the dtTestStepsHash for each row (dr) and excluding those values that are in the HashSet.

Related

Pass single column of dataTable to function in vb.net

I have a dataTable with 5 rows and 2 columns(Id,Name). I need to pass the 5 rows with only one column(Id) of the dataTable to a function parameter which is a dataTable.
This is what I tried
myfunction(dt.Rows.Item("Id")
Public function myfunction (dt as dataTable)
// Some code
End Function
But this is not working. Why is it so? Even if a column is excluded the dataTable still remains as a dataTable. How can I pass the "ID" column without including the "Name" column to this function as a parameter.
There are several issues with what you provided.
First, your function call is missing a parentheses on the end.
myfunction(dt.Rows.Item("Id"))
Second, your function is expecting a datatable, but you are passing a cell (and also not specifying which row, so this wouldn't work)...
If you truly want your function to use a datatable, you need to just pass dt like:
myFunction(dt)
If you want to pass the cell value instead, you can change your function definition to take a string or whatever datatype is within that column, and pass the value like:
myFunction(dt.Rows(ROWNUMBER).Item("Id"))
Functions need a DataType and a Return statement.
Your Function is expecting a DataTable and you are passing a DataRow.
dt.Rows.Item takes a Integer, not a String.
You don't Dim a Function, you call it.
Stack Overflow asks for a Minimum, Complete and Reproducible example of your problem. Your code is not that.
I gave your Function a name and something trivial to do so I could demonstrate the proper syntax. If you don't need the whole DataTable then don't pass it to the function. I am just passing the values in the ID column as a List(Of T) If your Function is not under your control and it wants a DataTable, then just pass dt.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim lst = dt.Rows.OfType(Of DataRow)().Select(Function(dr) dr.Field(Of Integer)("ID")).ToList()
Dim MaxID = GetMaxID(lst)
End Sub
Public Function GetMaxID(lst As List(Of Integer)) As Integer
Dim MaxI = lst.Max
Return MaxI
End Function

Dataset with Datatable

I am trying to check if my dictionary contains values in my dataset.datatable and if its quantities in the second column of the dataset are less than or greater than the quantities in my datatable. I tried using the SELECT method but it doesn’t seem to work, I get the error BC30469 reference to non-shared member requires object reference?
I was just trying to do a simple search in the table first to see if I can even do that..... apparently not. Thanks for the help!
Dim row As DataRow = DataSet.DataTable.Select("ColumnName1 = 'value3'")
If Not row Is Nothing Then
searchedValue = row.Item("ColumnName2")
End If
You could get a dictionary to compare with the one you already have like this (assuming your key is a string and the amount an Int32 and that your dataset contains only one table):
Dim myDBDict As Dictionary(Of String, Int32) =
From e In myDataSet.Tables(0).Rows.Cast(Of DataRow).ToDictionary(Of String, Int32)(
Function(e) e.Field(Of String)("MyIDColumn"),
Function(e) e.Field(Of Int32)("myAmountColumn"))

Can I use a method that returns a list of strings in SSRS report code as the headers in a tablix?

I have table that needs to contain 50 columns for each half hour in the day (+2 for daylight savings). So each column will be HH1, HH2, HH3... HH50.
I have written this piece of code in the report properties code section.
Function GetHH() As List(Of String)
Dim headers As List(Of String) = new List(Of String)
For index As Integer = 1 to 50
headers.Add("HH" & index)
Next
return headers
End Function
Is there a way to use the output of this function as the headers of my tablix? Or will I need to add the headers to some sort of dataset in the database and add it from there?
The column group functionality would be well suited for this. As you mentioned, you would need to write a SQL statement to return these values in a dataset. Then you can set your column group to group on these values. This way your table always gets the right number of columns and you don't have to add them manually.

Using LINQ to find updated rows in DataTable

I'm building an application in VB.NET where I am pushing data from one database to another. The source database is SQL Server and the target is MySQL.
What I am doing is first creating DataTables for each table in each database which I use to do a comparison. I've written the queries in such a way so that the source and target DataTables contain exactly the same columns and values to make the comparison easier.
This side of the application works fine. What I do next is find rows which do not exist in the target database by finding PKs which do not exist. I then insert these new rows into the target database with no problem.
The Problem
What I now need to do is find rows in each table that have been updated, i.e. are not identical to the corresponding rows in the target DataTable. I have tried using Except() as per the example below:
Public Function GetUpdates(ByVal DSDataSet As MSSQLQuery, ByVal AADataSet As MySQLQuery, Optional ByVal PK As String = Nothing) As List(Of DataRow)
' Determines records to be updated in the AADB and returns list of new Rows
' Param DSDataSet - MSSQLQuery Object for source table
' Param AADataSet - MySQLQuery Object for destination table
' Optional Param PK - String of name common columns to treat as PK
' Returns List(Of DataRow) containing rows to update in table
Dim orig = DSDataSet.GetDataset()
Dim origTable = orig.Tables(0).AsEnumerable()
Dim destination = AADataSet.GetDataset()
Dim destinationTable = destination.Tables(0).AsEnumerable()
' Get Records which are not in destination table
Dim ChangedRows = Nothing
If IsNothing(PK) Then
ChangedRows = destinationTable.AsEnumerable().Except(origTable.AsEnumerable(), DataRowComparer.Default)
End If
Dim List As New List(Of DataRow)
For Each addRow In ChangedRows
List.Add(addRow)
Next
Return List
End Function
The trouble is that it ends up simply returning the entire set of source rows.
How can I check for these changed rows? I could always hardcode queries to return what I want but this introduces problems because I need to make comparisons for 15 tables so it would be a complete mess.
Ideally I need a solution where it will take into account the variable number columns from the source tables for comparison against what is essentially an identical target table and simply compare the DataRows for equality.
There should be a corresponding row in the target tables for every source row since the addition of new rows is performed prior to this check for updated rows.
I am also open to using methods other than LINQ to achieve this.
Solution
In the end I implemented a custom comparer to use in the query as shown below. It first checks if the first column value matches (PK in my case) where if it does then it we check column-wise that everything matches.
Any discrepancy will set the flag value to FALSE which we return. If there aren't any issues then TRUE will be returned. In this case I used = to compare equality between values rather than Equals() since I'm not concerned about a strict equality.
The resulting set of DataRows is used to UPDATE the database using the first column value (PK) in the WHERE clause.
Imports System.Data
Class MyDataRowComparer
Inherits EqualityComparer(Of DataRow)
Public Overloads Overrides Function Equals(x As DataRow, y As DataRow) As Boolean
If x.Item(0).ToString().Equals(y.Item(0).ToString()) Then
' If PK matches then check column-wise.
Dim Flag As Boolean = True
For Counter As Integer = 0 To x.ItemArray.Count - 1
If Not x.Item(Counter) = y.Item(Counter) Then
Flag = False
End If
Next
Return Flag
Else
' Otherwise don't bother and just skip.
Return False
End If
End Function
...
End Class
class MyDataRowComparer : IEqualityComparer<DataRow>
{
public bool Equals(DataRow x, DataRow y)
{
return x["ColumnName"].Equals(y["ColumnName"]);
// Can add more columns to the Comparison
}
public int GetHashCode(DataRow obj)
{
return obj["ColumnName"].GetHashCode();
// Can add more columns to calculate HashCode
}
}
Now the Except statement will be like:
ChangedRows = destinationTable.AsEnumerable()
.Except(origTable.AsEnumerable(), MyDataRowComparer)

Simplest/fastest way to check if value exists in DataTable in VB.net?

I have a DataTable (currently with multiple columns but I could just grab one column if it makes it easier). I want to check if a String value exists in a column of the DataTable. (I'm doing it many times so I want it to be reasonably fast.)
What is a good way to do this? Iterating through the DataTable rows each time seems like a bad way. Can I convert the column to a flat List/Array format, and use a built in function? Something like myStrList.Contains("value")?
You can use select to find whether that value exist or not. If so, it returns rows or it will not. Here is some sample code to help you.
Dim foundRow() As DataRow
foundRow = dt.Select("SalesCategory='HP'")
If the data in your DataTable doesn't change very often, and you search the DataTable multiple times, and your DataTable contains many rows, then it's likely going to be a lot faster to build your own index for the data.
The simplest way to do this is to sort the data by the key column so that you can then do a binary search on the sorted list. For instance, you can build an index like this:
Private Function BuildIndex(table As DataTable, keyColumnIndex As Integer) As List(Of String)
Dim index As New List(Of String)(table.Rows.Count)
For Each row As DataRow in table.Rows
index.Add(row(keyColumnIndex))
Next
index.Sort()
Return index
End Function
Then, you can check if a value exists in the index quickly with a binary search, like this:
Private Function ItemExists(index As List(Of String), key As String) As Boolean
Dim index As Integer = index.BinarySearch(key)
If index >= 0 Then
Return True
Else
Return False
End If
End Function
You could also do the same thing with a simple string array. Or, you could use a Dictionary object (which is an implementation of a hash table) to build a hash index of your DataTable, for instance:
Private Function BuildIndex(table As DataTable, keyColumnIndex As Integer) As Dictionary(Of String, DataRow)
Dim index As New Dictionary(Of String, DataRow)(table.Rows.Count)
For Each row As DataRow in table.Rows
index(row(keyColumnIndex)) = row
Next
Return index
End Function
Then, you can get the matching DataRow for a given key, like this:
Dim index As Dictionary(Of String, DataRow) = BuildIndex(myDataTable, myKeyColumnIndex)
Dim row As DataRow = Nothing
If index.TryGetValue(myKey, row) Then
' row was found, can now use row variable to access all the data in that row
Else
' row with that key does not exist
End If
You may also want to look into using either the SortedList or SortedDictionary class. Both of these are implementations of binary trees. It's hard to say which of all of these options is going to be fastest in your particular scenario. It all depends on the type of data, how often the index needs to be re-built, how often you search it, how many rows are in the DataTable, and what you need to do with the found items. The best thing to do would be to try each one in a test case and see which one works best for what you need.
You should use row filter or DataTable.Rows.Find() instead of select (select does not use indexes). Depending on your table structure, specifically if your field in question is indexed (locally), performance of either way should be much faster than looping through all rows. In .NET, a set of fields needs to be a PrimaryKey to become indexed.
If your field is not indexed, I would avoid both select and row filter, because aside from overhead of class complexity, they don't offer compile time check for correctness of your condition. If it's a long one, you may end up spending lots of time debugging it once in a while.
It is always preferable to have your check strictly typed. Having first defined an underlying type, you can also define this helper method, which you can convert to extension method of DataTable class later:
Shared Function CheckValue(myTable As DataTable, columnName As String, searchValue As String) As Boolean
For row As DataRow In myTable.Rows
If row(columnName) = searchValue Then Return True
Next
Return False
End Function
or a more generic version of it:
Shared Function CheckValue(myTable As DataTable, checkFunc As Func(Of DataRow, Boolean)) As Boolean
For Each row As DataRow In myTable.Rows
If checkFunc(row) Then Return True
Next
Return False
End Function
and its usage:
CheckValue(myTable, Function(x) x("myColumn") = "123")
If your row class has MyColumn property of type String, it becomes:
CheckValue(myTable, Function(x) x.myColumn = "123")
One of the benefits of above approach is that you are able to feed calculated fields into your check condition, since myColumn here does not need to match a physical myColumn in the table/database.
bool exists = dt.AsEnumerable().Where(c => c.Field<string>("Author").Equals("your lookup value")).Count() > 0;