vb.NET Select distinct... how to use it? - vb.net

Coming from a C# background I am a bit miffed by my inability to get this simple linq query working:
Dim data As List(Of Dictionary(Of String, Object))
Dim dbm As AccessDBManager = GlobalObjectManager.DBManagers("SecondaryAccessDBManager")
data = dbm.Select("*", "T町丁目位置_各務原")
Dim towns As IEnumerable(Of String())
towns = data.Select(Function(d) New String() {d("町名_Trim").ToString(), d("ふりがな").ToString()})
towns = towns.Where(Function(s) s(0).StartsWith(searchTerms) Or s(1).StartsWith(searchTerms)).Distinct()
Call UpdateTownsListView(towns.ToList())
I pasted together the relevant bits, so hopefully there is no error here...
data is loaded from an access database and is a list with the data from each row stored as a dictionary.
In this case element from data has a field containing the name of a Japanese town and its reading and some other stuff like the row ID etc.
I have a form with a textbox. When the user types something in, I would like to retrieve from data the town names corresponding to the search terms without duplicates.
Right now the results contain loads of duplicates> How can I get this sorted to only get distinct results?
I read from some other posts that a key might be needed, but how can I declare this with extension methods?

Distinct uses the default equality comparer to compare values.
Your collection contains arrays of strings, so Distinct won't work the way you expected since two different arrays never equals each other (since ReferenceEquals would be used in the end).
A solution is to use the Distinct overload which takes an IEqualityComparer.
Class TwoStringArrayEqualityComparer
Implements IEqualityComparer(Of String())
Public Function Equals(s1 As String(), s2 As String()) As Boolean Implements IEqualityComparer(Of String()).Equals
' Note that checking for Nothing is missing
Return s1(0).Equals(s2(0)) AndAlso s1(1).Equals(s2(1))
End Function
Public Function GetHashCode(s As String()) As Integer Implements IEqualityComparer(Of String()).GetHashCode
Return (s(0) + s(1)).GetHashCode() ' probably not perfect :-)
End Function
End Class
...
towns = towns.Where(...).Distinct(new TwoStringArrayEqualityComparer())

Related

How to get specific index of a Dictionary?

I'm working on a code that returns a query result like MySqlCommand, all working well but what I'm trying to do is insert the result inside a ComboBox. The way for achieve this is the following:
Form load event execute the GetAvailableCategories function
The function executed download all the values and insert it into a dictionary
Now the dictionary returned need an iteration for each Items to insert in the ComboBox
Practice example:
1,3. Event that fire the function
Private Sub Service_Load(sender As Object, e As EventArgs) Handles MyBase.Load
For Each categoria In Categories.GetAvailableCategories()
service_category.Items.Add(categoria)
Next
End Sub
GetAvailableCategories function
Dim dic As New Dictionary(Of Integer, String)
For Each row In table.Rows
dic.Add(row(0), row(1))
Next
Return dic
How you can see in the 1,3 points I call the function that return the result. What I want to do is insert the row(0) as value of the item and row(1) as Item name. But Actually I get this result in the ComboBox:
[1, Hair cut]
and also I can't access to a specific position of the current item in the iteration. Maybe the dictionary isn't a good choice for this operation?
Sorry if the question could be stupid, but it's a long time since I don't program in vb.net and now I need to brush up a bit.
UPDATE
I've understood that I can assign the value access to the .key of my dictionary, so the result that I want achieve is correct if I do:
cateogoria.key (return the id of record taken from the db)
categoria.value (is the item name that'll display in the ComboBox)
now the problem's that: How to assign the value of the current item without create any other new class? For example:
service_category.Items.Add(categoria.key, categoria.value)
But I can't do this, any idea?
A List as a DataSource sounds like what you are really after. Relying on relative indices in different arrays is sort of flaky. There is not a lot about what these are, but a class would keep the related info together:
Public Class Service
Public Property Name As String
Public Property Category As String
Public Property Id As Int32
End Class
This will keep the different bits of information together. Use them to store the info read from the db and use a List to store all of them:\
Private Services As New List(of Service)
...
For Each row In table.Rows
Dim s As New Service
s.Name = row(0).ToString() '???
s.Category =...
s.Id = ...
Services.Add(s) ' add this item to list
Next
Finally, bind the List to the CBO:
myCbo.DataSource = Services
myCbo.DisplayMember = "Name" ' what to show in cbo
myCbo.ValueMember = "Id" ' what to use for SelectedValue
I dont really know what you want to show or what the db fields read are, so I am guessing. But the larger point is that a Class will keep the different bits of info together better than an array. The List can be the DataSource so that you dont even have to populate the CBO directly. The List can also be Sorted, searched, Filtered and so forth with linq.
When the user picks something, myCbo.SelectedItem should be that item (though it will need to be cast), or you can use SelectedIndex to find it in the list:
thisOne = Services(myCbo.SelectedIndex)
It is also usually a good idea to override ToString in the item/service class. This will determine what shows when a DisplayMember mapping is not available. Without this, WindowsApp2.Service might show for your items:
Public Overrides ToString() As String
Return String.Format("{0} ({1})", Name, Price)
End Sub
This would show something like
Haircut ($12.30)

Sorting a SortedDictionary by key length in Visual Basic?

I'm writing a script that anonymizes participant data from a file.
Basically, I have:
A folder of plaintext participant data (sometimes CSV, sometimes XML, sometimes TXT)
A file of known usernames and accompanying anonymous IDs (e.g. jsmith1 as a known username, User123 as an anonymous ID)
I want to replace every instance of the known username with the corresponding anonymous ID.
Generally speaking, what I have works just fine -- it loads in the usernames and anonymous IDs into a dictionary and one by one runs a find-and-replace on the document text for each.
However, this script also strips out names, and it runs into some difficulty when it encounters names contained in other names. So, for example, I have two pairs:
John,User123
Johnny,User456
Now, when I run the find-and-replace, it may first encounter John, and as a result it replaces Johnny with User123ny, and then doesn't trigger Johnny.
The simplest solution I can think of is just to run the find-and-replace from longest key to shortest. To do that, it looks like I need a SortedDictionary.
However, I can't seem to convince Visual Basic to take my custom Comparer for this. How do you specify this? What I have is:
Sub Main()
Dim nameDict As New SortedDictionary(Of String, String)(AddressOf SortKeyByLength)
End Sub
Public Function SortKeyByLength(key1 As String, key2 As String) As Integer
If key1.Length > key2.Length Then
Return 1
ElseIf key1.Length < key2.Length Then
Return -1
Else
Return 0
End If
End Function
(The full details above are in case anyone has any better ideas for how to resolve this problem in general.)
I think it takes a class that implements the IComparer interface, so you'd want something like:
Public Class ByLengthComparer
Implements IComparer(Of String)
Public Function Compare(key1 As String, key2 As String) As Integer Implements IComparer(Of String).Compare
If key1.Length > key2.Length Then
Return 1
ElseIf key1.Length < key2.Length Then
Return -1
Else
'[edit: in response to comments below]
'Return 0
Return key1.Compare(key2)
End If
End Function
End Class
Then, inside your main method, you'd call it like this:
Dim nameDict As New SortedDictionary(Of String, String)(New ByLengthComparer())
You might want to take a look (or a relook) at the documentation for the SortedDictionary constructor, and how to make a class that implements IComparer.

How to Save/Reload data in vb.net after .exe close?

I am new to vb.net, and this is my first project where I'm fairly certain there is an obvious answer that I just can't find.
Problem: I have a list of a structure I have defined with many properties. I want to be able to edit and load that list with the values I have saved to it before hand after closing the program and loading it backup. What is the best way to do this?
This isn't a simple string or bool, otherwise I would use the user settings that is commonly suggested, in the project's properties. I've seen others that save it into an xml and take it back up, but I'm not inclined to do so since this is going to be distributed to others in mass. Since it's a complex structure, what's the commonly held preferred method?
Example
Here's a structure:
Structure animal
Dim coloring as string
Dim vaccinesUpToDate as Boolean
Dim species as string
Dim age as integer
End structure
And there's a List(Of animal) that the user will add say 1 cat, 2 dogs, etc. I want it so that once the programs is closed after the user has added these, that structure will be saved to still have that 1 cat and 2 dogs with those settings so I can display them again. What's the best way to save the data in my program?
Thanks!
Consider serialization. For this, a class is more in order than an old fashioned Struct:
<Serializable>
Class Animal
Public Property Name As String
Public Property Coloring As String
Public Property VaccinesUpToDate As Boolean
Public Property Species As String
Public Property DateOfBirth As DateTime
Public ReadOnly Property Age As Integer
Get
If DateOfBirth <> DateTime.MinValue Then
Return (DateTime.Now.Year - DateOfBirth.Year)
Else
Return 0 ' unknown
End If
End Get
End Property
' many serializers require a simple CTor
Public Sub New()
End Sub
Public Overrides Function ToString() As String
Return String.Format("{0} ({1}, {2})", Name, Species, Age)
End Function
End Class
The ToString() override can be important. It is what will display if you add Animal objects to a ListBox e.g.: "Stripe (Gremlin, 27)"
Friend animalList As New List(of Animal) ' a place to store animals
' create an animal
a = New Animal
a.Coloring = "Orange"
a.Species = "Feline" ' should be an Enum maybe
a.Name = "Ziggy"
a.BirthDate = #2/11/2010#
animalList.Add(a)
' animalList(0) is now the Ziggy record. add as many as you like.
In more complex apps, you might write an Animals collection class. In that case, the List might be internal and the collection could save/load the list.
Friend Sub SaveData(fileName as String)
Using fs As New System.IO.FileStream(fileName,
IO.FileMode.OpenOrCreate)
Dim bf As New BinaryFormatter
bf.Serialize(fs, animalList)
End Using
End Sub
Friend Function LoadData(fileName as String) As List(Of Animal)
Dim a As List(of Animal)
Using fs As New FileStream(fileName, FileMode.Open, FileAccess.Read)
Dim bf As New BinaryFormatter
a = CType(bf.Deserialize(fs), List(Of Animal))
End Using
Return a
End Function
XMLSerialization, ProtoBuf and even json are much the same syntax. For a small amount of data, a serialized list is an easy alternative to a database (and have many, many other uses, like a better Settings approach).
Calculated Fields as Properties
Notice that I added a BirthDate property and changed Age to calculate the result. You should not save anything which can be easily calculated: in order to update the Age (or VaccinesUpToDate) you'd have to 'visit' each record, perform a calculation then save the result - which might be wrong in 24 hours.
The reason for exposing Age as a Property (rather than a function) is for data binding. It is very common to use a List<T> as the DataSource:
animalsDGV.DataSource = myAnimals
The result will be a row for each animal with each Property as a column. Fields as in the original Structure won't show up. Nor would an Age() function display, wrapping the result as a readonly property displays it. In a PropertyGrid, it will show disabled because it is RO.
Class versus Structure
So if a Structure using Properties will work, why use a Class instead? From Choosing Between Class and Struct on MSDN, avoid using a Structure unless the type meets all of the following:
It logically represents a single value, similar to primitive types (int, double, etc.)
It has an instance size under 16 bytes
It is immutable
It will not have to be boxed frequently
Animal fails the first 3 points (while it is a local item it is not a value for #1). It may also fail the last depending on how it is used.

Finding distinct lines in large datatables

Currently we have a large DataTable (~152k rows) and are doing a for each over this to find a sub set of distinct entries (~124K rows). This is currently taking about 14 minutes to run which is just far too long.
As we are stuck in .NET 2.0 as our reporting won't work with VS 2008+ I can't use linq, though I don't know if this will be any faster in fairness.
Is there a better way to find the distinct lines (invoice numbers in this case) other than this for each loop?
This is the code:
Public Shared Function SelectDistinctList(ByVal SourceTable As DataTable, _
ByVal FieldName As String) As List(Of String)
Dim list As New List(Of String)
For Each row As DataRow In SourceTable.Rows
Dim value As String = CStr(row(FieldName))
If Not list.Contains(value) Then
list.Add(value)
End If
Next
Return list
End Function
Using a Dictionary rather than a List will be quicker:
Dim seen As New Dictionary(Of String, String)
...
If Not seen.ContainsKey(value) Then
seen.Add(value, "")
End If
When you search a List, you're comparing each entry with value, so by the end of the process you're doing ~124K comparisons for each record. A Dictionary, on the other hand, uses hashing to make the lookups much quicker.
When you want to return the list of unique values, use seen.Keys.
(Note that you'd ideally use a Set type for this, but .NET 2.0 doesn't have one.)

VB.NET - Find a Substring in an ArrayList, StringCollection or List(Of String)

I've got some code that creates a list of AD groups that the user is a member of, with the intention of saying 'if user is a member of GroupX then allow admin access, if not allow basic access'.
I was using a StringCollection to store this list of Groups, and intended to use the Contains method to test for membership of my admin group, but the problem is that this method only compares the full string - but my AD groups values are formatted as cn=GroupX, etc....
I want to be easily able to determine if a particular substring (i.e. 'GroupX') appears in the list of groups. I could always iterate through the groups check each for a substring representing my AD group name, but I'm more interested in finding out if there is a 'better' way.
Clearly there are a number of repositories for the list of Groups, and it appears that Generics (List(Of String)) are more commonly preferred (which I may well implement anyway) but there is no in-built means of checking for a substring using this method either.
Any suggestions? Or should I just iterated through the list of groups?
RESULT:
I've settled on using a List(Of), and I've borrowed from Dan's code to iterate through the list.
I don't think you're going to find a better method than enumerating over the collection*.
That said, here's a good way to do it that will be independent of collection type:
Public Function ContainsSubstring(ByVal objects As IEnumerable, ByVal substring As String) As Boolean
Dim strings = objects.OfType(Of String)()
For Each str As String in strings
If str.Contains(substring) Then Return True
Next
Return False
End Function
This is a good way to address the "which collection to use?" issue since basically all collections, generic or not (ArrayList, List(Of String), etc.), implement IEnumerable.
*Justification for why I believe this forthcoming.
Writing a helper function which will iterate through the items checking for substrings and returning you a Boolean flag seem to be your best bet.
You can use a predicate function for that. It's a boolean function which will help you to filter out some entries.
For example, to get non-hidden files from a list:
Public ReadOnly Property NotHiddenFiles As List(Of FileInfo)
Get
Dim filesDirInfo As New DirectoryInfo(FileStorageDirectory)
Return filesDirInfo.GetFiles.ToList.FindAll(AddressOf NotHiddenPredicate)
End Get
End Property
Private Function NotHiddenPredicate(ByVal f As FileInfo) As Boolean
Return Not ((f.Attributes And FileAttributes.Hidden) = FileAttributes.Hidden)
End Function