Parallel.ForEach gives different results each time - vb.net

Please help me to convert following loop to Parallel loop. I tried using Parallel.ForEach and ConcurrentBag instead of HashSet, but the wied thing is that "Matched" returns every time different results.
I can't figure it out... Is it because of Thread Safety issues?
Keywords list contains about 500 unique strings, each 1-3 words in lenght.
Items contains about 10000 items.
Original code:
Dim Items As IEnumerable(Of Item) = Db.Items.GetAll
Dim Keywords As HashSet(Of String)
Dim Matched As HashSet(Of Item)
For Each Item In Items
For Each Keyword In Keywords
If Regex.IsMatch(Headline, String.Format("\b{0}\b", Keyword), RegexOptions.IgnoreCase Or RegexOptions.CultureInvariant) Then
If Not Matched.Contains(Item) Then
Matched.Add(Item)
End If
End If
Next
Next
Attempt to convert it to
Dim Items As IEnumerable(Of Item) = Db.Items.GetAll
Dim Keywords As HashSet(Of String)
Dim Matched As Concurrent.ConcurrentBag(Of Item)
Threading.Tasks.Parallel.ForEach(Of Item)(Items, Sub(Item)
For Each Keyword In Keywords
If Regex.IsMatch(Item.Title, String.Format("\b{0}\b", Keyword), RegexOptions.IgnoreCase Or RegexOptions.CultureInvariant) Then
If Not Matched.Contains(Item) Then
Matched.Add(Item)
End If
Continue For
End If
Next
End If

Yes, your code certainly isn't thread-safe. Using thread-safe collections won't make your code automatically thread-safe, you still need to use them correctly.
Your problem is that after Contains() finishes but before Add() is called on one thread, Add() can be called on another thread (and the same can possibly also happen while Contains() executes).
What you need to do is to:
Use locking (this means you won't need to use a thread safe collection anymore); or
Use something like ConcurrentHashSet. There is no such class in .Net, but you can use ConcurrentDictionary instead (even if it doesn't fit your needs exactly). Instead of your call to Contains() and then Add(), you could do Matched.TryAdd(Item, True), where True is there just because ConcurrentDictionary needs some value.

Related

Streamwriter: write two listboxes on the same row

I am trying to write, in order to export on txt file, information in two listbox with the same number of rows. I have to export them with the following format: Listbox1, Listbox2. In order to do this, I've tried to use the following code:
Using writer = New StreamWriter(SaveFileDialog1.FileName)
For Each o As Object In Form3.ListBox1.Items And Form3.ListBox2.Items
writer.WriteLine(o)
Next
End Using
I'm receiving the following error:
BC30452 Operator 'And' is not defined for types 'ListBox.ObjectCollection' and 'ListBox.ObjectCollection'.
I've also tried to perform three For Each loops, the first for the LB1, the second for the commas and the third for LB2, but I'm having it exported with content on single lines. How could I solve this?
If you use Enumerable.Zip, as suggested in another answer, then you can make the code more succinct by doing away with the explicit loop:
File.WriteAllLines(SaveFileDialog1.FileName,
Form3.ListBox1.
Items.
Cast(Of Object).
Zip(Form3.ListBox2.
Items.
Cast(Of Object),
Function(x1, x2) $"{x1}, {x2}"))
If you didn't use Zip then you can use a loop this way:
Dim items1 = Form3.ListBox1.Items
Dim items2 = Form3.ListBox2.Items
Using writer = New StreamWriter(SaveFileDialog1.FileName)
For i = 0 To Math.Min(items1.Count, items2.Count)
writer.WriteLine($"{items1(i)}, {items2(i)}")
Next
End Using
The Math.Min part is just in case there are different numbers of items in each ListBox. If you know there aren't then you can do away with that and just use one Count. If there might be different counts but you want to output all items then the code would become slightly more complex to handle that.
As the error message says, the syntax you attempted is simply not valid. There's no feature in VB.NET that does that sort of thing.
However, the .NET Framework API does provide a means for something similar, which would probably work in your case. See Enumerable.Zip(). You can use it like this:
Using writer = New StreamWriter(SaveFileDialog1.FileName)
For Each o As String In Form3.ListBox1.Items.Cast(Of Object).Zip(Form3.ListBox2.Items.Cast(Of Object), Function(x1, x2) x1 & ", " & x2)
writer.WriteLine(o)
Next
End Using
Since you said that both list boxes have the same number of items we can use the number of items in the first listbox less one (indexes start at zero) in a For loop.
I used a StringBuilder so the code does not have to throw away and create a new string on each iteration.
I used an interpolated string indicate by the $ preceding the string. This means I can insert variables in braces, right along with literals.
Call .ToString on the StringBuilder to write to the text file.
Private Sub SaveListBoxes()
Dim sb As New StringBuilder
For i = 0 To ListBox1.Items.Count - 1
sb.AppendLine($"{ListBox1.Items(i)}, {ListBox2.Items(i)}")
Next
File.WriteAllText("C:\Users\xxx\Desktop\ListBoxText.txt", sb.ToString)
End Sub

Is this an incorrect way of iterating over a dictionary?

Are there any problems with iterating over a dictionary in the following manner?
Dim dict As New Dictionary(Of String, Integer) From {{"One", 1}, {"Two", 2}, {"Three", 3}}
For i = 0 To dict.Count - 1
Dim Key = dict.Keys(i)
Dim Value = dict.Item(Key)
'Do more work
dict.Item(Key) = NewValue
Next
I have used it a lot without any problems. But I recently read that the best way to iterate over a dictionary was using a ForEach loop. This led me to question the method that I've used.
Update: Note I am not asking how to iterate over a dictionary, but rather if the method that I've used successfully in the past is wrong and if so why.
Are there any problems with iterating over a dictionary in the following manner?
Yes and no. Technically there's nothing inherently wrong with the way you're doing it as it does what you need it to do, BUT it requires unnecessary computations and is therefore slower than simply using a For Each loop and iterating the key/value-pairs.
Iterating keys, then fetching value
The Keys property is not a separate collection of keys, but is actually just a thin wrapper around the dictionary itself which contains an enumerator for enumerating the keys only. For this reason it also doesn't have an indexer that lets you access the key at a specific index like you are right now.
What's actually happening is that VB.NET is utilizing the extension method ElementAtOrDefault(), which works by stepping through the enumeration until the wanted index has been reached. This means that for every iteration of your main loop, ElementAtOrDefault() also performs a similar step-through iteration until it gets to the index you've specified. You now have two loops, resulting in an O(N * N) = O(N2) operation.
What's more, when you access the value via Item(Key) it has to calculate the hash of the key and determine the respective value to fetch. While this operation is close to O(1), it's still an unnecessary additional operation compared to what I'm talking about below.
Iterating key/value-pairs
The dictionary already has an internal list (array) holding the keys and their respective values, so when iterating the dictionary using a For Each loop all it does is fetch each pair and put them into a KeyValuePair. Since it is fetching directly by index this time (at a specific memory location) you only have one loop, thus the fetch operation is O(1), making your entire loop O(N * 1) = O(N).
Based on this we see that iterating the key/value-pairs is actually faster.
This kind of loop would look like (where kvp is a KeyValuePair(Of String, Integer)):
For Each kvp In dict
Dim Key = kvp.Key
Dim Value = kvp.Value
Next
See here:
https://www.dotnetperls.com/dictionary-vbnet
Keys. You can get a List of the Dictionary keys. Dictionary has a get accessor property with the identifier Keys. You can pass the Keys to a List constructor to obtain a List of the keys.
It cites an example similar to yours:
Module Module1
Sub Main()
' Put four keys and values in the Dictionary.
Dim dictionary As New Dictionary(Of String, Integer)
dictionary.Add("please", 12)
dictionary.Add("help", 11)
dictionary.Add("poor", 10)
dictionary.Add("people", -11)
' Put keys into List Of String.
Dim list As New List(Of String)(dictionary.Keys)
' Loop over each string.
Dim str As String
For Each str In list
' Print string and also Item(string), which is the value.
Console.WriteLine("{0}, {1}", str, dictionary.Item(str))
Next
End Sub
End Module

Vb Net check if arrayList contains a substring

I am using myArrayList.Contains(myString) and myArrayList.IndexOf(myString) to check if arrayList contains provided string and get its index respectively.
But, How could I check if contains a substring?
Dim myArrayList as New ArrayList()
myArrayList.add("sub1;sub2")
myArrayList.add("sub3;sub4")
so, something like, myArrayList.Contains("sub3") should return True
Well you could use the ArrayList to search for substrings with
Dim result = myArrayList.ToArray().Any(Function(x) x.ToString().Contains("sub3"))
Of course the advice to use a strongly typed List(Of String) is absolutely correct.
As far as your question goes, without discussing why do you need ArrayList, because array list is there only for backwards compatibility - to select indexes of items that contain specific string, the best performance you will get here
Dim indexes As New List(Of Integer)(100)
For i As Integer = 0 to myArrayList.Count - 1
If DirectCast(myArrayList(i), String).Contains("sub3") Then
indexes.Add(i)
End If
Next
Again, this is if you need to get your indexes. In your case, ArrayList.Contains - you testing whole object [string in your case]. While you need to get the string and test it's part using String.Contains
If you want to test in non case-sensitive manner, you can use String.IndexOf

Best way to iterate through Hashtable and conditionally remove entries in VB.NET

In VB.NET, I have a HashTable that I would like to iterate through and conditionally remove entries from. I've written the following code that does the job perfectly, but I'd like to know if there are any creative ways to simplify the code. It just doesn't seem right to have to create a second list to perform this operation.
Here's what I've written:
Dim ModsToRemove As New List(Of String)
For Each ModKey As DictionaryEntry In ModHashTable
If ModKey.Key.ToString.Contains("Criteria") Then
ModsToRemove.Add(ModKey.Key.ToString)
End If
Next
For Each ModKey As String In ModsToRemove
ModHashTable.Remove(ModKey)
Next
Is there another way to perform the same operation that doesn't require the creation of a second list and a second loop? Is it possible to remove entries from something you are iterating through without throwing an error in VB.NET? Is doing so universally a bad idea?
With a little bit of help from Resharper and LINQ, you can simplify your expression in the following ways.
This code block here can be rewritten to use LINQ instead of the embedded IF statement
For Each ModKey As DictionaryEntry In ModHashTable
If ModKey.Key.ToString.Contains("Criteria") Then
ModsToRemove.Add(ModKey.Key.ToString)
End If
Next
Is equivalent to
Dim modsToRemove As List(Of String) = (From modKey As DictionaryEntry In
modHashTable Where modKey.Key.ToString.Contains("Criteria")
Select modKey.Key.ToString).ToList()
Combining this with your actual loop to remove the items from the Hashtable, you should be able to get the equivalent functionality of your example above with the following 3 lines of code:
For Each key As String In (From modkey As DictionaryEntry In modHashTable Where modkey.Key.ToString.Contains("Criteria") Select modkey.Key.ToString).ToList()
modHashTable.Remove(key)
Next

List Of Multithreading throwing exception

I currently have a thread which at a stage goes through a List(of CustomClass) constantly with a ForEach loop. My problem is if I try to modify that list from the UI thread it throws a:
Collection was modified; enumeration operation may not execute
I tried using SyncLock which clearly doesn't work the way I thought, I also tried this:
Dim TempList As System.Collections.ObjectModel.ReadOnlyCollection(Of CustomClass) = G_.ActiveEnts.AsReadOnly
For each _Element in TempList
'Do stuff
Next
And other variations of it, like converting to an array
Of course after about 5 seconds of writing this question I decided to put a SyncLock on every access to the List of, rather than just in the thread. So whenever I modify the List of it SyncLocks it and this fixed it.